Patent application title: INFLUENZA VIRUS VACCINE COMPOSITION AND METHODS OF USE
Inventors:
Catherine J. Luke (Frederick, MD, US)
Adrian Vilalta (San Diego, CA, US)
Adrian Vilalta (San Diego, CA, US)
Mary K. Wloch (San Diego, CA, US)
Thomas G. Evans (San Diego, CA, US)
Andrew J. Geall (San Marcos, CA, US)
Gretchen S. Jimenez (San Diego, CA, US)
Assignees:
Vical Incorporated
IPC8 Class: AA61K317052FI
USPC Class:
514 44 R
Class name:
Publication date: 2010-08-05
Patent application number: 20100197771
Claims:
1. An isolated polynucleotide comprising a nucleic acid fragment, wherein
said nucleic acid fragment is SEQ ID NO: 66.
2. A vector comprising the polynucleotide of claim 1.
3. The polynucleotide of claim 1, further comprising a heterologous nucleic acid ligated to said nucleic acid fragment.
4. A composition comprising the vector of claims 2 and a carrier.
5. The composition of claim 4, further comprising a component selected from the group consisting of an adjuvant and a transfection facilitating compound.
6. The composition of claim 5, wherein said component is a cationic lipid.
7. The composition of claim 5, wherein said adjuvant comprises (.+-.)--N-(3-aminopropyl)-N,N-dimethyl-2,3-bis(syn-9-tetradecenyloxy)-1-p- ropanaminium bromide (GAP-DMORIE) and a neutral lipid, wherein said neutral lipid is selected from the group consisting of:(a) 1,2-dioleoyl-sn-glycero-3-phosphoethanolamine (DOPE);(b) 1,2-diphytanoyl-sn-glycero-3-phosphoethanolamine (DPyPE); and(c) 1,2-dimyristoyl-glycero-3-phosphoethanolamine (DMPE).
8. The composition of claim 6, wherein said transfection facilitating compound comprises (.+-.)--N-(2-hydroxyethyl)-N,N-dimethyl-2,3-bis(tetradecyloxy)-1-propanam- inium bromide (DMRIE).
9. The composition of claim 8, wherein said transfection facilitating compound further comprises a neutral lipid.
10. The composition of claim 9, wherein said neutral lipid is DOPE.
11. The composition of claim 8 further comprising a 1:1 molar ratio of GAP-DMORIE and DPyPE.
12. A method for treating or preventing influenza infection in a vertebrate comprising administering to a vertebrate in need thereof the composition of claim 4.
13. A method for eliciting an immune response to influenza virus in a vertebrate comprising administering to a vertebrate in need thereof the composition of claim 4.
Description:
CROSS-REFERENCE TO RELATED APPLICATIONS
[0001]The present application is a continuation of U.S. application Ser. No. 11/131,479 filed May 18, 2005, now pending; which claims benefit under 35 USC ยง119(e) to U.S. Provisional Application No. 60/571,854 filed May 18, 2004, now abandoned. The disclosure of each prior applications is considered part of and is incorporated by reference in the disclosure of this application.
REFERENCE TO A SEQUENCE LISTING SUBMITTED AS AN ELECTRONIC DOCUMENT
[0002]This application includes a "Sequence Listing," which is provided as an electronic document and which is hereby incorporated by reference in its entirety.
BACKGROUND OF THE INVENTION
[0003]The present invention relates to influenza virus vaccine compositions and methods of treating or preventing influenza infection and disease in mammals. Influenza is an acute febrile illness caused by infection of the respiratory tract. There are three types of influenza viruses: A, B, and C "IAV," "IBV" or "IAC," respectively, or generally "IV". Type A, which includes several subtypes, causes widespread epidemics and global pandemics such as those that occurred in 1918, 1957 and 1968. Type B causes regional epidemics. Type C causes sporadic cases and minor, local outbreaks. These virus types are distinguished in part on the basis of differences in two structural proteins, the nucleoprotein, found in the center of the virus, and the matrix protein, which foams the viral shell.
[0004]The disease can cause significant systemic symptoms, severe illness requiring hospitalization (such as viral pneumonia), and complications such as secondary bacterial pneumonia. More than 20 million people died during the pandemic flu season of 1918/1919, the largest pandemic of the 20th century. Recent epidemics in the United States are believed to have resulted in greater than 10,000 (up to 40,000) excess deaths per year and 5,000-10,000 deaths per year in non-epidemic years.
[0005]The best strategy for prevention of morbidity and mortality associated with influenza is vaccination. Vaccination is especially recommended for people in high-risk groups, such as residents of nursing or residential homes, as well as for diabetes, chronic renal failure, or chronic respiratory conditions.
[0006]Traditional methods of producing influenza vaccines involve growth of an isolated strain in embryonated hens' eggs. Initially, the virus is recovered from a throat swab or similar source and isolated in eggs. The initial isolation in egg is difficult, but the virus adapts to its egg host and subsequent propagation in eggs; takes place relatively easily. It is widely recognized, however, that the egg-derived production of IV for vaccine purposes has several disadvantages. One disadvantage is that such production process is rather vulnerable due to the varying (micro)biological quality of the eggs. Another disadvantage is that the process completely lacks flexibility if demand suddenly increases, i.e., in case of a serious epidemic or pandemic, because of the logistical problems due to the non-availability of large quantities of suitable eggs. Also, vaccines thus produced are contra-indicated for persons with a known hypersensitivity to chicken and/or egg proteins.
[0007]The influenza vaccines currently in use are designated whole virus (WV) vaccine or subvirion (SV) (also called "split" or "purified surface antigen"). The WV vaccine contains intact, inactivated virus, whereas the SV vaccine contains purified virus disrupted with detergents that solubilize the lipid-containing viral envelope, followed by chemical inactivation of residual virus. Attenuated viral vaccines against influenza are also in development. A discussion of methods of preparing conventional vaccine may be found in Wright, P. F. & Webster, R. G., FIELDS VIROLOGY, 4d Ed. (Knipe, D. M. et al. Ed.), 1464-65 (2001), for example.
Virus Structures
[0008]An IV is roughly spherical, but it can also be elongated or irregularly shaped. Inside the virus, eight segments of single-stranded RNA contain the genetic instructions for making the virus. The most striking feature of the virus is a layer of spikes projecting outward over its surface. There are two different types of spikes: one is composed of the molecule hemagglutinin (HA), the other of neuraminidase (NA). The HA molecule allows the virus to "stick" to a cell, initiating infection. The NA molecule allows newly formed viruses to exit their host cell without sticking to the cell surface or to each other. The viral capsid is comprised of viral ribonucleic acid and several so called "internal" proteins (polymerases (PB1, PB2, and PA, matrix protein (M1) and nucleoprotein (NP)). Because antibodies against HA and NA have traditionally proved the most effective in fighting infection, much research has focused on the structure, function, and genetic variation of those molecules. Researchers are also interested in a two non-structural proteins M2 and NS1; both molecules play important roles in viral infection.
[0009]Type A subtypes are described by a nomenclature system that includes the geographic site of discovery, a lab identification number, the year of discovery, and in parentheses the type of HA and NA it possesses, for example, A/Hong Kong/156/97 (H5N1). If the virus infects non-humans, the host species is included before the geographical site, as in A/Chicken/Hong Kong/G9/97 (H9N2).
[0010]Virions contain 7 segments (influenza C virus) to 8 segments (influenza A and B virus) of linear negative-sense single stranded RNA. Most of the segments of the virus genome code for a single protein. For many influenza viruses, the whole genome is now known. Genetic reassortment of the virus results from intermixing of the parental gene segments in the progeny of the viruses when a cell is co-infected by two different viruses of a given type. This phenomenon is facilitated by the segmental nature of the genome of influenza virus. Genetic reassortment is manifested as sudden changes in the viral surface antigens.
[0011]Antigenic changes in HA and NA allow the influenza virus to have tremendous variability. Antigenic drift is the term used to indicate minor antigenic variations in HA and NA of the influenza virus from the original parent virus, while major changes in HA and NA which make the new virions significantly different, are called Antigenic shift. The difference between the two phenomena is a matter of degree.
[0012]Antigenic drift (minor changes) occurs due to accumulation of point mutations in the gene which results in changes in the amino acids in the proteins. Changes which are extreme, and drastic (too drastic to be explained by mutation alone) result in antigenic shift of the virus. The segmented genomes of the influenza viruses reassort readily in double infected cells. Genetic reassortment between human and non-human influenza virus has been suggested as a mechanism for antigenic shift. Influenza is a zoonotic disease, and an important pathogen in a number of animal species, including swine, horses, and birds, both wild and domestic. Influenza viruses are transferred to humans from other species.
[0013]Because of antigenic shift and antigenic drift, immunity to an IV carrying a particular HA and/or NA protein does not necessarily confer protective immunity against IV strains carrying variant, or different HA and/or NA proteins. Because antibodies against HA and NA have traditionally proved the most effective in fighting IV infection, much research has focused on the structure, function and genetic variation of those molecules.
Recent IV Vaccine Candidates
[0014]During the past few years, there has been substantial interest in testing DNA-based vaccines for a number of infectious diseases where the need for a vaccine, or an improved vaccine, exists. Several well-recognized advantages of DNA-based vaccines include the speed, ease and cost of manufacture, the versatility of developing and testing multivalent vaccines, the finding that DNA vaccines can produce a robust cellular response in a wide variety of animal models as well as in humans, and the proven safety of using plasmid DNA as a delivery vector (Donnelly, J. J., et al., Annu. Rev. Immunol. 15:617-648 (1997); Manickan, E., et al., Grit. Rev. Immunol. 17(2):139-154 (1997); U.S. Pat. No. 6,214,804). DNA vaccines represent the next generation in the development of vaccines (Nossal, G., Nat. Med. 4(5 Supple):475-476 (1998)) and numerous DNA vaccines are in clinical trials. The above references are herein incorporated by reference in their entireties.
[0015]Studies have already been performed using DNA-based vaccines in animals. Ulmer, J. B. et al., Science 259:1745-9 (1993) revealed that mice could be protected by an IV nucleoprotein DNA vaccine alone against severe disease and death resulting from either a homologous or a heterologous IV challenge. Further studies have substantiated this model, and comparative studies of live influenza vaccines versus DNA influenza vaccines show them to be relatively equivalent in immune induction and protection in the murine model.
[0016]WO 94/21797, incorporated herein by reference in its entirety, discloses IV vaccine compositions comprising DNA constructs encoding NP, HA, M1, PB1 and NS1. WO 94/21797 also discloses methods of protecting against IV infection comprising immunization with a prophylactically effective amount of these DNA vaccine compositions.
[0017]The IV nucleoprotein is relatively conserved (see Shu, L. L. et al., J. Virol. 67:2723-9 (1993)), but just as conserved are the M1 matrix protein (which is a major T-cell target), and the M2 protein, which are encoded by separate reading frames of RNA segment 7. See Neirynck, S. et al., Nat. Med. 5:1157-63 (1999); Lamb, R. A. & Lai, C. J., Virology 112:746-51 (1981); Ito, T. et al., J. Virol. 65:5491-8 (1991). Animal DNA vaccine trials have been performed with DNA constructs encoding these genes alone or in combination, usually with success. See Okuda, K., et al., Vaccine 19:3681-91 (2001); Watabe, S. et al., Vaccine 19:4434-44 (2001). Of interest, the M2 protein is involved as part of an ion channel, is critical in resistance to the antiviral agents amantadine and rimantidine, and approximately 24 amino acids are extracellular (eM2). See Fischer, W. B., Biochim Biophys Acta 1561:27-45 (2002); Zhong, Q., FEBS Lett 434:265-71 (1998). Antibodies to this extracellular, highly conserved protein (eM2), which is highly expressed in infected cells (Lamb, R. A., et al., Cell 40:627-33 (1985)), have been shown to be involved in animal models. Treanor, J. J., J. Viral. 64:1375-7 (1990); Slepushkin, V. A. et al., Vaccine 13:1399-402 (1995). An approach using a conjugate hepatitis B core-eM2 protein has been evaluated in an animal model and proposed as a pandemic influenza vaccine. Neirynck, S. et al., Nat. Med. 5:1157-63 (1999). However, in one study vaccination of pigs with a DNA construct expressing eM2-NP fusion protein exacerbated disease after challenge with influenza A virus. Heinen, P. P., J. Gen. Virol. 83:1851-59 (2002). All of the above references are herein incorporated by reference in their entireties.
[0018]Heterologous "prime boost" strategies have been effective for enhancing immune responses and protection against numerous pathogens. Schneider et al., Immunol. Rev. 170:29-38 (1999); Robinson, H. L., Nat. Rev. Immunol. 2:239-50 (2002); Gonzalo, R. M. et al., Vaccine 20:1226-31 (2002); Tanghe, A., Infect. Immun. 69:3041-7 (2001). Providing antigen in different forms in the prime and the boost injections appears to maximize the immune response to the antigen. DNA vaccine priming followed by boosting with protein in adjuvant or by viral vector delivery of DNA encoding antigen appears to be the most effective way of improving antigen specific antibody and CD4+ T-cell responses or CD8+ T-cell responses respectively. Shiver J. W. et al., Nature 415: 331-5 (2002); Gilbert, S. C. et al., Vaccine 20:1039-45 (2002); Billaut-Mulot, O. et al., Vaccine 19:95-102 (2000); Sin, J. I. et al., DNA Cell Biol. 18:771-9 (1999). Recent data from monkey vaccination studies suggests that adding CRL1005 poloxamer (12 kDa, 5% POE), to DNA encoding the HIV gag antigen enhances T-cell responses when monkeys are vaccinated with an HIV gag DNA prime followed by a boost with an adenoviral vector expressing HIV gag (Ad5-gag). The cellular immune responses for a DNA/poloxamer prime followed by an Ad5-gag boost were greater than the responses induced with a DNA (without poloxamer) prime followed by Ad5-gag boost or for Ad5-gag only. Shiver, J. W. et al. Nature 415:331-5 (2002). U.S. Patent Appl. Publication No. US 2002/0165172 A1 describes simultaneous administration of a vector construct encoding an immunogenic portion of an antigen and a protein comprising the immunogenic portion of an antigen such that an immune response is generated. The document is limited to hepatitis B antigens and HIV antigens. Moreover, U.S. Pat. No. 6,500,432 is directed to methods of enhancing an immune response of nucleic acid vaccination by simultaneous administration of a polynucleotide and polypeptide of interest. According to the patent, simultaneous administration means administration of the polynucleotide and the polypeptide during the same immune response, preferably within 0-10 or 3-7 days of each other. The antigens contemplated by the patent include, among others, those of Hepatitis (all forms), HSV, HIV, CMV, EBV, RSV, VZV, HPV, polio, influenza, parasites (e.g., from the genus Plasmodium), and pathogenic bacteria (including but not limited to M. tuberculosis, M. leprae, Chlamydia, Shigella, B. burgdorferi, enterotoxigenic E. coli, S. typhosa, H. pylori, V. cholerae, B. pertussis, etc.). All of the above references are herein incorporated by reference in their entireties.
SUMMARY OF THE INVENTION
[0019]The present invention is directed to enhancing the immune response of a vertebrate in need of protection against IV infection by administering in vivo, into a tissue of the vertebrate, at least one polynucleotide, wherein the polynucleotide comprises one or more nucleic acid fragments, where the one or more nucleic acid fragments are optionally fragments of codon-optimized coding regions operably encoding one or more IV polypeptides, or fragments, variants, or derivatives thereof. The present invention is further directed to enhancing the immune response of a vertebrate in need of protection against IV infection by administering, in vivo, into a tissue of the vertebrate, a polynucleotide described above plus at least one isolated IV polypeptide or a fragment, a variant, or derivative thereof. The isolated IV polypeptide can be, for example, a purified subunit, a recombinant protein, a viral vector expressing an isolated IV polypeptide, or can be an inactivated or attenuated IV, such as those present in conventional IV vaccines. According to either method, the polynucleotide is incorporated into the cells of the vertebrate in vivo, and an immunologically effective amount of an immunogenic epitope of the encoded IV polypeptide, or a fragment, variant, or derivative thereof, is produced in vivo. When utilized, an isolated IV polypeptide or a fragment, variant, or derivative thereof is also administered in an immunologically effective amount.
[0020]According to the present invention, the polynucleotide can be administered either prior to, at the same time (simultaneously), or subsequent to the administration of the isolated IV polypeptide. The IV polypeptide or fragment, variant, or derivative thereof encoded by the polynucleotide comprises at least one immunogenic epitope capable of eliciting an immune response to influenza virus in a vertebrate. In addition, an isolated IV polypeptide or fragment, variant, or derivative thereof, when used, comprises at least one immunogenic epitope capable of eliciting an immune response in a vertebrate. The IV polypeptide or fragment, variant, or derivative thereof encoded by the polynucleotide can, but need not, be the same protein or fragment, variant, or derivative thereof as the isolated IV polypeptide which can be administered according to the method.
[0021]The polynucleotide of the invention can comprise a nucleic acid fragment, where the nucleic acid fragment is a fragment of a codon-optimized coding region operably encoding any IV polypeptide or fragment, variant, or derivative thereof, including, but not limited to, HA, NA, NP, M1 or M2 proteins or fragments (e.g., eM2), variants or derivatives thereof. A polynucleotide of the invention can also encode a derivative fusion protein, wherein two or more nucleic acid fragments, at least one of which encodes an IV polypeptide or fragment, variant, or derivative thereof, are joined in frame to encode a single polypeptide, e.g., NP fused to eM2. Additionally, a polynucleotide of the invention can further comprise a heterologous nucleic acid or nucleic acid fragment. Such heterologous nucleic acid or nucleic acid fragment may encode a heterologous polypeptide fused in frame with the polynucleotide encoding the IV polypeptide, e.g., a hepatitis B core protein or a secretory signal peptide. Preferably, the polynucleotide encodes an IV polypeptide or fragment, variant, or derivative thereof comprising at least one immunogenic epitope of IV, wherein the epitope elicits a B-cell (antibody) response, a T-cell (e.g., CTL) response, or both.
[0022]Similarly, the isolated IV polypeptide or fragment, variant, or derivative thereof to be delivered (either a recombinant protein, a purified subunit, or viral vector expressing an isolated IV polypeptide, or in the form of an inactivated IV vaccine) can be any isolated IV polypeptide or fragment, variant, or derivative thereof, including but not limited to the HA, NA, NP, M1 or M2 proteins or fragments (e.g., eM2), variants or derivatives thereof. In certain embodiments, a derivative protein can be a fusion protein, e.g., NP-eM2. In other embodiments, the isolated IV polypeptide or fragment, variant, or derivative thereof can be fused to a heterologous protein, e.g., a secretory signal peptide or the hepatitis B virus core protein. Preferably, the isolated IV polypeptide or fragment, variant, or derivative thereof comprises at least one immunogenic epitope of IV, wherein the antigen elicits a B-cell antibody response, a T-cell antibody response, or both.
[0023]Nucleic acids and fragments thereof of the present invention can be altered from their native state in one or more of the following ways. First, a nucleic acid or fragment thereof which encodes an IV polypeptide or fragment, variant, or derivative thereof can be part or all of a codon-optimized coding region, optimized according to codon usage in the animal in which the vaccine is to be delivered. In addition, a nucleic acid or fragment thereof which encodes an IV polypeptide can be a fragment which encodes only a portion of a full-length polypeptide, and/or can be mutated so as to, for example, remove from the encoded polypeptide non-desired protein motifs present in the encoded polypeptide or virulence factors associated with the encoded polypeptide. For example, the nucleic acid sequence could be mutated so as not to encode a membrane anchoring region that would prevent release of the polypeptide from the cell as with, e.g., eM2. Upon delivery, the polynucleotide of the invention is incorporated into the cells of the vertebrate in vivo, and a prophylactically or therapeutically effective amount of an immunologic epitope of an IV is produced in vivo.
[0024]Similarly, the proteins of the invention can be a fragment of a full-length IV polypeptide and/or can be altered so as to, for example, remove from the polypeptide non-desired protein motifs present in the polypeptide or virulence factors associated with the polypeptide. For example, the polypeptide could be altered so as not to encode a membrane anchoring region that would prevent release of the polypeptide from the cell.
[0025]The invention further provides immunogenic compositions comprising at least one polynucleotide, wherein the polynucleotide comprises one or more nucleic acid fragments, where each nucleic acid fragment is a fragment of a codon-optimized coding region encoding an IV polypeptide or a fragment, a variant, or a derivative thereof; and immunogenic compositions comprising a polynucleotide as described above and at least one isolated IV polypeptide or a fragment, a variant, or derivative thereof. Such compositions can further comprise, for example, carriers, excipients, transfection facilitating agents, and/or adjuvants as described herein.
[0026]The immunogenic compositions comprising a polynucleotide and an isolated IV polypeptide or fragment, variant, or derivative thereof as described above can be provided so that the polynucleotide and protein formulation are administered separately, for example, when the polynucleotide portion of the composition is administered prior (or subsequent) to the isolated IV polypeptide portion of the composition. Alternatively, immunogenic compositions comprising the polynucleotide and the isolated IV polypeptide or fragment, variant, or derivative thereof can be provided as a single formulation, comprising both the polynucleotide and the protein, for example, when the polynucleotide and the protein are administered simultaneously. In another alternative, the polynucleotide portion of the composition and the isolated IV polypeptide portion of the composition can be provided simultaneously, but in separate formulations.
[0027]Compositions comprising at least one polynucleotide comprising one or more nucleic acid fragments, where each nucleic acid fragment is optionally a fragment of a codon-optimized coding region operably encoding an IV polypeptide or fragment, variant, or derivative thereof together with and one or more isolated IV polypeptides or fragments, variants or derivatives thereof (as either a recombinant protein, a purified subunit, a viral vector expressing the protein, or in the form of an inactivated or attenuated IV vaccine) will be referred to herein as "combinatorial polynucleotide (e.g., DNA) vaccine compositions" or "single formulation heterologous prime-boost vaccine compositions."
[0028]The compositions of the invention can be univalent, bivalent, trivalent or multivalent. A univalent composition will comprise only one polynucleotide comprising a nucleic acid fragment, where the nucleic acid fragment is optionally a fragment of a codon-optimized coding region encoding an IV polypeptide or a fragment, variant, or derivative thereof, and optionally the same IV polypeptide or a fragment, variant, or derivative thereof in isolated form. In a single formulation heterologous prime-boost vaccine composition, a univalent composition can include a polynucleotide comprising a nucleic acid fragment, where the nucleic acid fragment is optionally a fragment of a codon-optimized coding region encoding an IV polypeptide or a fragment, variant, or derivative thereof and an isolated polypeptide having the same antigenic region as the polynucleotide. A bivalent composition will comprise, either in polynucleotide or protein form, two different IV polypeptides or fragments, variants, or derivatives thereof, each capable of eliciting an immune response. The polynucleotide(s) of the composition can encode two IV polypeptides or alternatively, the polynucleotide can encode only one IV polypeptide and the second IV polypeptide would be provided by an isolated IV polypeptide of the invention as in, for example, a single formulation heterologous prime-boost vaccine composition. In the case where both IV polypeptides of a bivalent composition are delivered in polynucleotide form, the nucleic acid fragments operably encoding those IV polypeptides need not be on the same polynucleotide, but can be on two different polynucleotides. A trivalent or further multivalent composition will comprise three IV polypeptides or fragments, variants or derivatives thereof, either in isolated form or encoded by one or more polynucleotides of the invention.
[0029]The present invention further provides plasmids and other polynucleotide constructs for delivery of nucleic acid fragments of the invention to a vertebrate, e.g., a human, which provide expression of IV polypeptides, or fragments, variants, or derivatives thereof. The present invention further provides carriers, excipients, transfection-facilitating agents, immunogenicity-enhancing agents, e.g., adjuvants, or other agent or agents to enhance the transfection, expression or efficacy of the administered gene and its gene product.
[0030]In one embodiment, a multivalent composition comprises a single polynucleotide, e.g., plasmid, comprising one or more nucleic acid regions operably encoding IV polypeptides or fragments, variants, or derivatives thereof. Reducing the number of polynucleotides, e.g., plasmids in the compositions of the invention can have significant impacts on the manufacture and release of product, thereby reducing the costs associated with manufacturing the compositions. There are a number of approaches to include more than one expressed antigen coding sequence on a single plasmid. These include, for example, the use of Internal Ribosome Entry Site (IRES) sequences, dual promoters/expression cassettes, and fusion proteins.
[0031]The invention also provides methods for enhancing the immune response of a vertebrate to IV infection by administering to the tissues of a vertebrate one or more polynucleotides each comprising one or more nucleic acid fragments, where each nucleic acid fragment is optionally a fragment of a codon-optimized coding region encoding an IV polypeptide or fragment, variant, or derivative thereof; and optionally administering to the tissues of the vertebrate one or more isolated IV polypeptides, or fragments, variants, or derivatives thereof. The isolated IV polypeptide can be administered prior to, at the same time (simultaneously), or subsequent to administration of the polynucleotides encoding IV polypeptides.
[0032]In addition, the invention provides consensus amino acid sequences for IV polypeptides, or fragments, variants or derivatives thereof, including, but not limited to the HA, NA, NP, M1 or M2 proteins or fragments (e.g. eM2), variants or derivatives thereof. Polynucleotides which encode the consensus polypeptides or fragments, variants or derivatives thereof, are also embodied in this invention. Such polynucleotides can be obtained by known methods, for example by backtranslation of the amino acid sequence and PCR synthesis of the corresponding polynucleotide as described below.
BRIEF DESCRIPTION OF THE FIGURES
[0033]FIG. 1 shows an alignment of nucleotides 46-1542 of SEQ ID NO:1 (native NP coding region) with a coding region fully codon-optimized for human usage (SEQ ID NO:23).
[0034]FIG. 2 shows the protocol for the preparation of a formulation comprising 0.3 mM BAK, 7.5 mg/ml CRL 1005 and 5 mg/ml of DNA in a final volume of 3.6 ml, through the use of thermal cycling.
[0035]FIG. 3 shows the protocol for the preparation of a formulation comprising 0.3 mM BAK, 34 mg/ml or 50 mg/ml CRL 1005 and 2.5 mg/ml DNA in a final volunme of 4.0 ml, through the use of thermal cycling.
[0036]FIG. 4 shows the protocol for the simplified preparation (without thermal cycling) of a formulation comprising 0.3 mM BAK, 7.5 mg/ml CRL 1005 and 5 mg/ml DNA.
[0037]FIG. 5 shows the anti-NP antibody response three weeks after a single administration of a combinatorial prime-boost vaccine formulation against the influenza virus NP protein.
[0038]FIG. 6 shows the anti-NP antibody response twelve days after a second administration of a combinatorial prime-boost vaccine formulation against the influenza virus NP protein.
[0039]FIG. 7 shows the CD8+ T Cell response to a combinatorial prime-boost vaccine formulation against the influenza virus NP protein.
[0040]FIG. 8 shows the CD4+ T Cell response to a combinatorial prime-boost vaccine formulation against the influenza virus NP protein.
[0041]FIGS. 9A and 9B show the results of a two dose mouse immunization regimen study with plasmid DNA encoding IAV HA (H3).
[0042]FIGS. 10A and 10B show the in vitro expression of M1 and M2 from segment 7 and an M1M2 fusion.
[0043]FIGS. 11A and 11B show the in vitro expression of eM2-NP and codon-optimized influenza virus NP protein.
[0044]FIG. 12 shows the influenza A NP protein consensus amino acid sequence aligned with 22 full length NP sequences.
[0045]FIG. 13 is a schematic diagram of various vectors encoding influenza proteins described herein.
[0046]FIG. 14 are the results of western blot experiments as described in Example 13, Experiment 3. The blots show lysates of VM92 cells transfected with plasmids which express M2 or NP to compare expression of the influenza protein from different expression vectors.
[0047]FIG. 15 are the results of western blot experiments as described in Example 13, Experiment 3. The blots show lysates of VM92 cells transfected with plasmids which express M1, M2 or NP to compare expression of the influenza protein from expression vectors.
DETAILED DESCRIPTION OF THE INVENTION
[0048]The present invention is directed to compositions and methods for enhancing the immune response of a vertebrate in need of protection against IV infection by administering in vivo, into a tissue of a vertebrate, at least one polynucleotide comprising one or more nucleic acid fragments, where each nucleic acid fragment is optionally a fragment of a codon-optimized coding region operably encoding an IV polypeptide, or a fragment, variant, or derivative thereof in cells of the vertebrate in need of protection. The present invention is also directed to administering in vivo, into a tissue of the vertebrate the above described polynucleotide and at least one isolated TV polypeptide, or a fragment, variant, or derivative thereof. The isolated IV polypeptide or fragment, variant, or derivative thereof can be, for example, a recombinant protein, a purified subunit protein, a protein expressed and carried by a heterologous live or inactivated or attentuated viral vector expressing the protein, or can be an inactivated IV, such as those present in conventional, commercially available, inactivated IV vaccines. According to either method, the polynucleotide is incorporated into the cells of the vertebrate in vivo, and an immunologically effective amount of the influenza protein, or fragment or variant encoded by the polynucleotide is produced in vivo. The isolated protein or fragment, variant, or derivative thereof is also administered in an immunologically effective amount. The polynucleotide can be administered to the vertebrate in need thereof either prior to, at the same time (simultaneously), or subsequent to the administration of the isolated IV polypeptide or fragment, variant, or derivative thereof.
[0049]Non-limiting examples of IV polypeptides within the scope of the invention include, but are not limited to, NP, HA, NA, M1 and M2 polypeptides, and fragments, e.g., eM2, derivatives, e.g., an NP-eM2 fusion, and variants thereof. Nucleotide and amino acid sequences of IV polypeptides from a wide variety of IV types and subtypes are known in the art. The nucleotide sequences set out below are the wild-type sequences. For example, the nucleotide sequence of the NP protein of Influenza A/PR/8/34 (H1N1) is available as GenBank Accession Number M38279.1, and has the following sequence, referred to herein as SEQ ID NO:1:
TABLE-US-00001 AGCAAAAGCAGGGTAGATAATCACTCACTGAGTGACATCAAAATCATGGCGTCTCAAGGCACC AAACGATCTTACGAACAGATGGAGACTGATGGAGAACGCCAGAATGCCACTGAAATCAGAGCA TCCGTCGGAAAAATGATTGGTGGAATTGGACGATTCTACATCCAAATGTGCACCGAACTCAAA CTCAGTGATTATGAGGGACGGTTGATCCAAAACAGCTTAACAATAGAGAGAATGGTGCTCTCT GCTTTTGACGAAAGGAGAAATAAATACCTTGAAGAACATCCCAGTGCGGGGAAAGATCCTAAG AAAACTGGAGGACCTATATACAGGAGAGTAAACGGAAAGTGGATGAGAGAACTCATCCTTTAT GACAAAGAAGAAATAAGGCGAATCTGGCGCCAAGCTAATAATGGTGACGATGCAACGGCTGGT CTGACTCACATGATGATCTGGCATTCCAATTTGAATGATGCAACTTATCAGAGGACAAGAGCT CTTGTTCGCACCGGAATGGATCCCAGGATGTGCTCTCTGATGCAAGGTTCAACTCTCCCTAGG AGGTCTGGAGCCGCAGGTGCTGCAGTCAAAGGAGTTGGAACAATGGTGATGGAATTGGTCAGA ATGATCAAACGTGGGATCAATGATCGGAACTTCTGGAGGGGTGAGAATGGACGAAAAACAAGA ATTGCTTATGAAAGAATGTGCAACATTCTCAAAGGGAAATTTCAAACTGCTGCACAAAAAGCA ATGATGGATCAAGTGAGAGAGAGCCGGAACCCAGGGAATGCTGAGTTCGAAGATCTCACTTTT CTAGCACGGTCTGCACTCATATTGAGAGGGTCGGTTGCTCACAAGTCCTGCCTGCCTGCCTGT GTGTATGGACCTGCCGTAGCCAGTGGGTACGACTTTGAAAGGGAGGGATACTCTCTAGTCGGA ATAGACCCTTTCAGACTGCTTCAAAACAGCCAAGTGTACAGCCTAATCAGACCAAATGAGAAT CCAGCACACAAGAGTCAACTGGTGTGGATGGCATGCCATTCTGCCGCATTTGAAGATCTAAGA GTATTAAGCTTCATCAAAGGGACGAAGGTGCTCCCAAGAGGGAAGCTTTCCACTAGAGGAGTT CAAATTGCTTCCAATGAAAATATGGAGACTATGGAATCAAGTACACTTGAACTGAGAAGCAGG TACTGGGCCATAAGGACCAGAAGTGGAGGAAACACCAATCAACAGAGGGCATCTGCGGGCCAA ATCAGCATACAACCTACGTTCTCAGTACAGAGAAATCTCCCTTTTGACAGAACAACCGTTATG GCAGCATTCAGTGGGAATACAGAGGGGAGAACATCTGACATGAGGACCGAAATCATAAGGATG ATGGAAAGTGCAAGACCAGAAGATGTGTCTTTCCAGGGGCGGGGAGTCTTCGAGCTCTCGGAC GAAAAGGCAGCGAGCCCGATCGTGCCTTCCTTTGACATGAGTAATGAAGGATCTTATTTCTTC GGAGACAATGCAGAGGAATACGATAATTAAAGAAAAATACCCTTGTTTCTACT
[0050]The amino acid sequence of the NP protein of Influenza A/PR/8/34 (H1N1), encoded by nucleotides 46-1494 of SEQ ID NO:1 is as follows, referred to herein as SEQ ID NO:2:
TABLE-US-00002 MASQGTKRSYEQMETDGERQNATEIRASVGKMIGGIGRFYIQMCTELKLS DYEGRLIQNSLTIERMVLSAFDERRNKYLEEHPSAGKDPKKTGGPIYRRV NGKWMRELILYDKEEIRRIWRQANNGDDATAGLTHMMIWHSNLNDATYQR TRALVRTGMDPRMCSLMQGSTLPRRSGAAGAAVKGVGTMVMELVRMIKRG INDRNFWRGENGRKTRIAYERMCNILKGKFQTAAQKAMMDQVRESRNPGN AEFEDLTFLARSALILRGSVAHKSCLPACVYGPAVASGYDFEREGYSLVG IDPFRLLQNSQVYSLIRPNENPAHKSQLVWMACHSAAFEDLRVLSFIKG TKVLPRGKLSTRGVQIASNENMETMESSTLELRSRYWAIRTRSGGNTNQQ RASAGQISIQPTFSVQRNLPFDRTTVMAAFSGNTEGRTSDMRTEIIRMME SARPEDVSFQGRGVFELSDEKAASPIVPSFDMSNEGSYFFGDNAEEYDN
[0051]Segment 7 of the IAV genome encodes both M1 and M2. Segment 7 of Influenza A virus (A/Puerto Rico/8/34/Mount Sinai (H1N1)), is available as GenBank Accession No. AF389121.1, and has the following sequence, referred to herein as SEQ ID NO:3:
TABLE-US-00003 AGCGAAAGCAGGTAGATATTGAAAGATGAGTCTTCTAACCGAGGTCGAAACGTACGTACTCTC TATCATCCCGTCAGGCCCCCTCAAAGCCGAGATCGCACAGAGACTTGAAGATGTCTTTGCAGG GAAGAACACTGATCTTGAGGTTCTCATGGAATGGCTAAAGACAAGACCAATCCTGTCACCTCT GACTAAGGGGATTTTAGGATTTGTGTTCACGCTCACCGTGCCCAGTGAGCGAGGACTGCAGCG TAGACGCTTTGTCCAAAATGCCCTTAATGGGAACGGGGATCCAAATAACATGGACAAAGCAGT TAAACTGTATAGGAAGCTCAAGAGGGAGATAACATTCCATGGGGCCAAAGAAATCTCACTCAG TTATTCTGCTGGTGCACTTGCCAGTTGTATGGGCCTCATATACAACAGGATGGGGGCTGTGAC CACTGAAGTGGCATTTGGCCTGGTATGTGCAACCTGTGAACAGATTGCTGACTCCCAGCATCG GTCTCATAGGCAAATGGTGACAACAACCAATCCACTAATCAGACATGAGAACAGAATGGTTTT AGCCAGCACTACAGCTAAGGCTATGGAGCAAATGGCTGGATCGAGTGAGCAAGCAGCAGAGGC CATGGAGGTTGCTAGTCAGGCTAGACAAATGGTGCAAGCGATGAGAACCATTGGGACTCATCC TAGCTCCAGTGCTGGTCTGAAAAATGATCTTCTTGAAAATTTGCAGGCCTATCAGAAACGAAT GGGGGTGCAGATGCAACGGTTCAAGTGATCCTCTCGCTATTGCCGCAAATATCATTGGGATCT TGCACTTGACATTGTGGATTCTTGATCGTCTTTTTTTCAAATGCATTTACCGTCGCTTTAAAT ACGGACTGAAAGGAGGGCCTTCTACGGAAGGAGTGCCAAAGTCTATGAGGGAAGAATATCGAA AGGAACAGCAGAGTGCTGTGGATGCTGACGATGGTCATTTTGTCAGCATAGAGCTGGAGTAAA AAACTACCTTGTTTCTACT
[0052]The amino acid sequence of the M1 protein of Influenza A/Puerto Rico/8/34/Mount Sinai (H1N1), encoded by nucleotides 26 to 784 of SEQ ID NO:3 is as follows, referred to herein as SEQ ID NO:4:
TABLE-US-00004 MSLLTEVETYVLSIIPSGPLKAEIAQRLEDVFAGKNTDLEVLMEWLKTRP ILSPLTKGILGFVFTLTVPSERGLQRRRFVQNALNGNGDPNNMDKAVKLY RKLKREITFHGAKEISLSYSAGALASCMGLIYNRMGAVTTEVAFGLVCAT CEQIADSQHRSHRQMVTTTNPLIRHENRMVLASTTAKAMEQMAGSSEQAA EAMEVASQARQMVQAMRTIGTHPSSSAGLKNDLLENLQAYQKRMGVQMQ RFK
[0053]The amino acid sequence of the M2 protein of Influenza A/Puerto Rico/8/34/Mount Sinai (H1N1), encoded (in spliced form) by nucleotides 26 to 51 and 740 to 1007 of SEQ ID NO:3 is as follows, referred to herein as SEQ ID NO:5:
TABLE-US-00005 MSLLTEVETPIRNEWGCRCNGSSDPLAIAANIIGILHLTLWILDRLFFKC IYRRFKYGLKGGPSTEGVPKSMREEYRKEQQSAVDADDGHFVSIELE
[0054]The Extracellular region of the M2 protein (eM2) corresponds to the first 24 amino acids of the N-terminal end of the protein, and is underlined above. See Fischer, W. B. et al., Biochim. Biophys. Acta. 1561:27-45 (2002); Zhong, Q. et al., FEBS Lett. 434:265-71 (1998).
[0055]A derivative of NP and eM2 described herein is encoded by a construct which encodes the first 24 amino acids of M2 and all or a portion of NP. The fusion constructs may be constructed with the eM2 sequences followed by the NP sequences, or with the NP sequences followed by the eM2 sequences. Exemplary fusion constructs using the NP and M2 sequences from Influenza A/PR/8/34 (H1N1) are set out below. A sequence, using the original influenza virus nucleotide sequences, which encodes the first 24 amino acids of M2 fused at its 3' end to a sequence which encodes NP in its entirety eM2-NP is referred to herein as SEQ ID NO:6:
TABLE-US-00006 1 ATGAGTCTTC TAACCGAGGT CGAAACGCCT ATCAGAAACG AATGGGGGTG CAGATGCAAC 61 GGTTCAAGTG ATATGGCGTC TCAAGGCACC AAACGATCTT ACGAACAGAT GGAGACTGAT 121 GGAGAACGCC AGAATGCCAC TGAAATCAGA GCATCCGTCG GAAAAATGAT TGGTGGAATT 181 GGACGATTCT ACATCCAAAT GTGCACCGAA CTCAAACTCA GTGATTATGA GGGACGGTTG 241 ATCCAAAACA GCTTAACAAT AGAGAGAATG GTGCTCTCTG CTTTTGACGA AAGGAGAAAT 301 AAATACCTTG AAGAACATCC CAGTGCGGGG AAAGATCCTA AGAAAACTGG AGGACCTATA 361 TACAGGAGAG TAAACGGAAA GTGGATGAGA GAACTCATCC TTTATGACAA AGAAGAAATA 421 AGGCGAATCT GGCGCCAAGC TAATAATGGT GACGATGCAA CGGCTGGTCT GACTCACATG 481 ATGATCTGGC ATTCCAATTT GAATGATGCA ACTTATCAGA GGACAAGAGC TCTTGTTCGC 541 ACCGGAATGG ATCCCAGGAT GTGCTCTCTG ATGCAAGGTT CAACTCTCCC TAGGAGGTCT 601 GGAGCCGCAG GTGCTGCAGT CAAAGGAGTT GGAACAATGG TGATGGAATT GGTCAGAATG 661 ATCAAACGTG GGATCAATGA TCGGAACTTC TGGAGGGGTG AGAATGGACG AAAAACAAGA 721 ATTGCTTATG AAAGAATGTG CAACATTCTC AAAGGGAAAT TTCAAACTGC TGCACAAAAA 781 GCAATGATGG ATCAAGTGAG AGAGAGCCGG AACCCAGGGA ATGCTGAGTT CGAAGATCTC 841 ACTTTTCTAG CACGGTCTGC ACTCATATTG AGAGGGTCGG TTGCTCACAA GTCCTGCCTG 901 CCTGCCTGTG TGTATGGACC TGCCGTAGCC AGTGGGTACG ACTTTGAAAG GGAGGGATAC 961 TCTCTAGTCG GAATAGACCC TTTCAGACTG CTTCAAAACA GCCAAGTGTA CAGCCTAATC 1021 AGACCAAATG AGAATCCAGC ACACAAGAGT CAACTGGTGT GGATGGCATG CCATTCTGCC 1081 GCATTTGAAG ATCTAAGAGT ATTAAGCTTC ATCAAAGGGA CGAAGGTGCT CCCAAGAGGG 1141 AAGCTTTCCA CTAGAGGAGT TCAAATTGCT TCCAATGAAA ATATGGAGAC TATGGAATCA 1201 AGTACACTTG AACTGAGAAG CAGGTACTGG GCCATAAGGA CCAGAAGTGG AGGAAACACC 1261 AATCAACAGA GGGCATCTGC GGGCCAAATC AGCATACAAC CTACGTTCTC AGTACAGAGA 1321 AATCTCCCTT TTGACAGAAC AACCGTTATG GCAGCATTCA GTGGGAATAC AGAGGGGAGA 1381 ACATCTGACA TGAGGACCGA AATCATAAGG ATGATGGAAA GTGCAAGACC AGAAGATGTG 1441 TCTTTCCAGG GGCGGGGAGT CTTCGAGCTC TCGGACGAAA AGGCAGCGAG CCCGATCGTG 1501 CCTTCCTTTG ACATGAGTAA TGAAGGATCT TATTTCTTCG GAGACAATGC AGAGGAATAC 1561 GATAAT
[0056]The amino acid sequence of the eM2-NP fusion protein of Influenza A/PR/8/34/(H1N1), encoded by nucleotides 1 to 1566 SEQ ID NO:6 is as follows, referred to herein as SEQ ID NO:7 (eM2 amino acid sequence underlined):
TABLE-US-00007 MSLLTEVETPIRNEWGCRCNGSSDMASQGTKRSYEQMETDGERQNATEIRASVGKMIGGIGRF YIQMCTELKLSDYEGRLIQNSLTIERMVLSAFDERRNKYLEEHPSAGKDPKKTGGPIYRRVNG KWMRELILYDKEEIRRIWRQANNGDDATAGLTHMMIWHSNLNDATYQRTRALVRTGMDPRMCS LMQGSTLPRRSGAAGAAVKGVGTMVMELVRMIKRGINDRNFWRGENGRKTRIAYERMCNILKG KFQTAAQKAMMDQVRESRNPGNAEFEDLTFLARSALILRGSVAHKSCLPACVYGPAVASGYDF EREGYSLVGIDPFRLLQNSQVYSLIRPNENPAHKSQLVWMACHSAAFEDLRVLSFIKGTKVLP RGKLSTRGVQIASNENMETMESSTLELRSRYWAIRTRSGGNTNQQRASAGQISIQPTFSVQRN LPFDRTTVMAAFSGNTEGRTSDMRTEIIRMMESARPEDVSFQGRGVFELSDEKAASPIVPSFD MSNEGSYFFGDNAEEYDN
[0057]A sequence, using the original influenza virus nucleotide sequences, which encodes NP in its entirety fused at its 3' end to the first 24 amino acids of M2 fused to a sequence which encodes NP in its entirety is referred to herein as SEQ ID NO:8:
TABLE-US-00008 ATGGCGTCTCAAGGCACCAAACGATCTTACGAACAGATGGAGACTGATGGAGAACGCCAGAATGCCACTG AAATCAGAGCATCCGTCGGAAAAATGATTGGTGGAATTGGACGATTCTACATCCAAATGTGCACCGAACT CAAACTCAGTGATTATGAGGGACGGTTGATCCAAAACAGCTTAACAATAGAGAGAATGGTGCTCTCTGCT TTTGACGAAAGGAGAAATAAATACCTTGAAGAACATCCCAGTGCGGGGAAAGATCCTAAGAAAACTGGAG GACCTATATACAGGAGAGTAAACGGAAAGTGGATGAGAGAACTCATCCTTTATGACAAAGAAGAAATAAG GCGAATCTGGCGCCAAGCTAATAATGGTGACGATGCAACGGCTGGTCTGACTCACATGATGATCTGGCAT TCCAATTTGAATGATGCAACTTATCAGAGGACAAGAGCTCTTGTTCGCACCGGAATGGATCCCAGGATGT GCTCTCTGATGCAAGGTTCAACTCTCCCTAGGAGGTCTGGAGCCGCAGGTGCTGCAGTCAAAGGAGTTGG AACAATGGTGATGGAATTGGTCAGAATGATCAAACGTGGGATCAATGATCGGAACTTCTGGAGGGGTGAG AATGGACGAAAAACAAGAATTGCTTATGAAAGAATGTGCAACATTCTCAAAGGGAAATTTCAAACTGCTG CACAAAAAGCAATGATGGATCAAGTGAGAGAGAGCCGGAACCCAGGGAATGCTGAGTTCGAAGATCTCAC TTTTCTAGCACGGTCTGCACTCATATTGAGAGGGTCGGTTGCTCACAAGTCCTGCCTGCCTGCCTGTGTG TATGGACCTGCCGTAGCCAGTGGGTACGACTTTGAAAGGGAGGGATACTCTCTAGTCGGAATAGACCCTT TCAGACTGCTTCAAAACAGCCAAGTGTACAGCCTAATCAGACCAAATGAGAATCCAGCACACAAGAGTCA ACTGGTGTGGATGGCATGCCATTCTGCCGCATTTGAAGATCTAAGAGTATTAAGCTTCATCAAAGGGACG AAGGTGCTCCCAAGAGGGAAGCTTTCCACTAGAGGAGTTCAAATTGCTTCCAATGAAAATATGGAGACTA TGGAATCAAGTACACTTGAACTGAGAAGCAGGTACTGGGCCATAAGGACCAGAAGTGGAGGAAACACCAA TCAACAGAGGGCATCTGCGGGCCAAATCAGCATACAACCTACGTTCTCAGTACAGAGAAATCTCCCTTTT GACAGAACAACCGTTATGGCAGCATTCAGTGGGAATACAGAGGGGAGAACATCTGACATGAGGACCGAAA TCATAAGGATGATGGAAAGTGCAAGACCAGAAGATGTGTCTTTCCAGGGGCGGGGAGTCTTCGAGCTCTC GGACGAAAAGGCAGCGAGCCCGATCGTGCCTTCCTTTGACATGAGTAATGAAGGATCTTATTTCTTCGGA GACAATGCAGAGGAATACGATAATATGAGTCTTCTAACCGAGGTCGAAACGCCTATCAGAAACGAATGGG GGTGCAGATGCAACGGTTCAAGTGAT
[0058]The amino acid sequence of the NP-eM2 fusion protein of Influenza A/PR/8/34/(H1N1), encoded by nucleotides 1 to 1566 of SEQ ID NO:8 is as follows, referred to herein as SEQ ID NO:9 (eM2 amino acid sequence underlined):
TABLE-US-00009 MASQGTKRSYEQMETDGERQNATEIRASVGKMIGGIGRFYIQMCTELKLSDYEGRLIQNSLTI ERMVLSAFDERRNKYLEEHPSAGKDPKKTGGPIYRRVNGKWMRELILYDKEEIRRIWRQANNG DDATAGLTHMMIWHSNLNDATYQRTRALVRTGMDPRMCSLMQGSTLPRRSGAAGAAVKGVGTM VMELVRMIKRGINDRNFWRGENGRKTRIAYERMCNILKGKFQTAAQKAMMDQVRESRNPGNAE FEDLTFLARSALILRGSVAHKSCLPACVYGPAVASGYDFEREGYSLVGIDPFRLLQNSQVYSL IRPNENPAHKSQLVWMACHSAAFEDLRVLSFIKGTKVLPRGKLSTRGVQIASNENMETMESST LELRSRYWAIRTRSGGNTNQQRASAGQISIQPTFSVQRNLPFDRTTVMAAFSGNTEGRTSDMR TEIIRMMESARPEDVSFQGRGVFELSDEKAASPIVPSFDMSNEGSYFFGDNAEEYDNMSLLTE VETPIRNEWGCRCNGSSD
[0059]The construction of functional fusion proteins often requires a linker sequence between the two fused fragments, in order to adopt an extended conformation to allow maximal flexibility. We used program LINKER (Chiquita J. Crasto C. J. and Feng, J. Protein Engineering 13:309-312 (2000), program publicly available at http://chutney.med.yale.edu/linker/linker.html (visited Apr. 16, 2003)), that can automatically generate a set of linker sequences, which are known to adopt extended conformations as determined by X-ray crystallography and NMR. Examples of suitable linkers to use in various eM2-NP or NP-eM2 fusion proteins are as follows:
TABLE-US-00010 1. GYNTRA (SEQ ID NO: 10) 2. FQMGET (SEQ ID NO: 11) 3. FDRVKHLK (SEQ ID NO: 12) 4. GRNTNGVIT (SEQ ID NO: 13) 5. VNEKTIPDHD (SEQ ID NO: 14)
[0060]The nucleotide sequence of the NP protein of Influenza B/LEE/40 is available as GenBank Accession Number K01395, and has the following sequence, referred to herein as SEQ ID NO:15:
TABLE-US-00011 1 ATGTCCAACA TGGATATTGA CAGTATAAAT ACCGGAACAA TCGATAAAAC ACCAGAAGAA 61 CTGACTCCCG GAACCAGTGG GGCAACCAGA CCAATCATCA AGCCAGCAAC CCTTGCTCCG 121 CCAAGCAACA AACGAACCCG AAATCCATCT CCAGAAAGGA CAACCACAAG CAGTGAAACC 181 GATATCGGAA GGAAAATCCA AAAGAAACAA ACCCCAACAG AGATAAAGAA GAGCGTCTAC 241 AAAATGGTGG TAAAACTGGG TGAATTCTAC AACCAGATGA TGGTCAAAGC TGGACTTAAT 301 GATGACATGG AAAGGAATCT AATTCAAAAT GCACAAGCTG TGGAGAGAAT CCTATTGGCT 361 GCAACTGATG ACAAGAAAAC TGAATACCAA AAGAAAAGGA ATGCCAGAGA TGTCAAAGAA 421 GGGAAGGAAG AAATAGACCA CAACAAGACA GGAGGCACCT TTTATAAGAT GGTAAGAGAT 481 GATAAAACCA TCTACTTCAG CCCTATAAAA ATTACCTTTT TAAAAGAAGA GGTGAAAACA 541 ATGTACAAGA CCACCATGGG GAGTGATGGT TTCAGTGGAC TAAATCACAT TATGATTGGA 601 CATTCACAGA TGAACGATGT CTGTTTCCAA AGATCAAAGG GACTGAAAAG GGTTGGACTT 661 GACCCTTCAT TAATCAGTAC TTTTGCCGGA AGCACACTAC CCAGAAGATC AGGTACAACT 721 GGTGTTGCAA TCAAAGGAGG TGGAACTTTA GTGGATGAAG CCATCCGATT TATAGGAAGA 781 GCAATGGCAG ACAGAGGGCT ACTGAGAGAC ATCAAGGCCA AGACGGCCTA TGAAAAGATT 841 CTTCTGAATC TGAAAAACAA GTGCTCTGCG CCGCAACAAA AGGCTCTAGT TGATCAAGTG 901 ATCGGAAGTA GGAACCCAGG GATTGCAGAC ATAGAAGACC TAACTCTGCT TGCCAGAAGC 961 ATGGTAGTTG TCAGACCCTC TGTAGCGAGC AAAGTGGTGC TTCCCATAAG CATTTATGCT 1021 AAAATACCTC AACTAGGATT CAATACCGAA GAATACTCTA TGGTTGGGTA TGAAGCCATG 1081 GCTCTTTATA ATATGGCAAC ACCTGTTTCC ATATTAAGAA TGGGAGATGA CGCAAAAGAT 1141 AAATCTCAAC TATTCTTCAT GTCGTGCTTC GGAGCTGCCT ATGAAGATCT AAGAGTGTTA 1201 TCTGCACTAA CGGGCACCGA ATTTAAGCCT AGATCAGCAC TAAAATGCAA GGGTTTCCAT 1261 GTCCCGGCTA AGGAGCAAGT AGAAGGAATG GGGGCAGCTC TGATGTCCAT CAAGCTTCAG 1321 TTCTGGGCCC CAATGACCAG ATCTGGAGGG AATGAAGTAA GTGGAGAAGG AGGGTCTGGT 1381 CAAATAAGTT GCAGCCCTGT GTTTGCAGTA GAAAGACCTA TTGCTCTAAG CAAGCAAGCT 1441 GTAAGAAGAA TGCTGTCAAT GAACGTTGAA GGACGTGATG CAGATGTCAA AGGAAATCTA 1501 CTCAAAATGA TGAATGATTC AATGGCAAAG AAAACCAGTG GAAATGCTTT CATTGGGAAG 1561 AAAATGTTTC AAATATCAGA CAAAAACAAA GTCAATCCCA TTGAGATTCC AATTAAGCAG 1621 ACCATCCCCA ATTTCTTCTT TGGGAGGGAC ACAGCAGAGG ATTATGATGA CCTCGATTAT 1681 TAA
[0061]The amino acid sequence of the NP protein of IBV B/LEE/40, encoded by nucleotides 1-1680 of SEQ ID NO:1 is as follows, referred to herein as SEQ ID NO:16:
TABLE-US-00012 MSNMDIDSINTGTIDKTPEELTPGTSGATRPIIKPATLAPPSNKRTRNPSPERTTTSSET DIGRKIQKKQTPTEIKKSVYKMVVKLGEFYNQMMVKAGLNDDMERNLIQNAQAVERILLA ATDDKKTEYQKKRNARDVKEGKEEIDHNKTGGTFYKMVRDDKTIYFSPIKITFLKEEVKT MYKTTMGSDGFSGLNHIMIGHSQMNDVCFQRSKGLKRVGLDPSLISTFAGSTLPRRSGTT GVAIKGGGTLVDEAIRFIGRAMADRGLLRDIKAKTAYEKILLNLKNKCSAPQQKALVDQV IGSRNPGIADIEDLTLLARSMVVVRPSVASKVVLPISIYAKIPQLGFNTEEYSMVGYEAM ALYNMATPVSILRMGDDAKDKSQLFFMSCFGAAYEDLRVLSALTGTEFKPRSALKCKGFH VPAKEQVEGMGAALMSIKLQFWAPMTRSGGNEVSGEGGSGQISCSPVFAVERPIALSKQA VRRMLSMNVEGRDADVKGNLLKMMNDSMAKKTSGNAFIGKKMFQISDKNKVNPIEIPIKQ TIPNFFFGRDTAEDYDDLDY
[0062]Non limiting examples of nucleotide sequences encoding the IAV hemagglutinin (HA) are as follows. It should be noted that HA sequences vary significantly between IV subtypes. Virtually any nucleotide sequence encoding an IV HA is suitable for the present invention. In fact, HA sequences included in vaccines and therapeutic formulations of the present invention (discussed in more detail below) might change from year to year depending on the prevalent strain or strains of IV.
[0063]The partial nucleotide sequence of the HA protein of IAV A/New_York/1/18(H1N1) is available as GenBank Accession Number AF116576, and has the following sequence, referred to herein as SEQ ID NO:17:
TABLE-US-00013 1 atggaggcaa gactactggt cttgttatgt gcatttgcag ctacaaatgc agacacaata 61 tgtataggct accatgcgaa taactcaacc gacactgttg acacagtact cgaaaagaat 121 gtgaccgtga cacactctgt taacctgctc gaagacagcc acaacggaaa actatgtaaa 181 ttaaaaggaa tagccccatt acaattgggg aaatgtaata tcgccggatg gctcttggga 241 aacccggaat gcgatttact gctcacagcg agctcatggt cctatattgt agaaacatcg 301 aactcagaga atggaacatg ttacccagga gatttcatcg actatgaaga actgagggag 361 caattgagct cagtgtcatc gtttgaaaaa ttcgaaatat ttcccaagac aagctcgtgg 421 cccaatcatg aaacaaccaa aggtgtaacg gcagcatgct cctatgcggg agcaagcagt 481 ttttacagaa atttgctgtg gctgacaaag aagggaagct catacccaaa gcttagcaag 541 tcctatgtga acaataaagg gaaagaagtc cttgtactat ggggtgttca tcatccgcct 601 accggtactg atcaacagag tctctatcag aatgcagatg cttatgtctc tgtagggtca 661 tcaaaatata acaggagatt caccccggaa atagcagcga gacccaaagt aagaggtcaa 721 gctgggagga tgaactatta ctggacatta ctagaacccg gagacacaat aacatttgag 781 gcaactggaa atctaatagc accatggtat gctttcgcac tgaatagagg ttctggatcc 841 ggtatcatca cttcagacgc accagtgcat gattgtaaca cgaagtgtca aacaccccat 901 ggtgctataa acagcagtct ccctttccag aatatacatc cagtcacaat aggagagtgc 961 ccaaaatacg tcaggagtac caaattgagg atggctacag gactaagaaa cattccatct 1021 attcaatcca ggggtctatt tggagccatt gccggtttta ttgagggggg atggactgga 1081 atgatagatg gatggtatgg ttatcatcat cagaatgaac agggatcagg ctatgcagcg 1141 gatcaaaaaa gcacacaaaa tgccattgac gggattacaa acaaggtgaa ttctgttatc 1201 gagaaaatga acacccaatt
[0064]The amino acid sequence of the partial HA protein of IAV A/New_York/1/18(H1N1), encoded by nucleotides 1 to 1218 of SEQ ID NO:17 is as follows, referred to herein as SEQ ID NO:18:
TABLE-US-00014 MEARLLVLLCAFAATNADTICIGYHANNSTDTVDTVLEKNVTVTHSVNLL EDSHNGKLCKLKGIAPLQLGKCNIAGWLLGNPECDLLLTASSWSYIVETS NSENGTCYPGDFIDYEELREQLSSVSSFEKFEIFPKTSSWPNHETTKGVT AACSYAGASSFYRNLLWLTKKGSSYPKLSKSYVNNKGKEVLVLWGVHHPP TGTDQQSLYQNADAYVSVGSSKYNRRFTPEIAARPKVRGQAGRMNYYWTL LEPGDTITFEATGNLIAPWYAFALNRGSGSGIITSDAPVHDCNTKCQTPH GAINSSLPFQNIHPVTIGECPKYVRSTKLRMATGLRNIPSIQSRGLFGAI AGFIEGGWTGMIDGWYGYHHQNEQGSGYAADQKSTQNAIDGITNKVNSVI EKMNTQ
[0065]The nucleotide sequence of the IAV A/Hong Kong/482/97 hemagglutinin (H5) is available as GenBank Accession Number AF046098, and has the following sequence, referred to herein as SEQ ID NO:19:
TABLE-US-00015 1 ctgtcaaaat ggagaaaata gtgcttcttc ttgcaacagt cagtcttgtt aaaagtgatc 61 agatttgcat tggttaccat gcaaacaact cgacagagca ggttgacaca ataatggaaa 121 agaatgttac tgttacacat gcccaagaca tactggaaag gacacacaac gggaagctct 181 gcgatctaaa tggagtgaaa cctctcattt tgagggattg tagtgtagct ggatggctcc 241 tcggaaaccc tatgtgtgac gaattcatca atgtgccgga atggtcttac atagtggaga 301 aggccagtcc agccaatgac ctctgttatc cagggaattt caacgactat gaagaactga 361 aacacctatt gagcagaata aaccattttg agaaaattca gatcatcccc aaaagttctt 421 ggtccaatca tgatgcctca tcaggggtga gctcagcatg tccatacctt gggaggtcct 481 cctttttcag aaatgtggta tggcttatca aaaagaacag tgcataccca acaataaaga 541 ggagctacaa taataccaac caagaagatc ttttggtact gtgggggatt caccatccta 601 atgatgcggc agagcagaca aagctctatc aaaatccaac cacctacatt tccgttggaa 661 catcaacact gaaccagaga ttggttccag aaatagctac tagacccaaa gtaaacgggc 721 aaagtggaag aatggagttc ttctggacaa ttttaaagcc gaatgatgcc atcaatttcg 781 agagtaatgg aaatttcatt gccccagaat atgcatacaa aattgtcaag aaaggggact 841 caacaattat gaaaagtgaa ttggaatatg gtaactgcaa caccaagtgt caaactccaa 901 tgggggcgat aaactctagt atgccattcc acaacataca ccccctcacc atcggggaat 961 gccccaaata tgtgaaatca aacagattag ttcttgcgac tggactcaga aatacccctc 1021 aaagggagag aagaagaaaa aagagaggac tatttggagc tatagcaggt tttatagagg 1081 gaggatggca gggcatggta gatggttggt atgggtacca ccatagcaat gagcagggga 1141 gtggatacgc tgcagacaaa gaatccactc aaaaggcaat agatggagtc accaataagg 1201 tcaactcgat cattaacaaa atgaacactc agtttgaggc cgttggaagg gaatttaata 1261 acttagaaag gagaatagag aatttaaaca agaaaatgga agacggattc ctagatgtct 1321 ggacttacaa tgctgaactt ctggttctca tggaaaatga gagaactctc gactttcatg 1381 actcaaatgt caagaacctt tacgacaagg tccgactaca gcttagggat aatgcaaagg 1441 aactgggtaa tggttgtttc gaattctatc acaaatgtga taatgaatgt atggaaagtg 1501 taaaaaacgg aacgtatgac tacccgcagt attcagaaga agcaagacta aacagagagg 1561 aaataagtgg agtaaaattg gaatcaatgg gaacttacca aatactgtca atttattcaa 1621 cagtggcgag ttccctagca ctggcaatca tggtagctgg tctatcttta tggatgtgct 1681 ccaatggatc gttacaatgc agaatttgca tttaaatttg tgagttcaga ttgtagttaa 1741 a
[0066]The amino acid sequence of the HA protein of IAV A/Hong Kong/482/97 (H5), encoded by nucleotides 9 to 1715 of SEQ ID NO:19 is as follows, referred to herein as SEQ ID NO:20:
TABLE-US-00016 MEKIVLLLATVSLVKSDQICIGYHANNSTEQVDTIMEKNVTVTHAQDILERTHNGKLCDLN GVKPLILRDCSVAGWLLGNPMCDEFINVPEWSYIVEKASPANDLCYPGNFNDYEELKHLLS RINHFEKIQIIPKSSWSNHDASSGVSSACPYLGRSSFFRNVVWLIKKNSAYPTIKRSYNNTNQ EDLLVLWGIHHPNDAAEQTKLYQNPTTYISVGTSTLNQRLVPEIATRPKVNGQSGRMEFFW TILKPNDAINFESNGNFIAPEYAYKIVKKGDSTIMKSELEYGNCNTKCQTPMGAINSSMPFH NIHPLTIGECPKYVKSNRLVLATGLRNTPQRERRRKKRGLFGAIAGFIEGGWQGMVDGWY GYHHSNEQGSGYAADKESTQKAIDGVTNKVNSIINKMNTQFEAVGREFNNLERRIENLNK KMEDGFLDVWTYNAELLVLMENERTLDFHDSNVKNLYDKVRLQLRDNAKELGNGCEEE YHKCDNECMESVKNGTYDYPQYSEEARLNREEISGVKLESMGTYQILSIYSTVASSLALAI MVAGLSLWMCSNGSLQCRICI
[0067]The nucleotide sequence of the IAV A/Hong Kong/1073/99(H9N2) is available as GenBank Accession Number INA404626, and has the following sequence, referred to herein as SEQ ID NO:21:
TABLE-US-00017 1 gcaaaagcag gggaattact taactagcaa aatggaaaca atatcactaa taactatact 61 actagtagta acagcaagca atgcagataa aatctgcatc ggccaccagt caacaaactc 121 cacagaaact gtggacacgc taacagaaac caatgttcct gtgacacatg ccaaagaatt 181 gctccacaca gagcataatg gaatgctgtg tgcaacaagc ctgggacatc ccctcattct 241 agacacatgc actattgaag gactagtcta tggcaaccct tcttgtgacc tgctgttggg 301 aggaagagaa tggtcctaca tcgtcgaaag atcatcagct gtaaatggaa cgtgttaccc 361 tgggaatgta gaaaacctag aggaactcag gacacttttt agttccgcta gttcctacca 421 aagaatccaa atcttcccag acacaacctg gaatgtgact tacactggaa caagcagagc 481 atgttcaggt tcattctaca ggagtatgag atggctgact caaaagagcg gtttttaccc 541 tgttcaagac gcccaataca caaataacag gggaaagagc attcttttcg tgtggggcat 601 acatcaccca cccacctata ccgagcaaac aaatttgtac ataagaaacg acacaacaac 661 aagcgtgaca acagaagatt tgaataggac cttcaaacca gtgatagggc caaggcccct 721 tgtcaatggt ctgcagggaa gaattgatta ttattggtcg gtactaaaac caggccaaac 781 attgcgagta cgatccaatg ggaatctaat tgctccatgg tatggacacg ttctttcagg 841 agggagccat ggaagaatcc tgaagactga tttaaaaggt ggtaattgtg tagtgcaatg 901 tcagactgaa aaaggtggct taaacagtac attgccattc cacaatatca gtaaatatgc 961 atttggaacc tgccccaaat atgtaagagt taatagtctc aaactggcag tcggtctgag 1021 gaacgtgcct gctagatcaa gtagaggact atttggagcc atagctggat tcatagaagg 1081 aggttggcca ggactagtcg ctggctggta tggtttccag cattcaaatg atcaaggggt 1141 tggtatggct gcagataggg attcaactca aaaggcaatt gataaaataa catccaaggt 1201 gaataatata gtcgacaaga tgaacaagca atatgaaata attgatcatg aattcagtga 1261 ggttgaaact agactcaata tgatcaataa taagattgat gaccaaatac aagacgtatg 1321 ggcatataat gcagaattgc tagtactact tgaaaatcaa aaaacactcg atgagcatga 1381 tgcgaacgtg aacaatctat ataacaaggt gaagagggca ctgggctcca atgctatgga 1441 agatgggaaa ggctgtttcg agctatacca taaatgtgat gatcagtgca tggaaacaat 1501 tcggaacggg acctataata ggagaaagta tagagaggaa tcaagactag aaaggcagaa 1561 aatagagggg gttaagctgg aatctgaggg aacttacaaa atcctcacca tttattcgac 1621 tgtcgcctca tctcttgtgc ttgcaatggg gtttgctgcc ttcctgttct gggccatgtc 1681 caatggatct tgcagatgca acatttgtat ataa
[0068]The amino acid sequence of the HA protein of IAV A/Hong Kong/1073/99 (H9N2), encoded by nucleotides 32 to 1711 of SEQ ID NO:21 is as follows, referred to herein as SEQ ID NO:22:
TABLE-US-00018 METISLITILLVVTASNADKICIGHQSTNSTETVDTLTETNVPVTHAKEL LHTEHNGMLCATSLGHPLILDTCTIEGLVYGNPSCDLLLGGREWSYIVER SSAVNGTCYPGNVENLEELRTLFSSASSYQRIQIFPDTTWNVTYTGTSRA CSGSFYRSMRWLTQKSGFYPVQDAQYTNNRGKSILFVWGIHHPPTYTEQT NLYIRNDTTTSVTTEDLNRTFKPVIGPRPLVNGLQGPIDYYWSVLKPGQT LRVRSNGNLIAPWYGHVLSGGSHGRILKTDLKGGNCVVQCQTEKGGLNST LPFHNISKYAFGTCPKYVRVNSLKLAVGLRNVPARSSRGLFGAIAGFIEG GWPGLVAGWYGFQHSNDQGVGMAADRDSTQKAIDKITSKVNNIVDKMNKQ YEIIDHEFSEVETRLNMINNKIDDQIQDVWAYNAELLVLLENQKTLDEHD ANVNNLYNKVKRALGSNAMEDGKGCFELYHKCDDQCMETIRNGTYNRRKY REESRLERQKIEGVKLESEGTYKILTIYSTVASSLVLAMGFAAFLFWAMS NGSCRCNICI
[0069]The present invention also provides vaccine compositions and methods for delivery of IV coding sequences to a vertebrate with optimal expression and safety conferred through codon optimization and/or other manipulations. These vaccine compositions are prepared and administered in such a manner that the encoded gene products are optimally expressed in the vertebrate of interest. As a result, these compositions and methods are useful in stimulating an immune response against IV infection. Also included in the invention are expression systems, delivery systems, and codon-optimized IV coding regions.
[0070]In a specific embodiment, the invention provides combinatorial polynucleotide (e.g., DNA) vaccines which combine both a polynucleotide vaccine and polypeptide (e.g., either a recombinant protein, a purified subunit protein, a viral vector expressing an isolated IV polypeptide, or in the form of an inactivated or attenuated IV vaccine) vaccine in a single formulation. The single formulation comprises an IV polypeptide-encoding polynucleotide vaccine as described herein, and optionally, an effective amount of a desired isolated IV polypeptide or fragment, variant, or derivative thereof. The polypeptide may exist in any form, for example, a recombinant protein, a purified subunit protein, a viral vector expressing an isolated IV polypeptide, or in the form of an inactivated or attenuated IV vaccine. The IV polypeptide or fragment, variant, or derivative thereof encoded by the polynucleotide vaccine may be identical to the isolated IV polypeptide or fragment, variant, or derivative thereof. Alternatively, the IV polypeptide or fragment, variant, or derivative thereof encoded by the polynucleotide may be different from the isolated IV polypeptide or fragment, variant, or derivative thereof.
[0071]It is to be noted that the term "a" or "an" entity refers to one or more of that entity; for example, "a polynucleotide," is understood to represent one or more polynucleotides. As such, the terms "a" (or "an"), "one or more," and "at least one" can be used interchangeably herein.
[0072]The term "polynucleotide" is intended to encompass a singular nucleic acid or nucleic acid fragment as well as plural nucleic acids or nucleic acid fragments, and refers to an isolated molecule or construct, e.g., a virus genome (e.g., a non-infectious viral genome), messenger RNA (mRNA), plasmid DNA (pDNA), or derivatives of pDNA (e.g., minicircles as described in (Darquet, A-M et al., Gene Therapy 4:1341-1349 (1997)) comprising a polynucleotide. A polynucleotide may comprise a conventional phosphodiester bond or a non-conventional bond (e.g., an amide bond, such as found in peptide nucleic acids (PNA)).
[0073]The terms "nucleic acid" or "nucleic acid fragment" refer to any one or more nucleic acid segments, e.g., DNA or RNA fragments, present in a polynucleotide or construct. A nucleic acid or fragment thereof may be provided in linear (e.g., mRNA) or circular (e.g., plasmid) form as well as double-stranded or single-stranded forms. By "isolated" nucleic acid or polynucleotide is intended a nucleic acid molecule, DNA or RNA, which has been removed from its native environment. For example, a recombinant polynucleotide contained in a vector is considered isolated for the purposes of the present invention. Further examples of an isolated polynucleotide include recombinant polynucleotides maintained in heterologous host cells or purified (partially or substantially) polynucleotides in solution. Isolated RNA molecules include in vivo or in vitro RNA transcripts of the polynucleotides of the present invention. Isolated polynucleotides or nucleic acids according to the present invention further include such molecules produced synthetically.
[0074]As used herein, a "coding region" is a portion of nucleic acid which consists of codons translated into amino acids. Although a "stop codon" (TAG, TGA, or TAA) is not translated into an amino acid, it may be considered to be part of a coding region, but any flanking sequences, for example promoters, ribosome binding sites, transcriptional terminators, and the like, are not part of a coding region. Two or more nucleic acids or nucleic acid fragments of the present invention can be present in a single polynucleotide construct, e.g., on a single plasmid, or in separate polynucleotide constructs, e.g., on separate (different) plasmids. Furthermore, any nucleic acid or nucleic acid fragment may encode a single IV polypeptide or fragment, derivative, or variant thereof, e.g., or may encode more than one polypeptide, e.g., a nucleic acid may encode two or more polypeptides. In addition, a nucleic acid may include a regulatory element such as a promoter, ribosome binding site, or a transcription terminator, or may encode heterologous coding regions fused to the IV coding region, e.g., specialized elements or motifs, such as a secretory signal peptide or a heterologous functional domain.
[0075]The terms "fragment," "variant," "derivative" and "analog" when referring to IV polypeptides of the present invention include any polypeptides which retain at least some of the immunogenicity or antigenicity of the corresponding native polypeptide. Fragments of IV polypeptides of the present invention include proteolytic fragments, deletion fragments and in particular, fragments of IV polypeptides which exhibit increased secretion from the cell or higher immunogenicity or reduced pathogenicity when delivered to an animal. Polypeptide fragments further include any portion of the polypeptide which comprises an antigenic or immunogenic epitope of the native polypeptide, including linear as well as three-dimensional epitopes. Variants of IV polypeptides of the present invention include fragments as described above, and also polypeptides with altered amino acid sequences due to amino acid substitutions, deletions, or insertions. Variants may occur naturally, such as an allelic variant. By an "allelic variant" is intended alternate forms of a gene occupying a given locus on a chromosome or genome of an organism or virus. Genes II, Lewin, B., ed., John Wiley & Sons, New York (1985), which is incorporated herein by reference. For example, as used herein, variations in a given gene product. When referring to IV NA or HA proteins, each such protein is a "variant," in that native IV strains are distinguished by the type of NA and HA proteins encoded by the virus. However, within a single HA or NA variant type, further naturally or non-naturally occurring variations such as amino acid deletions, insertions or substitutions may occur. Non-naturally occurring variants may be produced using art-known mutagenesis techniques. Variant polypeptides may comprise conservative or non-conservative amino acid substitutions, deletions or additions. Derivatives of IV polypeptides of the present invention, are polypeptides which have been altered so as to exhibit additional features not found on the native polypeptide. Examples include fusion proteins. An analog is another form of an IV polypeptide of the present invention. An example is a proprotein which can be activated by cleavage of the proprotein to produce an active mature polypeptide.
[0076]The terms "infectious polynucleotide" or "infectious nucleic acid" are intended to encompass isolated viral polynucleotides and/or nucleic acids which are solely sufficient to mediate the synthesis of complete infectious virus particles upon uptake by permissive cells. Thus, "infectious nucleic acids" do not require pre-synthesized copies of any of the polypeptides it encodes, e.g., viral replicases, in order to initiate its replication cycle in a permissive host cell.
[0077]The terms "non-infectious polynucleotide" or "non-infectious nucleic acid" as defined herein are polynucleotides or nucleic acids which cannot, without additional added materials, e.g, polypeptides, mediate the synthesis of complete infectious virus particles upon uptake by permissive cells. An infectious polynucleotide or nucleic acid is not made "non-infectious" simply because it is taken up by a non-permissive cell. For example, an infectious viral polynucleotide from a virus with limited host range is infectious if it is capable of mediating the synthesis of complete infectious virus particles when taken up by cells derived from a permissive host (i.e., a host permissive for the virus itself). The fact that uptake by cells derived from a non-permissive host does not result in the synthesis of complete infectious virus particles does not make the nucleic acid "non-infectious." In other words, the term is not qualified by the nature of the host cell, the tissue type, or the species taking up the polynucleotide or nucleic acid fragment.
[0078]In some cases, an isolated infectious polynucleotide or nucleic acid may produce fully-infectious virus particles in a host cell population which lacks receptors for the virus particles, i.e., is non-permissive for virus entry. Thus viruses produced will not infect surrounding cells. However, if the supernatant containing the virus particles is transferred to cells which are permissive for the virus, infection will take place.
[0079]The terms "replicating polynucleotide" or "replicating nucleic acid" are meant to encompass those polynucleotides and/or nucleic acids which, upon being taken up by a permissive host cell, are capable of producing multiple, e.g., one or more copies of the same polynucleotide or nucleic acid. Infectious polynucleotides and nucleic acids are a subset of replicating polynucleotides and nucleic acids; the terms are not synonymous. For example, a defective virus genome lacking the genes for virus coat proteins may replicate, e.g., produce multiple copies of itself, but is NOT infectious because it is incapable of mediating the synthesis of complete infectious virus particles unless the coat proteins, or another nucleic acid encoding the coat proteins, are exogenously provided.
[0080]In certain embodiments, the polynucleotide, nucleic acid, or nucleic acid fragment is DNA. In the case of DNA, a polynucleotide comprising a nucleic acid which encodes a polypeptide normally also comprises a promoter and/or other transcription or translation control elements operably associated with the polypeptide-encoding nucleic acid fragment. An operable association is when a nucleic acid fragment encoding a gene product, e.g., a polypeptide, is associated with one or more regulatory sequences in such a way as to place expression of the gene product under the influence or control of the regulatory sequence(s). Two DNA fragments (such as a polypeptide-encoding nucleic acid fragment and a promoter associated with the 5' end of the nucleic acid fragment) are "operably associated" if induction of promoter function results in the transcription of mRNA encoding the desired gene product and if the nature of the linkage between the two DNA fragments does not (1) result in the introduction of a frame-shift mutation, (2) interfere with the ability of the expression regulatory sequences to direct the expression of the gene product, or (3) interfere with the ability of the DNA template to be transcribed. Thus, a promoter region would be operably associated with a nucleic acid fragment encoding a polypeptide if the promoter was capable of effecting transcription of that nucleic acid fragment. The promoter may be a cell-specific promoter that directs substantial transcription of the DNA only in predetermined cells. Other transcription control elements, besides a promoter, for example enhancers, operators, repressors, and transcription termination signals, can be operably associated with the polynucleotide to direct cell-specific transcription. Suitable promoters and other transcription control regions are disclosed herein.
[0081]A variety of transcription control regions are known to those skilled in the art. These include, without limitation, transcription control regions which function in vertebrate cells, such as, but not limited to, promoter and enhancer segments from cytomegaloviruses (the immediate early promoter, in conjunction with intron-A), simian virus 40 (the early promoter), and retroviruses (such as Rous sarcoma virus). Other transcription control regions include those derived from vertebrate genes such as actin, heat shock protein, bovine growth hormone and rabbit ฮฒ-globin, as well as other sequences capable of controlling gene expression in eukaryotic cells. Additional suitable transcription control regions include tissue-specific promoters and enhancers as well as lymphokine-inducible promoters (e.g., promoters inducible by interferons or interleukins).
[0082]Similarly, a variety of translation control elements are known to those of ordinary skill in the art. These include, but are not limited to ribosome binding sites, translation initiation and termination codons, elements from picornaviruses (particularly an internal ribosome entry site, or IRES, also referred to as a CITE sequence).
[0083]A DNA polynucleotide of the present invention may be a circular or linearized plasmid or vector, or other linear DNA which may also be non-infectious and nonintegrating (i.e., does not integrate into the genome of vertebrate cells). A linearized plasmid is a plasmid that was previously circular but has been linearized, for example, by digestion with a restriction endonuclease. Linear DNA may be advantageous in certain situations as discussed, e.g., in Cherng, J. Y., et al., J. Control. Release 60:343-53 (1999), and Chen, Z. Y., et al. Mol. Ther. 3:403-10 (2001), both of which are incorporated herein by reference. As used herein, the terms plasmid and vector can be used interchangeably
[0084]Alternatively, DNA virus genomes may be used to administer DNA polynucleotides into vertebrate cells. In certain embodiments, a DNA virus genome of the present invention is nonreplicative, noninfectious, and/or nonintegrating. Suitable DNA virus genomes include without limitation, herpesvirus genomes, adenovirus genomes, adeno-associated virus genomes, and poxvirus genomes. References citing methods for the in vivo introduction of non-infectious virus genomes to vertebrate tissues are well known to those of ordinary skill in the art, and are cited supra.
[0085]In other embodiments, a polynucleotide of the present invention is RNA, for example, in the form of messenger RNA (mRNA). Methods for introducing RNA sequences into vertebrate cells are described in U.S. Pat. No. 5,580,859, the disclosure of which is incorporated herein by reference in its entirety.
[0086]Polynucleotides, nucleic acids, and nucleic acid fragments of the present invention may be associated with additional nucleic acids which encode secretory or signal peptides, which direct the secretion of a polypeptide encoded by a nucleic acid fragment or polynucleotide of the present invention. According to the signal hypothesis, proteins secreted by mammalian cells have a signal peptide or secretory leader sequence which is cleaved from the mature protein once export of the growing protein chain across the rough endoplasmic reticulum has been initiated. Those of ordinary skill in the art are aware that polypeptides secreted by vertebrate cells generally have a signal peptide fused to the N-terminus of the polypeptide, which is cleaved from the complete or "full length" polypeptide to produce a secreted or "mature" form of the polypeptide. In certain embodiments, the native leader sequence is used, or a functional derivative of that sequence that retains the ability to direct the secretion of the polypeptide that is operably associated with it. Alternatively, a heterologous mammalian leader sequence, or a functional derivative thereof, may be used. For example, the wild-type leader sequence may be substituted with the leader sequence of human tissue plasminogen activator (TPA) or mouse ฮฒ-glucuronidase.
[0087]In accordance with one aspect of the present invention, there is provided a polynucleotide construct, for example, a plasmid, comprising a nucleic acid fragment, where the nucleic acid fragment is a fragment of a codon-optimized coding region operably encoding an IV-derived polypeptide, where the coding region is optimized for expression in vertebrate cells, of a desired vertebrate species, e.g., humans, to be delivered to a vertebrate to be treated or immunized. Suitable IV polypeptides, or fragments, variants, or derivatives thereof may be derived from, but are not limited to, the IV HA, NA, NP, M1, or M2 proteins. Additional IV-derived coding sequences, e.g., coding for HA, NA, NP, M1, M2 or eM2, may also be included on the plasmid, or on a separate plasmid, and expressed, either using native IV codons or codons optimized for expression in the vertebrate to be treated or immunized. When such a plasmid encoding one or more optimized influenza sequences is delivered, in vivo to a tissue of the vertebrate to be treated or immunized, one or more of the encoded gene products will be expressed, i.e., transcribed and translated. The level of expression of the gene product(s) will depend to a significant extent on the strength of the associated promoter and the presence and activation of an associated enhancer element, as well as the degree of optimization of the coding region.
[0088]As used herein, the term "plasmid" refers to a construct made up of genetic material (i.e., nucleic acids). Typically a plasmid contains an origin of replication which is functional in bacterial host cells, e.g., Escherichia coli, and selectable markers for detecting bacterial host cells comprising the plasmid. Plasmids of the present invention may include genetic elements as described herein arranged such that an inserted coding sequence can be transcribed and translated in eukaryotic cells. Also, the plasmid may include a sequence from a viral nucleic acid. However, such viral sequences normally are not sufficient to direct or allow the incorporation of the plasmid into a viral particle, and the plasmid is therefore a non-viral vector. In certain embodiments described herein, a plasmid is a closed circular DNA molecule.
[0089]The term "expression" refers to the biological production of a product encoded by a coding sequence. In most cases a DNA sequence, including the coding sequence, is transcribed to form a messenger-RNA (mRNA). The messenger-RNA is then translated to form a polypeptide product which has a relevant biological activity. Also, the process of expression may involve further processing steps to the RNA product of transcription, such as splicing to remove introns, and/or post-translational processing of a polypeptide product.
[0090]As used herein, the term "polypeptide" is intended to encompass a singular "polypeptide" as well as plural "polypeptides," and comprises any chain or chains of two or more amino acids. Thus, as used herein, terms including, but not limited to "peptide," "dipeptide," "tripeptide," "protein," "amino acid chain," or any other term used to refer to a chain or chains of two or more amino acids, are included in the definition of a "polypeptide," and the term "polypeptide" can be used instead of, or interchangeably with any of these terms. The term further includes polypeptides which have undergone post-translational modifications, for example, glycosylation, acetylation, phosphorylation, amidation, derivatization by known protecting/blocking groups, proteolytic cleavage, or modification by non-naturally occurring amino acids.
[0091]Also included as polypeptides of the present invention are fragments, derivatives, analogs, or variants of the foregoing polypeptides, and any combination thereof. Polypeptides, and fragments, derivatives, analogs, or variants thereof of the present invention can be antigenic and immunogenic polypeptides related to IV polypeptides, which are used to prevent or treat, i.e., cure, ameliorate, lessen the severity of, or prevent or reduce contagion of infectious disease caused by the IV.
[0092]As used herein, an "antigenic polypeptide" or an "immunogenic polypeptide" is a polypeptide which, when introduced into a vertebrate, reacts with the vertebrate's immune system molecules, i.e., is antigenic, and/or induces an immune response in the vertebrate, i.e., is immunogenic. It is quite likely that an immunogenic polypeptide will also be antigenic, but an antigenic polypeptide, because of its size or conformation, may not necessarily be immunogenic. Examples of antigenic and immunogenic polypeptides of the present invention include, but are not limited to, e.g., HA or fragments or variants thereof, e.g. NP, or fragments thereof, e.g., PB1, or fragments or variants thereof, e.g., NS1 or fragments or variants thereof, e.g., M1 or fragments or variants thereof, and e.g. M2 or fragments or variants thereof including the extracellular fragment of M2 (eM2), or e.g., any of the foregoing polypeptides or fragments fused to a heterologous polypeptide, for example, a hepatitis B core antigen. Isolated antigenic and immunogenic polypeptides of the present invention in addition to those encoded by polynucleotides of the invention, may be provided as a recombinant protein, a purified subunit, a viral vector expressing the protein, or may be provided in the form of an inactivated IV vaccine, e.g., a live-attenuated virus vaccine, a heat-killed virus vaccine, etc.
[0093]By an "isolated" IV polypeptide or a fragment, variant, or derivative thereof is intended an IV polypeptide or protein that is not in its natural form. No particular level of purification is required. For example, an isolated IV polypeptide can be removed from its native or natural environment. Recombinantly produced IV polypeptides and proteins expressed in host cells are considered isolated for purposed of the invention, as are native or recombinant IV polypeptides which have been separated, fractionated, or partially or substantially purified by any suitable technique, including the separation of IV virions from eggs or culture cells in which they have been propagated. In addition, an isolated IV polypeptide or protein can be provided as a live or inactivated viral vector expressing an isolated IV polypeptide and can include those found in inactivated IV vaccine compositions. Thus, isolated IV polypeptides and proteins can be provided as, for example, recombinant IV polypeptides, a purified subunit of IV, a viral vector expressing an isolated IV polypeptide, or in the form of an inactivated or attenuated IV vaccine.
[0094]The term "epitopes," as used herein, refers to portions of a polypeptide having antigenic or immunogenic activity in a vertebrate, for example a human. An "immunogenic epitope," as used herein, is defined as a portion of a protein that elicits an immune response in an animal, as determined by any method known in the art. The term "antigenic epitope," as used herein, is defined as a portion of a protein to which an antibody or T-cell receptor can immunospecifically bind as determined by any method well known in the art. Immunospecific binding excludes non-specific binding but does not exclude cross-reactivity with other antigens. Where all immunogenic epitopes are antigenic, antigenic epitopes need not be immunogenic.
[0095]The term "immunogenic carrier" as used herein refers to a first polypeptide or fragment, variant, or derivative thereof which enhances the immunogenicity of a second polypeptide or fragment, variant, or derivative thereof. Typically, an "immunogenic carrier" is fused to or conjugated to the desired polypeptide or fragment thereof. An example of an "immunogenic carrier" is a recombinant hepatitis B core antigen expressing, as a surface epitope, an immunogenic epitope of interest. See, e.g., European Patent No. EP 0385610 B1, which is incorporated herein by reference in its entirety.
[0096]In the present invention, antigenic epitopes preferably contain a sequence of at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, at least 10, at least 15, at least 20, at least 25, or between about 8 to about 30 amino acids contained within the amino acid sequence of an IV polypeptide of the invention, e.g., an NP polypeptide, an M1 polypeptide or an M2 polypeptide. Certain polypeptides comprising immunogenic or antigenic epitopes are at least 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, or 100 amino acid residues in length. Antigenic as well as immunogenic epitopes may be linear, i.e., be comprised of contiguous amino acids in a polypeptide, or may be three dimensional, i.e., where an epitope is comprised of non-contiguous amino acids which come together due to the secondary or tertiary structure of the polypeptide, thereby forming an epitope.
[0097]As to the selection of peptides or polypeptides bearing an antigenic epitope (e.g., that contain a region of a protein molecule to which an antibody or T cell receptor can bind), it is well known in that art that relatively short synthetic peptides that mimic part of a protein sequence are routinely capable of eliciting an antiserum that reacts with the partially mimicked protein. See, e.g., Sutcliffe, J. G., et al., Science 219:660-666 (1983), which is herein incorporated by reference.
[0098]Peptides capable of eliciting an immunogenic response are frequently represented in the primary sequence of a protein, can be characterized by a set of simple chemical rules, and are confined neither to immunodominant regions of intact proteins nor to the amino or carboxyl terminals. Peptides that are extremely hydrophobic and those of six or fewer residues generally are ineffective at inducing antibodies that bind to the mimicked protein; longer peptides, especially those containing proline residues, usually are effective. Sutcliffe et al., supra, at 661. For instance, 18 of 20 peptides designed according to these guidelines, containing 8-39 residues covering 75% of the sequence of the IV hemagglutinin HAl polypeptide chain, induced antibodies that reacted with the HAl protein or intact virus; and 12/12 peptides from the MuLV polymerase and 18/18 from the rabies glycoprotein induced antibodies that precipitated the respective proteins.
Codon Optimization
[0099]"Codon optimization" is defined as modifying a nucleic acid sequence for enhanced expression in the cells of the vertebrate of interest, e.g. human, by replacing at least one, more than one, or a significant number, of codons of the native sequence with codons that are more frequently or most frequently used in the genes of that vertebrate. Various species exhibit particular bias for certain codons of a particular amino acid.
[0100]In one aspect, the present invention relates to polynucleotides comprising nucleic acid fragments of codon-optimized coding regions which encode IV polypeptides, or fragments, variants, or derivatives thereof, with the codon usage adapted for optimized expression in the cells of a given vertebrate, e.g., humans. These polynucleotides are prepared by incorporating codons preferred for use in the genes of the vertebrate of interest into the DNA sequence. Also provided are polynucleotide expression constructs, vectors, and host cells comprising nucleic acid fragments of codon-optimized coding regions which encode IV polypeptides, and fragments, variants, or derivatives thereof, and various methods of using the polynucleotide expression constructs, vectors, host cells to treat or prevent influenza disease in a vertebrate.
[0101]As used herein the term "codon-optimized coding region" means a nucleic acid coding region that has been adapted for expression in the cells of a given vertebrate by replacing at least one, or more than one, or a significant number, of codons with one or more codons that are more frequently used in the genes of that vertebrate.
[0102]Deviations in the nucleotide sequence that comprise the codons encoding the amino acids of any polypeptide chain allow for variations in the sequence coding for the gene. Since each codon consists of three nucleotides, and the nucleotides comprising DNA are restricted to four specific bases, there are 64 possible combinations of nucleotides, 61 of which encode amino acids (the remaining three codons encode signals ending translation). The "genetic code" which shows which codons encode which amino acids is reproduced herein as Table 1. As a result, many amino acids are designated by more than one codon. For example, the amino acids alanine and proline are coded for by four triplets, serine and arginine by six, whereas tryptophan and methionine are coded by just one triplet. This degeneracy allows for DNA base composition to vary over a wide range without altering the amino acid sequence of the proteins encoded by the DNA.
TABLE-US-00019 TABLE 1 The Standard Genetic Code T C A G T TTT Phe (F) TCT Ser (S) TAT Tyr (Y) TGT Cys (C) TTC Phe (F) TCC Ser (S) TAC Tyr (Y) TGC TTA Leu (L) TCA Ser (S) TAA Ter TGA Ter TTG Leu (L) TCG Ser (S) TAG Ter TGG Trp (W) C CTT Leu (L) CCT Pro (P) CAT His (H) CGT Arg (R) CTC Leu (L) CCC Pro (P) CAC His (H) CGC Arg (R) CTA Leu (L) CCA Pro (P) CAA Gln (Q) CGA Arg (R) CTG Leu (L) CCG Pro (P) CAG Gln (Q) CGG Arg (R) A ATT Ile (I) ACT Thr (T) AAT Asn (N) AGT Ser (S) ATC Ile (I) ACC Thr (T) AAC Asn (N) AGC Ser (S) ATA Ile (I) ACA Thr (T) AAA Lys (K) AGA Arg (R) ATG Met (M) ACG Thr (T) AAG Lys (K) AGG Arg (R) G GTT Val (V) GCT Ala (A) GAT Asp (D) GGT Gly (G) GTC Val (V) GCC Ala (A) GAC Asp (D) GGC Gly (G) GTA Val (V) GCA Ala (A) GAA Glu (E) GGA Gly (G) GTG Val (V) GCG Ala (A) GAG Glu (E) GGG Gly (G)
[0103]Many organisms display a bias for use of particular codons to code for insertion of a particular amino acid in a growing peptide chain. Codon preference or codon bias, differences in codon usage between organisms, is afforded by degeneracy of the genetic code, and is well documented among many organisms. Codon bias often correlates with the efficiency of translation of messenger RNA (mRNA), which is in turn believed to be dependent on, inter alta, the properties of the codons being translated and the availability of particular transfer RNA (tRNA) molecules. The predominance of selected tRNAs in a cell is generally a reflection of the codons used most frequently in peptide synthesis. Accordingly, genes can be tailored for optimal gene expression in a given organism based on codon optimization.
[0104]Given the large number of gene sequences available for a wide variety of animal, plant and microbial species, it is possible to calculate the relative frequencies of codon usage. Codon usage tables are readily available, for example, at the "Codon Usage Database" available at http://www.kazusa.or.jp/codon/ (visited Jul. 9, 2002), and these tables can be adapted in a number of ways. See Nakamura, Y., et al. "Codon usage tabulated from the international DNA sequence databases: status for the year 2000" Nucl. Acids Res. 28:292 (2000), which is incorporated by reference. As examples, the codon usage tables for human, mouse, domestic cat, and cow, calculated from GenBank Release 128.0 (15 Feb. 2002), are reproduced below as Tables 2-5. These Tables use mRNA nomenclature, and so instead of thymine (T) which is found in DNA, the Tables use uracil (U) which is found in RNA. The Tables have been adapted so that frequencies are calculated for each amino acid, rather than for all 64 codons.
TABLE-US-00020 TABLE 2 Codon Usage Table for Human Genes (Homo sapiens) Amino Acid Codon Number Frequency Phe UUU 326146 0.4525 Phe UUC 394680 0.5475 Total 720826 Leu UUA 139249 0.0728 Leu UUG 242151 0.1266 Leu CUU 246206 0.1287 Leu CUC 374262 0.1956 Leu CUA 133980 0.0700 Leu CUG 777077 0.4062 Total 1912925 Ile AUU 303721 0.3554 Ile AUC 414483 0.4850 Ile AUA 136399 0.1596 Total 854603 Met AUG 430946 1.0000 Total 430946 Val GUU 210423 0.1773 Val GUC 282445 0.2380 Val GUA 134991 0.1137 Val GUG 559044 0.4710 Total 1186903 Ser UCU 282407 0.1840 Ser UCC 336349 0.2191 Ser UCA 225963 0.1472 Ser UCG 86761 0.0565 Ser AGU 230047 0.1499 Ser AGC 373362 0.2433 Total 1534889 Pro CCU 333705 0.2834 Pro CCC 386462 0.3281 Pro CCA 322220 0.2736 Pro CCG 135317 0.1149 Total 1177704 Thr ACU 247913 0.2419 Thr ACC 371420 0.3624 Thr ACA 285655 0.2787 Thr ACG 120022 0.1171 Total 1025010 Ala GCU 360146 0.2637 Ala GCC 551452 0.4037 Ala GCA 308034 0.2255 Ala GCG 146233 0.1071 Total 1365865 Tyr UAU 232240 0.4347 Tyr UAC 301978 0.5653 Total 534218 His CAU 201389 0.4113 His CAC 288200 0.5887 Total 489589 Gln CAA 227742 0.2541 Gln CAG 668391 0.7459 Total 896133 Asn AAU 322271 0.4614 Asn AAC 376210 0.5386 Total 698481 Lys AAA 462660 0.4212 Lys AAG 635755 0.5788 Total 1098415 Asp GAU 430744 0.4613 Asp GAC 502940 0.5387 Total 933684 Glu GAA 561277 0.4161 Glu GAG 787712 0.5839 Total 1348989 Cys UGU 190962 0.4468 Cys UGC 236400 0.5532 Total 427362 Trp UGG 248083 1.0000 Total 248083 Arg CGU 90899 0.0830 Arg CGC 210931 0.1927 Arg CGA 122555 0.1120 Arg CGG 228970 0.2092 Arg AGA 221221 0.2021 Arg AGG 220119 0.2011 Total 1094695 Gly GGU 209450 0.1632 Gly GGC 441320 0.3438 Gly GGA 315726 0.2459 Gly GGG 317263 0.2471 Total 1283759 Stop UAA 13963 Stop UAG 10631 Stop UGA 24607
TABLE-US-00021 TABLE 3 Codon Usage Table for Mouse Genes (Mus musculus) Amino Acid Codon Number Frequency Phe UUU 150467 0.4321 Phe UUC 197795 0.5679 Total 348262 Leu UUA 55635 0.0625 Leu UUG 116210 0.1306 Leu CUU 114699 0.1289 Leu CUC 179248 0.2015 Leu CUA 69237 0.0778 Leu CUG 354743 0.3987 Total 889772 Ile AUU 137513 0.3367 Ile AUC 208533 0.5106 Ile AUA 62349 0.1527 Total 408395 Met AUG 204546 1.0000 Total 204546 Val GUU 93754 0.1673 Val GUC 140762 0.2513 Val GUA 64417 0.1150 Val GUG 261308 0.4664 Total 560241 Ser UCU 139576 0.1936 Ser UCC 160313 0.2224 Ser UCA 100524 0.1394 Ser UCG 38632 0.0536 Ser AGU 108413 0.1504 Ser AGC 173518 0.2407 Total 720976 Pro CCU 162613 0.3036 Pro CCC 164796 0.3077 Pro CCA 151091 0.2821 Pro CCG 57032 0.1065 Total 535532 Thr ACU 119832 0.2472 Thr ACC 172415 0.3556 Thr ACA 140420 0.2896 Thr ACG 52142 0.1076 Total 484809 Ala GCU 178593 0.2905 Ala GCC 236018 0.3839 Ala GCA 139697 0.2272 Ala GCG 60444 0.0983 Total 614752 Tyr UAU 108556 0.4219 Tyr UAC 148772 0.5781 Total 257328 His CAU 88786 0.3973 His CAC 134705 0.6027 Total 223491 Gln CAA 101783 0.2520 Gln CAG 302064 0.7480 Total 403847 Asn AAU 138868 0.4254 Asn AAC 187541 0.5746 Total 326409 Lys AAA 188707 0.3839 Lys AAG 302799 0.6161 Total 491506 Asp GAU 189372 0.4414 Asp GAC 239670 Total 429042 Glu GAA 235842 0.4015 Glu GAG 351582 0.5985 Total 587424 Cys UGU 97385 0.4716 Cys UGC 109130 0.5284 Total 206515 Trp UGG 112588 1.0000 Total 112588 Arg CGU 41703 0.0863 Arg CGC 86351 0.1787 Arg CGA 58928 0.1220 Arg CGG 92277 0.1910 Arg AGA 101029 0.2091 Arg AGG 102859 0.2129 Total 483147 Gly GGU 103673 0.1750 Gly GGC 198604 0.3352 Gly GGA 151497 0.2557 Gly GGG 138700 0.2341 Total 592474 Stop UAA 5499 Stop UAG 4661 Stop UGA 10356
TABLE-US-00022 TABLE 4 Codon Usage Table for Domestic Cat Genes (Felis cattus) Amino Acid Codon Number Frequency of usage Phe UUU 1204.00 0.4039 Phe UUC 1777.00 0.5961 Total 2981 Leu UUA 404.00 0.0570 Leu UUG 857.00 0.1209 Leu CUU 791.00 0.1116 Leu CUC 1513.00 0.2135 Leu CUA 488.00 0.0688 Leu CUG 3035.00 0.4282 Total 7088 Ile AUU 1018.00 0.2984 Ile AUC 1835.00 0.5380 Ile AUA 558.00 0.1636 Total 3411 Met AUG 1553.00 0.0036 Total 1553 Val GUU 696.00 0.1512 Val GUC 1279.00 0.2779 Val GUA 463.00 0.1006 Val GUG 2164.00 0.4702 Total 4602 Ser UCU 940.00 0.1875 Ser UCC 1260.00 0.2513 Ser UCA 608.00 0.1213 Ser UCG 332.00 0.0662 Ser AGU 672.00 0.1340 Ser AGC 1202.00 0.2397 Total 5014 Pro CCU 958.00 0.2626 Pro CCC 1375.00 0.3769 Pro CCA 850.00 0.2330 Pro CCG 465.00 0.1275 Total 3648 Thr ACU 822.00 0.2127 Thr ACC 1574.00 0.4072 Thr ACA 903.00 0.2336 Thr ACG 566.00 0.1464 Total 3865 Ala GCU 1129.00 0.2496 Ala GCC 1951.00 0.4313 Ala GCA 883.00 0.1952 Ala GCG 561.00 0.1240 Total 4524 Tyr UAU 837.00 0.3779 Tyr UAC 1378.00 0.6221 Total 2215 His CAU 594.00 0.3738 His CAC 995.00 0.6262 Total 1589 Gln CAA 747.00 0.2783 Gln CAG 1937.00 0.7217 Total 2684 Asn AAU 1109.00 0.3949 Asn AAC 1699.00 0.6051 Total 2808 Lys AAA 1445.00 0.4088 Lys AAG 2090.00 0.5912 Total 3535 Asp GAU 1255.00 0.4055 Asp GAC 1840.00 0.5945 Total 3095 Glu GAA 1637.00 0.4164 Glu GAG 2294.00 0.5836 Total 3931 Cys UGU 719.00 0.4425 Cys UGC 906.00 0.5575 Total 1625 Trp UGG 1073.00 1.0000 Total 1073 Arg CGU 236.00 0.0700 Arg CGC 629.00 0.1865 Arg CGA 354.00 0.1050 Arg CGG 662.00 0.1963 Arg AGA 712.00 0.2112 Arg AGG 779.00 0.2310 Total 3372 Gly GGU 648.00 0.1498 Gly GGC 1536.00 0.3551 Gly GGA 1065.00 0.2462 Gly GGG 1077.00 0.2490 Total 4326 Stop UAA 55 Stop UAG 36 Stop UGA 110
TABLE-US-00023 TABLE 5 Codon Usage Table for Cow Genes (Bos taurus) Amino Acid Codon Number Frequency of usage Phe UUU 13002 0.4112 Phe UUC 18614 0.5888 Total 31616 Leu UUA 4467 0.0590 Leu UUG 9024 0.1192 Leu CUU 9069 0.1198 Leu CUC 16003 0.2114 Leu CUA 4608 0.0609 Leu CUG 32536 0.4298 Total 75707 Ile AUU 12474 0.3313 Ile AUC 19800 0.5258 Ile AUA 5381 0.1429 Total 37655 Met AUG 17770 1.0000 Total 17770 Val GUU 8212 0.1635 Val GUC 12846 0.2558 Val GUA 4932 0.0982 Val GUG 24222 0.4824 Total 50212 Ser UCU 10287 0.1804 Ser UCC 13258 0.2325 Ser UCA 7678 0.1347 Ser UCG 3470 0.0609 Ser AGU 8040 0.1410 Ser AGC 14279 0.2505 Total 57012 Pro CCU 11695 0.2684 Pro CCC 15221 0.3493 Pro CCA 11039 0.2533 Pro CCG 5621 0.1290 Total 43576 Thr ACU 9372 0.2203 Thr ACC 16574 0.3895 Thr ACA 10892 0.2560 Thr ACG 5712 0.1342 Total 42550 Ala GCU 13923 0.2592 Ala GCC 23073 0.4295 Ala GCA 10704 0.1992 Ala GCG 6025 0.1121 Total 53725 Tyr UAU 9441 0.3882 Tyr UAC 14882 0.6118 Total 24323 His CAU 6528 0.3649 His CAC 11363 0.6351 Total 17891 Gln CAA 8060 0.2430 Gln CAG 25108 0.7570 Total 33168 Asn AAU 12491 0.4088 Asn AAC 18063 0.5912 Total 30554 Lys AAA 17244 0.3897 Lys AAG 27000 0.6103 Total 44244 Asp GAU 16615 0.4239 Asp GAC 22580 0.5761 Total 39195 Glu GAA 21102 0.4007 Glu GAG 31555 0.5993 Total 52657 Cys UGU 7556 0.4200 Cys UGC 10436 0.5800 Total 17992 Trp UGG 10706 1.0000 Total 10706 Arg CGU 3391 0.0824 Arg CGC 7998 0.1943 Arg CGA 4558 0.1108 Arg CGG 8300 0.2017 Arg AGA 8237 0.2001 Arg AGG 8671 0.2107 Total 41155 Gly GGU 8508 0.1616 Gly GGC 18517 0.3518 Gly GGA 12838 0.2439 Gly GGG 12772 0.2427 Total 52635 Stop UAA 555 Stop UAG 394 Stop UGA 392
[0105]By utilizing these or similar tables, one of ordinary skill in the art can apply the frequencies to any given polypeptide sequence, and produce a nucleic acid fragment of a codon-optimized coding region which encodes the polypeptide, but which uses codons more optimal for a given species. Codon-optimized coding regions can be designed by various different methods.
[0106]In one method, termed "uniform optimization," a codon usage table is used to find the single most frequent codon used for any given amino acid, and that codon is used each time that particular amino acid appears in the polypeptide sequence. For example, referring to Table 2 above, for leucine, the most frequent codon in humans is CUG, which is used 41% of the time. Thus all the leucine residues in a given amino acid sequence would be assigned the codon CUG. A coding region for IAV NP (SEQ ID NO:2) optimized by the "uniform optimization" method is presented herein as SEQ ID NO:24:
TABLE-US-00024 1 ATGGCCAGCC AGGGCACCAA GCGGAGCTAC GAGCAGATGG AGACCGACGG CGAGCGGCAG 61 AACGCCACCG AGATCCGGGC CAGCGTGGGC AAGATGATCG GCGGCATCGG CCGGTTCTAC 121 ATCCAGATGT GCACCGAGCT GAAGCTGAGC GACTACGAGG GCCGGCTGAT CCAGAACAGC 181 CTGACCATCG AGCGGATGGT GCTGAGCGCC TTCGACGAGC GGCGGAACAA GTACCTGGAG 241 GAGCACCCCA GCGCCGGCAA GGACCCCAAG AAGACCGGCG GCCCCATCTA CCGGCGGGTG 301 AACGGCAAGT GGATGCGGGA GCTGATCCTG TACGACAAGG AGGAGATCCG GCGGATCTGG 361 CGGCAGGCCA ACAACGGCGA CGACGCCACC GCCGGCCTGA CCCACATGAT GATCTGGCAC 421 AGCAACCTGA ACGACGCCAC CTACCAGCGG ACCCGGGCCC TGGTGCGGAC CGGCATGGAC 481 CCCCGGATGT GCAGCCTGAT GCAGGGCAGC ACCCTGCCCC GGCGGAGCGG CGCCGCCGGC 541 GCCGCCGTGA AGGGCGTGGG CACCATGGTG ATGGAGCTGG TGCGGATGAT CAAGCGGGGC 601 ATCAACGACC GGAACTTCTG GCGGGGCGAG AACGGCCGGA AGACCCGGAT CGCCTACGAG 661 CGGATGTGCA ACATCCTGAA GGGCAAGTTC CAGACCGCCG CCCAGAAGGC CATGATGGAC 721 CAGGTGCGGG AGAGCCGGAA CCCCGGCAAC GCCGAGTTCG AGGACCTGAC CTTCCTGGCC 781 CGGAGCGCCC TGATCCTGCG GGGCAGCGTG GCCCACAAGA GCTGCCTGCC CGCCTGCGTG 841 TACGGCCCCG CCGTGGCCAG CGGCTACGAC TTCGAGCGGG AGGGCTACAG CCTGGTGGGC 901 ATCGACCCCT TCCGGCTGCT GCAGAACAGC CAGGTGTACA GCCTGATCCG GCCCAACGAG 961 AACCCCGCCC ACAAGAGCCA GCTGGTGTGG ATGGCCTGCC ACAGCGCCGC CTTCGAGGAC 1021 CTGCGGGTGC TGAGCTTCAT CAAGGGCACC AAGGTGCTGC CCCGGGGCAA GCTGAGCACC 1081 CGGGGCGTGC AGATCGCCAG CAACGAGAAC ATGGAGACCA TGGAGAGCAG CACCCTGGAG 1141 CTGCGGAGCC GGTACTGGGC CATCCGGACC CGGAGCGGCG GCAACACCAA CCAGCAGCGG 1201 GCCAGCGCCG GCCAGATCAG CATCCAGCCC ACCTTCAGCG TGCAGCGGAA CCTGCCCTTC 1261 GACCGGACCA CCGTGATGGC CGCCTTCAGC GGCAACACCG AGGGCCGGAC CAGCGACATG 1321 CGGACCGAGA TCATCCGGAT GATGGAGAGC GCCCGGCCCG AGGACGTGAG CTTCCAGGGC 1381 CGGGGCGTGT TCGAGCTGAG CGACGAGAAG GCCGCCAGCC CCATCGTGCC CAGCTTCGAC 1441 ATGAGCAACG AGGGCAGCTA CTTCTTCGGC GACAACGCCG AGGAGTACGA CAACTGA
[0107]In another method, termed "full-optimization," the actual frequencies of the codons are distributed randomly throughout the coding region. Thus, using this method for optimization, if a hypothetical polypeptide sequence had 100 leucine residues, referring to Table 2 for frequency of usage in humans, about 7, or 7% of the leucine codons would be UUA, about 13, or 13% of the leucine codons would be UUG, about 13, or 13% of the leucine codons would be CUU, about 20, or 20% of the leucine codons would be CUC, about 7, or 7% of the leucine codons would be CUA, and about 41, or 41% of the leucine codons would be CUG. These frequencies would be distributed randomly throughout the leucine codons in the coding region encoding the hypothetical polypeptide. As will be understood by those of ordinary skill in the art, the distribution of codons in the sequence can vary significantly using this method; however, the sequence always encodes the same polypeptide.
[0108]As an example, a nucleotide sequence for NP (SEQ ID NO:2) fully optimized for human codon usage, is shown as SEQ ID NO:23. An alignment of nucleotides 46-1542 of SEQ ID NO:1 (native NP coding region) with the codon-optimized coding region (SEQ ID NO:23) is presented in FIG. 1.
[0109]In using the "full-optimization" method, an entire polypeptide sequence may be codon-optimized as described above. With respect to various desired fragments, variants or derivatives of the complete polypeptide, the fragment variant, or derivative may first be designed, and is then codon-optimized individually. Alternatively, a full-length polypeptide sequence is codon-optimized for a given species resulting in a codon-optimized coding region encoding the entire polypeptide, and then nucleic acid fragments of the codon-optimized coding region, which encode fragments, variants, and derivatives of the polypeptide are made from the original codon-optimized coding region. As would be well understood by those of ordinary skill in the art, if codons have been randomly assigned to the full-length coding region based on their frequency of use in a given species, nucleic acid fragments encoding fragments, variants, and derivatives would not necessarily be fully codon-optimized for the given species. However, such sequences are still much closer to the codon usage of the desired species than the native codon usage. The advantage of this approach is that synthesizing codon-optimized nucleic acid fragments encoding each fragment, variant, and derivative of a given polypeptide, although routine, would be time consuming and would result in significant expense.
[0110]When using the "full-optimization" method, the term "about" is used precisely to account for fractional percentages of codon frequencies for a given amino acid. As used herein, "about" is defined as one amino acid more or one amino acid less than the value given. The whole number value of amino acids is rounded up if the fractional frequency of usage is 0.50 or greater, and is rounded down if the fractional frequency of use is 0.49 or less. Using again the example of the frequency of usage of leucine in human genes for a hypothetical polypeptide having 62 leucine residues, the fractional frequency of codon usage would be calculated by multiplying 62 by the frequencies for the various codons. Thus, 7.28 percent of 62 equals 4.51 UUA codons, or "about 5," i.e., 4, 5, or 6 UUA codons, 12.66 percent of 62 equals 7.85 UUG codons or "about 8," i.e., 7, 8, or 9 UUG codons, 12.87 percent of 62 equals 7.98 CUU codons, or "about 8," i.e., 7, 8, or 9 CUU codons, 19.56 percent of 62 equals 12.13 CUC codons or "about 12," i.e., 11, 12, or 13 CUC codons, 7.00 percent of 62 equals 4.34 CUA codons or "about 4," i.e., 3, 4, or 5 CUA codons, and 40.62 percent of 62 equals 25.19 CUG codons, or "about 25," i.e., 24, 25, or 26 CUG codons.
[0111]In a third method termed "minimal optimization," coding regions are only partially optimized. For example, the invention includes a nucleic acid fragment of a codon-optimized coding region encoding a polypeptide in which at least about 1%, 2%, 3%, 4%, 5%, 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95% or 100% of the codon positions have been codon-optimized for a given species. That is, they contain a codon that is preferentially used in the genes of a desired species, e.g., a vertebrate species, e.g., humans, in place of a codon that is normally used in the native nucleic acid sequence. Codons that are rarely found in the genes of the vertebrate of interest are changed to codons more commonly utilized in the coding regions of the vertebrate of interest.
[0112]Thus, those codons which are used more frequently in the IV gene of interest than in genes of the vertebrate of interest are substituted with more frequently-used codons. The difference in frequency at which the IV codons are substituted may vary based on a number factors as discussed below. For example, codons used at least twice more per thousand in IV genes as compared to genes of the vertebrate of interest are substituted with the most frequently used codon for that amino acid in the vertebrate of interest. This ratio may be adjusted higher or lower depending on various factors such as those discussed below. Accordingly, a codon in an IV native coding region would be substituted with a codon used more frequently for that amino acid in coding regions of the vertebrate of interest if the codon is used 1.1 times, 1.2 times, 1.3 times, 1.4 times, 1.5 times, 1.6 times, 1.7 times, 1.8 times, 1.9 times, 2.0 times, 2.1 times, 2.2 times, 2.3 times, 2.4 times, 2.5 times, 2.6 times, 2.7 times, 2.8 times, 2.9 times, 3.0 times, 3.1 times, 3.2 times, 3.3. times, 3.4 times, 3.5 times, 3.6 times. 3.7 times, 3.8 times, 3.9 times, 4.0 times, 4.1 times, 4.2 times, 4.3 times, 4.4 times, 4.5 times, 4.6 times, 4.7 times, 4.8 times, 4.9 times, 5.0 times, 5.5 times, 6.0 times, 6.5 times, 7.0 times, 7.5 times, 8.0 times, 8.5 times, 9.0 times, 9.5 times, 10.0 times, 10.5 times, 11.0 times, 11.5 times, 12.0 times, 12.5 times, 13.0 times, 13.5 times, 14.0 times, 14.5 times, 15.0 times, 15.5 times, 16.0 times, 16.5 times, 17.0 times, 17.5 times, 18.0 times, 18.5 times, 19.0 times, 19.5 times, 20 times, 21 times, 22 times, 23 times, 24 times, 25 times, or greater more frequently in IV coding regions than in coding regions of the vertebrate of interest.
[0113]This minimal human codon optimization for highly variant codons has several advantages, which include but are not limited to the following examples. Since fewer changes are made to the nucleotide sequence of the gene of interest, fewer manipulations are required, which leads to reduced risk of introducing unwanted mutations and lower cost, as well as allowing the use of commercially available site-directed mutagenesis kits, and reducing the need for expensive oligonucleotide synthesis. Further, decreasing the number of changes in the nucleotide sequence decreases the potential of altering the secondary structure of the sequence, which can have a significant impact on gene expression in certain host cells. The introduction of undesirable restriction sites is also reduced, facilitating the subcloning of the genes of interest into the plasmid expression vector.
[0114]The present invention also provides isolated polynucleotides comprising coding regions of IV polypeptides, e.g., NP, M1, M2, HA, NA, PB1, PB2, PA, NS1 or NS2, or fragments, variants, or derivatives thereof. The isolated polynucleotides can also be codon-optimized.
[0115]In certain embodiments described herein, a codon-optimized coding region encoding SEQ ID NO:2 is optimized according to codon usage in humans (Homo sapiens). Alternatively, a codon-optimized coding region encoding SEQ ID NO:2 may be optimized according to codon usage in any plant, animal, or microbial species. Codon-optimized coding regions encoding SEQ ID NO:2, optimized according to codon usage in humans are designed as follows. The amino acid composition of SEQ ID NO:2 is shown in Table 6.
TABLE-US-00025 TABLE 6 Number in AMINO ACID SEQ ID NO: 2 A Ala 39 R Arg 49 C Cys 6 G Gly 41 H His 6 I Ile 26 L Leu 33 K Lys 21 M Met 25 F Phe 18 P Pro 17 S Ser 40 T Thr 28 W Trp 6 Y Tyr 15 V Val 23 N Asn 26 D Asp 22 Q Gln 21 E Glu 36
[0116]Using the amino acid composition shown in Table 6, a human codon-optimized coding region which encodes SEQ ID NO:2 can be designed by any of the methods discussed herein. For "uniform" optimization, each amino acid is assigned the most frequent codon used in the human genome for that amino acid. According to this method, codons are assigned to the coding region encoding SEQ ID NO:2 as follows: the 18 phenylalanine codons are TTC, the 33 leucine codons are CTG, the 26 isoleucine codons are ATC, the 25 methionine codons are ATG, the 23 valine codons are GTG, the 40 serine codons are AGC, the 17 proline codons are CCC, the 28 threonine codons are ACC, the 39 alanine codons are GCC, the 15 tyrosine codons are TAC, the 6 histidine codons are CAC, the 21 glutamine codons are CAG, the 26 asparagine codons are AAC, the 21 lysine codons are AAG, the 22 aspartic acid codons are GAC, the 36 glutamic acid codons are GAG, the 6 tryptophan codons are TGG, the 49 arginine codons are CGG, AGA, or AGG (the frequencies of usage of these three codons in the human genome are not significantly different), and the 41 glycine codons are GGC.
[0117]Alternatively, a human codon-optimized coding region which encodes SEQ ID NO:2 can be designed by the "full optimization" method, where each amino acid is assigned codons based on the frequency of usage in the human genome. These frequencies are shown in Table 6 above. Using this latter method, codons are assigned to the coding region encoding SEQ ID NO:2 as follows: about 8 of the 18 phenylalanine codons are TTT, and about 10 of the phenylalanine codons are TTC; about 2 of the 33 leucine codons are TTA, about 4 of the leucine codons are TTG, about 4 of the leucine codons are CTT, about 6 of the leucine codons are CTC, about 2 of the leucine codons are CTA, and about 13 of the leucine codons are CTG; about 9 of the 26 isoleucine codons are ATT, about 13 of the isoleucine codons are ATC, and about 4 of the isoleucine codons are ATA; the 25 methionine codons are ATG; about 4 of the 23 valine codons are GTT, about 5 of the valine codons are GTG, about 3 of the valine codons are GTA, and about 11 of the valine codons are GTG; about 7 of the 40 serine codons are TCT, about 9 of the serine codons are TCC, about 6 of the serine codons are TCA, about 2 of the serine codons are TCG, about 6 of the serine codons are AGT, and about 10 of the serine codons are AGC; about 5 of the 17 proline codons are CCT, about 6 of the proline codons are CCC, about 5 of the proline codons are CCA, and about 2 of the proline codons are CCG; about 7 of the 28 threonine codons are ACT, about 10 of the threonine codons are ACC, about 8 of the threonine codons are ACA, and about 3 of the threonine codons are ACG; about 10 of the 39 alanine codons are GCT, about 16 of the alanine codons are GCC, about 9 of the alanine codons are GCA, and about 4 of the alanine codons are GCG; about 7 of the 15 tyrosine codons are TAT and about 8 of the tyrosine codons are TAC; about 2 of the 6 histidine codons are CAT and about 4 of the histidine codons are CAC; about 5 of the 21 glutamine codons are CAA and about 16 of the glutamine codons are CAG; about 12 of the 26 asparagine codons are AAT and about 14 of the asparagine codons are AAC; about 9 of the 21 lysine codons are AAA and about 12 of the lysine codons are AAG; about 10 of the 22 aspartic acid codons are GAT and about 12 of the aspartic acid codons are GAC; about 11 of the 26 glutamic acid codons are GAA and about 15 of the glutamic acid codons are GAG; about 3 of the 6 cysteine codons are TGT and about 3 of the cysteine codons are TGC; the 6 tryptophan codons are TGG; about 4 of the 49 arginine codons are CGT, about 9 of the arginine codons are CGC, about 5 of the arginine codons are CGA, about 10 of the arginine codons are CGG, about 10 of the arginine codons are AGA, and about 10 of the arginine codons are AGG; and about 7 of the 41 glycine codons are GGT, about 14 of the glycine codons are GGC, about 10 of the glycine codons are GGA, and about 10 of the glycine codons are GGG.
[0118]As described above, the term "about" means that the number of amino acids encoded by a certain codon may be one more or one less than the number given. It would be understood by those of ordinary skill in the art that the total number of any amino acid in the polypeptide sequence must remain constant, therefore, if there is one "more" of one codon encoding a give amino acid, there would have to be one "less" of another codon encoding that same amino acid.
[0119]A representative "fully optimized" codon-optimized coding region encoding SEQ ID NO:2, optimized according to codon usage in humans is presented herein as SEQ ID NO:23.
[0120]Additionally, a minimally codon-optimized nucleotide sequence encoding SEQ ID NO:2 can be designed by changing only certain codons found more frequently in IV genes than in human genes, as shown in Table 7. For example, if it is desired to substitute more frequently used codons in humans for those codons that occur at least 2 times more frequently in IV genes (designated with an asterisk in Table 7), Arg AGA, which occurs 2.3 times more frequently in IV genes than in human genes, is changed to, e.g., CGG; Asn AAT, which occurs 2.0 times more frequently in IV genes than in human genes, is changed to, e.g., AAC; Ile ATA, which occurs 3.6 times more frequently in IV genes than in human genes, is changed to, e.g., ATC; and Leu CTA, which occurs 2.0 times more frequently in IV genes than is human, is changed to, e.g., CTG.
TABLE-US-00026 TABLE 7 Codon Usage Table for Human Genes and IV Genes Amino Acid Codon Human IV Ala A GCA 16 25 GCG 8 5 GCC 19 11 GCT 19 15 Arg R AGA 12 28* AGG 11 14 CGA 6 7 CGG 12 4 CGC 11 3 CGT 5 3 Asn N AAC 20 27 AAT 17 34* Asp D GAC 26 20 GAT 22 25 Cys C TGC 12 13 TGT 10 12 Gln Q CAA 12 18 CAG 35 20 Glu E GAA 30 39 GAG 40 28 Gly G GGA 16 30 GGG 16 19 GGC 23 9 GGT 11 13 His H CAC 15 13 CAT 11 7 Ile I ATA 7 25* ATC 22 18 ATT 16 23 Leu L CTA 7 14* CTG 40 17 CTC 20 14 CTT 13 14 TTA 7 8 TTG 13 14 Lys K AAA 24 35 AAG 33 20 Met M ATG 22 30 Phe F TTC 21 17 TTT 17 19 Pro P CCA 17 12 CCG 7 4 CCC 20 8 CCT 17 13 Ser S AGC 19 14 AGT 12 16 TCA 12 23 TCG 5 4 TCC 18 12 TCT 15 15 Thr T ACA 15 24 ACG 6 4 ACC 19 13 ACT 13 19 Trp W TGG 13 18 Tyr Y TAC 16 12 TAT 12 19 Val V GTA 7 13 GTG 29 20 GTC 15 12 GTT 11 15 Term TAA 1 2 TAG 0.5 0.4 TGA 1 1
[0121]In another form of minimal optimization, a Codon Usage Table (CUT) for the specific IV sequence in question is generated and compared to CUT for human genomic DNA (see Table 7, supra). Amino acids are identified for which there is a difference of at least 10 percentage points in codon usage between human and IV DNA (either more or less). Then the wild type IV codon is modified to conform to predominant human codon for each such amino acid. Furthermore, the remainder of codons for that amino acid are also modified such that they conform to the predominant human codon for each such amino acid.
[0122]A representative "minimally optimized" codon-optimized coding region encoding SEQ ID NO:2, minimally optimized according to codon usage in humans by this latter method, is presented herein as SEQ ID NO:25:
TABLE-US-00027 1 ATGGCCTCAC AGGGCACCAA GCGGAGTTAT GAGCAGATGG AGACCGATGG CGAGAGACAG 61 AACGCCACAG AGATCAGAGC CTCAGTTGGC AAGATGATCG GCGGCATCGG CCGGTTCTAT 121 ATCCAGATGT GCACGGAGCT GAAGCTGAGC GACTACGAGG GCAGACTGAT TCAGAACTCT 181 CTGACCATCG AGAGAATGGT CCTGAGTGCC TTCGATGAGA GACGAAACAA GTATCTGGAG 241 GAGCATCCCT CCGCCGGCAA GGACCCCAAG AAGACGGGCG GCCCCATATA TAGAAGAGTT 301 AACGGCAAGT GGATGAGAGA GCTGATCCTG TACGATAAGG AGGAGATCCG CAGAATATGG 361 AGGCAGGCCA ACAACGGCGA CGATGCCACT GCCGGCCTGA CACATATGAT GATATGGCAC 421 AGTAACCTGA ACGACGCCAC CTACCAGAGA ACAAGGGCCC TGGTTCGCAC GGGCATGGAT 481 CCCAGAATGT GTTCACTGAT GCAGGGCTCT ACACTGCCCA GAAGGTCTGG CGCCGCCGGC 541 GCCGCCGTCA AGGGCGTTGG CACAATGGTG ATGGAGCTGG TGCGGATGAT CAAGAGAGGC 601 ATTAACGATC GGAACTTTTG GAGGGGCGAG AACGGCAGAA AGACCAGGAT AGCCTACGAG 661 CGAATGTGCA ACATTCTGAA GGGCAAGTTC CAGACTGCCG CCCAGAAGGC CATGATGGAT 721 CAGGTGCGGG AGAGCAGAAA CCCCGGCAAC GCCGAGTTCG AGGACCTGAC TTTCCTGGCC 781 AGATCTGCCC TGATACTGAG GGGCTCTGTA GCCCACAAGT CCTGCCTGCC CGCCTGCGTG 841 TACGGCCCCG CCGTGGCCTC CGGCTATGAC TTCGAGCGAG AGGGCTACTC CCTGGTAGGC 901 ATCGATCCCT TTAGACTGCT GCAGAACTCT CAGGTCTACA GTCTGATTAG ACCCAACGAG 961 AACCCCGCCC ATAAGAGCCA GCTGGTGTGG ATGGCCTGCC ACAGTGCCGC CTTCGAGGAC 1021 CTGAGGGTGC TGTCTTTTAT AAAGGGCACA AAGGTGCTGC CCCGCGGCAA GCTGTCTACT 1081 AGGGGCGTCC AGATAGCCTC CAACGAGAAC ATGGAGACAA TGGAGTCTAG TACTCTGGAG 1141 CTGAGGTCTA GGTACTGGGC CATCAGGACT AGGAGCGGCG GCAACACCAA CCAGCAGAGG 1201 GCCAGCGCCG GCCAGATCAG CATTCAGCCC ACCTTCAGTG TACAGAGAAA CCTGCCCTTT 1261 GATAGAACTA CTGTTATGGC CGCCTTCTCT GGCAACACTG AGGGCAGAAC TAGTGACATG 1321 CGAACAGAGA TCATAAGAAT GATGGAGTCG GCCCGTCCCG AGGATGTGTC CTTTCAGGGC 1381 AGGGGCGTCT TCGAGCTGAG CGACGAGAAG GCCGCCAGCC CCATCGTACC CTCTTTCGAT 1441 ATGAGTAACG AGGGCTCGTA CTTTTTTGGC GACAACGCCG AGGAGTATGA TAACTGA
[0123]In certain embodiments described herein, a codon-optimized coding region encoding SEQ ID NO:4 is optimized according to codon usage in humans (Homo sapiens). Alternatively, a codon-optimized coding region encoding SEQ ID NO:4 may be optimized according to codon usage in any plant, animal, or microbial species. Codon-optimized coding regions encoding SEQ ID NO:4, optimized according to codon usage in humans are designed as follows. The amino acid composition of SEQ ID NO:4 is shown in Table 8.
TABLE-US-00028 TABLE 8 Number in AMINO ACID SEQ ID NO: 4 A Ala 25 R Arg 17 C Cys 3 G Gly 16 H His 5 I Ile 11 L Leu 26 K Lys 13 M Met 14 F Phe 7 P Pro 8 S Ser 18 T Thr 18 W Trp 1 Y Tyr 5 V Val 16 N Asn 11 D Asp 6 Q Gln 15 E Glu 17
[0124]Using the amino acid composition shown in Table 8, a human codon-optimized coding region which encodes SEQ ID NO:4 can be designed by any of the methods discussed herein. For "uniform" optimization, each amino acid is assigned the most frequent codon used in the human genome for that amino acid. According to this method, codons are assigned to the coding region encoding SEQ ID NO:4 as follows: the 7 phenylalanine codons are TTC, the 26 leucine codons are CTG, the 11 isoleucine codons are ATC, the 14 methionine codons are ATG, the 16 valine codons are GTG, the 18 serine codons are AGC, the 8 proline codons are CCC, the 18 threonine codons are ACC, the 25 alanine codons are GCC, the 5 tyrosine codons are TAC, the 5 histidine codons are CAC, the 15 glutamine codons are CAG, the 11 asparagine codons are AAC, the 13 lysine codons are AAG, the 6 aspartic acid codons are GAC, the 17 glutamic acid codons are GAG, the 1 tryptophan codon is TGG, the 17 arginine codons are CGG, AGA, or AGG (the frequencies of usage of these three codons in the human genome are not significantly different), and the 16 glycine codons are GGC. The codon-optimized coding region designed by this method is presented herein as SEQ ID NO:27:
TABLE-US-00029 ATGAGCCTGCTGACCGAGGTGGAGACCTACGTGCTGAGCATCATCCCCAG CGGCCCCCTGAAGGCCGAGATCGCCCAGAGGCTGGAGGACGTGTTCGCCG GCAAGAACACCGACCTGGAGGTGCTGATGGAGTGGCTGAAGACCAGGCCC ATCCTGAGCCCCCTGACCAAGGGCATCCTGGGCTTCGTGTTCACCCTGAC CGTGCCCAGCGAGAGGGGCCTGCAGAGGAGGAGGTTCGTGCAGAACGCCC TGAACGGCAACGGCGACCCCAACAACATGGACAAGGCCGTGAAGCTGTAC AGGAAGCTGAAGAGGGAGATCACCTTCCACGGCGCCAAGGAGATCAGCCT GAGCTACAGCGCCGGCGCCCTGGCCAGCTGCATGGGCCTGATCTACAACA GGATGGGCGCCGTGACCACCGAGGTGGCCTTCGGCCTGGTGTGCGCCACC TGCGAGCAGATCGCCGACAGCCAGCACAGGAGCCACAGGCAGATGGTGAC CACCACCAACCCCCTGATCAGGCACGAGAACAGGATGGTGCTGGCCAGCA CCACCGCCAAGGCCATGGAGCAGATGGCCGGCAGCAGCGAGCAGGCCGCC GAGGCCATGGAGGTGGCCAGCCAGGCCAGGCAGATGGTGCAGGCCATGAG GACCATCGGCACCCACCCCAGCAGCAGCGCCGGCCTGAAGAACGACCTGC TGGAGAACCTGCAGGCCTACCAGAAGAGGATGGGCGTGCAGATGCAGAGG TTCAAG
[0125]Alternatively, a human codon-optimized coding region which encodes SEQ ID NO:4 can be designed by the "full optimization" method, where each amino acid is assigned codons based on the frequency of usage in the human genome. These frequencies are shown in Table 8 above. Using this latter method, codons are assigned to the coding region encoding SEQ ID NO:4 as follows: about 3 of the 7 phenylalanine codons are TTT, and about 4 of the phenylalanine codons are TTC; about 2 of the 26 leucine codons are TTA, about 3 of the leucine codons are TTG, about 3 of the leucine codons are CTT, about 5 of the leucine codons are CTC, about 2 of the leucine codons are CTA, and about 11 of the leucine codons are CTG; about 4 of the 11 isoleucine codons are ATT, about 5 of the isoleucine codons are ATC, and about 2 of the isoleucine codons are ATA; the 14 methionine codons are ATG; about 3 of the 16 valine codons are GTT, about 4 of the valine codons are GTG, about 2 of the valine codons are GTA, and about 8 of the valine codons are GTG; about 3 of the 18 serine codons are TCT, about 4 of the serine codons are TCC, about 3 of the serine codons are TCA, about 1 of the serine codons is TCG, about 3 of the serine codons are AGT, and about 4 of the serine codons are AGC; about 2 of the 8 proline codons are CCT, about 3 of the proline codons are CCC, about 2 of the proline codons are CCA, and about 1 of the proline codons is CCG; about 4 of the 18 threonine codons are ACT, about 7 of the threonine codons are ACC, about 5 of the threonine codons are ACA, and about 2 of the threonine codons are ACG; about 7 of the alanine codons are GCT, about 10 of the alanine codons are GCC, about 6 of the alanine codons are GCA, and about 3 of the alanine codons are GCG; about 2 of the 5 tyrosine codons are TAT and about 3 of the tyrosine codons are TAC; about 2 of the 5 histidine codons are CAT and about 3 of the histidine codons are CAC; about 4 of the 15 glutamine codons are CAA and about 11 of the glutamine codons are CAG; about 5 of the 11 asparagine codons are AAT and about 6 of the asparagine codons are AAC; about 5 of the 13 lysine codons are AAA and about 8 of the lysine codons are AAG; about 3 of the 6 aspartic acid codons are GAT and about 3 of the aspartic acid codons are GAC; about 7 of the 17 glutamic acid codons are GAA and about 10 of the glutamic acid codons are GAG; about 1 of the 3 cysteine codons is TGT and about 2 of the cysteine codons are TGC; the 1 tryptophan codons is TGG; about 1 of the 17 arginine codons are CGT, about 3 of the arginine codons are CGC, about 2 of the arginine codons are CGA, about 4 of the arginine codons are CGG, about 3 of the arginine codons are AGA, and about 3 of the arginine codons are AGG; and about 3 of the 16 glycine codons are GGT, about 6 of the glycine codons are GGC, about 4 of the glycine codons are GGA, and about 4 of the glycine codons are GGG.
[0126]As described above, the term "about" means that the number of amino acids encoded by a certain codon may be one more or one less than the number given. It would be understood by those of ordinary skill in the art that the total number of any amino acid in the polypeptide sequence must remain constant, therefore, if there is one "more" of one codon encoding a give amino acid, there would have to be one "less" of another codon encoding that same amino acid.
[0127]A representative "fully optimized" codon-optimized coding region encoding SEQ ID NO:4, optimized according to codon usage in humans is presented herein as SEQ ID NO:26:
TABLE-US-00030 ATGAGCTTGCTAACAGAAGTGGAAACCTATGTCCTCAGTATCATTCCTAG CGGCCCCTTAAAAGCCGAAATCGCTCAGCGGCTCGAGGATGTTTTTGCCG GCAAGAACACCGACCTGGAGGTATTGATGGAGTGGCTGAAAACGCGACCT ATTCTGAGCCCCCTGACTAAGGGAATACTCGGCTTCGTTTTTACATTGAC CGTGCCCTCAGAGAGGGGTCTCCAAAGGAGGCGCTTCGTGCAGAACGCCT TAAACGGGAACGGGGACCCAAATAATATGGATAAGGCAGTGAAACTGTAT CGCAAATTAAAGCGGGAGATAACCTTCCATGGAGCCAAGGAGATCTCCCT GTCTTACTCTGCAGGTGCTCTCGCGTCGTGTATGGGACTTATCTACAACC GAATGGGCCCGTCACAACAGAAGTGGCTTTCGGGCTGGTGTGCGCAACTT GCGAACAGATTGCTGACAGTCAGCACCGGTCCCACCGTCAAATGGTCACC ACCACCAATCCGCTGATTAGACATGAAAATCGCATGGTTCTAGCATCAAC TACAGCCAAAGCAATGGAACAAATGGCCGGAAGCTCCGAGCAGGCTGCCG AGGCGATGGAGGTGGCGTCCCAGGCCAGACAGATGGTACAGGCTATGAGA ACTATCGGTACGCACCCAAGTTCTTCAGCTGGGCTGAAGAATGATCTTCT TGAGAACCTGCAGGCCTACCAAAAGCGGATGGGCGTCCAGATGCAGAGAT TTAAA
[0128]Additionally, a minimally codon-optimized nucleotide sequence encoding SEQ ID NO:4 can be designed by changing only certain codons found more frequently in IV genes than in human genes, as shown in Table 7. For example, if it is desired to substitute more frequently used codons in humans for those codons that occur at least 2 times more frequently in IV genes (designated with an asterisk in Table 7), Arg AGA, which occurs 2.3 times more frequently in IV genes than in human genes, is changed to, e.g., CGG; Asn AAT, which occurs 2.0 times more frequently in IV genes than in human genes, is changed to, e.g., AAC; Ile ATA, which occurs 3.6 times more frequently in IV genes than in human genes, is changed to, e.g., ATC; and Leu CTA, which occurs 2.0 times more frequently in IV genes than is human, is changed to, e.g., CTG.
[0129]In another form of minimal optimization, a Codon Usage Table (CUT) for the specific IV sequence in question is generated and compared to CUT for human genomic DNA (see Table 7, supra). Amino acids are identified for which there is a difference of at least 10 percentage points in codon usage between human and IV DNA (either more or less). Then the wild type IV codon is modified to conform to predominant human codon for each such amino acid. Furthermore, the remainder of codons for that amino acid are also modified such that they conform to the predominant human codon for each such amino acid.
[0130]A representative "minimally optimized" codon-optimized coding region encoding SEQ ID NO:4, minimally optimized according to codon usage in humans by this latter method, is presented herein as SEQ ID NO:28:
TABLE-US-00031 ATGAGTCTGCTGACAGAGGTTGAGACGTACGTGCTGTCCATCATTCCCTC AGGCCCCCTGAAGGCCGAGATTGCCCAGAGACTGGAGGACGTCTTCGCC GGCAAGAACACCGATCTGGAGGTGCTGATGGAGTGGCTGAAGACTCGCCC CATCCTGTCTCCCCTGACAAAGGGCATCCTGGGCTTCGTATTTACACTGA CCGTCCCCTCCGAGAGAGGCCTGCAGCGGAGGAGGTTCGTTCAGAACGCC CTGAACGGCAACGGCGATCCCAACAACATGGATAAGGCCGTGAAGCTGTA TAGAAAGCTGAAGCGAGAGATCACATTTCATGGCGCCAAGGAGATATCGC TGAGCTACAGTGCCGGCGCCCTGGCCTCTTGCATGGGCCTGATATACAAC AGAATGGGCGCCGTTACTACAGAGGTAGCCTTTGGCCTGGTCTGCGCCAC TTGCGAGCAGATCGCCGACTCTCAGCATAGATCTCACAGACAGATGGTGA CGACTACAAACCCCCTGATACGGCACGAGAACAGGATGGTGCTGGCCTCT ACTACCGCCAAGGCCATGGAGCAGATGGCCGGCAGCAGTGAGCAGGCCGC CGAGGCCATGGAGGTAGCCTCACAGGCCAGGCAGATGGTGCAGGCCATGC GAACCATCGGCACTCACCCCTCCAGCTCTGCCGGCCTGAAGAACGACCTG CTGGAGAACCTGCAGGCCTATCAGAAGAGAATGGGCGTACAGATGCAGAG GTTCAAG
[0131]In certain embodiments described herein, a codon-optimized coding region encoding SEQ ID NO:5 is optimized according to codon usage in humans (Homo sapiens). Alternatively, a codon-optimized coding region encoding SEQ ID NO:5 may be optimized according to codon usage in any plant, animal, or microbial species. Codon-optimized coding regions encoding SEQ ID NO:5, optimized according to codon usage in humans are designed as follows. The amino acid composition of SEQ ID NO:5 is shown in Table 9.
TABLE-US-00032 TABLE 9 Number in AMINO ACID SEQ ID NO: 5 A Ala 5 R Arg 7 C Cys 3 G Gly 8 H His 2 I Ile 8 L Leu 10 K Lys 5 M Met 2 F Phe 4 P Pro 4 S Ser 7 T Thr 4 W Trp 2 Y Tyr 3 V Val 4 N Asn 3 D Asp 5 Q Gln 2 E Glu 9
[0132]Using the amino acid composition shown in Table 9, a human codon-optimized coding region which encodes SEQ ID NO:5 can be designed by any of the methods discussed herein. For "uniform" optimization, each amino acid is assigned the most frequent codon used in the human genome for that amino acid. According to this method, codons are assigned to the coding region encoding SEQ ID NO:5 as follows: the 4 phenylalanine codons are TTC, the leucine codons are CTG, the 8 isoleucine codons are ATC, the 2 methionine codons are ATG, the 4 valine codons are GTG, the 7 serine codons are AGC, the 4 proline codons are CCC, the 4 threonine codons are ACC, the 5 alanine codons are GCC, the 3 tyrosine codons are TAC, the 2 histidine codons are CAC, the 2 glutamine codons are CAG, the 3 asparagine codons are AAC, the 5 lysine codons are AAG, the 5 aspartic acid codons are GAC, the 9 glutamic acid codons are GAG, the 2 tryptophan codons are TGG, the 7 arginine codons are CGG, AGA, or AGG (the frequencies of usage of these three codons in the human genome are not significantly different), and the 8 glycine codons are GGC. The codon-optimized PA coding region designed by this method is presented herein as SEQ ID NO:30:
TABLE-US-00033 1 ATGAGCCTGC TGACCGAGGT GGAGACCCCC ATCCGGAACG AGTGGGGCTG CCGGTGCAAC 61 GGCAGCAGCG ACCCCCTGGC CATCGCCGCC AACATCATCG GCATCCTGCA CCTGACCCTG 121 TGGATCCTGG ACCGGCTGTT CTTCAAGTGC ATCTACCGGC GGTTCAAGTA CGGCCTGAAG 181 GGCGGCCCCA GCACCGAGGG CGTGCCCAAG AGCATGCGGG AGGAGTACCG GAAGGAGCAG 241 CAGAGCGCCG TGGACGCCGA CGACGGCCAC TTCGTGAGCA TCGAGCTGGA GTGA
[0133]Alternatively, a human codon-optimized coding region which encodes SEQ ID NO:5 can be designed by the "full optimization" method, where each amino acid is assigned codons based on the frequency of usage in the human genome. These frequencies are shown in Table 9 above. Using this latter method, codons are assigned to the coding region encoding SEQ ID NO:5 as follows: about 2 of the 4 phenylalanine codons are TTT, and about 2 of the phenylalanine codons are TTC; about 1 of the 10 leucine codons are TTA, about 1 of the leucine codons are TTG, about 1 of the leucine codons are CTT, about 2 of the leucine codons are CTC, about 1 of the leucine codons are CTA, and about 4 of the leucine codons are CTG; about 3 of the 8 isoleucine codons are ATT, about 4 of the isoleucine codons are ATC, and about 1 of the isoleucine codons are ATA; the 2 methionine codons are ATG; about 1 of the 4 valine codons are GTT, about 1 of the valine codons are GTG, about 0 of the valine codons are GTA, and about 2 of the valine codons are GTG; about 1 of the 7 serine codons are TCT, about 2 of the serine codons are TCC, about 1 of the serine codons are TCA, about 0 of the serine codons are TCG, about 1 of the serine codons are AGT, and about 2 of the serine codons are AGC; about 1 of the 4 proline codons are CCT, about 1 of the proline codons are CCC, about 2 of the proline codons are CCA, and about 0 of the proline codons are CCG; about 1 of the 4 threonine codons are ACT, about 1 of the threonine codons are ACC, about 1 of the threonine codons are ACA, and about 0 of the threonine codons are ACG; about 1 of the 5 alanine codons are GGT, about 2 of the alanine codons are GCC, about 1 of the alanine codons are GCA, and about 1 of the alanine codons are GCG; about 1 of the 3 tyrosine codons are TAT and about 2 of the tyrosine codons are TAC; about 1 of the 2 histidine codons are CAT and about 1 of the histidine codons are CAC; about 1 of the 2 glutamine codons are CAA and about 1 of the glutamine codons are CAG; about 1 of the 3 asparagine codons are AAT and about 2 of the asparagine codons are AAC; about 2 of the 5 lysine codons are AAA and about 3 of the lysine codons are AAG; about 2 of the 5 aspartic acid codons are GAT and about 3 of the aspartic acid codons are GAC; about 4 of the 9 glutamic acid codons are GAA and about 5 of the glutamic acid codons are GAG; about 1 of the 3 cysteine codons are TGT and about 2 of the cysteine codons are TGC; the 2 tryptophan codons are TGG; about 1 of the 7 arginine codons are CGT, about 1 of the arginine codons are CGC, about 1 of the arginine codons are CGA, about 1 of the arginine codons are CGG, about 1 of the arginine codons are AGA, and about 1 of the arginine codons are AGG; and about 1 of the 8 glycine codons are GGT, about 3 of the glycine codons are GGC, about 2 of the glycine codons are GGA, and about 2 of the glycine codons are GGG.
[0134]As described above, the term "about" means that the number of amino acids encoded by a certain codon may be one more or one less than the number given. It would be understood by those of ordinary skill in the art that the total number of any amino acid in the polypeptide sequence must remain constant, therefore, if there is one "more" of one codon encoding a give amino acid, there would have to be one "less" of another codon encoding that same amino acid.
[0135]A representative "fully optimized" codon-optimized coding region encoding SEQ ID NO:5, optimized according to codon usage in humans is presented herein as SEQ ID NO:29:
TABLE-US-00034 1 ATGAGTCTTC TAACCGAGGT CGAAACGCCT ATCAGAAACG AATGGGGGTG CAGATGCAAC 61 GGTTCAAGTG ATCCTCTCGC TATTGCCGCA AATATCATTG GGATCTTGCA CTTGACATTG 121 TGGATTCTTG ATCGTCTTTT TTTCAAATGC ATTTACCGTC GCTTTAAATA CGGACTGAAA 181 GGAGGGCCTT CTACGGAAGG AGTGCCAAAG TCTATGAGGG AAGAATATCG AAAGGAACAG 241 CAGAGTGCTG TGGATGCTGA CGATGGTCAT TTTGTCAGCA TAGAGCTGGA GTAA
[0136]Additionally, a minimally codon-optimized nucleotide sequence encoding SEQ ID NO:5 can be designed by changing only certain codons found more frequently in IV genes than in human genes, as shown in Table 7. For example, if it is desired to substitute more frequently used codons in humans for those codons that occur at least 2 times more frequently in IV genes (designated with an asterisk in Table 7), Arg AGA, which occurs 2.3 times more frequently in IV genes than in human genes, is changed to, e.g., CGG; Asn AAT, which occurs 2.0 times more frequently in IV genes than in human genes, is changed to, e.g., AAC; Ile ATA, which occurs 3.6 times more frequently in IV genes than in human genes, is changed to, e.g., ATC; and Leu CTA, which occurs 2.0 times more frequently in IV genes than is human, is changed to, e.g., CTG.
[0137]In another form of minimal optimization, a Codon Usage Table (CUT) for the specific IV sequence in question is generated and compared to CUT for human genomic DNA (see Table 7, supra). Amino acids are identified for which there is a difference of at least 10 percentage points in codon usage between human and IV DNA (either more or less). Then the wild type IV codon is modified to conform to predominant human codon for each such amino acid. Furthermore, the remainder of codons for that amino acid are also modified such that they conform to the predominant human codon for each such amino acid.
[0138]A representative "minimally optimized" codon-optimized coding region encoding SEQ ID NO:5, minimally optimized according to codon usage in humans by this latter method, is presented herein as SEQ ID NO:31:
TABLE-US-00035 1 ATGTCTCTGC TGACAGAGGT GGAGACACCC ATAAGGAACG AGTGGGGCTG CAGGTGCAAC 61 GGCTCTAGTG ATCCCCTGGC CATCGCCGCC AACATCATTG GCATACTGCA TCTGACCCTG 121 TGGATCCTGG ATAGACTGTT CTTTAAGTGC ATTTACAGAC GATTTAAGTA TGGCCTGAAG 181 GGCGGCCCCT CAACTGAGGG CGTGCCCAAG AGTATGAGAG AGGAGTACCG GAAGGAGCAG 241 CAGAGCGCCG TTGACGCCGA TGACGGCCAC TTCGTCTCCA TCGAGCTGGA GTGA
[0139]In certain embodiments described herein, a codon-optimized coding region encoding SEQ ID NO:7 is optimized according to codon usage in humans (Homo sapiens). Alternatively, a codon-optimized coding region encoding SEQ ID NO:7 may be optimized according to codon usage in any plant, animal, or microbial species. Codon-optimized coding regions encoding SEQ ID NO:7, optimized according to codon usage in humans are designed as follows. The amino acid composition of SEQ ID NO:7 is shown in Table 10.
TABLE-US-00036 TABLE 10 Number in AMINO ACID SEQ ID NO: 7 A Ala 39 R Arg 51 C Cys 8 G Gly 43 H His 6 I Ile 27 L Leu 35 K Lys 21 M Met 26 F Phe 18 P Pro 18 S Ser 43 T Thr 30 W Trp 7 Y Tyr 15 V Val 24 N Asn 28 D Asp 23 Q Gln 21 E Glu 39
[0140]Using the amino acid composition shown in Table 10, a human codon-optimized coding region which encodes SEQ ID NO:7 can be designed by any of the methods discussed herein. For "uniform" optimization, each amino acid is assigned the most frequent codon used in the human genome for that amino acid. According to this method, codons are assigned to the coding region encoding SEQ ID NO:7 as follows: the 18 phenylalanine codons are TTC, the leucine codons are CTG, the 27 isoleucine codons are ATC, the 26 methionine codons are ATG, the 24 valine codons are GTG, the 43 serine codons are AGC, the 18 proline codons are CCC, the 30 threonine codons are ACC, the 39 alanine codons are GCC, the 15 tyrosine codons are TAC, the 6 histidine codons are CAC, the 21 glutamine codons are CAG, the 28 asparagine codons are AAC, the 21 lysine codons are AAG, the 23 aspartic acid codons are GAC, the 39 glutamic acid codons are GAG, the 7 tryptophan codons are TGG, the 51 arginine codons are CGG, AGA, or AGG (the frequencies of usage of these three codons in the human genome are not significantly different), and the 43 glycine codons are GGC. The codon-optimized PA coding region designed by this method is presented herein as SEQ NO:33:
TABLE-US-00037 ATGAGCCTGCTGACCGAGGTGGAGACCCCCATCAGGAACGAGTGGGGCTGCAGGTGCAACGGCAGCAGCG ACATGGCCAGCCAGGGCACCAAGAGGAGCTACGAGCAGATGGAGACCGACGGCGAGAGGCAGAACGCCAC CGAGATCAGGGCCAGCGTGGGCAAGATGATCGGCGGCATCGGCAGGTTCTACATCCAGATGTGCACCGAG CTGAAGCTGAGCGACTACGAGGGCAGGCTGATCCAGAACAGCCTGACCATCGAGAGGATGGTGCTGAGCG CCTTCGACGAGAGGAGGAACAAGTACCTGGAGGAGCACCCCAGCGCCGGCAAGGACCCCAAGAAGACCGG CGGCCCCATCTACAGGAGGGTGAACGGCAAGTGGATGAGGGAGCTGATCCTGTACGACAAGGAGGAGATC AGGAGGATCTGGAGGCAGGCCAACAACGGCGACGACGCCACCGCCGGCCTGACCCACATGATGATCTGGC ACAGCAACCTGAACGACGCCACCTACCAGAGGACCAGGGCCCTGGTGAGGACCGGCATGGACCCCAGGAT GTGCAGCCTGATGCAGGGCAGCACCCTGCCCAGGAGGAGCGGCGCCGCCGGCGCCGCCGTGAAGGGCGTG GGCACCATGGTGATGGAGCTGGTGAGGATGATCAAGAGGGGCATCAACGACAGGAACTTCTGGAGGGGCG AGAACGGCAGGAAGACCAGGATCGCCTACGAGAGGATGTGCAACATCCTGAAGGGCAAGTTCCAGACCGC CGCCCAGAAGGCCATGATGGACCAGGTGAGGGAGAGCAGGAACCCCGGCAACGCCGAGTTCGAGGACCTG ACCTTCCTGGCCAGGAGCGCCCTGATCCTGAGGGGCAGCGTGGCCCACAAGAGCTGCCTGCCCGCCTGCG TGTACGGCCCCGCCGTGGCCAGCGGCTACGACTTCGAGAGGGAGGGCTACAGCCTGGTGGGCATCGACCC CTTCAGGCTGCTGCAGAACAGCCAGGTGTACAGCCTGATCAGGCCCAACGAGAACCCCGCCCACAAGAGC CAGCTGGTGTGGATGGCCTGCCACAGCGCCGCCTTCGAGGACCTGAGGGTGCTGAGCTTCATCAAGGGCA CCAAGGTGCTGCCCAGGGGCAAGCTGAGCACCAGGGGCGTGCAGATCGCCAGCAACGAGAACATGGAGAC CATGGAGAGCAGCACCCTGGAGCTGAGGAGCAGGTACTGGGCCATCAGGACCAGGAGCGGCGGCAACACC AACCAGCAGAGGGCCAGCGCCGGCCAGATCAGCATCCAGCCCACCTTCAGCGTGCAGAGGAACCTGCCCT TCGACAGGACCACCGTGATGGCCGCCTTCAGCGGCAACACCGAGGGCAGGACCAGCGACATGAGGACCGA GATCATCAGGATGATGGAGAGCGCCAGGCCCGAGGACGTGAGCTTCCAGGGCAGGGGCGTGTTCGAGCTG AGCGACGAGAAGGCCGCCAGCCCCATCGTGCCCAGCTTCGACATGAGCAACGAGGGCAGCTACTTCTTCG GCGACAACGCCGAGGAGTACGACAAC
[0141]Alternatively, a human codon-optimized coding region which encodes SEQ ID NO:7 can be designed by the "full optimization" method, where each amino acid is assigned codons based on the frequency of usage in the human genome. These frequencies are shown in Table 10 above. Using this latter method, codons are assigned to the coding region encoding SEQ ID NO:7 as follows: about 8 of the 18 phenylalanine codons are TTT, and about 10 of the phenylalanine codons are TTC; about 3 of the 35 leucine codons are TTA, about 4 of the leucine codons are TTG, about 5 of the leucine codons are CTT, about 7 of the leucine codons are CTC, about 2 of the leucine codons are CTA, and about 14 of the leucine codons are CTG; about 10 of the 27 isoleucine codons are ATT, about 13 of the isoleucine codons are ATC, and about 4 of the isoleucine codons are ATA; the 26 methionine codons are ATG; about 4 of the 24 valine codons are GTT, about 6 of the valine codons are GTG, about 3 of the valine codons are GTA, and about 11 of the valine codons are GTG; about 8 of the 43 serine codons are TCT, about 9 of the serine codons are TCC, about 6 of the serine codons are TCA, about 2 of the serine codons are TCG, about 6 of the serine codons are AGT, and about 10 of the serine codons are AGC; about 5 of the 18 proline codons are CCT, about 6 of the proline codons are CCC, about 5 of the proline codons are CCA, and about 2 of the proline codons are CCG; about 7 of the 30 threonine codons are ACT, about 11 of the threonine codons are ACC, about 8 of the threonine codons are ACA, and about 4 of the threonine codons are ACG; about 10 of the 39 alanine codons are GGT, about 16 of the alanine codons are GCC, about 9 of the alanine codons are GCA, and about 4 of the alanine codons are GCG; about 7 of the 15 tyrosine codons are TAT and about 8 of the tyrosine codons are TAC; about 2 of the 6 histidine codons are CAT and about 4 of the histidine codons are CAC; about 5 of the 21 glutamine codons are CAA and about 16 of the glutamine codons are CAG; about 13 of the 28 asparagine codons are AAT and about 15 of the asparagine codons are AAC; about 9 of the 21 lysine codons are AAA and about 12 of the lysine codons are AAG; about 11 of the 23 aspartic acid codons are GAT and about 12 of the aspartic acid codons are GAC; about 16 of the 39 glutamic acid codons are GAA and about 23 of the glutamic acid codons are GAG; about 4 of the 8 cysteine codons are TGT and about 4 of the cysteine codons are TGC; the 7 tryptophan codons are TGG; about 4 of the 51 arginine codons are CGT, about 10 of the arginine codons are CGC, about 6 of the arginine codons are CGA, about 11 of the arginine codons are CGG, about 10 of the arginine codons are AGA, and about 10 of the arginine codons are AGG; and about 7 of the 43 glycine codons are GGT, about 15 of the glycine codons are GGC, about 11 of the glycine codons are GGA, and about 11 of the glycine codons are GGG.
[0142]As described above, the term "about" means that the number of amino acids encoded by a certain codon may be one more or one less than the number given. It would be understood by those of ordinary skill in the art that the total number of any amino acid in the polypeptide sequence must remain constant, therefore, if there is one "more" of one codon encoding a give amino acid, there would have to be one "less" of another codon encoding that same amino acid.
[0143]A representative "fully optimized" codon-optimized coding region encoding SEQ ID NO:7, optimized according to codon usage in humans is presented herein as SEQ ID NO:32:
TABLE-US-00038 ATGAGCCTTCTCACAGAAGTGGAAACACCTATCAGAAATGAATGGGGATGCAGATGCAATGGGTCGAGTG ATATGGCCTCTCAAGGTACGAAAAGAAGCTACGAGCAAATGGAAACGGATGGAGAAAGACAAAACGCGAC CGAAATCAGAGCATCCGTCGGGAAGATGATTGGAGGAATCGGACGATTCTACATCCAGATGTGCACAGAG CTAAAGCTATCGGATTATGAAGGGAGACTAATACAAAATAGCCTAACTATCGAGAGAATGGTGCTGTCTG CATTTGACGAAAGGAGAAACAAATACCTGGAAGAACACCCCTCTGCAGGGAAAGACCCAAAAAAAACTGG AGGTCCGATATACCGGAGAGTCAACGGTAAATGGATGAGAGAGCTGATCTTGTATGATAAGGAAGAAATA AGACGCATCTGGCGGCAAGCTAATAATGGAGACGACGCTACTGCAGGGCTCACGCATATGATGATCTGGC ACTCTAATTTGAATGATGCAACGTACCAAAGAACCCGCGCACTTGTGCGGACCGGAATGGACCCTCGTAT GTGCAGCCTTATGCAGGGGTCCACACTGCCCAGAAGGTCCGGAGCAGCTGGAGCAGCAGTAAAGGGGGTT GGAACCATGGTGATGGAGCTGGTGAGAATGATTAAGAGGGGGATCAATGACAGGAACTTCTGGCGAGGAG AAAACGGGAGAAAAACTAGGATAGCATATGAGAGGATGTGTAACATCCTCAAAGGAAAATTCCAAACCGC TGCTCAGAAAGCAATGATGGATCAAGTACGCGAAAGTAGAAATCCTGGAAATGCAGAGTTTGAAGATCTC ACTTTCCTCGCGCGAAGCGCTCTCATCCTCAGAGGGAGTGTCGCTCATAAAAGTTGCCTGCCTGCCTGCG TATATGGTCCTGCCGTGGCAAGTGGATACGACTTTGAGAGAGAGGGGTACTCTCTTGTTCGAATAGATCC ATTCAGATTACTTCAGAATTCCCAGGTGTACAGTTTAATAAGGCCAAACGAAAATCCTGCACACAAATCA CAACTTGTTTGGATGGCATGCCATAGTGCCGCATTCGAAGATCTAAGAGTTCTCTCTTTCATCAAAGGTA CAAAGGTCCTTCCAAGGGGAAAACTCTCTACCAGAGGGGTACAAATAGCTTCAAATGAGAACATGGAGAC AATGGAATCTAGCACATTGGAATTGAGAAGTAGGTATTGGGCCATTAGAACCAGGAGTGGAGGCAATACT AATCAACAGCGGGCTTCTGCCGGTCAAATTAGCATACAACCTACTTTTTCAGTGCAACGGAATCTCCCTT TTGATAGGACAACTGTCATGGCGGCATTCTCTGGAAATACCGAAGGAAGGACTTCCGATATGAGGACTGA GATCATTAGGATGATGGAAAGTGCCCGACCTGAAGACGTCAGTTTTCAAGGAAGAGGTGTGTTCGAACTC TCTGACGAAAAGGCAGCTAGCCCAATCGTTCCTTCTTTTGATATGTCAAATGAAGGATCCTACTTCTTCG GCGATAATGCGGAGGAATATGACAAC
[0144]In certain embodiments described herein, a codon-optimized coding region encoding SEQ ID NO:9 is optimized according to codon usage in humans (Homo sapiens). Alternatively, a codon-optimized coding region encoding SEQ ID NO:9 may be optimized according to codon usage in any plant, animal, or microbial species. Codon-optimized coding regions encoding SEQ ID NO:9, optimized according to codon usage in humans are designed as follows. The amino acid composition of SEQ ID NO:9 is shown in Table 11.
TABLE-US-00039 TABLE 11 Number in AMINO ACID SEQ ID NO: 9 A Ala 39 R Arg 51 C Cys 8 G Gly 43 H His 6 I Ile 27 L Leu 35 K Lys 21 M Met 26 F Phe 18 P Pro 18 S Ser 43 T Thr 30 W Trp 7 Y Tyr 15 V Val 24 N Asn 28 D Asp 23 Q Gln 21 E Glu 39
[0145]Using the amino acid composition shown in Table 11, a human codon-optimized coding region which encodes SEQ ID NO:9 can be designed by any of the methods discussed herein. For "uniform" optimization, each amino acid is assigned the most frequent codon used in the human genome for that amino acid. According to this method, codons are assigned to the coding region encoding SEQ ID NO:9 as follows: the 18 phenylalanine codons are TTC, the leucine codons are CTG, the 27 isoleucine codons are ATC, the 26 methionine codons are ATG, the 24 valine codons are GTG, the 43 serine codons are AGC, the 18 proline codons are CCC, the 30 threonine codons are ACC, the 39 alanine codons are GCC, the 15 tyrosine codons are TAC, the 6 histidine codons are CAC, the 21 glutamine codons are CAG, the 28 asparagine codons are AAC, the 21 lysine codons are AAG, the 23 aspartic acid codons are GAC, the 39 glutamic acid codons are GAG, the 7 tryptophan codons are TGG, the 51 arginine codons are CGG, AGA, or AGG (the frequencies of usage of these three codons in the human genome are not significantly different), and the 43 glycine codons are GGC. The codon-optimized PA coding region designed by this method is presented herein as SEQ ID NO:35:
TABLE-US-00040 ATGGCCAGCCAGGGCACCAAGAGGAGCTACGAGCAGATGGAGACCGACGGCGAGAGGCA GAACGCCACCGAGATCAGGGCCAGCGTGGGCAAGATGATCGGCGGCATCGGCAGGTTCTA CATCCAGATGTGCACCGAGCTGAAGCTGAGCGACTACGAGGGCAGGCTGATCCAGAACAG CCTGACCATCGAGAGGATGGTGCTGAGCGCCTTCGACGAGAGGAGGAACAAGTACCTGGA GGAGCACCCCAGCGCCGGCAAGGACCCCAAGAAGACCGGCGGCCCCATCTACAGGAGGGT GAACGGCAAGTGGATGAGGGAGCTGATCCTGTACGACAAGGAGGAGATCAGGAGGATCTG GAGGCAGGCCAACAACGGCGACGACGCCACCGCCGGCCTGACCCACATGATGATCTGGCA CAGCAACCTGAACGACGCCACCTACCAGAGGACCAGGGCCCTGGTGAGGACCGGCATGGA CCCCAGGATGTGCAGCCTGATGCAGGGCAGCACCCTGCCCAGGAGGAGCGGCGCCGCCGG CGCCGCCGTGAAGGGCGTGGGCACCATGGTGATGGAGCTGGTGAGGATGATCAAGAGGGG CATCAACGACAGGAACTTCTGGAGAGGGGCGAGAACGGCAGGAAGACCAGGATCGCCTACGA GAGGATGTGCAACATCCTGAAGGGCAAGTTCCAGACCGCCGCCCAGAAGGCCATGATGGA CCAGGTGAGGGAGAGCAGGAACCCCGGCAACGCCGAGTTCGAGGACCTGACCTTCCTGGC CAGGAGCGCCCTGATCCTGAGGGGCAGCGTGGCCCACAAGAGCTGCCTGCCCGCCTGCGTG TACGGCCCCGCCGTGGCCAGCGGCTACGACTTCGAGAGGGAGGGCTACAGCCTGGTGGGCA TCGACCCCTTCAGGCTGCTGCAGAACAGCCAGGTGTACAGCCTGATCAGGCCCAACGAGAA CCCCGCCCACAAGAGCCAGCTGGTGTGGATGGCCTGCCACAGCGCCGCCTTCGAGGACCTG AGGGTGCTGAGCTTCATCAAGGGCACCAAGGTGCTGCCCAGGGGCAAGCTGAGCACCAGG GGCGTGCAGATCGCCAGCAACGAGAACATGGAGACCATGGAGAGCAGCACCCTGGAGCTG AGGAGCAGGTACTGGGCCATCAGGACCAGGAGCGGCGGCAACACCAACCAGCAGAGGGCC AGCGCCGGCCAGATCAGCATCCAGCCCACCTTCAGCGTGCAGAGGAACCTGCCCTTCGACA GGACCACCGTGATGGCCGCCTTCAGCGGCAACACCGAGGGCAGGACCAGCGACATGAGGA CCGAGATCATCAGGATGATGGAGAGCGCCAGGCCCGAGGACGTGAGCTTCCAGGGCAGGG GCGTGTTCGAGCTGAGCGACGAGAAGGCCGCCAGCCCCATCGTGCCCAGCTTCGACATGAG CAACGAGGGCAGCTACTTCTTCGGCGACAACGCCGAGGAGTACGACAACATGAGCCTGCTG ACCGAGGTGGAGACCCCCATCAGGAACGAGTGGGGCTGCAGGTGCAACGGCAGCAGCGAC
[0146]Alternatively, a human codon-optimized coding region which encodes SEQ ID NO:9 can be designed by the "full optimization" method, where each amino acid is assigned codons based on the frequency of usage in the human genome. These frequencies are shown in Table 11 above. Using this latter method, codons are assigned to the coding region encoding SEQ ID NO:9 as follows: about 8 of the 18 phenylalanine codons are TTT, and about 10 of the phenylalanine codons are TTC; about 3 of the 35 leucine codons are TTA, about 4 of the leucine codons are TTG, about 5 of the leucine codons are CTT, about 7 of the leucine codons are CTC, about 2 of the leucine codons are CTA, and about 14 of the leucine codons are CTG; about 10 of the 27 isoleucine codons are ATT, about 13 of the isoleucine codons are ATC, and about 4 of the isoleucine codons are ATA; the 26 methionine codons are ATG; about 4 of the 24 valine codons are GTT, about 6 of the valine codons are GTG, about 3 of the valine codons are GTA, and about 11 of the valine codons are GTG; about 8 of the 43 serine codons are TCT, about 9 of the serine codons are TCC, about 6 of the serine codons are TCA, about 2 of the serine codons are TCG, about 6 of the serine codons are AGT, and about 10 of the serine codons are AGC; about 5 of the 18 proline codons are CCT, about 6 of the proline codons are CCC, about 5 of the proline codons are CCA, and about 2 of the proline codons are CCG; about 7 of the 30 threonine codons are ACT, about 11 of the threonine codons are ACC, about 8 of the threonine codons are ACA, and about 4 of the threonine codons are ACG; about 10 of the 39 alanine codons are GGT, about 16 of the alanine codons are GCC, about 9 of the alanine codons are GCA, and about 4 of the alanine codons are GCG; about 7 of the 15 tyrosine codons are TAT and about 8 of the tyrosine codons are TAC; about 2 of the 6 histidine codons are CAT and about 4 of the histidine codons are CAC; about 5 of the 21 glutamine codons are CAA and about 16 of the glutamine codons are CAG; about 13 of the 28 asparagine codons are AAT and about 15 of the asparagine codons are AAC; about 9 of the 21 lysine codons are AAA and about 12 of the lysine codons are AAG; about 11 of the 23 aspartic acid codons are GAT and about 12 of the aspartic acid codons are GAC; about 16 of the 39 glutamic acid codons are GAA and about 23 of the glutamic acid codons are GAG; about 4 of the 8 cysteine codons are TGT and about 4 of the cysteine codons are TGC; the 7 tryptophan codons are TGG; about 4 of the 51 arginine codons are CGT, about 10 of the arginine codons are CGC, about 6 of the arginine codons are CGA, about 11 of the arginine codons are CGG, about 10 of the arginine codons are AGA, and about 10 of the arginine codons are AGG; and about 7 of the 43 glycine codons are GGT, about 15 of the glycine codons are GGC, about 11 of the glycine codons are GGA, and about 11 of the glycine codons are GGG.
[0147]As described above, the term "about" means that the number of amino acids encoded by a certain codon may be one more or one less than the number given. It would be understood by those of ordinary skill in the art that the total number of any amino acid in the polypeptide sequence must remain constant, therefore, if there is one "more" of one codon encoding a give amino acid, there would have to be one "less" of another codon encoding that same amino acid.
[0148]A representative "fully optimized" codon-optimized coding region encoding SEQ ID NO:9, optimized according to codon usage in humans is presented herein as SEQ ID NO:34:
TABLE-US-00041 ATGGCAAGCCAGGGCACAAAACGCAGTTACGAGCAGATGGAGACTGATGGTGAGAGGCAGAACGCCACCG AAATCCGGGCCTCCGTCGGCAAGATGATTGGTGGCATCGGAAGATTCTATATCCAGATGTGCACGGAGCT TAAGCTGTCCGATTACGAGGGGCGCTTAATACAGAACTCTCTGACTATCGAGCGAATGGTCTTGAGCGCC TTTGATGAGCGGCGTAATAAGTATCTCGAAGAGCACCCTTCTGCTGGAAAAGACCCCAAAAAGACCGGGG GACCTATCTACCGACGTGTGAACGGAAAATGGATGCGCGAACTGATACTGTACGACAAGGAGGAGATCCG TAGGATCTGGAGACAGGCTAATAACGGAGATGATGCCACAGCTGGGCTGACCCATATGATGATATGGCAT AGCAACCTGAACGACGCAACCTATCAACGCACTAGAGCACTCGTGAGGACCGGTATGGACCCACGCATGT GCTCATTGATGCAAGGTAGCACATTGCCTCGGAGGTCAGGCGCCGCCGGTGCCGCCGTAAAGGGGGTGGG CACAATGGTGATGGAACTGGTCCGAATGATCAAAAGAGGCATCAATGACAGGAACTTTTGGCGCGGAGAA AACGGGCGCAAGACCCGCATTGCCTACGAGCGCATGTGTAACATTTTAAAAGGCAAATTCCAGACTGCAG CCCAGAAAGCAATGATGGACCAAGTTAGAGAAAGTAGAAATCCCGGGAATGCCGAGTTTGAAGACCTGAC TTTCCTGGCTAGAAGCGCCTTGATCCTGCGGGGCTCTGTCGCCCACAAGAGCTGCCTCCCCGCTTGCGTT TACGGCCCCGCGGTCGCAAGTGGCTACGATTTCGAGAGGGAGGGGTATTCCCTAGTTGGGATCGATCCCT TCCGGCTCCTACAGAATTCTCAGGTGTATAGTCTGATTAGACCCAACGAAAACCCGGCTCACAAGAGTCA GCTTGTTTGGATGGCATGTCACTCAGCAGCTTTCGAAGACCTGCGGGTACTCAGCTTTATTAAAGGCACC AAGGTCCTGCCAAGAGGAAAGCTCTCCACGAGGGGAGTACAGATCGCCTCAAACGAGAACATGGAGACAA TGGAAAGCTCCACCCTTGAGCTTAGGTCGCGGTATTGGGCTATTAGAACACGATCTGGGGGGAATACCAA TCAGCAACGAGCGAGTGCTGGTCAGATTTCCATTCAGCCTACTTTCTCTGTGCAACGGAATCTACCATTT GACAGGACAACTGTGATGGCAGCGTTCTCCGGCAATACAGAAGGACGAACATCAGACATGAGGACCGAAA TTATCCGGATGATGGAGAGCGCTCGGCCAGAAGATGTGTCGTTCCAGGGCCGGGGCGTGTTTGAGCTCAG CGACGAGAAGGCCGCGTCTCCAATTGTGCCTTCCTTTGATATGAGCAATGAGGGGTCATACTTTTTCGGA GACAATGCCGAAGAGTATGATAATATGTCTCTGCTTACCGAGGTGGAAACGCCGATACGCAACGAATGGG GTTGTCGTTGTAACGGCTCCAGTGAT
[0149]In certain embodiments described herein, a codon-optimized coding region encoding SEQ ID NO:16 is optimized according to codon usage in humans (Homo sapiens). Alternatively, a codon-optimized coding region encoding SEQ ID NO:16 may be optimized according to codon usage in any plant, animal, or microbial species. Codon-optimized coding regions encoding SEQ ID NO:16, optimized according to codon usage in humans are designed as follows. The amino acid composition of SEQ ID NO:16 is shown in Table 12.
TABLE-US-00042 TABLE 12 Number in AMINO ACID SEQ ID NO: 16 A Ala 41 R Arg 30 C Cys 5 G Gly 44 H His 4 I Ile 38 L Leu 39 K Lys 52 M Met 27 F Phe 21 P Pro 26 S Ser 40 T Thr 38 W Trp 1 Y Tyr 14 V Val 32 N Asn 25 D Asp 34 Q Gln 19 E Glu 30
[0150]Using the amino acid composition shown in Table 12, a human codon-optimized coding region which encodes SEQ ID NO:16 can be designed by any of the methods discussed herein. For "uniform" optimization, each amino acid is assigned the most frequent codon used in the human genome for that amino acid. According to this method, codons are assigned to the coding region encoding SEQ ID NO:16 as follows: the 21 phenylalanine codons are TTC, the 39 leucine codons are CTG, the 38 isoleucine codons are ATC, the 27 methionine codons are ATG, the 32 valine codons are GTG, the 40 serine codons are AGC, the 26 proline codons are CCC, the 38 threonine codons are ACC, the 41 alanine codons are GCC, the 14 tyrosine codons are TAC, the 4 histidine codons are CAC, the 19 glutamine codons are CAG, the 25 asparagine codons are AAC, the 52 lysine codons are AAG, the 34 aspartic acid codons are GAC, the 30 glutamic acid codons are GAG, the 1 tryptophan codon is TGG, the 30 arginine codons are CGG, AGA, or AGG (the frequencies of usage of these three codons in the human genome are not significantly different), and the 44 glycine codons are GGC. The codon-optimized PA coding region designed by this method is presented herein as SEQ ID NO:37:
TABLE-US-00043 ATGAGCAACATGGACATCGACAGCATCAACACCGGCACCATCGACAAGACCCCCGAGGAG CTGACCCCCGGCACCAGCGGCGCCACCCGGCCCATCATCAAGCCCGCCACCCTGGCCCCCC CCAGCAACAAGCGGACCCGGAACCCCAGCCCCGAGCGGACCACCACCAGCAGCGAGACCG ACATCGGCCGGAAGATCCAGAAGAAGCAGACCCCCACCGAGATCAAGAAGAGCGTGTACA AGATGGTGGTGAAGCTGGGCGAGTTCTACAACCAGATGATGGTGAAGGCCGGCCTGAACG ACGACATGGAGCGGAACCTGATCCAGAACGCCCAGGCCGTGGAGCGGATCCTGCTGGCCG CCACCGACGACAAGAAGACCGAGTACCAGAAGAAGCGGAACGCCCGGGACGTGAAGGAG GGCAAGGAGGAGATCGACCACAACAAGACCGGCGGCACCTTCTACAAGATGGTGCGGGAC GACAAGACCATCTACTTCAGCCCCATCAAGATCACCTTCCTGAAGGAGGAGGTGAAGACCA TGTACAAGACCACCATGGGCAGCGACGGCTTCAGCGGCCTGAACCACATCATGATCGGCCA CAGCCAGATGAACGACGTGTGCTTCCAGCGGAGCAAGGGCCTGAAGCGGGTGGGCCTGGA CCCCAGCCTGATCAGCACCTTCGCCGGCAGCACCCTGCCCCGGCGGAGCGGCACCACCGGC GTGGCCATCAAGGGCGGCGGCACCCTGGTGGACGAGGCCATCCGGTTCATCGGCCGGGCCA TGGCCGACCGGGGCCTGCTGCGGGACATCAAGGCCAAGACCGCCTACGAGAAGATCCTGCT GAACCTGAAGAACAAGTGCAGCGCCCCCCAGCAGAAGGCCCTGGTGGACCAGGTGATCGG CAGCCGGAACCCCGGCATCGCCGACATCGAGGACCTGACCCTGCTGGCCCGGAGCATGGTG GTGGTGCGGCCCAGCGTGGCCAGCAAGGTGGTGCTGCCCATCAGCATCTACGCCAAGATCC CCCAGCTGGGCTTCAACACCGAGGAGTACAGCATGGTGGGCTACGAGGCCATGGCCCTGTA CAACATGGCCACCCCCGTGAGCATCCTGCGGATGGGCGACGACGCCAAGGACAAGAGCCA GCTGTTCTTCATGAGCTGCTTCGGCGCCGCCTACGAGGACCTGCGGGTGCTGAGCGCCCTGA CCGGCACCGAGTTCAAGCCCCGGAGCGCCCTGAAGTGCAAGGGCTTCCACGTGCCCGCCAA GGAGCAGGTGGAGGGCATGGGCGCCGCCCTGATGAGCATCAAGCTGCAGTTCTGGGCCCCC ATGACCCGGAGCGGCGGCAACGAGGTGAGCGGCGAGGGCGGCAGCGGCCAGATCAGCTGC AGCCCCGTGTTCGCCGTGGAGCGGCCCATCGCCCTGAGCAAGCAGGCCGTGCGGCGGATGC TGAGCATGAACGTGGAGGGCCGGGACGCCGACGTGAAGGGCAACCTGCTGAAGATGATGA ACGACAGCATGGCCAAGAAGACCAGCGGCAACGCCTTCATCGGCAAGAAGATGTTCCAGA TCAGCGACAAGAACAAGGTGAACCCCATCGAGATCCCCATCAAGCAGACCATCCCCAACTT CTTCTTCGGCCGGGACACCGCCGAGGACTACGACGACCTGGACTACTGA
[0151]Alternatively, a human codon-optimized coding region which encodes SEQ ID NO:16 can be designed by the "full optimization" method, where each amino acid is assigned codons based on the frequency of usage in the human genome. These frequencies are shown in Table 12 above. Using this latter method, codons are assigned to the coding region encoding SEQ ID NO:16 as follows: about 10 of the 21 phenylalanine codons are TTT, and about 12 of the phenylalanine codons are TTC; about 3 of the 39 leucine codons are TTA, about 5 of the leucine codons are TTG, about 5 of the leucine codons are CTT, about 8 of the leucine codons are CTC, about 3 of the leucine codons are CTA, and about 16 of the leucine codons are CTG; about 14 of the 38 isoleucine codons are ATT, about 18 of the isoleucine codons are ATC, and about 6 of the isoleucine codons are ATA; the 27 methionine codons are ATG; about 6 of the 32 valine codons are GTT, about 8 of the valine codons are GTG, about 4 of the valine codons are GTA, and about 15 of the valine codons are GTG; about 7 of the 40 serine codons are TCT, about 9 of the serine codons are TCC, about 6 of the serine codons are TCA, about 2 of the serine codons are TCG, about 6 of the serine codons are AGT, and about 10 of the serine codons are AGC; about 7 of the 26 proline codons are CCT, about 9 of the proline codons are CCC, about 7 of the proline codons are CCA, and about 3 of the proline codons are CCG; about 9 of the 38 threonine codons are ACT, about 14 of the threonine codons are ACC, about 11 of the threonine codons are ACA, and about 4 of the threonine codons are ACG; about 11 of the 41 alanine codons are GGT, about 17 of the alanine codons are GCC, about 9 of the alanine codons are GCA, and about 4 of the alanine codons are GCG; about 6 of the 14 tyrosine codons are TAT and about 8 of the tyrosine codons are TAC; about 2 of the 4 histidine codons are CAT and about 2 of the histidine codons are CAC; about 5 of the 19 glutamine codons are CAA and about 14 of the glutamine codons are CAG; about 12 of the 25 asparagine codons are AAT and about 13 of the asparagine codons are AAC; about 22 of the 52 lysine codons are AAA and about 30 of the lysine codons are AAG; about 16 of the 34 aspartic acid codons are GAT and about 18 of the aspartic acid codons are GAC; about 12 of the 30 glutamic acid codons are GAA and about 18 of the glutamic acid codons are GAG; about 2 of the 5 cysteine codons are TGT and about 3 of the cysteine codons are TGC; the single tryptophan codon is TGG; about 2 of the 30 arginine codons are CGT, about 6 of the arginine codons are CGC, about 3 of the arginine codons are CGA, about 6 of the arginine codons are CGG, about 6 of the arginine codons are AGA, and about 6 of the arginine codons are AGG; and about 7 of the 44 glycine codons are GGT, about 15 of the glycine codons are GGC, about 11 of the glycine codons are GGA, and about 11 of the glycine codons are GGG.
[0152]As described above, the term "about" means that the number of amino acids encoded by a certain codon may be one more or one less than the number given. It would be understood by those of ordinary skill in the art that the total number of any amino acid in the polypeptide sequence must remain constant, therefore, if there is one "more" of one codon encoding a give amino acid, there would have to be one "less" of another codon encoding that same amino acid.
[0153]A representative "fully optimized" codon-optimized coding region encoding SEQ ID NO:16, optimized according to codon usage in humans is presented herein as SEQ ID NO:36:
TABLE-US-00044 ATGTCGAACATGGACATCGACAGCATTAACACAGGTACTATTGACAAAACCCCCGAAGAACTAACCCCTG GAACCTCAGGAGCAACACGCCCAATAATCAAACCGGCCACCCTCGCGCCCCCTAGCAATAAGAGGACCCG CAATCCAAGTCCTGAGAGAACCACTACTTCATCTGAAACGGATATCGGTCGGAAAATTCAAAAAAAGCAG ACGCCCACAGAGATAAAGAAGTCTGTTTACAAAATGGTGGTAAAGCTCGGTGAGTTTTATAACCAGATGA TGGTCAAGGCGGGGCTTAACGACGATATGGAACGAAATCTTATACAGAATGCACAGGCAGTAGAGAGAAT ACTGCTGGCCGCTACTGATGACAAGAAAACGGAGTACCAAAAAAAACGGAATGCTCGAGATGTGAAAGAA GGAAAAGAAGAAATTGACCATAACAAAACTGGGGGGACATTCTATAAGATGGTGCGGGACGATAAGACAA TCTATTTTAGCCCGATAAAGATTACCTTCCTGAAGGAGGAGGTTAAAACAATGTACAAGACGACGATGGG CAGCGATGGGTTTTCCGGACTTAATCATATAATGATTGGTCACTCGCAGATGAACGATGTATGTTTCCAG CGCTCCAAGGGCTTAAAGAGGGTAGGTCTTGACCCGTCTCTAATATCAACTTTCGCAGGATCCACTTTGC CGAGGCGTTCTGGCACGACAGGCGTGGCTATCAAGGGCGGGGGGACGCTGGTCGATGAGGCCATTCGCTT TATTGGTAGGGCCATGGCCGATAGAGGGCTTCTACGAGACATCAAAGCAAAAACAGCATATGAGAAGATA TTATTAAACTTAAAGAACAAATGCTCCGCTCCTCAGCAAAAAGCGCTCGTTGACCAAGTAATCGGTTCGA GAAATCCAGGCATTGCCGATATCGAAGATCTTACACTCTTGGCGCGAAGCATGGTCGTTGTCCGTCCCAG TGTCGCTAGTAAGGTGGTACTACCAATCTCGATTTACGCAAAAATTCCACAACTCGGCTTTAATACAGAG GAATATTCTATGGTAGGTTATGAAGCCATGGCGTTGTATAATATGGCTACACCAGTCTCCATATTGCGTA TGGGAGATGACGCAAAAGATAAGAGTCAACTCTTTTTCATGTCATGTTTCGGCGCAGCGTACGAAGATCT GAGAGTACTATCCGCCTTGACTGGAACGGAATTTAAACCACGGTCAGCCTTAAAGTGTAAGGGTTTTCAC GTCCCTGCTAAGGAGCAAGTTGAGGGAATGGGCGCGGCACTGATGAGTATAAAATTACAATTTTGGGCTC CAATGACGCGTTCGGGAGGGAATGAAGTTTCTGGTGAGGGAGGGAGTGGACAGATATCATGCTCGCCCGT GTTCGCGGTTGAACGTCCGATTGCTTTGAGTAAGCAGGCGGTTAGGCGGATGTTAAGTATGAATGTGGAG GGCCGCGATGCCGACGTCAAAGGCAACTTATTAAAAATGATGAACGACAGCATGGCAAAGAAGACTAGTG GGAATGCTTTTATAGGGAAAAAAATGTTCCAAATAAGTGACAAAAACAAAGTGAACCCCATCGAAATACC TATCAAGCAAACCATCCCGAATTTCTTTTTCGGTCGAGACACCGCGGAGGACTACGATGACCTAGATTAC TAA
[0154]Additionally, a minimally codon-optimized nucleotide sequence encoding SEQ ID NO:16 can be designed by changing only certain codons found more frequently in IV genes than in human genes, as shown in Table 7. For example, if it is desired to substitute more frequently used codons in humans for those codons that occur at least 2 times more frequently in IV genes (designated with an asterisk in Table 7), Arg AGA, which occurs 2.3 times more frequently in IV genes than in human genes, is changed to, e.g., CGG; Asn AAT, which occurs 2.0 times more frequently in IV genes than in human genes, is changed to, e.g., AAC; Ile ATA, which occurs 3.6 times more frequently in IV genes than in human genes, is changed to, e.g., ATC; and Leu CTA, which occurs 2.0 times more frequently in IV genes than is human, is changed to, e.g., CTG.
[0155]In another form of minimal optimization, a Codon Usage Table (CUT) for the specific IV sequence in question is generated and compared to CUT for human genomic DNA (see Table 7, supra). Amino acids are identified for which there is a difference of at least 10 percentage points in codon usage between human and IV DNA (either more or less). Then the wild type IV codon is modified to conform to predominant human codon for each such amino acid. Furthermore, the remainder of codons for that amino acid are also modified such that they conform to the predominant human codon for each such amino acid.
[0156]A representative "minimally optimized" codon-optimized coding region encoding SEQ ID NO:16, minimally optimized according to codon usage in humans by this latter method, is presented herein as SEQ ID NO:38:
TABLE-US-00045 ATGTCTAACATGGACATCGACTCTATAAACACAGGCACGATCGATAAGACCCCCGAGGAGC TGACACCCGGCACTTCAGGCGCCACCAGACCCATAATAAAGCCCGCCACTCTGGCCCCCCC CTCTAACAAGAGGACGAGGAACCCCTCTCCCGAGCGCACCACAACGAGTAGCGAGACGGA CATCGGCAGGAAGATACAGAAGAAGCAGACTCCCACTGAGATTAAGAAGTCCGTGTATAA GATGGTGGTTAAGCTGGGCGAGTTTTACAACCAGATGATGGTGAAGGCCGGCCTGAACGAT GACATGGAGAGGAACCTGATACAGAACGCCCAGGCCGTGGAGAGGATTCTGCTGGCCGCC ACCGATGACAAGAAGACTGAGTATCAGAAGAAGAGAAACGCCCGGGACGTTAAGGAGGGC AAGGAGGAGATCGATCACAACAAGACAGGCGGCACTTTCTATAAGATGGTCCGTGATGAC AAGACAATCTACTTTTCTCCCATCAAGATCACATTCCTGAAGGAGGAGGTAAAGACTATGT ACAAGACAACTATGGGCTCCGATGGCTTCAGTGGCCTGAACCACATAATGATAGGCCATAG TCAGATGAACGATGTGTGCTTCCAGAGAAGCAAGGGCCTGAAGAGGGTCGGCCTGGATCCC TCGCTGATTAGTACCTTCGCCGGCAGCACTCTGCCCAGAAGATCTGGCACTACTGGCGTAGC CATAAAGGGCGGCGGCACACTGGTAGACGAGGCCATAAGGTTTATTGGCAGAGCCATGGC CGACCGCGGCCTGCTGAGAGATATCAAGGCCAAGACCGCCTACGAGAAGATACTGCTGAA CCTGAAGAACAAGTGCTCAGCCCCCCAGCAGAAGGCCCTGGTGGATCAGGTGATCGGCAGT AGAAACCCCGGCATCGCCGACATCGAGGATCTGACTCTGCTGGCCAGAAGCATGGTAGTCG TAAGACCCTCTGTGGCCTCTAAGGTTGTGCTGCCCATCTCCATCTACGCCAAGATTCCCCAG CTGGGCTTTAACACTGAGGAGTACTCCATGGTGGGCTATGAGGCCATGGCCCTGTATAACA TGGCCACACCCGTCTCTATCCTGCGGATGGGCGACGATGCCAAGGACAAGTCTCAGCTGTT TTTTATGAGTTGTTTCGGCGCCGCCTATGAGGATCTGAGAGTCCTGTCAGCCCTGACAGGCA CTGAGTTCAAGCCCAGGTCCGCCCTGAAGTGCAAGGGCTTTCATGTGCCCGCCAAGGAGCA GGTGGAGGGCATGGGCGCCGCCCTGATGAGCATCAAGCTGCAGTTCTGGGCCCCCATGACC CGGTCTGGCGGCAACGAGGTCTCGGGCGAGGGCGGCAGTGGCCAGATAAGTTGCAGCCCC GTTTTTGCCGTTGAGAGACCCATCGCCCTGTCTAAGCAGGCCGTTAGACGAATGCTGAGTAT GAACGTCGAGGGCCGAGACGCCGATGTGAAGGGCAACCTGCTGAAGATGATGAACGATTC CATGGCCAAGAAGACAAGCGGCAACGCCTTCATTGGCAAGAAGATGTTCCAGATAAGCGA TAAGAACAAGGTTAACCCCATCGAGATTCCCATCAAGCAGACCATCCCCAACTTCTTCTTCG GCAGGGATACCGCCGAGGATTACGATGACCTGGACTACTGA
[0157]Randomly assigning codons at an optimized frequency to encode a given polypeptide sequence using the "full-optimization" or "minimal optimization" methods, can be done manually by calculating codon frequencies for each amino acid, and then assigning the codons to the polypeptide sequence randomly. Additionally, various algorithms and computer software programs are readily available to those of ordinary skill in the art. For example, the "EditSeq" function in the Lasergene Package, available from DNAstar, Inc., Madison, Wis., the backtranslation function in the Vector NTI Suite, available from InforMax, Inc., Bethesda, Md., and the "backtranslate" function in the GCG-Wisconsin Package, available from Accelrys, Inc., San Diego, Calif. In addition, various resources are publicly available to codon-optimize coding region sequences. For example, the "backtranslation" function found at http://www.entelechon.com/eng/backtranslation.html (visited Jul. 9, 2002), and the "backtranseq" function available at http://bioinfo.pbi.nrc.ca:8090/EMBOSS/index.html (visited Oct. 15, 2002). Constructing a rudimentary algorithm to assign codons based on a given frequency can also easily be accomplished with basic mathematical functions by one of ordinary skill in the art.
[0158]A number of options are available for synthesizing codon-optimized coding regions designed by any of the methods described above, using standard and routine molecular biological manipulations well known to those of ordinary skill in the art. In one approach, a series of complementary oligonucleotide pairs of 80-90 nucleotides each in length and spanning the length of the desired sequence are synthesized by standard methods. These oligonucleotide pairs are synthesized such that upon annealing, they form double stranded fragments of 80-90 base pairs, containing cohesive ends, e.g., each oligonucleotide in the pair is synthesized to extend 3, 4, 5, 6, 7, 8, 9, 10, or more bases beyond the region that is complementary to the other oligonucleotide in the pair. The single-stranded ends of each pair of oligonucleotides is designed to anneal with the single-stranded end of another pair of oligonucleotides. The oligonucleotide pairs are allowed to anneal, and approximately five to six of these double-stranded fragments are then allowed to anneal together via the cohesive single stranded ends, and then they ligated together and cloned into a standard bacterial cloning vector, for example, a TOPOยฎ vector available from Invitrogen Corporation, Carlsbad, Calif. The construct is then sequenced by standard methods. Several of these constructs consisting of 5 to 6 fragments of 80 to 90 base pair fragments ligated together, i.e., fragments of about 500 base pairs, are prepared, such that the entire desired sequence is represented in a series of plasmid constructs. The inserts of these plasmids are then cut with appropriate restriction enzymes and ligated together to form the final construct. The final construct is then cloned into a standard bacterial cloning vector, and sequenced. Additional methods would be immediately apparent to the skilled artisan. In addition, gene synthesis is readily available commercially.
[0159]The codon-optimized coding regions can be versions encoding any gene products from any strain, derivative, or variant of IV, or fragments, variants, or derivatives of such gene products. For example, nucleic acid fragments of codon-optimized coding regions encoding the NP, M1 and M2 polypeptides, or fragments, variants or derivatives thereof. Codon-optimized coding regions encoding other IV polypeptides or fragments, variants, or derivatives thereof (e.g. HA, NA, PB1, PB2, PA, NS1 or NS2), are included within the present invention. Additional, non-codon-optimized polynucleotides encoding IV polypeptides or other polypeptides are included as well.
Consensus Sequences
[0160]The present invention is further directed to specific consensus sequences of influenza virus proteins, and fragments, derivatives and variants thereof. A "consensus sequence" is, e.g., an idealized sequence that represents the amino acids most often present at each position of two or more sequences which have been compared to each other. A consensus sequence is a theoretical representative amino acid sequence in which each amino acid is the one which occurs most frequently at that site in the different sequences which occur in nature. The term also refers to an actual sequence which approximates the theoretical consensus. A consensus sequence can be derived from sequences which have, e.g., shared functional or structural purposes. It can be defined by aligning as many known examples of a particular structural or functional domain as possible to maximize the homology. A sequence is generally accepted as a consensus when each particular amino acid is reasonably predominant at its position, and most of the sequences which form the basis of the comparison are related to the consensus by rather few substitutions, e.g., from 0 to about 100 substitutions. In general, the wild-type comparison sequences are at least about 50%, 75%, 80%, 90%, 95%, 96%, 97%, 98% or 99% identical to the consensus sequence. Accordingly, polypeptides of the invention are about 50%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% identical to the consensus sequence. Consensus amino acid sequences can be prepared for any of the influenza antigens. By analyzing amino acid sequences from influenza A strains sequenced since 1990, consensus amino acid sequences were derived for the influenza A NP (SEQ ID NO: 76), M1 (SEQ ID NO:77) and M2 (SEQ ID NO:78) proteins (Example 3).
[0161]A "consensus amino acid" is an amino acid chosen to occupy a given position in the consensus protein. A system which is organized to select consensus amino acids can be a computer program, or a combination of one or more computer programs with "by hand" analysis and calculation. When a consensus amino acid is obtained for each position of the aligned amino acid sequences, then these consensus amino acids are "lined up" to obtain the amino acid sequence of the consensus protein.
[0162]Another embodiment of this invention is directed to a process for the preparation of a consensus protein comprising a process to calculate an amino acid residue for nearly all positions of a so-called consensus protein and to synthesize a complete gene from this sequence that could be expressed in a prokaryotic or eukaryotic expression system.
[0163]Polynucleotides which encode the consensus influenza polypeptides, or fragments, variants or derivatives thereof, are also part of this invention. Such polynucleotides can be obtained by known methods, for example by backtranslation of the amino acid sequence and PCR synthesis of the corresponding polynucleotide.
Compositions and Methods
[0164]In certain embodiments, the present invention is directed to compositions and methods of enhancing the immune response of a vertebrate in need of protection against IV infection by administering in vivo, into a tissue of a vertebrate, one or more polynucleotides comprising at least one codon-optimized coding region encoding an IV polypeptide, or a fragment, variant, or derivative thereof. In addition, the present invention is directed to compositions and methods of enhancing the immune response of a vertebrate in need of protection against IV infection by administering to the vertebrate a composition comprising one or more polynucleotides as described herein, and at least one isolated IV polypeptide, or a fragment, variant, or derivative thereof. The polynucleotide may be administered either prior to, at the same time (simultaneously), or subsequent to the administration of the isolated polypeptide.
[0165]The coding regions encoding IV polypeptides or fragments, variants, or derivatives thereof may be codon optimized for a particular vertebrate. Codon optimization is carried out by the methods described herein, for example, in certain embodiments codon-optimized coding regions encoding polypeptides of IV, or nucleic acid fragments of such coding regions encoding fragments, variants, or derivatives thereof are optimized according to the codon usage of the particular vertebrate. The polynucleotides of the invention are incorporated into the cells of the vertebrate in vivo, and an immunologically effective amount of an IV polypeptide or a fragment, variant, or derivative thereof is produced in vivo. The coding regions encoding an IV polypeptide or a fragment, variant, or derivative thereof may be codon optimized for mammals, e.g., humans, apes, monkeys (e.g., owl, squirrel, cebus, rhesus, African green, patas, cynomolgus, and cercopithecus), orangutans, baboons, gibbons, and chimpanzees, dogs, wolves, cats, lions, and tigers, horses, donkeys, zebras, cows, pigs, sheep, deer, giraffes, bears, rabbits, mice, ferrets, seals, whales; birds, e.g., ducks, geese, terns, shearwaters, gulls, turkeys, chickens, quail, pheasants, geese, starlings and budgerigars, or other vertebrates.
[0166]In one embodiment, the present invention relates to codon-optimized coding regions encoding polypeptides of IV, or nucleic acid fragments of such coding regions fragments, variants, or derivatives thereof which have been optimized according to human codon usage. For example, human codon-optimized coding regions encoding polypeptides of IV, or fragments, variants, or derivatives thereof are prepared by substituting one or more codons preferred for use in human genes for the codons naturally used in the DNA sequence encoding the IV polypeptide or a fragment, variant, or derivative thereof. Also provided are polynucleotides, vectors, and other expression constructs comprising codon-optimized coding regions encoding polypeptides of IV, or nucleic acid fragments of such coding regions encoding fragments, variants, or derivatives thereof; pharmaceutical compositions comprising polynucleotides, vectors, and other expression constructs comprising codon-optimized coding regions encoding polypeptides of IV, or nucleic acid fragments of such coding regions encoding fragments, variants, or derivatives thereof; and various methods of using such polynucleotides, vectors and other expression constructs. Coding regions encoding IV polypeptides can be uniformly optimized, fully optimized, minimally optimized, codon-optimized by region and/or not codon-optimized, as described herein.
[0167]The present invention is further directed towards polynucleotides comprising codon-optimized coding regions encoding polypeptides of IV antigens, for example, HA, NA, NP, M1 and M2, optionally in conjunction with other antigens. The invention is also directed to polynucleotides comprising codon-optimized nucleic acid fragments encoding fragments, variants and derivatives of these polypeptides, e.g., an eM2 or a fusion of NP and eM2.
[0168]In certain embodiments, the present invention provides an isolated polynucleotide comprising a nucleic acid fragment, where the nucleic acid fragment is a fragment of a codon-optimized coding region encoding a polypeptide at least 60%, 65%, 70%, 75%, 80%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical to an IV polypeptide, e.g., HA, NA, NP, M1 or M2, and where the nucleic acid fragment is a variant of a codon-optimized coding region encoding an IV polypeptide, e.g., HA, NA, NP, M1 or M2. The human codon-optimized coding region can be optimized for any vertebrate species and by any of the methods described herein.
Isolated IV Polypeptides
[0169]The present invention is further drawn to compositions which include at least one polynucleotide comprising one or more nucleic acid fragments, where each nucleic acid fragment is optionally a fragment of a codon-optimized coding region operably encoding an IV polypeptide or fragment, variant, or derivative thereof; together with one or more isolated IV component or isolated polypeptide. The IV component may be inactivated virus, attenuated virus, a viral vector expressing an isolated influenza virus polypeptide, or an influenza virus protein, fragment, variant or derivative thereof.
[0170]The polypeptides or fragments, variants or derivatives thereof, in combination with the codon-optimized nucleic acid compositions may be referred to as "combinatorial polynucleotide vaccine compositions" or "single formulation heterologous prime-boost vaccine compositions."
[0171]The isolated IV polypeptides of the invention may be in any form, and are generated using techniques well known in the art. Examples include isolated IV proteins produced recombinantly, isolated IV proteins directly purified from their natural milieu, recombinant (non-IV) virus vectors expressing an isolated IV protein, or proteins delivered in the form of an inactivated IV vaccine, such as conventional vaccines
[0172]When utilized, an isolated IV polypeptide or fragment, variant or derivative thereof is administered in an immunologically effective amount. Conventional IV vaccines have been standardized to micrograms of viral antigens HA and NA. See Subbarao, K., Advances in Viral Research 54:349-373 (1999), incorporated herein by reference in its entirety. The recommended dose for these vaccines is 15 ug of each HA per 0.5 ml. Id. The effective amount of conventional IV vaccines is determinable by one of ordinary skill in the art based upon several factors, including the antigen being expressed, the age and weight of the subject, and the precise condition requiring treatment and its severity, and route of administration.
[0173]In the instant invention, the combination of conventional antigen vaccine compositions with the codon-optimized nucleic acid compositions provides for therapeutically beneficial effects at dose sparing concentrations. For example, immunological responses sufficient for a therapeutically beneficial effect in patients predetermined for an approved commercial product, such as for the conventional product described above, can be attained by using less of the approved commercial product when supplemented or enhanced with the appropriate amount of codon-optimized nucleic acid. Thus, dose sparing is contemplated by administration of conventional IV vaccines administered in combination with the codon-optimized nucleic acids of the invention
[0174]In particular, the dose of conventional vaccine may be reduced by at least 5%, at least 10%, at least 20%, at least 30%, at least 40%, at least 50%, at least 60% or at least 70% when administered in combination with the codon-optimized nucleic acid compositions of the invention.
[0175]Similarly, a desirable level of an immunological response afforded by a DNA based pharmaceutical alone may be attained with less DNA by including an aliquot of a conventional vaccine. Further, using a combination of conventional and DNA based pharmaceuticals may allow both materials to be used in lesser amounts while still affording the desired level of immune response arising from administration of either component alone in higher amounts (e.g. one may use less of either immunological product when they are used in combination). This may be manifest not only by using lower amounts of materials being delivered at any time, but also to reducing the number of administrations points in a vaccination regime (e.g. 2 versus 3 or 4 injections), and/or to reducing the kinetics of the immunological response (e.g. desired response levels are attained in 3 weeks in stead of 6 after immunization).
[0176]In particular, the dose of DNA based pharmaceuticals, may be reduced by at least 5%, at least 10%, at least 20%, at least 30%, at least 40%, at least 50%, at least 60% or at least 70% when administered in combination with conventional IV vaccines.
[0177]Determining the precise amounts of DNA based pharmaceutical and conventional antigen is based on a number of factors as described above, and is readily determined by one of ordinary skill in the art.
[0178]In addition to dose sparing, the claimed combinatorial compositions provide for a broadening of the immune response and/or enhanced beneficial immune responses. Such broadened or enhanced immune responses are achieved by: adding DNA to enhance cellular responses to a conventional vaccine; adding a conventional vaccine to a DNA pharmaceutical to enhance humoral response; using a combination that induces additional epitopes (both humoral and/or cellular) to be recognized and/or more desirably responded to (epitope broadening); employing a DNA-conventional vaccine combination designed for a particular desired spectrum of immunological responses; obtaining a desirable spectrum by using higher amounts of either component. The broadened immune response is measurable by one of ordinary skill in the art by standard immunological assay specific for the desirable response spectrum.
[0179]Both broadening and dose sparing can be obtained simultaneously.
[0180]The isolated IV polypeptide or fragment, variant, or derivative thereof to be delivered (either a recombinant protein, a purified subunit, or viral vector expressing an isolated IV polypeptide, or in the form of an inactivated IV vaccine) can be any isolated IV polypeptide or fragment, variant, or derivative thereof, including but not limited to the HA, NA, NP, M1, or M2 proteins or fragments, variants or derivatives thereof. Fragments include, but are not limited to, the eM2 protein. In certain embodiments, a derivative protein can be a fusion protein, e.g., NP-eM2. It should be noted that any isolated IV polypeptide or fragment, variant, or derivative thereof described herein can be combined in a composition with any polynucleotide comprising a nucleic acid fragment, where the nucleic acid fragment is optionally a fragment of a codon-optimized coding region operably encoding an IV polypeptide or fragment, variant, or derivative thereof. The proteins can be different, the same, or can be combined in any combination of one or more isolated IV proteins and one or more polynucleotides.
[0181]In certain embodiments, the isolated IV polypeptides, or fragments, derivatives or variants thereof can be fused to or conjugated to a second isolated IV polypeptide, or fragment, derivative or variant thereof, or can be fused to other heterologous proteins, including for example, hepatitis B proteins including, but not limited to the hepatitis B core antigen (HBcAg), or those derived from diphtheria or tetanus. The second isolated IV polypeptide or other heterologous protein can act as a "carrier" that potentiates the immunogenicity of the IV polypeptide or a fragment, variant, or derivative thereof to which it is attached. Hepatitis B virus proteins and fragments and variants thereof useful as carriers within the scope of the invention are disclosed in U.S. Pat. Nos. 6,231,864 and 5,143,726, which are incorporated by reference in their entireties. Polynucleotides comprising coding regions encoding said fused or conjugated proteins are also within the scope of the invention.
[0182]The use of recombinant particles comprising hepatitis B core antigen ("HBcAg") and heterologous protein sequences as potent immunogenic moieties is well documented. For example, addition of heterologous sequences to the amino terminus of a recombinant HBcAg results in the spontaneous assembly of particulate structures which express the heterologous epitope on their surface, and which are highly immunogenic when inoculated into experimental animals. See Clarke et al., Nature 330:381-384 (1987). Heterologous epitopes can also be inserted into HBcAg particles by replacing approximately 40 amino acids of the carboxy terminus of the protein with the heterologous sequences. These recombinant HBcAg proteins also spontaneously form immunogenic particles. See Stahl and Murray, Proc. Natl. Acad. Sci. USA, 86:6283-6287 (1989). Additionally, chimeric HBcAg particles may be constructed where the heterologous epitope is inserted in or replaces all or part of the sequence of amino acid residues in a more central region of the HBcAg protein, in an immunodominant loop, thereby allowing the heterologous epitope to be displayed on the surface of the resulting particles. See EP Patent No. 0421635 B1. Shown below are the DNA and amino acid sequences of the human hepatitis B core protein (HBc), subtype ayw (SEQ ID NOs 39 and 40), as described in Galibert, F., et al., Nature 281:646-650 (1979); see also U.S. Pat. Nos. 4,818,527, 4,882,145 and 5,143,726. All of the above references are incorporated herein by reference in their entireties. The nucleotide and amino acid sequences are presented herein as SEQ NO 39:
TABLE-US-00046 ATGGACATCGACCCTTATAAAGAATTTGGAGCTACTGTGGAGTTACTCTC GTTTTTGCCTTCTGACTTCTTTCCTTCAGTACGAGATCTTCTAGATACCG CCTCAGCTCTGTATCGGGAAGCCTTAGAGTCTCCTGAGCATTGTTCACCT CACCATACTGCACTCAGGCAAGCAATTCTTTGCTGGGGGGAACTAATGAC TCTAGCTACCTGGGTGGGTGTTAATTTGGAAGATCCAGCGTCTAGAGACC TAGTAGTCAGTTATGTCAACACTAATATGGGCCTAAAGTTCAGGCAACTC TTGTGGTTTCACATTTCTTGTCTCACTTTTGGAAGAGAAACAGTTATAGA GTATTTGGTGTCTTTCGGAGTGTGGATTCGCACTCCTCCAGCTTATAGAC CACCAAATGCCCCTATCCTATCAACACTTCCGGAGACTACTGTTGTTAGA CGACGAGGCAGGTCCCCTAGAAGAAGAACTCCCTCGCCTCGCAGACGAAG GTCTCAATCGCCGCGTCGCAGAAGATCTCAATCTCGGGAATCTCAATGTT AG
and SEQ ID NO:40:
TABLE-US-00047 [0183]MDIDPYKEFGATVELLSFLPSDFFPSVRDLLDTASALYREALESPEHCSP HHTALRQAILCWGELMTLATWVGVNLEDPASRDLVVSYVNTNMGLKFRQL LWFHISCLTFGRETVIEYLVSFGVWIRTPPAYRPPNAPILSTLPETTVVR RRGRSPRRRTPSPRRRRSQSPRRRRSQSRESQC
[0184]A completely synthetic HBcAg has been synthesized as well. See Nassal, M. Gene 66:279-294 (1988). The nucleotide and amino acid sequences are presented herein as SEQ ID NO 41:
TABLE-US-00048 ATGGATATCGATCCTTATAAAGAATTCGGAGCTACTGGGAGTTACTCTCG TTTCTCCCGAGTGACTTCTTTCCTTCAGTACGAGATCTTCTGGATACCGC CAGCGCGCTGTATCGGGAAGCCTTGGAGTCTCCTGAGCACTGCAGCCCTC ACCATACTGCCCTCAGGCAAGCAATTCTTTGCTGGGGGGAGCTCATGACT CTGGCCACGTGGGTGGGTGTTAACTTGGAAGATCCAGCTAGCAGGGACCT GGTAGTCAGTTATGTCAACACTAATATGGGTTTAAAGTTCAGGCAACTCT TGTGGTTTCACATTAGCTGCCTCACTTTCGGCCGAGAAACAGTTCTAGAA TATTTGGTGTCTTTCGGAGTGTGGATCCGCACTCCTCCAGCTTATAGGCC TCCGAATGCCCCTATCCTGTCGACACTCCCGGAGACTACTGTTGTTAGAC GTCGAGGCAGGTCACCTAGAAGAAGAACTCCTTCGCCTCGCAGGCGAAGG TCTCAATCGCCGCGGCGCCGAAGATCTCAATCTCGGGAATCTCAATGTTA GTGA
and SEQ ID NO:42:
TABLE-US-00049 [0185]MDIDPYKEFGATVELLSFLPSDFFPSVRDLLDTASALYREALESPEHCSP HHTALRQAILCWGELMTLATWVGVNLEDPASRDLVVSYVNTNMGLKFRQL LWFHISCLTFGRETVLEYLVSFGVWIRTPPAYRPPNAPILSTLPETTVVR RRGRSPRRRTPSPRRRRSQSPRRRRSQSRESQC
[0186]Chimaeric HBcAg particles comprising isolated IV proteins or variants, fragments or derivatives thereof are prepared by recombinant techniques well known to those of ordinary skill in the art. A polynucleotide, e.g., a plasmid, which carries the coding region for the HBcAg operably associated with a promoter is constructed. Convenient restrictions sites are engineered into the coding region encoding the N-terminal, central, and/or C-terminal portions of the HBcAg, such that heterologous sequences may be inserted. A construct which expresses a HBcAg/IV fusion protein is prepared by inserting a DNA sequence encoding an IV protein or variant, fragment or derivative thereof, in frame, into a desired restriction site in the coding region of the HBcAg. The resulting construct is then inserted into a suitable host cell, e.g., E. coli, under conditions where the chimeric HBcAg will be expressed. The chimaeric HBcAg self-assembles into particles when expressed, and can then be isolated, e.g., by ultracentrifugation. The particles fanned resemble the natural 27 nm HBcAg particles isolated from a hepatitis B virus, except that an isolated IV protein or fragment, variant, or derivative thereof is contained in the particle, preferably exposed on the outer particle surface.
[0187]The IV protein or fragment, variant, or derivative thereof expressed in a chimaeric HBcAg particle may be of any size which allows suitable particles of the chimeric HBcAg to self-assemble. As discussed above, even small antigenic epitopes may be immunogenic when expressed in the context of an immunogenic carrier, e.g., a HBcAg. Thus, HBcAg particles of the invention may comprise at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, at least 10, at least 15, at least 20, at least 25, or between about 15 to about 30 amino acids of an IV protein fragment of interest inserted therein. HBcAg particles of the invention may further comprise immunogenic or antigenic epitopes of at least 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, or 100 amino acid residues of an IV protein fragment of interest inserted therein.
[0188]The immunodominant loop region of HBcAg was mapped to about amino acid residues 75 to 83, to about amino acids 75 to 85 or to about amino acids 130 to 140. See Colucci et al., J. Immunol. 141:4376-4380 (1988), and Salfeld et al. J. Virol. 63:798 (1989), which are incorporated by reference. A chimeric HBcAg is still often able to faun core particles when foreign epitopes are cloned into the immunodominant loop. Thus, for example, amino acids of the IV protein fragment may be inserted into the sequence of HBcAg amino acids at various positions, for example, at the N-terminus, from about amino acid 75 to about amino acid 85, from about amino acid 75 to about amino acid 83, from about amino acid 130 to about amino acid 140, or at the C-terminus. Where amino acids of the IV protein fragment replace all or part of the native core protein sequence, the inserted IV sequence is generally not shorter, but may be longer, than the HBcAg sequence it replaces.
[0189]Alternatively, if particle formation is not desired, full-length IV coding sequences can be fused to the coding region for the HBcAg. The HBcAg sequences can be fused either at the N- or C-terminus of any of the Influenza antigens described herein, including the eM2-NP constructs. Fusions could include flexible protein linkers as described for NP-eM2 fusions above. Examples of IV coding sequences fused to the HBcAg coding sequence of SEQ ID NO:41 include an IAV NP-HBcAg fusion (SEQ ID NO:43),
TABLE-US-00050 ATGGCGTCTCAAGGCACCAAACGATCTTACGAACAGATGGAGACTGATG GAGAACGCCAGAATGCCACTGAAATCAGAGCATCCGTCGGAAAAATGAT TGGTGGAATTGGACGATTCTACATCCAAATGTGCACCGAACTCAAACTCA GTGATTATGAGGGACGGTTGATCCAAAACAGCTTAACAATAGAGAGAAT GGTGCTCTCTGCTTTTGACGAAAGGAGAAATAAATACCTTGAAGAACATC CCAGTGCGGGGAAAGATCCTAAGAAAACTGGAGGACCTATATACAGGAG AGTAAACGGAAAGTGGATGAGAGAACTCATCCTTTATGACAAAGAAGAA ATAAGGCGAATCTGGCGCCAAGCTAATAATGGTGACGATGCAACGGCTGG TCTGACTCACATGATGATCTGGCATTCCAATTTGAATGATGCAACTTATC AGAGGACAAGAGCTCTTGTTCGCACCGGAATGGATCCCAGGATGTGCTCT CTGATGCAAGGTTCAACTCTCCCTAGGAGGTCTGGAGCCGCAGGTGCTGC AGTCAAAGGAGTTGGAACAATGGTGATGGAATTGGTCAGAATGATCAAA CGTGGGATCAATGATCGGAACTTCTGGAGGGGTGAGAATGGACGAAAAA CAAGAATTGCTTATGAAAGAATGTGCAACATTCTCAAAGGGAAATTTCAA ACTGCTGCACAAAAAGCAATGATGGATCAAGTGAGAGAGAGCCGGAACC CAGGGAATGCTGAGTTCGAAGATCTCACTTTTCTAGCACGGTCTGCACTC ATATTGAGAGGGTCGGTTGCTCACAAGTCCTGCCTGCCTGCCTGTGTGTA TGGACCTGCCGTAGCCAGTGGGTACGACTTTGAAAGGGAGGGATACTCTC TAGTCGGAATAGACCCTTTCAGACTGCTTCAAAACAGCCAAGTGTACAGC CTAATCAGACCAAATGAGAATCCAGCACACAAGAGTCAACTGGTGTGGA TGGCATGCCATTCTGCCGCATTTGAAGATCTAAGAGTATTAAGCTTCATC AAAGGGACGAAGGTGCTCCCAAGAGGGAAGCTTTCCACTAGAGGAGTTC AAATTGCTTCCAATGAAAATATGGAGACTATGGAATCAAGTACACTTGAA CTGAGAAGCAGGTACTGGGCCATAAGGACCAGAAGTGGAGGAAACACCA ATCAACAGAGGGCATCTGCGGGCCAAATCAGCATACAACCTACGTTGTCA GTACAGAGAAATCTCCCTTTTGACAGAACAACCTTATGGCAGCATTCAG TGGGAATACAGAGGGGAGATGGCGTCTCAAGGCACCAAACGATCTTACG AACAGATGGAGACTGATGGAGAACGCCAGAATGCCACTGAAATCAGAGC ATCCGTCGGAAAAATGATTGGTGGAATTGGACGATTCTACATCCAAATGT GCACCGAACTCAAACTCAGTGATTATGAGGGACGGTTGATCCAAAACAG CTTAACAATAGAGAGAATGGTGCTCTCTGCTTTTGACGAAAGGAGAAATA AATACCTTGAAGAACATCCCAGTGCGGGGAAAGATCCTAAGAAAACTGG AGGACCTATATACAGGAGAGTAAACGGAAAGTGGATGAGAGAACTCATC CTTTATGACAAAGAAGAAATAAGGCGAATCTGGCGCCAAGCTAATAATG GTGACGATGCAACGGCTGGTCTGACTCACATGATGATCTGGCATTCCAAT TTGAATGATGCAACTTATCAGAGGACAAGAGCTCTTGTTCGCACCGGAAT GGATCCCAGGATGTGCTCTCTGATGCAAGGTTCAACTCTCCCTAGGAGGT CTGGAGCCGCAGGTGCTGCAGTCAAAGGAGTTGGAACAATGGTGATGGA ATTGGTCAGAATGATCAAACGTGGGATCAATGATCGGAACTTCTGGAGG GGTGAGAATGGACGAAAAACAAGAATTGCTTATGAAAGAATGTGCAACA TTCTCAAAGGGAAATTTCAAACTGCTGCACAAAAAGCAATGATGGATCA AGTGAGAGAGAGCCGGAACCCAGGGAATGCTGAGTTCGAAGATCTCACT TTTCTAGCACGGTCTGCACTCATATTGAGAGGGTCGGTTGCTCACAAGTC CTGCCTGCCTGCCTGTGTGTATGGACCTGCCGTAGCCAGTGGGTACGACT TTGAAAGGGAGGGATACTCTCTAGTCGGAATAGACCCTTTCAGACTGCTT CAAAACAGCCAAGTGTACAGCCTAATCAGACCAAATGAGAATCCAGCAC ACAAGAGTCAACTGGTGTGGATGGCATGCCATTCTGCCGCATTTGAAGAT CTAAGAGTATTAAGCTTCATCAAAGGGACGAAGGTGCTCCCAAGAGGGA AGCTTTCCACTAGAGGAGTTCAAATTGCTTCCAATGAAAATATGGAGACT ATGGAATCAAGTACACTTGAACTGAGAAGCAGGTACTGGGCCATAAGGA CCAGAAGTGGAGGAAACACCAATCAACAGAGGGCATCTGCGGGCCAAAT CAGCATACAACCTACGTTCTCAGTACAGAGAAATCTCCCTTTTGACAGAA CAACCGTTATGGCAGCATTCAGTGGGAATACAGAGGGGAGAACATCTGA CATGAGGACCGAAATCATAAGGATGATGGAAAGTGCAAGACCAGAAGAT GTGTCTTTCCAGGGGCGGGGAGTCTTCGAGCTCTCGGACGAAAAGGCAGC GAGCCCGATCGTGCCTTCCTTTGACATGAGTAATGAAGGATCTTATTTCT TCGGAGACAATGCAGAGGAATACGATAATATGGATATCGATCCTTATAAA GAATTCGGAGCTACTGTGGAGTTACTCTCGTTTCTCCCGAGTGACTTCTT TCCTTCAGTACGAGATCTTCTGGATACCGCCAGCGCGCTGTATCGGGAAG CCTTGGAGTCTCCTGAGCACTGCAGCCCTCACCATACTGCCCTCAGGCAA GCAATTCTTTGCTGGGGGGAGCTCATGACTCTGGCCACGTGGGTGGGTGT TAACTTGGAAGATCCAGCTAGCAGGGACCTGGTAGTCAGTTATGTCAACA CTAATATGGGTTTAAAGTTCAGGCAACTCTTGTGGTTTCACATTAGCTGC CTCACTTTCGGCCGAGAAACAGTTCTAGAATATTTGGTGTCTTTCGGAGT GTGGATCCGCACTCCTCCAGCTTATAGGCCTCCGAATGCCCCTATCCTGT CGACACTCCCGGAGACTACTGTTGTTAGACGTCGAGGCAGGTCACCTAGA AGAAGAACTCCTTCGCCTCGCAGGCGAAGGTCTCAATCGCCGCGGCGCCG AAGATCTCAATCTCGGGAATCTCAATGT
an IBV NP-HBcAg fusion (SEQ ID NO:44),
TABLE-US-00051 ATGTCCAACATGGATATTGACAGTATAAATACCGGAACAATCGATAAAA CACCAGAAGAACTGACTCCCGGAACCAGTGGGGCAACCAGACCAATCAT CAAGCCAGCAACCCTTGCTCCGCCAAGCAACAAACGAACCCGAAATCCA TCTCCAGAAAGGACAACCACAAGCAGTGAAACCGATATCGGAAGGAAAA TCCAAAAGAAACAAACCCCAACAGAGATAAAGAAGAGCGTCTACAAAAT GGTGGTAAAACTGGGTGAATTCTACAACCAGATGATGGTCAAAGCTGGA CTTAATGATGACATGGAAAGGAATCTAATTCAAAATGCACAAGCTGTGG AGAGAATCCTATTGGCTGCAACTGATGACAAGAAAACTGAATACCAAAA GAAAAGGAATGCCAGAGATGTCAAAGAAGGGAAGGAAGAAATAGACCA CAACAAGACAGGAGGCACCTTTTATAAGATGGTAAGAGATGATAAAACC ATCTACTTCAGCCCTATAAAAATTACCTTTTTAAAAGAAGAGGTGAAAAC AATGTACAAGACCACCATGGGGAGTGATGGTTTCAGTGGACTAAATCAC ATTATGATTGGACATTCACAGATGAACGATGTCTGTTTCCAAAGATCAAA GGGACTGAAAAGGGTTGGACTTGACCCTTCATTAATCAGTACTTTTGCCG GAAGCACACTACCCAGAAGATCAGGTACAACTGGTGTTGCAATCAAAGG AGGTGGAACTTTAGTGGATGAAGCCATCCGATTTATAGGAAGAGCAATG GCAGACAGAGGGCTACTGAGAGACATCAAGGCCAAGACGGCCTATGAAA AGATTCTTCTGAATCTGAAAAACAAGTGCTCTGCGCCGCAACAAAAGGCT CTAGTTGATCAAGTGATCGGAAGTAGGAACCCAGGGATTGCAGACATAG AAGACCTAACTCTGCTTGCCAGAAGCATGGTAGTTGTCAGACCCTCTGTA GCGAGCAAAGTGGTGCTTCCCATAAGCATTTATGCTAAAATACCTCAACT AGGATTCAATACCGAAGAATACTCTATGGTTGGGTATGAAGCCATGGCTC TTTATAATATGGCAACACCTGTTTCCATATTAAGAATGGGAGATGACGCA AAAGATAAATCTCAACTATTCTTCATGTCGTGCTTCGGAGCTGCCTATGA AGATCTAAGAGTGTTATCTGCACTAACGGGCACCGAATTTAAGCCTAGAT CAGCACTAAAATGCAAGGGTTTCCATGTCCCGGCTAAGGAGCAAGTAGA AGGAATGGGGGCAGCTCTGATGTCCATCAAGCTTCAGTTCTGGGCCCCAA TGACCAGATCTGGAGGGAATGAAGTAAGTGGAGAAGGAGGGTCTGGTCA AATAAGTTGCAGCCCTGTGTTTGCAGTAGAAAGACCTATTGCTCTAAGCA AGCAAGCTGTAAGAAGAATGCTGTCAATGAACGTTGAAGGACGTGATGC AGATGTCAAAGGAAATCTACTCAAAATGATGAATGATTCAATGGCAAAG AAAACCAGTGGAAATGCTTTCATTGGGAAGAAAATGTTTCAAATATCAGA CAAAAACAAAGTCAATCCCATTGAGATTCCAATTAAGCAGACCATCCCCA ATTTCTTCTTTGGGAGGGACACAGCAGAGGATTATGATGACCTCGATTAT ATGGATATCGATCCTTATAAAGAATTCGGAGCTACTGTGGAGTTACTCTC GTTTCTCCCGAGTGACTTCTTTCCTTCAGTACGAGATCTTCTGGATACCG CCAGCGCGCTGTATCGGGAAGCCTTGGAGTCTCCTGAGCACTGCAGCCCT CACCATACTGCCCTCAGGCAAGCAATTCTTTGCTGGGGGGAGCTCATGAC TCTGGCCACGTGGGTGGGTGTTAACTTGGAAGATCCAGCTAGCAGGGACC TGGTAGTCAGTTATGTCAACACTAATATGGGTTTAAAGTTCAGGCAACTC TTGTGGTTTCACATTAGCTGCCTCACTTTCGGCCGAGAAACAGTTCTAGA ATATTTGGTGTCTTTCGGAGTGTGGATCCGCACTCCTCCAGCTTATAGGC CTCCGAATGCCCCTATCCTGTCGACACTCCCGGAGACTACTGTTGTTAGA CGTCGAGGCAGGTCACCTAGAAGAAGAACTCCTTCGCCTCGCAGGCGAAG GTCTCAATCGCCGCGGCGCCGAAGATCTCAATCTCGGGAATCTCAATGTT
or an IAV M1-HBcAg fusion (SEQ ID NO:45),
TABLE-US-00052 ATGAGTCTTCTAACCGAGGTCGAAACGTACGTACTCTCTATCATCCCGTC AGGCCCCCTCAAAGCCGAGATCGCACAGAGACTTGAAGATGTCTTTGCAG GGAAGAACACTGATCTTGAGGTTCTCATGGAATGGCTAAAGACAAGACCA ATCCTGTCACCTCTGACTAAGGGGATTTTAGGATTTGTGTTCACGCTCAC CGTGCCCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCCC TTAATGGGAACGGGGATCCAAATAACATGGACAAAGCAGTTAAACTGTA TAGGAAGCTCAAGAGGGAGATAACATTCCATGGGGCCAAAGAAATCTCAC TCAGTTATTCTGCTGGTGCACTTGCCAGTTGTATGGGCCTCATATACAAC AGGATGGGGGCTGTGACCACTGAAGTGGCATTTGGCCTGGTATGTGCAAC CTGTGAACAGATTGCTGACTCCCAGCATCGGTCTCATAGGCAAATGGTGA CAACAACCAATCCACTAATCAGACATGAGAACAGAATGGTTTTAGCCAG CACTACAGCTAAGGCTATGGAGCAAATGGCTGGATCGAGTGAGCAAGCA GCAGAGGCCATGGAGGTTGCTAGTCAGGCTAGACAAATGGTGCAAGCGA TGAGAACCATTGGGACTCATCCTAGCTCCAGTGCTGGTCTGAAAAATGAT CTTCTTGAAAATTTGCAGGCCTATCAGAAACGAATGGGGGTGCAGATGCA ACGGTTCAAGATGGATATCGATCCTTATAAAGAATTCGGAGCTACTGTGG AGTTACTCTCGTTTCTCCCGAGTGACTTCTTTCCTTCAGTACGAGATCTT CTGGATACCGCCAGCGCGCTGTATCGGGAAGCCTTGGAGTCTCCTGAGCA CTGCAGCCCTCACCATAGTGCCCTCAGGCAAGCAATTCTTTGCTGGGGGG AGCTCATGACTCTGGCCACGTGGGTGGGTGTTAACTTGGAAGATCCAGCT AGCAGGGACCTGGTAGTCAGTTATGTCAACACTAATATGGGTTTAAAGTT CAGGCAACTCTTGTGGTTTCACATTAGCTGCCTCACTTTCGGCCGAGAAA CAGTTCTAGAATATTTGGTGTCTTTCGGAGTGTGGATCCGCACTCCTCCA GCTTATAGGCCTCCGAATGCCCCTATCCTGTCGACACTCCCGGAGACTAC TGTTGTTAGACGTCGAGGCAGGTCACCTAGAAGAAGAACTCCTTCGCCTC AGCGGCGAAGGTCTCAATCGCCGCGGCGCCGAAGATCTCAATCTCGGGAA TCTCAATGT
[0190]These fusion constructs could be codon optimized by any of the methods described.
[0191]The chimeric HBcAg can be used in the present invention in conjunction with a polynucleotide comprising a nucleic acid fragment, where each nucleic acid fragment is optionally a fragment of a codon-optimized coding region operably encoding an IV polypeptide, or a fragment, variant, or derivative thereof, as an influenza vaccine for a vertebrate.
Methods and Administration
[0192]The present invention also provides methods for delivering an IV polypeptide or a fragment, variant, or derivative thereof to a human, which comprise administering to a human one or more of the compositions described herein; such that upon administration of compositions such as those described herein, an IV polypeptide or a fragment, variant, or derivative thereof is expressed in human cells, in an amount sufficient to generate an immune response to the IV or administering the IV polypeptide or a fragment, variant, or derivative thereof itself to the human in an amount sufficient to generate an immune response.
[0193]The present invention further provides methods for delivering an IV polypeptide or a fragment, variant, or derivative thereof to a human, which comprise administering to a vertebrate one or more of the compositions described herein; such that upon administration of compositions such as those described herein, an immune response is generated in the vertebrate.
[0194]The term "vertebrate" is intended to encompass a singular "vertebrate" as well as plural "vertebrates" and comprises mammals and birds, as well as fish, reptiles, and amphibians.
[0195]The term "mammal" is intended to encompass a singular "mammal" and plural "mammals," and includes, but is not limited to humans; primates such as apes, monkeys (e.g., owl, squirrel, cebus, rhesus, African green, patas, cynomolgus, and cercopithecus), orangutans, baboons, gibbons, and chimpanzees; canids such as dogs and wolves; felids such as cats, lions, and tigers; equines such as horses, donkeys, and zebras, food animals such as cows, pigs, and sheep; ungulates such as deer and giraffes; ursids such as bears; and others such as rabbits, mice, ferrets, seals, whales. In particular, the mammal can be a human subject, a food animal or a companion animal.
[0196]The term "bird" is intended to encompass a singular "bird" and plural "birds," and includes, but is not limited to feral water birds such as ducks, geese, terns, shearwaters, and gulls; as well as domestic avian species such as turkeys, chickens, quail, pheasants, geese, and ducks. The term "bird" also encompasses passerine birds such as starlings and budgerigars.
[0197]The present invention further provides a method for generating, enhancing or modulating an immune response to an IV comprising administering to a vertebrate one or more of the compositions described herein. In this method, the compositions may include one or more isolated polynucleotides comprising at least one nucleic acid fragment where the nucleic acid fragment is optionally a fragment of a codon-optimized coding region encoding an IV polypeptide, or a fragment, variant, or derivative thereof. In another embodiment, the compositions may include both a polynucleotide as described above, and also an isolated IV polypeptide, or a fragment, variant, or derivative thereof, wherein the protein is provided as a recombinant protein, in particular, a fusion protein, a purified subunit, viral vector expressing the protein, or in the form of an inactivated IV vaccine. Thus, the latter compositions include both a polynucleotide encoding an IV polypeptide or a fragment, variant, or derivative thereof and an isolated IV polypeptide or a fragment, variant, or derivative thereof. The IV polypeptide or a fragment, variant, or derivative thereof encoded by the polynucleotide of the compositions need not be the same as the isolated IV polypeptide or a fragment, variant, or derivative thereof of the compositions. Compositions to be used according to this method may be univalent, bivalent, trivalent or multivalent.
[0198]The polynucleotides of the compositions may comprise a fragment of a human (or other vertebrate) codon-optimized coding region encoding a protein of the IV, or a fragment, variant, or derivative thereof. The polynucleotides are incorporated into the cells of the vertebrate in vivo, and an antigenic amount of the IV polypeptide, or fragment, variant, or derivative thereof, is produced in vivo. Upon administration of the composition according to this method, the IV polypeptide or a fragment, variant, or derivative thereof is expressed in the vertebrate in an amount sufficient to elicit an immune response. Such an immune response might be used, for example, to generate antibodies to the IV for use in diagnostic assays or as laboratory reagents, or as therapeutic or preventative vaccines as described herein.
[0199]The present invention further provides a method for generating, enhancing, or modulating a protective and/or therapeutic immune response to IV in a vertebrate, comprising administering to a vertebrate in need of therapeutic and/or preventative immunity one or more of the compositions described herein. In this method, the compositions include one or more polynucleotides comprising at least one nucleic acid fragment, where the nucleic acid fragment is optionally a fragment of a codon-optimized coding region encoding an IV polypeptide, or a fragment, variant, or derivative thereof. In a further embodiment, the composition used in this method includes both an isolated polynucleotide comprising at least one nucleic acid fragment, where the nucleic acid fragment is optionally a fragment of a codon-optimized coding region encoding an IV polypeptide, or a fragment, variant, or derivative thereof; and at least one isolated IV polypeptide, or a fragment, variant, or derivative thereof. Thus, the latter composition includes both an isolated polynucleotide encoding an IV polypeptide or a fragment, variant, or derivative thereof and an isolated IV polypeptide or a fragment, variant, or derivative thereof, for example, a recombinant protein, a purified subunit, viral vector expressing the protein, or an inactivated virus vaccine. Upon administration of the composition according to this method, the IV polypeptide or a fragment, variant, or derivative thereof is expressed in the human in a therapeutically or prophylactically effective amount.
[0200]As used herein, an "immune response" refers to the ability of a vertebrate to elicit an immune reaction to a composition delivered to that vertebrate. Examples of immune responses include an antibody response or a cellular, e.g., cytotoxic T-cell, response. One or more compositions of the present invention may be used to prevent influenza infection in vertebrates, e.g., as a prophylactic vaccine, to establish or enhance immunity to IV in a healthy individual prior to exposure to influenza or contraction of influenza disease, thus preventing the disease or reducing the severity of disease symptoms.
[0201]As mentioned above, compositions of the present invention can be used both to prevent IV infection, and also to therapeutically treat IV infection. In individuals already exposed to influenza, or already suffering from influenza disease, the present invention is used to further stimulate the immune system of the vertebrate, thus reducing or eliminating the symptoms associated with that disease or disorder. As defined herein, "treatment" refers to the use of one or more compositions of the present invention to prevent, cure, retard, or reduce the severity of influenza disease symptoms in a vertebrate, and/or result in no worsening of influenza disease over a specified period of time in a vertebrate which has already been exposed to IV and is thus in need of therapy. The term "prevention" refers to the use of one or more compositions of the present invention to generate immunity in a vertebrate which has not yet been exposed to a particular strain of IV, thereby preventing or reducing disease symptoms if the vertebrate is later exposed to the particular strain of IV. The methods of the present invention therefore may be referred to as therapeutic vaccination or preventative or prophylactic vaccination. It is not required that any composition of the present invention provide total immunity to influenza or totally cure or eliminate all influenza disease symptoms. As used herein, a "vertebrate in need of therapeutic and/or preventative immunity" refers to an individual for whom it is desirable to treat, i.e., to prevent, cure, retard, or reduce the severity of influenza disease symptoms, and/or result in no worsening of influenza disease over a specified period of time. Vertebrates to treat and/or vaccinate include humans, apes, monkeys (e.g., owl, squirrel, cebus, rhesus, African green, patas, cynomolgus, and cercopithecus), orangutans, baboons, gibbons, and chimpanzees, dogs, wolves, cats, lions, and tigers, horses, donkeys, zebras, cows, pigs, sheep, deer, giraffes, bears, rabbits, mice, ferrets, seals, whales, ducks, geese, terns, shearwaters, gulls, turkeys, chickens, quail, pheasants, geese, starlings and budgerigars.
[0202]One or more compositions of the present invention are utilized in a "prime boost" regimen. An example of a "prime boost" regimen may be found in Yang, Z. et al. J. Virol. 77:799-803 (2002), which is incorporated herein by reference in its entirety. In these embodiments, one or more polynucleotide vaccine compositions of the present invention are delivered to a vertebrate, thereby priming the immune response of the vertebrate to an IV, and then a second immunogenic composition is utilized as a boost vaccination. One or more compositions of the present invention are used to prime immunity, and then a second immunogenic composition, e.g., a recombinant viral vaccine or vaccines, a different polynucleotide vaccine, or one or more purified subunit isolated IV polypeptides or fragments, variants or derivatives thereof is used to boost the anti-IV immune response.
[0203]In one embodiment, a priming composition and a boosting composition are combined in a single composition or single formulation. For example, a single composition may comprise an isolated IV polypeptide or a fragment, variant, or derivative thereof as the priming component and a polynucleotide encoding an influenza protein as the boosting component. In this embodiment, the compositions may be contained in a single vial where the priming component and boosting component are mixed together. In general, because the peak levels of expression of protein from the polynucleotide does not occur until later (e.g., 7-10 days) after administration, the polynucleotide component may provide a boost to the isolated protein component. Compositions comprising both a priming component and a boosting component are referred to herein as "combinatorial vaccine compositions" or "single formulation heterologous prime-boost vaccine compositions." In addition, the priming composition may be administered before the boosting composition, or even after the boosting composition, if the boosting composition is expected to take longer to act.
[0204]In another embodiment, the priming composition may be administered simultaneously with the boosting composition, but in separate formulations where the priming component and the boosting component are separated.
[0205]The terms "priming" or "primary" and "boost" or "boosting" as used herein may refer to the initial and subsequent immunizations, respectively, i.e., in accordance with the definitions these terms normally have in immunology. However, in certain embodiments, e.g., where the priming component and boosting component are in a single formulation, initial and subsequent immunizations may not be necessary as both the "prime" and the "boost" compositions are administered simultaneously.
[0206]In certain embodiments, one or more compositions of the present invention are delivered to a vertebrate by methods described herein, thereby achieving an effective therapeutic and/or an effective preventative immune response. More specifically, the compositions of the present invention may be administered to any tissue of a vertebrate, including, but not limited to, muscle, skin, brain tissue, lung tissue, liver tissue, spleen tissue, bone marrow tissue, thymus tissue, heart tissue, e.g., myocardium, endocardium, and pericardium, lymph tissue, blood tissue, bone tissue, pancreas tissue, kidney tissue, gall bladder tissue, stomach tissue, intestinal tissue, testicular tissue, ovarian tissue, uterine tissue, vaginal tissue, rectal tissue, nervous system tissue, eye tissue, glandular tissue, tongue tissue, and connective tissue, e.g., cartilage.
[0207]Furthermore, the compositions of the present invention may be administered to any internal cavity of a vertebrate, including, but not limited to, the lungs, the mouth, the nasal cavity, the stomach, the peritoneal cavity, the intestine, any heart chamber, veins, arteries, capillaries, lymphatic cavities, the uterine cavity, the vaginal cavity, the rectal cavity, joint cavities, ventricles in brain, spinal canal in spinal cord, the ocular cavities, the lumen of a duct of a salivary gland or a liver. When the compositions of the present invention is administered to the lumen of a duct of a salivary gland or liver, the desired polypeptide is expressed in the salivary gland and the liver such that the polypeptide is delivered into the blood stream of the vertebrate from each of the salivary gland or the liver. Certain modes for administration to secretory organs of a gastrointestinal system using the salivary gland, liver and pancreas to release a desired polypeptide into the bloodstream is disclosed in U.S. Pat. Nos. 5,837,693 and 6,004,944, both of which are incorporated herein by reference in their entireties.
[0208]In certain embodiments, the compositions are administered into embryonated chicken eggs or by intra-muscular injection into the defeathered breast area of chicks as described in Kodihalli S. et al., Vaccine 18:2592-9 (2000), which is incorporated herein by reference in its entirety.
[0209]In certain embodiments, the compositions are administered to muscle, either skeletal muscle or cardiac muscle, or to lung tissue. Specific, but non-limiting modes for administration to lung tissue are disclosed in Wheeler, C. J., et al., Proc. Natl. Acad. Sci. USA 93:11454-11459 (1996), which is incorporated herein by reference in its entirety.
[0210]According to the disclosed methods, compositions of the present invention can be administered by intramuscular (i.m.), subcutaneous (s.c.), or intrapulmonary routes. Other suitable routes of administration include, but are not limited to intratracheal, transdermal, intraocular, intranasal, inhalation, intracavity, intravenous (i.v.), intraductal (e.g., into the pancreas) and intraparenchymal (i.e., into any tissue) administration. Transdermal delivery includes, but not limited to intradermal (e.g., into the dermis or epidermis), transdermal (e.g., percutaneous) and transmucosal administration (i.e., into or through skin or mucosal tissue). Intracavity administration includes, but not limited to administration into oral, vaginal, rectal, nasal, peritoneal, or intestinal cavities as well as, intrathecal (i.e., into spinal canal), intraventricular (i.e., into the brain ventricles or the heart ventricles), inraatrial (i.e., into the heart atrium) and sub arachnoid (i.e., into the sub arachnoid spaces of the brain) administration.
[0211]Any mode of administration can be used so long as the mode results in the expression of the desired peptide or protein, in the desired tissue, in an amount sufficient to generate an immune response to IV and/or to generate a prophylactically or therapeutically effective immune response to IV in a human in need of such response. Administration means of the present invention include needle injection, catheter infusion, biolistic injectors, particle accelerators (e.g., "gene guns" or pneumatic "needleless" injectors) Med-E-Jet (Vahlsing, H., et al., J. Immunol. Methods 171:11-22 (1994)), Pigjet (Schrijver, R., et al., Vaccine 15: 1908-1916 (1997)), Biojector (Davis, H., et al., Vaccine 12: 1503-1509 (1994); Gramzinski, R., et al., Mol. Med. 4: 109-118 (1998)), AdvantaJet (Linmayer, I., et al., Diabetes Care 9:294-297 (1986)), Medi-jector (Martins, J., and Roedl, E. J. Occup. Med. 21:821-824 (1979)), gelfoam sponge depots, other commercially available depot materials (e.g., hydrogels), osmotic pumps (e.g., Alza minipumps), oral or suppositorial solid (tablet or pill) pharmaceutical formulations, topical skin creams, and decanting, use of polynucleotide coated suture (Qin, Y., et al., Life Sciences 65: 2193-2203 (1999)) or topical applications during surgery. Certain modes of administration are intramuscular needle-based injection and pulmonary application via catheter infusion. Energy-assisted plasmid delivery (EAPD) methods may also be employed to administer the compositions of the invention. One such method involves the application of brief electrical pulses to injected tissues, a procedure commonly known as electroporation. See generally Mir, L. M. et al., Proc. Natl. Acad. Sci. USA 96:4262-7 (1999); Hartikka, J. et al., Mol. Ther. 4:407-15 (2001); Mathiesen, I., Gene Ther. 6:508-14 (1999); Rizzuto G. et al., Hum. Gen. Ther. 11:1891-900 (2000). Each of the references cited in this paragraph is incorporated herein by reference in its entirety.
[0212]Determining an effective amount of one or more compositions of the present invention depends upon a number of factors including, for example, the antigen being expressed or administered directly, e.g., HA, NA, NP, M1 or M2, or fragments, e.g., eM2, variants, or derivatives thereof, the age and weight of the subject, the precise condition requiring treatment and its severity, and the route of administration. Based on the above factors, determining the precise amount, number of doses, and timing of doses are within the ordinary skill in the art and will be readily determined by the attending physician or veterinarian.
[0213]Compositions of the present invention may include various salts, excipients, delivery vehicles and/or auxiliary agents as are disclosed, e.g., in U.S. Patent Application Publication No. 2002/0019358, published Feb. 14, 2002, which is incorporated herein by reference in its entirety.
[0214]Furthermore, compositions of the present invention may include one or more transfection facilitating compounds that facilitate delivery of polynucleotides to the interior of a cell, and/or to a desired location within a cell. As used herein, the terms "transfection facilitating compound," "transfection facilitating agent," and "transfection facilitating material" are synonymous, and may be used interchangeably. It should be noted that certain transfection facilitating compounds may also be "adjuvants" as described infra, i.e., in addition to facilitating delivery of polynucleotides to the interior of a cell, the compound acts to alter or increase the immune response to the antigen encoded by that polynucleotide. Examples of the transfection facilitating compounds include, but are not limited to inorganic materials such as calcium phosphate, alum (aluminum sulfate), and gold particles (e.g., "powder" type delivery vehicles); peptides that are, for example, cationic, intercell targeting (for selective delivery to certain cell types), intracell targeting (for nuclear localization or endosomal escape), and ampipathic (helix forming or pore forming); proteins that are, for example, basic (e.g., positively charged) such as histones, targeting (e.g., asialoprotein), viral (e.g., Sendai virus coat protein), and pore-forming; lipids that are, for example, cationic (e.g., DMREE, DOSPA, DC-Chol), basic (e.g., steryl amine), neutral (e.g., cholesterol), anionic (e.g., phosphatidyl serine), and zwitterionic (e.g., DOPE, DOPC); and polymers such as dendrimers, star-polymers, "homogenous" poly-amino acids (e.g., poly-lysine, poly-arginine), "heterogeneous" poly-amino acids (e.g., mixtures of lysine & glycine), co-polymers, polyvinylpyrrolidinone (PVP), poloxamers (e.g. CRL 1005) and polyethylene glycol (PEG). A transfection facilitating material can be used alone or in combination with one or more other transfection facilitating materials. Two or more transfection facilitating materials can be combined by chemical bonding (e.g., covalent and ionic such as in lipidated polylysine, PEGylated polylysine) (Toncheva, et al., Biochim. Biophys. Acta 1380(3):354-368 (1988)), mechanical mixing (e.g., free moving materials in liquid or solid phase such as "polylysine+cationic lipids") (Gao and Huang, Biochemistry 35:1027-1036 (1996); Trubetskoy, et al., Biochem. Biophys. Acta 1131:311-313 (1992)), and aggregation (e.g., co-precipitation, gel forming such as in cationic lipids+poly-lactide, and polylysine+gelatin). Each of the references cited in this paragraph is incorporated herein by reference in its entirety.
[0215]One category of transfection facilitating materials is cationic lipids. Examples of cationic lipids are 5-carboxyspermylglycine dioctadecylamide (DOGS) and dipalmitoyl-phophatidylethanolamine-5-carboxyspermylamide (DPPES). Cationic cholesterol derivatives are also useful, including {3ฮฒ-[N--N',N'-dimethylamino)ethane]-carbomoyl}-cholesterol (DC-Chol). Dimethyldioctdecyl-ammonium bromide (DDAB), N-(3-aminopropyl)-N,N-(bis-(2-tetradecyloxyethyl))-N-methyl-ammonium bromide (PA-DEMO), N-(3-aminopropyl)-N,N-(bis-(2-dodecyloxyethyl))-N-methyl-ammonium bromide (PA-DELO), N,N,N-tris-(2-dodecyloxy)ethyl-N-(3-amino)propyl-ammonium bromide (PA-TELO), and N1-(3-aminopropyl)((2-dodecyloxy)ethyl)-N2-(2-dodecyloxy)ethyl-1-piperazi- naminium bromide (GA-LOE-BP) can also be employed in the present invention.
[0216]Non-diether cationic lipids, such as DL-1,2-dioleoyl-3-dimethylaminopropyl-ฮฒ-hydroxyethylammonium (DORI diester), 1-O-oleyl-2-oleoyl-3-dimethylaminopropyl-ฮฒ-hydroxyethylamm- onium (DORI ester/ether), and their salts promote in vivo gene delivery. In some embodiments, cationic lipids comprise groups attached via a heteroatom attached to the quaternary ammonium moiety in the head group. A glycyl spacer can connect the linker to the hydroxyl group.
[0217]Specific, but non-limiting cationic lipids for use in certain embodiments of the present invention include DMRIE ((ยฑ)--N-(2-hydroxyethyl)-N,N-dimethyl-2,3-bis(tetradecyloxy)-1-propana- minium bromide), GAP-DMORIE ((ยฑ)--N-(3-aminopropyl)-N,N-dimethyl-2,3-bis(syn-9-tetradeceneyloxy)-1- -propanaminium bromide), and GAP-DLRIE ((ยฑ)--N-(3-aminopropyl)-N,N-dimethyl-2,3-(bis-dodecyloxy)-1-propanamin- ium bromide).
[0218]Other specific but non-limiting cationic surfactants for use in certain embodiments of the present invention include Bn-DHRIE, DhxRIE, DhxRIE-OAc, DhxRIE-OBz and Pr-DOctRIE-OAc. These lipids are disclosed in copending U.S. patent application Ser. No. 10/725,015. In another aspect of the present invention, the cationic surfactant is Pr-DOctRIE-OAc.
[0219]Other cationic lipids include (ยฑ)-N,N-dimethyl-N-[2-(sperminecarboxamido) ethyl]-2,3-bis(dioleyloxy)-1-propaniminium pentahydro chloride (DOSPA), (ยฑ)--N-(2-aminoethyl)-N,N-dimethyl-2,3-bis(tetradecyloxy)-1-propanimin- ium bromide (ฮฒ-aminoethyl-DMRIE or ฮฒAE-DMRIE) (Wheeler, et al., Biochim. Biophys. Acta 1280:1-11 (1996), and (ยฑ)--N-(3-aminopropyl)-N,N-dimethyl-2,3-bis(dodecyloxy)-1-propaniminiu- m bromide (GAP-DLRIE) (Wheeler, et al., Proc. Natl. Acad. Sci. USA 93:11454-11459 (1996)), which have been developed from DMRIE. Both of the references cited in this paragraph are incorporated herein by reference in their entirety.
[0220]Other examples of DMRIE-derived cationic lipids that are useful for the present invention are (ยฑ)--N-(3-aminopropyl)-N,N-dimethyl-2,3-(bis-decyloxy)-1-propanaminium bromide (GAP-DDRIE), (ยฑ)--N-(3-aminopropyl)--N,N-dimethyl-2,3-(bis-tetradecyloxy)-1-propana- minium bromide (GAP-DMRIE), (ยฑ)--N--((N''-methyl)-N'-ureyl)propyl-N,N-dimethyl-2,3-bis(tetradecylo- xy)-1-propanaminium bromide (GMU-DMRIE), (ยฑ)--N-(2-hydroxyethyl)-N,N-dimethyl-2,3-bis(dodecyloxy)-1-propanamini- um bromide (DLRIE), and (ยฑ)--N-(2-hydroxyethyl)-N,N-dimethyl-2,3-bis-([Z]-9-octadecenyloxy)pro- pyl-1-propaniminium bromide (HP-DORIE).
[0221]In the embodiments where the immunogenic composition comprises a cationic lipid, the cationic lipid may be mixed with one or more co-lipids. For purposes of definition, the term "co-lipid" refers to any hydrophobic material which may be combined with the cationic lipid component and includes amphipathic lipids, such as phospholipids, and neutral lipids, such as cholesterol. Cationic lipids and co-lipids may be mixed or combined in a number of ways to produce a variety of non-covalently bonded macroscopic structures, including, for example, liposomes, multilamellar vesicles, unilamellar vesicles, micelles, and simple films. One non-limiting class of co-lipids are the zwitterionic phospholipids, which include the phosphatidylethanolamines and the phosphatidylcholines. Examples of phosphatidylethanolamines, include DOPE, DMPE and DPyPE. In certain embodiments, the co-lipid is DPyPE, which comprises two phytanoyl substituents incorporated into the diacylphosphatidylethanolamine skeleton. In other embodiments, the co-lipid is DOPE, CAS name 1,2-diolyeoyl-sn-glycero-3-phosphoethanolamine.
[0222]When a composition of the present invention comprises a cationic lipid and co-lipid, the cationic lipid:co-lipid molar ratio may be from about 9:1 to about 1:9, from about 4:1 to about 1:4, from about 2:1 to about 1:2, or about 1:1.
[0223]In order to maximize homogeneity, the cationic lipid and co-lipid components may be dissolved in a solvent such as chloroform, followed by evaporation of the cationic lipid/co-lipid solution under vacuum to dryness as a film on the inner surface of a glass vessel (e.g., a Rotovap round-bottomed flask). Upon suspension in an aqueous solvent, the amphipathic lipid component molecules self-assemble into homogenous lipid vesicles. These lipid vesicles may subsequently be processed to have a selected mean diameter of uniform size prior to complexing with, for example, a codon-optimized polynucleotide of the present invention, according to methods known to those skilled in the art. For example, the sonication of a lipid solution is described in Feigner et al., Proc. Natl. Acad. Sci. USA 8:7413-7417 (1987) and in U.S. Pat. No. 5,264,618, the disclosures of which are incorporated herein by reference.
[0224]In those embodiments where the composition includes a cationic lipid, polynucleotides of the present invention are complexed with lipids by mixing, for example, a plasmid in aqueous solution and a solution of cationic lipid:co-lipid as prepared herein are mixed. The concentration of each of the constituent solutions can be adjusted prior to mixing such that the desired final plasmid/cationic lipid:co-lipid ratio and the desired plasmid final concentration will be obtained upon mixing the two solutions. The cationic lipid:co-lipid mixtures are suitably prepared by hydrating a thin film of the mixed lipid materials in an appropriate volume of aqueous solvent by vortex mixing at ambient temperatures for about 1 minute. The thin films are prepared by admixing chloroform solutions of the individual components to afford a desired molar solute ratio followed by aliquoting the desired volume of the solutions into a suitable container. The solvent is removed by evaporation, first with a stream of dry, inert gas (e.g. argon) followed by high vacuum treatment.
[0225]Other hydrophobic and amphiphilic additives, such as, for example, sterols, fatty acids, gangliosides, glycolipids, lipopeptides, liposaccharides, neobees, niosomes, prostaglandins and sphingolipids, may also be included in compositions of the present invention. In such compositions, these additives may be included in an amount between about 0.1 mol % and about 99.9 mol % (relative to total lipid), about 1-50 mol %, or about 2-25 mol %.
[0226]Additional embodiments of the present invention are drawn to compositions comprising an auxiliary agent which is administered before, after, or concurrently with the polynucleotide. As used herein, an "auxiliary agent" is a substance included in a composition for its ability to enhance, relative to a composition which is identical except for the inclusion of the auxiliary agent, the entry of polynucleotides into vertebrate cells in vivo, and/or the in vivo expression of polypeptides encoded by such polynucleotides. Certain auxiliary agents may, in addition to enhancing entry of polynucleotides into cells, enhance an immune response to an immunogen encoded by the polynucleotide. Auxiliary agents of the present invention include nonionic, anionic, cationic, or zwitterionic surfactants or detergents, with nonionic surfactants or detergents being preferred, chelators, DNase inhibitors, poloxamers, agents that aggregate or condense nucleic acids, emulsifying or solubilizing agents, wetting agents, gel-forming agents, and buffers.
[0227]Auxiliary agents for use in compositions of the present invention include, but are not limited to non-ionic detergents and surfactants IGEPAL CA 630ยฎ, NONIDET NP-40, Nonidetยฎ P40, Tween-20ยฎ, Tween-80ยฎ, Pluronicยฎ F68 (ave. MW: 8400; approx. MW of hydrophobe, 1800; approx. wt. % of hydrophile, 80%), Pluronic F77ยฎ (ave. MW: 6600; approx. MW of hydrophobe, 2100; approx. wt. % of hydrophile, 70%), Pluronic P65ยฎ (ave. MW: 3400; approx. MW of hydrophobe, 1800; approx. wt. % of hydrophile, 50%), Triton X100ยฎ, and Triton X-114ยฎ; the anionic detergent sodium dodecyl sulfate (SDS); the sugar stachyose; the condensing agent DMSO; and the chelator/DNAse inhibitor EDTA, CRL 1005 (12 kDa, 5% POE), and BAK (Benzalkonium chloride 50% solution, available from Ruger Chemical Co. Inc.). In certain specific embodiments, the auxiliary agent is DMSO, Nonidet P40, Pluronic F68ยฎ (ave. MW: 8400; approx. MW of hydrophobe, 1800; approx. wt. % of hydrophile, 80%), Pluronic F77ยฎ (ave. MW: 6600; approx. MW of hydrophobe, 2100; approx. wt. % of hydrophile, 70%), Pluronic P65ยฎ (ave. MW: 3400; approx. MW of hydrophobe, 1800; approx. wt. % of hydrophile, 50%), Pluronic L64ยฎ (ave. MW: 2900; approx. MW of hydrophobe, 1800; approx. wt. % of hydrophile, 40%), and Pluronic F108ยฎ (ave. MW: 14600; approx. MW of hydrophobe, 3000; approx. wt. % of hydrophile, 80%). See, e.g., U.S. Patent Application Publication No. 2002/0019358, published Feb. 14, 2002, which is incorporated herein by reference in its entirety.
[0228]Certain compositions of the present invention can further include one or more adjuvants before, after, or concurrently with the polynucleotide. The term "adjuvant" refers to any material having the ability to (1) alter or increase the immune response to a particular antigen or (2) increase or aid an effect of a pharmacological agent. It should be noted, with respect to polynucleotide vaccines, that an "adjuvant," can be a transfection facilitating material. Similarly, certain "transfection facilitating materials" described supra, may also be an "adjuvant." An adjuvant may be used with a composition comprising a polynucleotide of the present invention. In a prime-boost regimen, as described herein, an adjuvant may be used with either the priming immunization, the booster immunization, or both. Suitable adjuvants include, but are not limited to, cytokines and growth factors; bacterial components (e.g., endotoxins, in particular superantigens, exotoxins and cell wall components); aluminum-based salts; calcium-based salts; silica; polynucleotides; toxoids; serum proteins, viruses and virally-derived materials, poisons, venoms, imidazoquiniline compounds, poloxamers, and cationic lipids.
[0229]A great variety of materials have been shown to have adjuvant activity through a variety of mechanisms. Any compound which may increase the expression, antigenicity or immunogenicity of the polypeptide is a potential adjuvant. The present invention provides an assay to screen for improved immune responses to potential adjuvants. Potential adjuvants which may be screened for their ability to enhance the immune response according to the present invention include, but are not limited to: inert carriers, such as alum, bentonite, latex, and acrylic particles; pluronic block polymers, such as TiterMaxยฎ (block copolymer CRL-8941, squalene (a metabolizable oil) and a microparticulate silica stabilizer); depot formers, such as Freunds adjuvant, surface active materials, such as saponin, lysolecithin, retinal, Quil A, liposomes, and pluronic polymer formulations; macrophage stimulators, such as bacterial lipopolysaccharide; alternate pathway complement activators, such as insulin, zymosan, endotoxin, and levamisole; and non-ionic surfactants, such as poloxamers, poly(oxyethylene)-poly(oxypropylene) tri-block copolymers. Also included as adjuvants are transfection-facilitating materials, such as those described above.
[0230]Poloxamers which may be screened for their ability to enhance the immune response according to the present invention include, but are not limited to, commercially available poloxamers such as Pluronicยฎ surfactants, which are block copolymers of propylene oxide and ethylene oxide in which the propylene oxide block is sandwiched between two ethylene oxide blocks. Examples of Pluronicยฎ surfactants include Pluronicยฎ L121 (ave. MW: 4400; approx. MW of hydrophobe, 3600; approx. wt. % of hydrophile, 10%), Pluronicยฎ L101 (ave. MW: 3800; approx. MW of hydrophobe, 3000; approx. wt. % of hydrophile, 10%), Pluronicยฎ L81 (ave. MW: 2750; approx. MW of hydrophobe, 2400; approx. wt. % of hydrophile, 10%), Pluronicยฎ L61 (ave. MW: 2000; approx. MW of hydrophobe, 1800; approx. wt. % of hydrophile, 10%), Pluronicยฎ L31 (ave. MW: 1100; approx. MW of hydrophobe, 900; approx. wt. % of hydrophile, 10%), Pluronicยฎ L122 (ave. MW: 5000; approx. MW of hydrophobe, 3600; approx. wt. % of hydrophile, 20%), Pluronicยฎ L92 (ave. MW: 3650; approx. MW of hydrophobe, 2700; approx. wt. % of hydrophile, 20%), Pluronicยฎ L72 (ave. MW: 2750; approx. MW of hydrophobe, 2100; approx. wt. % of hydrophile, 20%), Pluronicยฎ L62 (ave. MW: 2500; approx. MW of hydrophobe, 1800; approx. wt. % of hydrophile, 20%), Pluronicยฎ L42 (ave. MW: 1630; approx. MW of hydrophobe, 1200; approx. wt. % of hydrophile, 20%), Pluronicยฎ L63 (ave. MW: 2650; approx. MW of hydrophobe, 1800; approx. wt. % of hydrophile, 30%), Pluronicยฎ L43 (ave. MW: 1850; approx. MW of hydrophobe, 1200; approx. wt. % of hydrophile, 30%), Pluronicยฎ L64 (ave. MW: 2900; approx. MW of hydrophobe, 1800; approx. wt. % of hydrophile, 40%), Pluronicยฎ L44 (ave. MW: 2200; approx. MW of hydrophobe, 1200; approx. wt. % of hydrophile, 40%), Pluronicยฎ L35 (ave. MW: 1900; approx. MW of hydrophobe, 900; approx. wt. % of hydrophile, 50%), Pluronicยฎ P123 (ave. MW: 5750; approx. MW of hydrophobe, 3600; approx. wt. % of hydrophile, 30%), Pluronicยฎ P103 (ave. MW: 4950; approx. MW of hydrophobe, 3000; approx. wt. % of hydrophile, 30%), Pluronicยฎ P104 (ave. MW: 5900; approx. MW of hydrophobe, 3000; approx. wt. % of hydrophile, 40%), Pluronicยฎ P84 (ave. MW: 4200; approx. MW of hydrophobe, 2400; approx. wt. % of hydrophile, 40%), Pluronicยฎ P105 (ave. MW: 6500; approx. MW of hydrophobe, 3000; approx. wt. % of hydrophile, 50%), Pluronicยฎ P85 (ave. MW: 4600; approx. MW othydrophobe, 2400; approx. wt. % of hydrophile, 50%), Pluronicยฎ P75 (ave. MW: 4150; approx. MW of hydrophobe, 2100; approx. wt. % of hydrophile, 50%), Pluronicยฎ P65 (ave. MW: 3400; approx. MW of hydrophobe, 1800; approx. wt. % of hydrophile, 50%), Pluronicยฎ F127 (ave. MW: 12600; approx. MW of hydrophobe, 3600; approx. wt. % of hydrophile, 70%), Pluronicยฎ F98 (ave. MW: 13000; approx. MW of hydrophobe, 2700; approx. wt. % of hydrophile, 80%), Pluronicยฎ F87 (ave. MW: 7700; approx. MW of hydrophobe, 2400; approx. wt. % of hydrophile, 70%), Pluronicยฎ F77 (ave. MW: 6600; approx. MW of hydrophobe, 2100; approx. wt. % of hydrophile, 70%), Pluronicยฎ F108 (ave. MW: 14600; approx. MW of hydrophobe, 3000; approx. wt. % of hydrophile, 80%), Pluronicยฎ F98 (ave. MW: 13000; approx. MW of hydrophobe, 2700; approx. wt. % of hydrophile, 80%), Pluronicยฎ F88 (ave. MW: 11400; approx. MW of hydrophobe, 2400; approx. wt. % of hydrophile, 80%), Pluronicยฎ F68 (ave. MW: 8400; approx. MW of hydrophobe, 1800; approx. wt. % of hydrophile, 80%), Pluronicยฎ F38 (ave. MW: 4700; approx. MW of hydrophobe, 900; approx. wt. % of hydrophile, 80%).
[0231]Reverse poloxamers which may be screened for their ability to enhance the immune response according to the present invention include, but are not limited to Pluronicยฎ R 31R1 (ave. MW: 3250; approx. MW of hydrophobe, 3100; approx. wt. % of hydrophile, 10%), Pluronicยฎ R 25R1 (ave. MW: 2700; approx. MW of hydrophobe, 2500; approx. wt. % of hydrophile, 10%), Pluronicยฎ R 17R1 (ave. MW: 1900; approx. MW of hydrophobe, 1700; approx. wt. % of hydrophile, 10%), Pluronicยฎ R 31R2 (ave. MW: 3300; approx. MW of hydrophobe, 3100; approx. wt. % of hydrophile, 20%), Pluronicยฎ R 25R2 (ave. MW: 3100; approx. MW of hydrophobe, 2500; approx. wt. % of hydrophile, 20%), Pluronicยฎ R 17R2 (ave. MW: 2150; approx. MW of hydrophobe, 1700; approx. wt. % of hydrophile, 20%), Pluronicยฎ R 12R3 (ave. MW: 1800; approx. MW of hydrophobe, 1200; approx. wt. % of hydrophile, 30%), Pluronicยฎ R 31R4 (ave. MW: 4150; approx. MW of hydrophobe, 3100; approx. wt. % of hydrophile, 40%), Pluronicยฎ R 25R4 (ave. MW: 3600; approx. MW of hydrophobe, 2500; approx. wt. % of hydrophile, 40%), Pluronicยฎ R 22R4 (ave. MW: 3350; approx. MW of hydrophobe, 2200; approx. wt. % of hydrophile, 40%), Pluronicยฎ R 17R4 (ave. MW: 3650; approx. MW of hydrophobe, 1700; approx. wt. % of hydrophile, 40%), Pluronicยฎ R 25R5 (ave. MW: 4320; approx. MW of hydrophobe, 2500; approx. wt. % of hydrophile, 50%), Pluronicยฎ R 10R5 (ave. MW: 1950; approx. MW of hydrophobe, 1000; approx. wt. % of hydrophile, 50%), Pluronicยฎ R 25R8 (ave. MW: 8550; approx. MW of hydrophobe, 2500; approx. wt. % of hydrophile, 80%), Pluronicยฎ R 17R8 (ave. MW: 7000; approx. MW of hydrophobe, 1700; approx. wt. % of hydrophile, 80%), and Pluronicยฎ R 10R8 (ave. MW: 4550; approx. MW of hydrophobe, 1000; approx. wt. % of hydrophile, 80%).
[0232]Other commercially available poloxamers which may be screened for their ability to enhance the immune response according to the present invention include compounds that are block copolymer of polyethylene and polypropylene glycol such as Synperonicยฎ L121 (ave. MW: 4400), Synperonicยฎ L122 (ave. MW: 5000), Synperonicยฎ P104 (ave. MW: 5850), Synperonicยฎ P105 (ave. MW: 6500), Synperonicยฎ P123 (ave. MW: 5750), Synperonicยฎ P85 (ave. MW: 4600) and Synperonicยฎ P94 (ave. MW: 4600), in which L indicates that the surfactants are liquids, P that they are pastes, the first digit is a measure of the molecular weight of the polypropylene portion of the surfactant and the last digit of the number, multiplied by 10, gives the percent ethylene oxide content of the surfactant; and compounds that are nonylphenyl polyethylene glycol such as Synperonicยฎ NP10 (nonylphenol ethoxylated surfactant-10% solution), Synperonicยฎ NP30 (condensate of 1 mole of nonylphenol with 30 moles of ethylene oxide) and Synperonicยฎ NP5 (condensate of 1 mole of nonylphenol with 5.5 moles of naphthalene oxide).
[0233]Other poloxamers which may be screened for their ability to enhance the immune response according to the present invention include: (a) a polyether block copolymer comprising an A-type segment and a B-type segment, wherein the A-type segment comprises a linear polymeric segment of relatively hydrophilic character, the repeating units of which contribute an average Hansch-Leo fragmental constant of about -0.4 or less and have molecular weight contributions between about 30 and about 500, wherein the B-type segment comprises a linear polymeric segment of relatively hydrophobic character, the repeating units of which contribute an average Hansch-Leo fragmental constant of about -0.4 or more and have molecular weight contributions between about 30 and about 500, wherein at least about 80% of the linkages joining the repeating units for each of the polymeric segments comprise an ether linkage; (b) a block copolymer having a polyether segment and a polycation segment, wherein the polyether segment comprises at least an A-type block, and the polycation segment comprises a plurality of cationic repeating units; and (c) a polyether-polycation copolymer comprising a polymer, a polyether segment and a polycationic segment comprising a plurality of cationic repeating units of formula --NH--R0, wherein R0 is a straight chain aliphatic group of 2 to 6 carbon atoms, which may be substituted, wherein said polyether segments comprise at least one of an A-type of B-type segment. See U.S. Pat. No. 5,656,611, by Kabonov, et al., which is incorporated herein by reference in its entirety. Other poloxamers of interest include CRL1005 (12 kDa, 5% POE), CRL8300 (11 kDa, 5% POE), CRL2690 (12 kDa, 10% POE), CRL4505 (15 kDa, 5% POE) and CRL1415 (9 kDa, 10% POE).
[0234]Other auxiliary agents which may be screened for their ability to enhance the immune response according to the present invention include, but are not limited to Acacia (gum arabic); the poloxyethylene ether R--O--(C2H4O).sub.x--H (BRIJยฎ), e.g., polyethylene glycol dodecyl ether (BRIJยฎ 35, x=23), polyethylene glycol dodecyl ether (BRIJยฎ 30, x=4), polyethylene glycol hexadecyl ether (BRIJยฎ 52 x=2), polyethylene glycol hexadecyl ether (BRIJยฎ 56, x=10), polyethylene glycol hexadecyl ether (BRIJยฎ 58P, x=20), polyethylene glycol octadecyl ether (BRIJยฎ 72, x=2), polyethylene glycol octadecyl ether (BRIJยฎ 76, x=10), polyethylene glycol octadecyl ether (BRIJยฎ 78P, x=20), polyethylene glycol oleyl ether (BRIJยฎ 92V, x=2), and polyoxyl 10 oleyl ether (BRIJยฎ 97, x=10); poly-D-glucosamine (chitosan); chlorbutanol; cholesterol; diethanolamine; digitonin; dimethylsulfoxide (DMSO), ethylenediamine tetraacetic acid (EDTA); glyceryl monosterate; lanolin alcohols; mono- and di-glycerides; monoethanolamine; nonylphenol polyoxyethylene ether (NP-40ยฎ); octylphenoxypolyethoxyethanol (NONIDET NP-40 from Amresco); ethyl phenol poly(ethylene glycol ether)", n=11 (Nonidetยฎ P40 from Roche); octyl phenol ethylene oxide condensate with about 9 ethylene oxide units (nonidet P40); IGEPAL CA 630ยฎ ((octyl phenoxy) polyethoxyethanol; structurally same as NONIDET NP-40); oleic acid; oleyl alcohol; polyethylene glycol 8000; polyoxyl 20 cetostearyl ether; polyoxyl 35 castor oil; polyoxyl 40 hydrogenated castor oil; polyoxyl 40 stearate; polyoxyethylene sorbitan monolaurate (polysorbate 20, or TWEEN-20ยฎ; polyoxyethylene sorbitan monooleate (polysorbate 80, or TWEEN-80ยฎ); propylene glycol diacetate; propylene glycol monstearate; protamine sulfate; proteolytic enzymes; sodium dodecyl sulfate (SDS); sodium monolaurate; sodium stearate; sorbitan derivatives (SPANยฎ), e.g., sorbitan monopalmitate (SPANยฎ 40), sorbitan monostearate (SPANยฎ 60), sorbitan tristearate (SPANยฎ 65), sorbitan monooleate (SPANยฎ 80), and sorbitan trioleate (SPANยฎ 85); 2,6,10,15,19,23-hexamethyl-2,6,10,14,18,22-tetracosa-hexaene (squalene); stachyose; stearic acid; sucrose; surfactin (lipopeptide antibiotic from Bacillus subtilis); dodecylpoly(ethyleneglycolether)9 (Thesitยฎ) MW 582.9; octyl phenol ethylene oxide condensate with about 9-10 ethylene oxide units (Triton X-100ยฎ); octyl phenol ethylene oxide condensate with about 7-8 ethylene oxide units (Triton X-114ยฎ); tris(2-hydroxyethyl)amine (trolamine); and emulsifying wax.
[0235]In certain adjuvant compostions, the adjuvant is a cytokine. A composition of the present invention can comprise one or more cytokines, chemokines, or compounds that induce the production of cytokines and chemokines, or a polynucleotide encoding one or more cytokines, chemokines, or compounds that induce the production of cytokines and chemokines. Examples include, but are not limited to granulocyte macrophage colony stimulating factor (GM-CSF), granulocyte colony stimulating factor (G-CSF), macrophage colony stimulating factor (M-CSF), colony stimulating factor (CSF), erythropoietin (EPO), interleukin 2 (IL-2), interleukin-3 (IL-3), interleukin 4 (IL-4), interleukin 5 (IL-5), interleukin 6 (IL-6), interleukin 7 (IL-7), interleukin 8 (IL-8), interleukin 10 (IL-10), interleukin 12 (IL-12), interleukin 15 (IL-15), interleukin 18 (IL-18), interferon alpha (IFNฮฑ), interferon beta (IFNฮฒ), interferon gamma (IFNฮณ), interferon omega (IFNฮฑ)), interferon tau (IFNฯ), interferon gamma inducing factor I (IGIF), transforming growth factor beta (TGF-ฮฒ), RANTES (regulated upon activation, normal T-cell expressed and presumably secreted), macrophage inflammatory proteins (e.g., MIP-1 alpha and MIP-1 beta), Leishmania elongation initiating factor (LEIF), and Flt-3 ligand.
[0236]In certain compositions of the present invention, the polynucleotide construct may be complexed with an adjuvant composition comprising (ยฑ)--N-(3-aminopropyl)-N,N-dimethyl-2,3-bis(syn-9-tetradeceneyloxy)-1-- propanaminium bromide (GAP-DMORIE). The composition may also comprise one or more co-lipids, e.g., 1,2-dioleoyl-sn-glycero-3-phosphoethanolamine (DOPE), 1,2-diphytanoyl-sn-glycero-3-phosphoethanolamine (DPyPE), and/or 1,2-dimyristoyl-glycer-3-phosphoethanolamine (DMPE). An adjuvant composition comprising GAP-DMORIE and DPyPE at a 1:1 molar ratio is referred to herein as Vaxfectinยฎ See, e.g., PCT Publication No. WO 00/57917, which is incorporated herein by reference in its entirety.
[0237]In other embodiments, the polynucleotide itself may function as an adjuvant as is the case when the polynucleotides of the invention are derived, in whole or in part, from bacterial DNA. Bacterial DNA containing motifs of unmethylated CpG-dinucleotides (CpG-DNA) triggers innate immune cells in vertebrates through a pattern recognition receptor (including toll receptors such as TLR 9) and thus possesses potent immunostimulatory effects on macrophages, dendritic cells and B-lymphocytes. See, e.g., Wagner, H., Curr. Opin. Microbiol. 5:62-69 (2002); Jung, J. et al., J. Immunol. 169: 2368-73 (2002); see also Kliman, D. M. et al., Proc. Natl. Acad. Sci. U.S.A. 93:2879-83 (1996). Methods of using unmethylated CpG-dinucleotides as adjuvants are described in, for example, U.S. Pat. Nos. 6,207,646, 6,406,705 and 6,429,199, the disclosures of which are herein incorporated by reference.
[0238]The ability of an adjuvant to increase the immune response to an antigen is typically manifested by a significant increase in immune-mediated protection. For example, an increase in humoral immunity is typically manifested by a significant increase in the titer of antibodies raised to the antigen, and an increase in T-cell activity is typically manifested in increased cell proliferation, or cellular cytotoxicity, or cytokine secretion. An adjuvant may also alter an immune response, for example, by changing a primarily humoral or Th2 response into a primarily cellular, or Th1 response.
[0239]Nucleic acid molecules and/or polynucleotides of the present invention, e.g., plasmid DNA, mRNA, linear DNA or oligonucleotides, may be solubilized in any of various buffers. Suitable buffers include, for example, phosphate buffered saline (PBS), normal saline, Tris buffer, and sodium phosphate (e.g., 150 mM sodium phosphate). Insoluble polynucleotides may be solubilized in a weak acid or weak base, and then diluted to the desired volume with a buffer. The pH of the buffer may be adjusted as appropriate. In addition, a pharmaceutically acceptable additive can be used to provide an appropriate osmolarity. Such additives are within the purview of one skilled in the art. For aqueous compositions used in vivo, sterile pyrogen-free water can be used. Such formulations will contain an effective amount of a polynucleotide together with a suitable amount of an aqueous solution in order to prepare pharmaceutically acceptable compositions suitable for administration to a human.
[0240]Compositions of the present invention can be formulated according to known methods. Suitable preparation methods are described, for example, in Remington's Pharmaceutical Sciences, 16th Edition, A. Osol, ed., Mack Publishing Co., Easton, Pa. (1980), and Remington's Pharmaceutical Sciences, 19th Edition, A. R. Gennaro, ed., Mack Publishing Co., Easton, Pa. (1995), both of which are incorporated herein by reference in their entireties. Although the composition may be administered as an aqueous solution, it can also be formulated as an emulsion, gel, solution, suspension, lyophilized form, or any other form known in the art. In addition, the composition may contain pharmaceutically acceptable additives including, for example, diluents, binders, stabilizers, and preservatives.
[0241]The following examples are included for purposes of illustration only and are not intended to limit the scope of the present invention, which is defined by the appended claims. All references cited in the Examples are incorporated herein by reference in their entireties.
EXAMPLES
Materials and Methods
[0242]The following materials and methods apply generally to all the examples disclosed herein. Specific materials and methods are disclosed in each example, as necessary.
[0243]The practice of the present invention will employ, unless otherwise indicated, conventional techniques of cell biology, cell culture, molecular biology (including PCR), vaccinology, microbiology, recombinant DNA, and immunology, which are within the skill of the art. Such techniques are explained fully in the literature. See, for example, Molecular Cloning A Laboratory Manual, 2nd Ed., Sambrook et al., ed., Cold Spring Harbor Laboratory Press: (1989); DNA Cloning, Volumes I and II (D. N. Glover ed., 1985); Oligonucleotide Synthesis (M. J. Gait ed., 1984); Mullis et al. U.S. Pat. No. 4,683,195; Nucleic Acid Hybridization (B. D. Hames & S. J. Higgins eds. 1984); Transcription And Translation (B. D. Hames & S. J. Higgins eds. 1984); Culture Of Animal Cells (R. I. Freshney, Alan R. Liss, Inc., 1987); Immobilized Cells And Enzymes (IRL Press, 1986); B. Perbal, A Practical Guide To Molecular Cloning (1984); the treatise, Methods In Enzymology (Academic Press, Inc., N.Y.); Gene Transfer Vectors For Mammalian Cells (J. H. Miller and M. P. Calos eds., 1987, Cold Spring Harbor Laboratory); Methods In Enzymology, Vols. 154 and 155 (Wu et al. eds.), Immunochemical Methods In Cell And Molecular Biology (Mayer and Walker, eds., Academic Press, London, 1987); and in Ausubel et al., Current Protocols in Molecular Biology, John Wiley and Sons, Baltimore, Md. (1989). Each of the references cited in this paragraph is incorporated herein by reference in its entirety.
Gene Construction
[0244]Constructs of the present invention are constructed based on the sequence information provided herein or in the art utilizing standard molecular biology techniques, including, but not limited to the following. First, a series complementary oligonucleotide pairs of 80-90 nucleotides each in length and spanning the length of the construct are synthesized by standard methods. These oligonucleotide pairs are synthesized such that upon annealing, they form double stranded fragments of 80-90 base pairs, containing cohesive ends. The single-stranded ends of each pair of oligonucleotides are designed to anneal with a single-stranded end of an adjacent oligonucleotide duplex. Several adjacent oligonucleotide pairs prepared in this manner are allowed to anneal, and approximately five to six adjacent oligonucleotide duplex fragments are then allowed to anneal together via the cohesive single stranded ends. This series of annealed oligonucleotide duplex fragments is then ligated together and cloned into a suitable plasmid, such as the TOPOยฎ vector available from Invitrogen Corporation, Carlsbad, Calif. The construct is then sequenced by standard methods. Constructs prepared in this manner, comprising 5 to 6 adjacent 80 to 90 base pair fragments ligated together, i.e., fragments of about 500 base pairs, are prepared, such that the entire desired sequence of the construct is represented in a series of plasmid constructs. The inserts of these plasmids are then cut with appropriate restriction enzymes and ligated together to form the final construct. The final construct is then cloned into a standard bacterial cloning vector, and sequenced. The oligonucleotides and primers referred to herein can easily be designed by a person of skill in the art based on the sequence information provided herein and in the art, and such can be synthesized by any of a number of commercial nucleotide providers, for example Retrogen, San Diego, Calif., and GENEART, Regensburg, Germany.
Plasmid Vectors
[0245]Constructs of the present invention can be inserted, for example, into eukaryotic expression vectors VR1012 or VR10551. These vectors are built on a modified pUC18 background (see Yanisch-Perron, C., et al. Gene 33:103-119 (1985)), and contain a kanamycin resistance gene, the human cytomegalovirus immediate early promoter/enhancer and intron A, and the bovine growth hormone transcription termination signal, and a polylinker for inserting foreign genes. See Hartikka, J., et al., Hum. Gene Ther. 7:1205-1217 (1996). However, other standard commercially available eukaryotic expression vectors may be used in the present invention, including, but not limited to: plasmids pcDNA3, pHCMV/Zeo, pCR3.1, pEF1/His, pIND/GS, pRc/HCMV2, pSV40/Zeo2, pTRACER-HCMV, pUB6/V5-His, pVAX1, and pZeoSV2 (available from Invitrogen, San Diego, Calif.), and plasmid pCI (available from Promega, Madison, Wis.).
[0246]An optimized backbone plasmid, termed VR10551, has minor changes from the VR1012 backbone described above. The VR10551 vector is derived from and similar to VR1012 in that it uses the human cytomegalovirus immediate early (hCMV-IE) gene enhancer/promoter and 5' untranslated region (UTR), including the hCMV-IE Intron A. The changes from the VR1012 to the VR10551 include some modifications to the multiple cloning site, and a modified rabbit ฮฒ globin 3' untranslated region/polyadenylation signal sequence/transcriptional terminator has been substituted for the same functional domain derived from the bovine growth hormone gene.
[0247]Additionally, constructs of the present invention can be inserted into other eukaryotic expression vector backbones such as VR10682 or VR10686. The VR10682 expression vector backbone (SEQ ID NO:94) contains a modified rous sarcoma virus (RSV) promoter from expression plasmid VCL1005, the bovine growth hormone (BGH) poly-adenylation site and a polylinker for inserting foreign genes and a kanamycin resistance gene. The RSV promoter in VCL1005 and VR10682 contains a XbaI endonuclease restriction site near the transcription start site in the sequence TAC TCT AGA CG (SEQ ID NO:82). The modified RSV promoter contained in VR10682. Expression plasmid VCL1005 is described in U.S. Pat. No. 5,561,064 and is incorporated herein by reference.
[0248]The VR10686 expression vector backbone (SEQ ID NO:112) was created by replacing the West Nile Virus (WNV) antigen insert in VR6430 (SEQ ID NO:89) with the multiple cloning site from the VR1012 vector. The VR10686 and VR6430 expression vector backbones contain the RSV promoter, derived from VCL1005, which has been modified back to the wild-type RSV sequence (TAC AAT AAA CG (SEQ ID NO:83)). The wild-type RSV promoter is fused to the "R" region plus the first 39 nucleotides of the U5 region from Human T-Cell Leukemia Virus I (HTLV-I), hereinafter referred to as the RU5 element. The R and U5 regions are portions of the long terminal repeat region (LTR) of HTLV-I which control expression of the HTLV-I transcript and is duplicated at either end of the integrated viral genome as a result of the retroviral integration mechanism. The LTR of HTLV-1 and most retroviruses are divided into three regions, U3, R and U5. Transcription from the intigrated viral genome commences at the U3-R boundary of the 5' LTR and the transcript is polyadenylated at the R-U5 boundary of the 3' LTR. (See Goff, S. P. Retroviridae, Field's Virology 4th ed. 2:1871-1939 (2001). This RU5 HTLV-I element has been shown to be a potent stimulator of translation when fused to the SV40 early gene promoter. See Takebe et al., Mol. Cell. Biol. 8:466-472 (1988). It has been proposed that the stimulation of translation by the HTLV-I RU5 element is due to its function, in part, as a translational enhancing internal ribosome entry site (IBES). See Attal et al. FEBS Letters 392:220-224 (1996). Additionally the HTLV-I RU5 element provides the 5'-splice donor site. Immediately downstream of the RU5 element is the 3'-end of the HCMV intron A sequence containing the splice acceptor sequence. The VR10686 and VR6430 expression vectors contain a hybrid intron composed of the 5'-HTLV I intron sequence fused to the 3'-end of the HCMV intron A, a bovine growth hormone poly-adenylation site, a polylinker for insertion of foreign genes and a kanamycin resistance gene. The VR6430 vector expresses the prM and E West Nile Virus antigens (Genebank Accession No. AF202541).
[0249]The vector backbones described above may by used to create expression vectors which express multiple influenza proteins, fragments, variants or derivatives thereof. An expression vector as described herein may contain an additional promoter. For example, construct VR4774 (described in Example 13), contains a CMV promoter and an RSV promoter. Thus, the vector backbones described herein may contain multiple expression cassettes which comprise a promoter and an influenza coding sequence including, inter alia, polynucleotides as described herein. The expression cassettes may encode the same or different influenza polypeptides. Additionally, the expression cassettes may be in the same or opposite orientation relative to each other. As such transcription from each cassette may be in the same or opposition direction (i.e. 5' to 3' in both expression cassettes or, alternatively, 5' to 3' in one expression cassette and 3' to 5' in the other expression cassette).
Plasmid DNA Purification
[0250]Plasmid DNA may be transformed into competent cells of an appropriate Escherichia coli strain (including but not limited to the DH5ฮฑ strain) and highly purified covalently closed circular plasmid DNA was isolated by a modified lysis procedure (Horn, N. A., et al., Hum. Gene Ther. 6:565-573 (1995)) followed by standard double CsCl-ethidium bromide gradient ultracentrifugation (Sambrook, J., et al., Molecular Cloning: A Laboratory Manual, 2nd Ed., Cold Spring Harbor Laboratory Press, Plainview, N.Y. (1989)). Alternatively, plasmid DNAs are purified using Giga columns from Qiagen (Valencia, Calif.) according to the kit instructions. All plasmid preparations were free of detectable chromosomal DNA, RNA and protein impurities based on gel analysis and the bicinchoninic protein assay (Pierce Chem. Co., Rockford Ill.). Endotoxin levels were measured using Limulus Amebocyte Lysate assay (LAL, Associates of Cape Cod, Falmouth, Mass.) and were less than 0.6 Endotoxin Units/mg of plasmid DNA. The spectrophotometric A260/A280 ratios of the DNA solutions were typically above 1.8. Plasmids were ethanol precipitated and resuspended in an appropriate solution, e.g., 150 mM sodium phosphate (for other appropriate excipients and auxiliary agents, see U.S. Patent Application Publication 2002/0019358, published Feb. 14, 2002). DNA was stored at -20EC until use. DNA was diluted by mixing it with 300 mM salt solutions and by adding appropriate amount of USP water to obtain 1 mg/ml plasmid DNA in the desired salt at the desired molar concentration.
Plasmid Expression in Mammalian Cell Lines
[0251]The expression plasmids were analyzed in vitro by transfecting the plasmids into a well characterized mouse melanoma cell line (VM-92, also known as UM-449). See, e.g., Wheeler, C. J., Sukhu, L., Yang, G., Tsai, Y., Bustamente, C., Feigner, P. Norman, J & Manthorpe, M. "Converting an Alcohol to an Amine in a Cationic Lipid Dramatically Alters the Co-lipid Requirement, Cellular Transfection Activity and the Ultrastructure of DNA-Cytofectin Complexes," Biochim. Biophys. Acta. 1280:1-11 (1996). Other well-characterized human cell lines can also be used, e.g. MRC-5 cells, ATCC Accession No. CCL-171 or human rhabdomyosarcoma cell line RD (ATCC CCL-136). The transfection was performed using cationic lipid-based transfection procedures well known to those of skill in the art. Other transfection procedures are well known in the art and may be used, for example electroporation and calcium chloride-mediated transfection (Graham F. L. and A. J. van der Eb Virology 52:456-67 (1973)). Following transfection, cell lysates and culture supernatants of transfected cells were evaluated to compare relative levels of expression of IV antigen proteins. The samples were assayed by western blots and ELISAs, using commercially available polyclonal and/or monoclonal antibodies (available, e.g., from Research Diagnostics Inc., Flanders N.J.), so as to compare both the quality and the quantity of expressed antigen.
Injections of Plasmid DNA
[0252]The quadriceps muscles of restrained awake mice (e.g., female 6-12 week old BALB/c mice from Harlan Sprague Dawley, Indianapolis, Ind.) are injected bilaterally with 1-50 ฮผg of DNA in 50 ฮผl solution (100 ฮผg in 100 ฮผl total per mouse) using a disposable elastic insulin syringe and 28 G 1/2 needle (Becton-Dickinson, Franklin Lakes, N.J., Cat. No. 329430) fitted with a plastic collar cut from a micropipette tip, as previously described (Hartikka, J., et al., Hum. Gene Ther. 7:1205-1217 (1996).
[0253]Animal care throughout the study was in compliance with the "Guide for the Use and Care of Laboratory Animals", Institute of Laboratory Animal Resources, Commission on Life Sciences, National Research Council, National Academy Press, Washington, D.C., 1996 as well as with Vical's Institutional Animal Care and Use Committee.
Example 1
Construction of Expression Vectors
[0254]Plasmid constructs comprising the native coding regions encoding NP, M1, M2, HA, and eM2, IV proteins or fragments, variants or derivatives are constructed as follows. The NP, M1, and M2 genes from IV (A/PR/8/34) are isolated from viral RNA by RT PCR, or prepared by direct synthesis if the wildtype sequence is known, by standard methods and are inserted into the vector VR10551 via standard restriction sites, by standard methods.
[0255]Plasmid constructs comprising human codon-optimized coding regions encoding NP, M1, M2, HA, eM2, and/or an eM2-NP fusion; or other codon-optimized coding regions encoding other IV proteins or fragments, variants or derivatives either alone or as fusions with a carrier protein, e.g., HBcAg, are prepared as follows. The codon-optimized coding regions are generated using the full, minimal, or uniform codon optimization methods described herein. The codon optimized coding regions are constructed using standard PCR methods described herein, or are ordered commercially. Oligonucleotides representing about the first 23-24 aa extracellular region of M2 are constructed, and are used in an overlap PCR reaction with the NP coding regions described above, to create a coding region coding for an eM2/NP fusion protein, for example as shown in SEQ ID NOs 6 and 7. The codon-optimized coding regions are inserted into the vector VR10551 via standard restriction sites, by standard methods.
[0256]Plasmids constructed as above are propagated in Escherichia coli and purified by the alkaline lysis method (Sambrook, J., et al., Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N. Y., ed. 2 (1989)). CsCl-banded DNA are ethanol precipitated and resuspended in 0.9% saline or PBS to a final concentration of 2 mg/ml for injection. Alternately, plasmids are purified using any of a variety of commercial kits, or by other known procedures involving differential precipitation and/or chromatographic purification.
[0257]Expression is tested by formulating each of the plasmids in DMRIE/DOPE and transfecting VM92 cells. The supernatants are collected and the protein production tested by Western blot or ELISA. The relative expression of the wild type and codon optimized constructs are compared.
[0258]Examples of constructs made according to the above methods are listed in Table 13. The experimental procedure for generating the listed constructs is as described above, with particular parameters and materials employed as described herein.
TABLE-US-00053 TABLE 13 Plasmid # Description VR4700 TPA leader- NP (A/PR/34) in VR 1255 VR4707 TPA leader-M2 with transmembrane deletion, glycine linker inserted VR4710 TPA leader -1st 24 amino acids of M2 from VR4707 fused to NP from VR4700 VR4750 full length HA from mouse adapted virus (H3, Hong Kong 68 ) VR4752 full length HA from mouse adapted virus (H1, Puerto Rico 34) VR4755 algorithm to codon optimize consensus amino acid sequence, direct fusion M2 to ATG of M1 VR4756 native sequence from A/Niigata/137/96 influenza strain (matches amino acid consensus sequence) VR4757 Contracted codon optimized- 1st 24 amino acids of M2 from consensus fused to full- length NP consensus VR4758 Applicants' codon optimized- 1st 24 amino acids of M2 from consensus fused to full- length NP consensus VR4759 Full-length M2 derived from VR4755 VR4760 Full-length M1 derived from VR4755 VR4761 Full-length NP derived from VR4757 VR4762 Full-length NP derived from VR4758 VR4763 Selectively codon-optimized regions of segment 7
[0259]The pDNA expression vector VR4700 which encodes the influenza NP protein has been described in the art. See, e.g. Sankar, V., Baccaglilni, L., Sawddey, M., Wheeler, C. J., Pillemer, S. R., Baum, B. J. and Atkinson, J. C., "Salivary Gland Delivery of pDNA-Cationic Lipolplexes Elicits Systemic Immune Responses," Oral Diseases 8:275-281 (2002). The following is the open reading frame for TPA-NP (from VR4700), referred to herein as SEQ ID NO:46:
TABLE-US-00054 1 atggatgcaa tgaagagagg gctctgctgt gtgctgctgc tgtgtggagc agtcttcgtt 61 tcgcccagcg ctagaggatc gggaatggcg tcccaaggca ccaaacggtc ttacgaacag 121 atggagactg atggagaacg ccagaatgcc actgaaatca gagcatccgt cggaaaaatg 181 attggtggaa ttggacgatt ctacatccaa atgtgcaccg aactcaaact cagtgattat 241 gagggacggt tgatccaaaa cagcttaaca atagagagaa tggtgctctc tgcttttgac 301 gaaaggagaa ataaatacct ggaagaacat cccagtgcgg ggaaagatcc taagaaaact 361 ggaggaccta tatacaggag agtaaacgga aagtggatga gagaactcat cctttatgac 421 aaagaagaaa taaggcgaat ctggcgccaa gctaataatg gtgacgatgc aacggctggt 481 ctgactcaca tgatgatctg gcattccaat ttgaatgatg caacttatca gaggacaaga 541 gctcttgttc gcaccggaat ggatcccagg atgtgctctc tgatgcaagg ttcaactctc 601 cctaggaggt ctggagccgc aggtgctgca gtcaaaggag ttggaacaat ggtgatggaa 661 ttggtcagga tgatcaaacg tgggatcaat gatcggaact tctggagggg tgagaatgga 721 cgaaaaacaa gaattgctta tgaaagaatg tgcaacattc tcaaagggaa atttcaaact 781 gctgcacaaa aagcaatgat ggatcaagtg agagagagcc ggaacccagg gaatgctgag 841 ttcgaagatc tcacttttct agcacggtct gcactcatat tgagagggtc ggttgctcac 901 aagtcctgcc tgcctgcctg tgtgtatgga cctgccgtag ccagtgggta cgactttgaa 961 agagagggat actctctagt cggaatagac cctttcagac tgcttcaaaa cagccaagtg 1021 tacagcctaa tcagaccaaa tgagaatcca gcacacaaga gtcaactggt gtggatggca 1081 tgccattctg ccgcatttga agatctaaga gtattaagct tcatcaaagg gacgaaggtg 1141 ctcccaagag ggaagctttc cactagagga gttcaaattg cttccaatga aaatatggag 1201 actatggaat caagtacact tgaactgaga agcaggtact gggccataag gaccagaagt 1261 ggaggaaaca ccaatcaaca gagggcatct gcgggccaaa tcagcataca acctacgttc 1321 tcagtacaga gaaatctccc ttttgacaga acaaccatta tggcagcatt caatgggaat 1381 acagagggaa gaacatctga catgaggacc gaaatcataa ggatgatgga aagtgcaaga 1441 ccagaagatg tgtctttcca ggggcgggga gtcttcgagc tctcggacga aaaggcagcg 1501 agcccgatcg tgccttcctt tgacatgagt aatgaaggat cttatttctt cggagacaat 1561 gcagatgagt acgacaatta a
[0260]Purified VR4700 DNA was used to transfect the murine cell line VM92 to determine expression of the NP protein. Expression of NP was confirmed with a Western Blot assay. Western blot analysis showed very low level expression of VR4700 in vitro as detected with mouse polyclonal anti-NP antibody. In vivo antibody response was detected by ELISA with an average titer of 62,578.
[0261]Plasmid VR4707 expresses a secreted form of M2, i.e., TPA-M2. The sequence was assembled using synthetic oligonucleotides in which the oligos were annealed amongst themselves, and then ligated and gel purified. The purified product was then ligated (cloned) into Eco RI/Sal I of VR10551. The M2 sequence lacks the transmembrane domain; the cloned sequence contains amino acids [TPA(1-23)]ARGSG[M2(1-25)]GGG[M2(44-97)]. Amino acid residues between TPA and M2 and between M2 domains were added as flexible linkers. The following mutations were introduced to generate appropriate T-cell epitopes: 74S->G and 78S->N. The following is the open reading frame for TPA-M2ฮTM (from VR4707), referred to herein as SEQ ID NO:47:
TABLE-US-00055 1 atggatgcaa tgaagagagg gctctgctgt gtgctgctgc tgtgtggagc agtcttcgtt 61 tcgcccagcg ctagaggatc gggaatgagt cttctgaccg aggtcgaaac ccctatcaga 121 aacgaatggg ggtgcagatg caacgattca agtgatcctg gcggcggcga tcggcttttt 181 ttcaaatgca tttatcggcg ctttaaatac ggcttgaaaa gagggccttc taccgaagga 241 gtgccagagt ctatgaggga agaatatcgg aaggaacagc agaatgctgt ggatgttgac 301 gatagccatt ttgtcagcat cgagctggag taa
[0262]Purified VR4707 DNA was used to transfect the murine cell line VM92 to determine expression of the M2 protein. Expression of M2 was confirmed with a Western Blot assay. Expression was visualized with a commercially available anti-M2 monoclonal antibody. In vivo M2 antibody response to VR4707, as assayed by ELISA, resulted in an average titer of 110, which is lower than the average titer of 9,240 for VR4756, encoding full-length M2 from segment 7. An IFNฮณ ELISPOT assay for M2-specific T cells resulted in an average of 61 SFU/106 cells versus an average of 121 SFU/106 cells for the segment 7 construct.
[0263]VR4710 was created by fusing the TPA leader and the first 24 amino acids of M2 from VR4707 to the full-length NP gene from VR4700. Primers 5'-GCCGAATCCATGGATGCAATGAAG-3' (SEQ ID NO:48) and 5'-GGTGCCTTGGGACGCCATATCACTTGAATCGTTGCA-3' (SEQ ID NO:49) were used to amplify the TPA-M2 fragment from VR4707. Primers 5'-TGCAACGATTCAAGTGATATGGCGTCCCAAGGCACC-3' (SEQ ID NO:50) and 5'-GCCGTCGACTTAATTGTCGTACTC-3' (SEQ ID NO:51) were used to amplify the NP gene from VR4700. Then the N-terminal and C-terminal primers were used to assemble the fusion, and the eM2NP fusion was cloned into VR10551 as an EcoRI-SalI fragment. The following is the open reading frame for TPA-M2-NP (from VR4710), referred to herein as SEQ ID NO:52:
TABLE-US-00056 1 atggatgcaa tgaagagagg gctctgctgt gtgctgctgc tgtgtggagc agtcttcgtt 61 tcgcccagcg ctagaggatc gggaatgagt cttctgaccg aggtcgaaac ccctatcaga 121 aacgaatggg ggtgcagatg caacgattca agtgatatgg cgtcccaagg caccaaacgg 181 tcttacgaac agatggagac tgatggagaa cgccagaatg ccactgaaat cagagcatcc 241 gtcggaaaaa tgattggtgg aattggacga ttctacatcc aaatgtgcac cgaactcaaa 301 ctcagtgatt atgagggacg gttgatccaa aacagcttaa caatagagag aatggtgctc 361 tctgcttttg acgaaaggag aaataaatac ctggaagaac atcccagtgc ggggaaagat 421 cctaagaaaa ctggaggacc tatatacagg agagtaaacg gaaagtggat gagagaactc 481 atcctttatg acaaagaaga aataaggcga atctggcgcc aagctaataa tggtgacgat 541 gcaacggctg gtctgactca catgatgatc tggcattcca atttgaatga tgcaacttat 601 cagaggacaa gagctcttgt tcgcaccgga atggatccca ggatgtgctc tctgatgcaa 661 ggttcaactc tccctaggag gtctggagcc gcaggtgctg cagtcaaagg agttggaaca 721 atggtgatgg aattggtcag gatgatcaaa cgtgggatca atgatcggaa cttctggagg 781 ggtgagaatg gacgaaaaac aagaattgct tatgaaagaa tgtgcaacat tctcaaaggg 841 aaatttcaaa ctgctgcaca aaaagcaatg atggatcaag tgagagagag ccggaaccca 901 gggaatgctg agttcgaaga tctcactttt ctagcacggt ctgcactcat attgagaggg 961 tcggttgctc acaagtcctg cctgcctgcc tgtgtgtatg gacctgccgt agccagtggg 1021 tacgactttg aaagagaggg atactctcta gtcggaatag accctttcag actgcttcaa 1081 aacagccaag tgtacagcct aatcagacca aatgagaatc cagcacacaa gagtcaactg 1141 gtgtggatgg catgccattc tgccgcattt gaagatctaa gagtattaag cttcatcaaa 1201 gggacgaagg tgctcccaag agggaagctt tccactagag gagttcaaat tgcttccaat 1261 gaaaatatgg agactatgga atcaagtaca cttgaactga gaagcaggta ctgggccata 1321 aggaccagaa gtggaggaaa caccaatcaa cagagggcat ctgcgggcca aatcagcata 1381 caacctacgt tctcagtaca gagaaatctc ccttttgaca gaacaaccat tatggcagca 1441 ttcaatggga atacagaggg aagaacatct gacatgagga ccgaaatcat aaggatgatg 1501 gaaagtgcaa gaccagaaga tgtgtctttc caggggcggg gagtcttcga gctctcggac 1561 gaaaaggcag cgagcccgat cgtgccttcc tttgacatga gtaatgaagg atcttatttc 1621 ttcggagaca atgcagatga gtacgacaat taa
[0264]Purified VR4710 DNA was used to transfect the murine cell line VM92 to determine expression of the eM2-NP fusion protein. Expression of eM2-NP was confirmed with a Western Blot assay. Expression was visualized with a commercially available monoclonal antibody to M2 and with mouse polyclonal antibody to NP. ELISA assay results following 2 injections of pDNA into mice revealed little antibody response to M2, but an average titer of 66,560 for anti-NP antibody.
[0265]VR4750 was created by first reverse transcribing RNA from the mouse-adapted A/Hong Kong/1/68 virus stock using random hexamer to create a cDNA library. Then primers 5' GGGCTAGCGCCGCCACCATGAAGACCATCATTGCT 3' (SEQ ID NO:53) and 5' CCGTCGACTCAAATGCAAATGTTGCA 3' (SEQ ID NO:54) were employed to PCR the HA gene. The gene was inserted into the Invitrogen TOPO-TA vector first, and then sub-cloned into VR10551 using restriction enzymes NheI and SalI. The following is the open reading frame for HA (H3N2) from mouse-adapted A/Hong Kong/68 (from VR4750), referred to herein as SEQ ID NO:55:
TABLE-US-00057 1 atgaagacca tcattgcttt gagctacatt ttctgtctgg ctctcggcca agaccttcca 61 ggaaatgaca acaacacagc aacgctgtgc ctgggacatc atgcggtgcc aaacggaaca 121 ctagtgaaaa caatcacaga tgatcagatt gaagtgacta atgctactga gctagttcag 181 agctcctcaa cggggaaaat atgcaacaat cctcatcgaa tccttgatgg aatagactgc 241 acactgatag atgctctatt gggggaccct cattgtgatg tttttcaaaa tgagacatgg 301 gaccttttcg ttgaacgcag caaagctttc agcaactgtt acccttatga tgtgccagat 361 tatgcccccc ttaggtcact agttgcctcg tcaggcactc tggagtttat cactgagggt 421 ttcacttgga ctggggtcac tcagaatggg ggaagcagtg cttgcaaaag gggacctggt 481 agcggttttt tcagtagact gaactggttg accaaatcag gaagcacata tccagtgctg 541 aacgtgacta tgccaaacaa tgacaatttt gacaaactat acatttgggg ggttcaccac 601 ccgagcacga accaagaaca aaccagcctg tatgttcaag catcagggag agtcacagtc 661 tctaccagga gaagccagca aactataatc ccgaatatcg agtccagacc ctgggtaagg 721 ggtctgtcta gtagaataag catctattgg acaatagtta agccgggaga cgtactggta 781 attaatagta atgggaacct aatcgctcct cggggttatt tcaagatgcg cactgggaaa 841 agctcaataa tgaggtcaga tgcacctatt gatacctgta tttctgaatg catcactcca 901 aatggaagca ttcccaatga caagcccttt caaaacgtaa acaaaatcac gtatggagca 961 tgccccaagt atgttaagca aaacaccctg aagttggcaa cagggatgcg gaatgtacca 1021 gagaaacaaa ctagaggcct attcggcgca atagcaggtt tcatagaaaa tggttgggag 1081 ggaatgatag acggttggta cggtttcagg catcaaaatt ctgagggcac aggacaagca 1141 gcagatctta aaagcactca agcagccatc gaccaaatca atgggaaatt gaacaggata 1201 atcaagaaga cgaacgagaa attccatcaa atcgaaaagg aattctcaga agtagaaggg 1261 agaattcagg acctcgagaa atacgttgaa gacactaaaa tagatctctg gtcttacaat 1321 gcggagcttc ttgtcgctct ggagaatcaa catacaattg acctgactga ctcggaaatg 1381 aacaagctgt ttgaaaaaac aaggaggcaa ctgagggaaa atgctgaaga catgggcaat 1441 ggttgcttca aaatatacca caaatgtgac aacgcttgca tagagtcaat cagaactggg 1501 acttatgacc atgatgtata cagagacgaa gcattaaaca accggtttca gatcaaaggt 1561 gttgaactga agtctggata caaagactgg atcctgtgga tttcctttgc catatcatgc 1621 tttttgcttt gtgttgtttt gctggggttc atcatgtggg cctgccagaa aggcaacatt 1681 aggtgcaaca tttgcatttg a
[0266]While VR4750 expression was not clearly detected in vitro by Western blot assay, two 100 ฮผg vaccinations of VR4750 have been shown to protect mice from intranasal challenge with mouse-adapted A/Hong Kong/68 virus.
[0267]VR4752 was created by first reverse transcribing RNA from the mouse-adapted A/Puerto Rico/8/34 virus stock using random hexamer to create a cDNA library. Then primers 5' GGGCTAGCGCCGCCACCATGAAGGCAAACCTACTG 3' (SEQ ID NO:56) and 5' CCGTCGACTCAGATGCATATTCTGCA 3' (SEQ ID NO:57) were employed to PCR the HA gene. The gene was then cloned into the TOPO-TA vector first, and then sub-cloned into VR10551 using restriction enzymes NheI and SalI. The following is the open reading frame for HA (H1N1) cloned from mouse-adapted A/Puerto Rico/34 (from VR4752), referred to herein as SEQ ID NO:58:
TABLE-US-00058 1 atgaaggcaa acctactggt cctgttatgt gcacttgcag ctgcagatgc agacacaata 61 tgtataggct accatgcgaa caattcaacc gacactgttg acacagtgct cgagaagaat 121 gtgacagtga cacactctgt taacctgctc gaagacagcc acaacggaaa actatgtaga 181 ttaaaaggaa tagccccact acaattgggg aaatgtaaca tcgccggatg gctcttggga 241 aacccagaat gcgacccact gcttccagtg agatcatggt cctacattgt agaaacacca 301 aactctgaga atggaatatg ttatccagga gatttcatcg actatgagga gctgagggag 361 caattgagct cagtgtcatc attcgaaaga ttcgaaatat ttcccaaaga aagctcatgg 421 cccaaccaca acacaaccaa aggagtaacg gcagcatgct cccatgcggg gaaaagcagt 481 ttttacagaa atttgctatg gctgacggag aaggagggct catacccaaa gctgaaaaat 541 tcttatgtga acaagaaagg gaaagaagtc cttgtactgt ggggtattca tcacccgtct 601 aacagtaagg atcaacagaa tatctatcag aatgaaaatg cttatgtctc tgtagtgact 661 tcaaattata acaggagatt taccccggaa atagcagaaa gacccaaagt aagagatcaa 721 gctgggagga tgaactatta ctggaccttg ctaaaacccg gagacacaat aatatttgag 781 gcaaatggaa atctaatagc accaaggtat gctttcgcac tgagtagagg ctttgggtcc 841 ggcatcatca cctcaaacgc atcaatgcat gagtgtaaca cgaagtgtca aacacccctg 901 ggagctataa acagcagtat ccctttccag aatatacacc cagtcacaat aggagagtgc 961 ccaaaatacg tcaggagtgc caaattgagg atggttacag gactaaggaa cattccgtcc 1021 attcaatcca gaggtctatt tggagccatt gccggtttta ttgaaggggg atggactgga 1081 atgatagatg gatggtacgg ttatcatcat cagaatgaac agggatcagg ctatgcagcg 1141 gatcaaaaaa gcacacaaaa tgccattaac gggattacaa acaaggtgaa ctctgttatc 1201 gagaaaatga acattcaatt cacagctgtg ggtaaagaat tcaacaaatt agaaaaaagg 1261 atggaaaatt taaataaaaa agttgatgat ggatttctgg acatttggac atataatgca 1321 gaattgttag ttctactgga aaatgaaagg actctggatt tccatgactc aaatgtgaag 1381 aatctgtatg agaaagtaaa aagccaatta aagaataatg ccaaagaaat cggaaatgga 1441 tgttttgagt tctaccacaa gtgtgacaat gaatgcatgg aaagtgtaag aaatgggact 1501 tatgattatc ccaaatattc agaagagtca aagttgaaca gggaaaaggt agatggagtg 1561 aaattggaat caatggggat ctatcagatt ctggcgatct actcaactgt cgccagttca 1621 ctggtgcttt tggtctccct gggggcaatc agtttctgga tgtgttctaa tggatctttg 1681 cagtgcagaa tatgcatctg a
[0268]Purified VR4752 DNA was used to transfect the murine cell line VM92 to determine expression of the HA protein. Expression of HA was configured with a Western Blot assay. Expression was visualized with a commercially available goat anti-influenza A (H1N1) antibody.
[0269]A direct fusion of the M2 gene to the M1 gene was synthesized based on a codon-optimized sequence derived from methods described in Example 4 using the "universal" optimization strategy. The synthesized gene was received in the pUC119 vector and then sub-cloned into VR10551 as an EcoRI-SalI fragment. The following is the open reading frame for the M2M1 fusion (from VR4755), referred to herein as SEQ ID NO:59:
TABLE-US-00059 1 atgagcctgc tgaccgaggt ggagaccccc atcagaaacg agtggggctg cagatgcaac 61 gacagcagcg accccctggt ggtggccgcc agcatcatcg gcatcctgca cctgatcctg 121 tggatcctgg acagactgtt cttcaagtgc atctacagac tgttcaagca cggcctgaag 181 agaggcccca gcaccgaggg cgtgcccgag agcatgagag aggagtacag aaaggagcag 241 cagaacgccg tggacgccga cgacagccac ttcgtgagca tcgagctgga gatgtccctg 301 ctgacagaag tggaaacata cgtgctgagc atcgtgccca gcggccccct gaaggccgag 361 atcgcccaga gactggagga cgtgttcgcc ggcaagaaca ccgacctgga ggccctgatg 421 gagtggctga agaccagacc catcctgagc cccctgacca agggcatcct gggcttcgtg 481 ttcaccctga ccgtgcccag cgagagaggc ctgcagagaa gaagattcgt gcagaacgcc 541 ctgaacggca acggcgaccc caacaacatg gaccgggccg tgaagctgta ccggaagctg 601 aagagagaga tcaccttcca cggcgccaag gagatcgccc tgagctacag cgccggcgcc 661 ctggccagct gcatgggcct gatctacaac agaatgggcg ccgtgaccac cgaggtggcc 721 ttcggcctgg tgtgcgccac ctgcgagcag atcgccgaca gccagcacag aagccacaga 781 cagatggtgg ccaccaccaa ccccctgatc agacacgaga acagaatggt gctggccagc 841 accaccgcca aggccatgga gcagatggcc ggcagcagcg agcaggccgc cgaggccatg 901 gagatcgcca gccaggccag acagatggtg caggccatga gagccatcgg cacccacccc 961 agcagcagcg ccggcctgaa ggacgacctg ctggagaacc tgcagaccta ccagaagaga 1021 atgggcgtgc agatgcagag attcaagtga
[0270]Purified VR4755 DNA was used to transfect the murine cell line VM92 to determine expression of the M2M1 fusion protein. Expression of M2M1 was confirmed with a Western Blot assay. Expression of the M2M1 fusion was visualized with commercially available anti-M1 and anti-M2 monoclonal antibodies.
[0271]The segment 7 RNA of influenza A encodes both the M1 and M2 genes. A consensus amino acid sequence for M1 and M2 was derived according to methods described herein. The consensus sequences for both proteins, however, are identical to the M1 and M2 amino acid sequences derived from the IV strain A/Niigata/137/96, represented herein as SEQ ID NO:77 and SEQ ID NO:78, respectively. Accordingly, the native sequence for segment 7, A/Niigata/137/96, was synthesized and received as an insert in pUC119. The segment 7 insert was sub-cloned into VR10551 as an EcoR1-SalI fragment. The following is the open reading frame for segment 7 (from VR4756), referred to herein as SEQ ID NO:60:
TABLE-US-00060 1 atgagccttc taaccgaggt cgaaacgtat gttctctcta tcgttccatc aggccccctc 61 aaagccgaaa tcgcgcagag acttgaagat gtctttgctg ggaaaaacac agatcttgag 121 gctctcatgg aatggctaaa gacaagacca atcctgtcac ctctgactaa ggggattttg 181 gggtttgtgt tcacgctcac cgtgcccagt gagcgaggac tgcagcgtag acgctttgtc 241 caaaatgccc tcaatgggaa tggggatcca aataacatgg acagagcagt taaactatat 301 agaaaactta agagggagat tacattccat ggggccaaag aaatagcact cagttattct 361 gctggtgcac ttgccagttg catgggcctc atatacaaca gaatgggggc tgtaaccact 421 gaagtggcct ttggcctggt atgtgcaaca tgtgaacaga ttgctgactc ccagcacagg 481 tctcataggc aaatggtggc aacaaccaat ccattaataa ggcatgagaa cagaatggtt 541 ttggccagca ctacagctaa ggctatggag caaatggctg gatcaagtga gcaggcagcg 601 gaggccatgg aaattgctag tcaggccagg caaatggtgc aggcaatgag agccattggg 661 actcatccta gctccagtgc tggtctaaaa gatgatcttc ttgaaaattt gcagacctat 721 cagaaacgaa tgggggtgca gatgcaacga ttcaagtgac ccgcttgttg ttgctgcgag 781 tatcattggg atcttgcact tgatattgtg gattcttgat cgtctttttt tcaaatgcat 841 ctatcgactc ttcaaacacg gtctgaaaag agggccttct acggaaggag tacctgagtc 901 tatgagggaa gaatatcgaa aggaacagca gaatgctgtg gatgctgacg acagtcattt 961 tgtcagcata gagctggagt aa
[0272]SEQ ID NO:77 ("consensus" (A/Niigata/137/96) M1):
TABLE-US-00061 MSLLTEVETYVLSIVPSGPLKAEIAQRLEDVFAGKNTDLEALMEWLKTR PILSPLTKGILGFVFTLTVPSERGLQRRRFVQNALNGNGDPNNMDRAVKL YRKLKREITFHGAKEIALSYSAGALASCMGLIYNRMGAVTTEVAFGLVCA TCEQIADSQHRSHRQMVATTNPLIRHENRMVLASTTAKAMEQMAGSSEQ AAEAMEIASQARQMVQAMRAIGTHPSSSAGLKDDLLENLQTYQKRMGVQ MQRFK
[0273]SEQ ID NO:78 ("consensus" (A/Niigata/137/96) M2):
TABLE-US-00062 MSLLTEVETPIRNEWGCRCNDSSDPLVVAASIIGILHLILWILDRLFFKC IYRLFKHGLKRGPSTEGVPESMREEYRKEQQNAVDADDSHFVSIELE
[0274]Purified VR4756 DNA was used to transfect the murine cell line VM92 to determine expression of the proteins encoded by segment 7. Expression of both M1 and M2 was confirmed with a Western blot assay using commercially available anti-M1 and anti-M2 monoclonal antibodies. ELISA assay results following 2 injections of pDNA into mice revealed an average anti-M2 antibody titer of 9,240 versus a 110 average titer for VR4707. An IFNฮณ ELISPOT assay for M2-specific T cells resulted in an average of 121 SFU/106 cells for VR4756 injected mice versus an average of 61 SFU/106 cells for the VR4707 construct.
[0275]An additional segment 7 sequence is created, VR4763, which contains selectively codon-optimized regions of segment 7. Optimization of the coding regions in segment 7 is selective, because segment 7 contains two overlapping coding regions (i.e., encoding M1 and M2) and these coding regions are partially in different reading frames. From the AUG encoded by nucleotides 1 to 3 of segment 7, M1 is encoded by 1 through 759 of the segment 7 RNA, while M2 is encoded by a spliced messenger RNA which includes nucleotides 1 to 26 of segment 7 spliced to nucleotides 715 to 982 of segment 7. Optimization of the region from 715 to 759 is avoided because the M1 and M2 coding sequences (in different reading frames) overlap in that region. Due to the splicing that occurs to join by 26 to an alternate frame at by 715 of the segment 7 sequence, optimization in these splicing regions is also avoided; adjacent regions that arguably could also participate in splicing are likewise avoided. Optimization is done in a manner to insure that no new splicing sites are inadvertently introduced. The areas that are optimized are done so using "universal" strategy, e.g. inserting the most frequently used codon for each amino acid. The following is the nucleotide sequence for codon-optimized segment 7 (from VR4763), referred to herein as SEQ ID NO:61:
TABLE-US-00063 1 atgagcctgc tgaccgaggt cgaaacgtat gttctctcta tcgtgcccag cggccccctg 61 aaggccgaga tcgcccagag actggaggac gtgttcgccg gcaagaacac cgacctggag 121 gccctgatgg agtggctgaa gaccagaccc atcctgagcc ccctgaccaa gggcatcctg 181 ggcttcgtgt tcaccctgac cgtgcccagc gagagaggac tgcagagaag aagattcgtg 241 cagaacgccc tgaacggcaa cggcgacccc aacaacatgg acagagccgt gaagctgtac 301 agaaagctga agagagagat caccttccac ggcgccaagg agatcgccct gagctacagc 361 gccggcgccc tggccagctg catgggcctg atctacaaca gaatgggcgc cgtgaccacc 421 gaggtggcct tcggcctggt gtgcgccacc tgcgagcaga tcgccgacag ccagcacaga 481 agccacagac agatggtggc caccaccaac cccctgatca gacacgagaa cagaatggtg 541 ctggccagca ccaccgccaa ggccatggag cagatggccg gcagcagcga gcaggccgcc 601 gaggccatgg agatcgccag ccaggccaga cagatggtgc aggccatgag agccatcggc 661 acccacccca gcagcagcgc cggcctgaaa gatgatcttc ttgaaaattt gcagacctat 721 cagaaacgaa tgggggtgca gatgcaacga ttcaagtgac cccctggtgg tggccgccag 781 catcatcggc atcctgcacc tgatcctgtg gatcctggac agactgttct tcaagtgcat 841 ctacagactg ttcaagcacg gcctgaagag aggccccagc accgagggcg tgcccgagag 901 catgagagag gagtacagaa aggagcagca gaacgccgtg gacgccgacg acagccactt 961 cgtgagcatc gagctggagt ga
[0276]The codon optimized coding region for M1 extends from nucleotide 1 to nucleotide 759 of SEQ ID NO:61 including the stop codon, and is represented herein as SEQ ID NO:79. The codon-optimized coding region for M2 extends from nucleotide 1 to nucleotide 26 of SEQ ID NO:61 spliced to nucleotide 715 through nucleotide 959 of SEQ ID NO:61, including the stop codon, and is represented herein as SEQ ID NO:80.
[0277]Optimized M1 Coding Region (SEQ ID NO:79):
TABLE-US-00064 ATGAGCCTGCTGACCGAGGTCGAAACGTATGTTCTCTCTATCGTGCCCAG CGGCCCCCTGAAGGCCGAGATCGCCCAGAGACTGGAGGACGTGTTCGCCG GCAAGAACACCGACCTGGAGGCCCTGATGGAGTGGCTGAAGACCAGACCC ATCCTGAGCCCCCTGACCAAGGGCATCCTGGGCTTCGTGTTCACCCTGAC CGTGCCCAGCGAGAGAGGCCTGCAGAGAAGAAGATTCGTGCAGAACGCCC TGAACGGCAACGGCGACCCCAACAACATGGACAGAGCCGTGAAGCTGTAC AGAAAGCTGAAGAGAGAGATCACCTTCCACGGCGCCAAGGAGATCGCCCT GAGCTACAGCGCCGGCGCCCTGGCCAGCTGCATGGGCCTGATCTACAACA GAATGGGCGCCGTGACCACCGAGGTGGCCTTCGGCCTGGTGTGCGCCACC TGCGAGCAGATCGCCGACAGCCAGCACAGAAGCCACAGACAGATGGTGGC CACCACCAACCCCCTGATCAGACACGAGAACAGAATGGTGCTGGCCAGCA GAGGCCATGGAGCCACCGCCAAGGCCATGGAGCAGATGGCCGGCAGCAG CGAGCAGGCCGCCATCGCCAGCCAGGCCAGACAGATGGTGCAGGCCATG AGAGCCATCGGCACCCACCCCAGCAGCAGCGCCGGCCTGAAAGATGATC TTCTTGAAAATTTGCAGACCTATCAGAAACGAATGGGGGTGCAGATGCAA CGATTCAAGTGA
[0278]Optimized M2 Coding Region (SEQ ID NO:80):
TABLE-US-00065 ATGAGCCTGCTGACCGAGGTCGAAACACCTATCAGAAACGAATGGGGGTG CAGATGCAACGATTCAAGTGACCCCCTGGTGGTGGCCGCCAGCATCATCG GCATCCTGCACCTGATCCTGTGGATCCTGGACAGACTGTTCTTCAAGTGC ATCTACAGACTGTTCAAGCACGGCCTGAAGAGAGGCCCCAGCACCGAGGG CGTGCCCGAGAGCATGAGAGAGGAGTACAGAAAGGAGCAGCAGAACGCCG TGGACGCCGACGACAGCCACTTCGTGAGCATCGAGCTGGAGTGA
[0279]The eM2-NP fusion was codon-optimized, inserted in pUC119 and sub-cloned into VR10551 as an EcoRI-SalI fragment. The following is the open reading frame for eM2-NP: codon-optimized by Contract (from VR4757), referred to herein as SEQ ID NO:62:
TABLE-US-00066 1 atgagcttgc tcactgaagt cgagacacca atcagaaacg aatggggatg tagatgcaac 61 gatagctcag acatggcctc ccagggaacc aaaagaagct atgaacagat ggagactgac 121 ggagagagac agaacgccac agagatcaga gctagtgtag gaaagatgat agacggtatc 181 gggcgatttt acattcaaat gtgtacggaa ttgaaactca gcgactatga aggcagactt 241 atccagaact cactcacaat tgagcgcatg gtactcagtg catttgatga aagaaggaat 301 aggtacctcg aagaacaccc cagcgccggc aaagatccca agaagactgg cggcccaatt 361 tacagaagag tggacggtaa gtggatgaga gagctggtat tgtacgataa agaagaaatt 421 agaagaatct ggaggcaagc aaacaatgga gaggatgcta cagctggcct gacccacatg 481 atgatttggc atagtaacct gaatgatacc acctaccagc ggacaagggc tctcgttcga 541 accgggatgg atccccgcat gtgctcattg atgcagggta gtacactccc gaggaggtca 601 ggcgcggccg gtgcagccgt gaaaggaatc ggcactatgg taatggaatt gataagaatg 661 attaaaaggg ggattaatga caggaacttt tggagaggag aaaatggacg caaaacaagg 721 agtgcgtatg aacggatgtg caatattttg aaaggaaaat tccaaactgc agcacagcgc 781 gccatgatgg atcaggtacg agaaagtcgc aacccaggta atgctgaaat agaggacctt 841 atatttctcg cccggagtgc tctcatactt agaggaagcg tggcccataa aagttgtctc 901 cccgcatgcg tatacggtcc cgctgtgtct tccggatacg attttgaaaa agagggatat 961 tcattggtgg gaatcgaccc ttttaagctg cttcagaact cacaggttta cagtttgatt 1021 agaccaaacg agaacccagc ccacaaatca caactcgtgt ggatggcatg ccactctgcc 1081 gctttcgaag atctgagact gctctcattt attagaggca ctaaagtgag cccgagggga 1141 aaactgagca cacgaggagt acagatagca tctaacgaaa atatggataa tatgggatct 1201 agcacactcg aattgaggtc acgatactgg gctattagaa cacggagcgg agggaacacc 1261 aaccagcaga gagcatccgc cggtcagata agcgttcagc ctacattttc agtacaacga 1321 aacctgccat ttgaaaagag tacagtgatg gccgcattta ctggcaacac cgagggacga 1381 acaagcgaca tgagagcaga gattattaga atgatggaag gagctaaacc agaggaggtt 1441 tcatttagag gaaggggagt cttcgaattg tccgatgaga aagccacaaa tcccatagta 1501 cctagcttcg acatgtccaa cgaaggctct tacttttttg gtgacaatgc cgaagagtac 1561 gacaattga
[0280]Purified VR4757 DNA was used to transfect the murine cell line VM92 to determine expression of the eM2-NP fusion protein. Expression of eM2-NP was confirmed with a Western Blot assay. Expression was visualized with a commercially available monoclonal antibody to M2 and with mouse polyclonal antibody to NP. In vivo antibody response to NP was detected by ELISA with an average titer of 51,200.
[0281]The eM2-NP fusion gene in VR4758 was codon-optimized and synthesized. The gene was inserted into pUC119 and sub-cloned into VR10551 as an EcoRI-SalI fragment. The following is the open reading frame for eM2-NP: codon-optimized by Applicants (from VR4758), referred to herein as SEQ ID NO:63:
TABLE-US-00067 1 atgagcctgc tgaccgaggt ggagaccccc atcagaaacg agtggggctg cagatgcaac 61 gacagcagcg acatggccag ccagggcacc aagagaagct acgagcagat ggagaccgac 121 ggcgagagac agaacgccac cgagatcaga gccagcgtgg gcaagatgat cgacggcatc 181 ggcagattct acatccagat gtgcaccgag ctgaagctga gcgactacga gggcagactg 241 atccagaaca gcctgaccat cgagagaatg gtgctgagcg ccttcgacga gagaagaaac 301 agatacctgg aggagcaccc cagcgccggc aaggacccca agaagaccgg cggccccatc 361 tacagaagag tggacggcaa gtggatgaga gagctggtgc tgtacgacaa ggaggagatc 421 agaagaatct ggagacaggc caacaacggc gaggacgcca ccgccggcct gacccacatg 481 atgatctggc acagcaacct gaacgacaac acctaccaga gaaccagagc cctggtgcgg 541 accggcatgg accccagaat gtgcagcctg atgcagggca gcaccctgcc cagaagaagc 601 ggcgccgccg gcgccgccgt gaagggcatc ggcaccatgg tgatggagct gatcagaatg 661 atcaagagag gcatcaacga cagaaacttc tggagaggcg agaacggcag aaagaccaga 721 agcgcctacg agagaatgtg caacatcctg aagggcaagt tccagaccgc cgcccagaga 781 gccatgatgg accaggtccg ggagagcaga aaccccggca acgccgagat cgaggacctg 841 atcttcctgg ccagaagcgc cctgatcctg agaggcagcg tggcccacaa gagctgcctg 901 cccgcctgcg tgtacggccc cgccgtgagc agcggctacg acttcgagaa ggagggctac 961 agcctggtgg gcatcgaccc cttcaagctg ctgcagaaca gccaggtgta cagcctgatc 1021 agacccaacg agaaccccgc ccacaagagc cagctggtgt ggatggcctg ccacagcgcc 1081 gccttcgagg acctgagact gctgagcttc atcagaggca ccaaggtgtc ccccagaggc 1141 aagctgagca ccagaggcgt gcagatcgcc agcaacgaga acatggacaa catgggcagc 1201 agcaccctgg agctgagaag cagatactgg gccatcagaa ccagaagcgg cggcaacacc 1261 aaccagcaga gagccagcgc cggccagatc agcgtgcagc ccaccttcag cgtgcagaga 1321 aacctgccct tcgagaagag caccgtgatg gccgccttca ccggcaacac cgagggcaga 1381 accagcgaca tgagagccga gatcatcaga atgatggagg gcgccaagcc cgaggaggtg 1441 tccttcagag gcagaggcgt gttcgagctg agcgacgaga aggccaccaa ccccatcgtg 1501 cctagcttcg acatgagcaa cgagggcagc tacttcttcg gcgacaacgc cgaggagtac 1561 gacaactga
[0282]Purified VR4758 DNA was used to transfect the murine cell line VM92 to determine expression of the eM2-NP protein. Expression of eM2-NP was confirmed with a Western Blot assay. Expression was visualized with a commercially available monoclonal antibody to M2 and with mouse polyclonal antibody to NP. In vivo antibody response to NP was detected by ELISA with an average titer of 48,640.
[0283]The M2 gene was PCR-amplified from VR4755 using the primers 5'-GCCGAATTCGCCACCATGAGCCTGCTGACC-3' (SEQ ID NO:64) and 5'-GCCGTCGACTGATCACTCCAGCTCGATGCTCAC-3' (SEQ ID NO:65) and sub-cloned into VR10551 as an EcoRI-SalI fragment. The following is the open reading frame for M2 (from VR4759), referred to herein as SEQ ID NO:66:
TABLE-US-00068 1 atgagcctgc tgaccgaggt ggagaccccc atcagaaacg agtggggctg cagatgcaac 61 gacagcagcg accccctggt ggtggccgcc agcatcatcg gcatcctgca cctgatcctg 121 tggatcctgg acagactgtt cttcaagtgc atctacagac tgttcaagca cggcctgaag 181 agaggcccca gcaccgaggg cgtgcccgag agcatgagag aggagtacag aaaggagcag 241 cagaacgccg tggacgccga cgacagccac ttcgtgagca tcgagctgga gtga
[0284]Purified VR4759 DNA was used to transfect the murine cell line VM92 to determine expression of the M2 protein. Expression of M2 was confirmed with a Western Blot assay. Expression was visualized with a commercially available anti-M2 monoclonal antibody.
[0285]The M1 gene was PCR-amplified from VR4755 using the primers 5'-GCCGAATTCGCCACCATGTCCCTGCTGACAGAAGTG-3' (SEQ ID NO:67) and 5'-GCCGTCGACTGATCACTTGAATCTCTGCATC-3' (SEQ ID NO:68) and sub-cloned into VR10551 as an EcoRI-SalI fragment. The following is the open reading frame for M1 (from VR4760), referred to herein as SEQ ID NO:69:
TABLE-US-00069 1 atgtccctgc tgacagaagt ggaaacatac gtgctgagca tcgtgcccag cggccccctg 61 aaggccgaga tcgcccagag actggaggac gtgttcgccg gcaagaacac cgacctggag 121 gccctgatgg agtggctgaa gaccagaccc atcctgagcc ccctgaccaa gggcatcctg 181 ggcttcgtgt tcaccctgac cgtgcccagc gagagaggcc tgcagagaag aagattcgtg 241 cagaacgccc tgaacggcaa cggcgacccc aacaacatgg accgggccgt gaagctgtac 301 cggaagctga agagagagat caccttccac ggcgccaagg agatcgccct gagctacagc 361 gccggcgccc tggccagctg catgggcctg atctacaaca gaatgggcgc cgtgaccacc 421 gaggtggcct tcggcctggt gtgcgccacc tgcgagcaga tcgccgacag ccagcacaga 481 agccacagac agatggtggc caccaccaac cccctgatca gacacgagaa cagaatggtg 541 ctggccagca ccaccgccaa ggccatggag cagatggccg gcagcagcga gcaggccgcc 601 gaggccatgg agatcgccag ccaggccaga cagatggtgc aggccatgag agccatcggc 661 acccacccca gcagcagcgc cggcctgaag gacgacctgc tggagaacct gcagacctac 721 cagaagagaa tgggcgtgca gatgcagaga ttcaagtga
[0286]Purified VR4760 DNA was used to transfect the murine cell line VM92 to deter mine expression of the M1 protein. Expression of M1 was confirmed with a Western Blot assay. Expression was visualized with a commercially available anti-M1 monoclonal antibody.
[0287]The NP gene was PCR-amplified from VR4757 using primers 5'-GCCGAATTCGCCACCATGGCCTCCCAGGGAACCAAAAG-3' (SEQ ID NO:70) and 5'-GCCGTCGACTGATCAATTGTCGTACTCTTC-3' (SEQ ID NO:71) and sub-cloned into VR10551 as an EcoRI-SalI fragment. The following is the open reading frame for NP: codon-optimized by Contract (from VR4761), referred to herein as SEQ ID NO:72:
TABLE-US-00070 1 atg gcc tcc cag gga acc aaa aga agc tat gaa cag atg gag act gac 49 gga gag aga cag aac gcc aca gag atc aga gct agt gta gga aag atg 97 ata gac ggt atc ggg cga ttt tac att caa atg tgt acg gaa ttg aaa 145 ctc agc gac tat gaa ggc aga ctt atc cag aac tca ctc aca att gag 193 cgc atg gta ctc agt gca ttt gat gaa aga agg aat agg tac ctc gaa 241 gaa cac ccc agc gcc ggc aaa gat ccc aag aag act ggc ggc cca att 289 tac aga aga gtg gac ggt aag tgg atg aga gag ctg gta ttg tac gat 337 aaa gaa gaa att aga aga atc tgg agg caa gca aac aat gga gag gat 385 gct aca gct ggc ctg acc cac atg atg att tgg cat agt aac ctg aat 433 gat acc acc tac cag cgg aca agg gct ctc gtt cga acc ggg atg gat 481 ccc cgc atg tgc tca ttg atg cag ggt agt aca ctc ccg agg agg tca 529 ggc gcg gcc ggt gca gcc gtg aaa gga atc ggc act atg gta atg gaa 577 ttg ata aga atg att aaa agg ggg att aat gac agg aac ttt tgg aga 625 gga gaa aat gga cgc aaa aca agg agt gcg tat gaa cgg atg tgc aat 673 att ttg aaa gga aaa ttc caa act gca gca cag cgc gcc atg atg gat 721 cag gta cga gaa agt cgc aac cca ggt aat gct gaa ata gag gac ctt 769 ata ttt ctc gcc cgg agt gct ctc ata ctt aga gga agc gtg gcc cat 817 aaa agt tgt ctc ccc gca tgc gta tac ggt ccc gct gtg tct tcc gga 865 tac gat ttt gaa aaa gag gga tat tca ttg gtg gga atc gac cct ttt 913 aag ctg ctt cag aac tca cag gtt tac agt ttg att aga cca aac gag 961 aac cca gcc cac aaa tca caa ctc gtg tgg atg gca tgc cac tct gcc 1009 gct ttc gaa gat ctg aga ctg ctc tca ttt att aga ggc act aaa gtg 1057 agc ccg agg gga aaa ctg agc aca cga gga gta cag ata gca tct aac 1105 gaa aat atg gat aat atg gga tct agc aca ctc gaa ttg agg tca cga 1153 tac tgg gct att aga aca cgg agc gga ggg aac acc aac cag cag aga 1201 gca tcc gcc ggt cag ata agc gtt cag cct aca ttt tca gta caa cga 1249 aac ctg cca ttt gaa aag agt aca gtg atg gcc gca ttt act ggc aac 1297 acc gag gga cga aca agc gac atg aga gca gag att att aga atg atg 1345 gaa gga gct aaa cca gag gag gtt tca ttt aga gga agg gga gtc ttc 1393 gaa ttg tcc gat gag aaa gcc aca aat ccc ata gta cct agc ttc gac 1441 atg tcc aac gaa ggc tct tac ttt ttt ggt gac aat gcc gaa gag tac 1489 gac aat tga
[0288]Purified VR4761 DNA was used to transfect the murine cell line VM92 to determine expression of the NP protein. Expression of NP was confirmed with a Western Blot assay. Expression was visualized with a mouse polyclonal anti-NP antibody. In vitro expression of VR4761 was significantly higher than VR4700 and comparable to VR4762.
[0289]The NP gene was PCR-amplified from VR4758 using primers 5'-GCCGAATTCGCCACCATGGCCAGCCAGGGCACCAAG-3' (SEQ ID NO:73) and 5'-GCCGTCGACTGATCAGTTGTCGTACTCC-3' (SEQ ID NO:74) and sub-cloned into VR10551 as an EcoRI-SalI fragment. The following is the open reading frame for NP: codon-optimized by Applicants (from VR4762), referred to herein as SEQ ID NO:75:
TABLE-US-00071 1 atg gcc agc cag ggc acc aag aga agc tac gag cag atg gag acc gac 49 ggc gag aga cag aac gcc acc gag atc aga gcc agc gtg ggc aag atg 97 atc gac ggc atc ggc aga ttc tac atc cag atg tgc acc gag ctg aag 145 ctg agc gac tac gag ggc aga ctg atc cag aac agc ctg acc atc gag 193 aga atg gtg ctg agc gcc ttc gac gag aga aga aac aga tac ctg gag 241 gag cac ccc agc gcc ggc aag gac ccc aag aag acc ggc ggc ccc atc 289 tac aga aga gtg gac ggc aag tgg atg aga gag ctg gtg ctg tac gac 337 aag gag gag atc aga aga atc tgg aga cag gcc aac aac ggc gag gac 385 gcc acc gcc ggc ctg acc cac atg atg atc tgg cac agc aac ctg aac 433 gac acc acc tac cag aga acc aga gcc ctg gtg cgg acc ggc atg gac 481 ccc aga atg tgc agc ctg atg cag ggc agc acc ctg ccc aga aga agc 529 ggc gcc gcc ggc gcc gcc gtg aag ggc atc ggc acc atg gtg atg gag 577 ctg atc aga atg atc aag aga ggc atc aac gac aga aac ttc tgg aga 625 ggc gag aac ggc aga aag acc aga agc gcc tac gag aga atg tgc aac 673 atc ctg aag ggc aag ttc cag acc gcc gcc cag aga gcc atg atg gac 721 cag gtc cgg gag agc aga aac ccc ggc aac gcc gag atc gag gac ctg 769 atc ttc ctg gcc aga agc gcc ctg atc ctg aga ggc agc gtg gcc cac 817 aag agc tgc ctg ccc gcc tgc gtg tac ggc ccc gcc gtg agc agc ggc 865 tac gac ttc gag aag gag ggc tac agc ctg gtg ggc atc gac ccc ttc 913 aag ctg ctg cag aac agc cag gtg tac agc ctg atc aga ccc aac gag 961 aac ccc gcc cac aag agc cag ctg gtg tgg atg gcc tgc cac agc gcc 1009 gcc ttc gag gac ctg aga ctg ctg agc ttc atc aga ggc acc aag gtg 1057 tcc ccc aga ggc aag ctg agc acc aga ggc gtg cag atc gcc agc aac 1105 gag aac atg gac aac atg ggc agc agc acc ctg gag ctg aga agc aga 1153 tac tgg gcc atc aga acc aga agc ggc ggc aac acc aac cag cag aga 1201 gcc agc gcc ggc cag atc agc gtg cag ccc acc ttc agc gtg cag aga 1249 aac ctg ccc ttc gag aag agc acc gtg atg gcc gcc ttc acc ggc aac 1297 acc gag ggc aga acc agc gac atg aga gcc gag atc atc aga atg atg 1345 gag ggc gcc aag ccc gag gag gtg tcc ttc aga ggc aga ggc gtg ttc 1393 gag ctg agc gac gag aag gcc acc aac ccc atc gtg cct agc ttc gac 1441 atg agc aac gag ggc agc tac ttc ttc ggc gac aac gcc gag gag tac 1489 gac aac tga
[0290]Purified VR4762 DNA was used to transfect the murine cell line VM92 to determine expression of the NP protein. Expression of NP was confirmed with a Western Blot assay. Expression was visualized with a mouse polyclonal anti-NP antibody. In vitro expression of VR4762 was significantly higher than VR4700 and comparable to VR4761.
[0291]In addition to plasmids encoding single IV proteins, single plasmids which contain two or more IV coding regions are constructed according to standard methods. For example, a polycistronic construct, where two or more IV coding regions are transcribed as a single transcript in eukaryotic cells may be constructed by separating the various coding regions with IBES sequences. Alternatively, two or more coding regions may be inserted into a single plasmid, each with their own promoter sequence.
Example 2
Preparation of Recombinant NP DNA and Protein
[0292]Recombinant NP DNA and protein may be prepared using the following procedure. Eukaryotic cells may be used to express the NP protein from a transfected expression plasmid. Alternatively, a baculovirus system can be used wherein insect cells such as, but not limited to, Sf9, Sf21, or D.Mel-2 cells are infected with a recombinant baculovirus which can expresses the NP protein. Cells which have been infected with recombinant baculoviruses, or contain expression plasmids, encoding recombinant NP are collected by knocking and scraping cells off the bottom of the flask in which they are grown. Cells infected for 24 or 48 hours are less easy to detach from flask and may lyse, thus care must be taken with their removal. The flask containing the cells is then rinsed with PBS and the cells are transfered to 250 ml conical tubes. The tubes are spun at 1000 rpm in J-6 centrifuge (300รg) for about 5-10 minutes. The cell pellets are washed two times with PBS and then resuspended in about 10-20 ml of PBS in order to count. The cells are finally resuspended at a concentration of about 2ร107 cells/ml in RSB (10 mM Tris pH=7.5, 1.5 mM MgCl2, 10 mM KCl).
[0293]Approximately 106 cells are used per lane of a standard SDS-PAGE mini-protein gel which is equivalent to the whole cell fraction for gel analysis purposes. 10% NP40 is added to the cells for a final concentration of 0.5%. The cell-NP40 mixture is vortexed and placed on ice for 10 minutes, vortexing occasionally. After ice incubation, the cells are spun at 1500 rpm in a J-6 centrifuge (600ร1) for 10 minutes. The supernantant is removed which is the cytoplasmic fration. The remaining pellet, containing the nuclei, is washed two times with buffer C (20 mM HEPES pH=7.9, 1.5 mM MgCl2, 0.2 mM EDTA, 0.5 mM PMSF, 0.5 mM DTT) to remove cytoplasmic proteins. The nuclei are resuspended in buffer C to 5ร107 nuclei/ml. The nuclei are vortexed vigorously to break up particles and an aliquot is removed for the mini-protein gel which is the nuclei fraction.
[0294]To the remaining nuclei a quarter of the volume of 5M NaCl is added and the mixture is sonicated for 5 minutes at a maximum output in a bath-type sonicator at 4ยฐ C., in 1-2 minute bursts, resting 30 seconds between bursts. The sonicated mixture is stirred at 4ยฐ C., then spun at 12000รg for 10 minutes. A sample is removed for the protein mini-gel equivalent to approximately 106 nuclei. The sample for the gel is centrifuged and the supernatant is the nuclear extract and the pellet is the nuclear pellet for gel analysis.
[0295]For gel analysis, a small amount (about 106 nuclear equivalents) of the nuclear pellet is resuspended directly in gel sample buffer and run with equivalent amounts of whole cells, cytoplasm, nuclei, nuclear extract and nuclear pellet. The above method gives relatively crude NP. To recover NP of a higher purity, 2.1M NaCl can be added to the nuclear pellet instead of 5M NaCl. This will bring the salt content to 0.42M NaCl. The supernatant will then contain about 60-70% of the total NP plus nuclear proteins. The resulting pellet is then extracted with 1M NaCl and centrifuged as above. The supernatant will contain NP at more than 95% purity.
Example 3
Consensus Amino Acid Sequences of NP, M1 and M2
[0296]By analyzing amino acid sequences from influenza strains sequenced since 1990, consensus amino acid sequences were derived for influenza NP, M1 and M2 antigens.
NP Consensus Amino Acid Sequence
[0297]The method by which amino acid sequences for influenza NP (strain A) was chosen is as follows. The http://www.flu.lanl.gov database containing influenza sequences for each segment was searched for influenza A strains, human, NP, amino acids. Results gave about 400 sequences, the majority of which were only partial sequences. The sequences were subsequently narrowed down to 85 approximately full length sequences. If different passages of the same strain were found, the earliest passage was chosen. The sequences were further narrowed down to 28 full length NP sequences isolated from 1990 to 2000 (no full-length sequences from 2001-2003). Five additional sequences were eliminated which were identical to another sequence isolated from the same year based on the assumption that sequences with the same year and identical amino acid sequences were likely to be the same virus strain (in order to avoid double weighting). If there were sequences from the same year with different amino acid sequences, both sequences were kept.
[0298]Sequences were aligned to the A/PR/8/34 strain in decending order by most recent, and the consensus sequence was determined by utilizing the amino acid with the majority (FIG. 12). There are 32 amino acid changes between the A/PR/8/34 and the consensus sequence, and all amino acid changes are also present in the two year 2000 NP sequences. For one additional amino acid (aa 275) 15/23 have changed from E (in A/PR/34) to G/D or V (7G, 7D, 1V). Since the two 2000 strains both contain a G at this position, G was chosen. The changes total 33 amino acids, which is about a 7% difference from the A/PR/8/34 strain.
[0299]The dominant Balb/c epitope TYQRTRALV is still maintained in the new consensus; changes to other theoretical human epitopes have not been determined as yet.
[0300]The A strains used in the last 8 years of flu vaccines (USA) are as follows (no full length sequences are available on any of the these strains' NP genes): [0301]a. 2002-2003 A/Moscow/10/99, A/New Caledonia/20/99 [0302]b. 2001-2002 A/Moscow/10/99, A/New Caledonia/20/99 [0303]c. 2000-2001 A/Panama/2007/99, A/New Caledonia/20/99 [0304]d. 1999-2000 A/Sydney/05/97, A/Beijing/262/95 [0305]e. 1998-1999 A/Sydney/05/97, A/Beijing/262/95 [0306]f. 1997-1998 A/Nanchang/933/95, A/Johannesburg/82/96 [0307]g. 1996-1997 A/Nanchang/933/95, A/Texas/36/91 [0308]h. 1995-1996 A/Johannesburg/33/94, A/Texas/36/91
[0309]The final NP consensus amino acid sequence derived using this method is referred to herein as SEQ ID NO:76:
TABLE-US-00072 1 masqgtkrsy eqmetdgerq nateirasvg kmidgigrfy iqmctelkls dyegrliqns 61 ltiermvlsa fderrnryle ehpsagkdpk ktggpiyrrv dgkwmrelvl ydkeeirriw 121 rqanngedat aglthmmiwh snlndttyqr tralvrtgmd prmcslmqgs tlprrsgaag 181 aavkgigtmv melirmikrg indrnfwrge ngrktrsaye rmcnilkgkf qtaaqrammd 241 qvresrnpgn aeiedlifla rsalilrgsv ahksclpacv ygpavssgyd fekegyslvg 301 idpfkllqns qvyslirpne npahksqlvw machsaafed lrllsfirgt kvsprgklst 361 rgvqiasnen mdnmgsstle lrsrywairt rsggntnqqr asagqisvqp tfsvqrnlpf 421 ekstvmaaft gntegrtsdm raeiirmmeg akpeevsfrg rgvfelsdek atnpivpsfd 481 msnegsyffg dnaeeydn
M1 and M2 Consensus Amino Acid Sequences
[0310]Consensus sequences for M1 and M2 were determined in a similar fashion, as follows. The search parameters on the http://www.flu.lanl.gov/website were: influenza A strains, human, segment 7, nucleotide (both M1 and M2 are derived from segment 7). Full-length sequences from 1990-1999 (no 2000+ sequences were available) were chosen. For sequences with the same year and city, only the earliest passage was used. For entries for the same year, sequences were eliminated that were identical to another sequence isolated from the same year (even if different city). Twenty one sequences, full-length for both M1 and M2 from 1993-1999, were compared. At each position, the amino acid with the simple majority was used.
[0311]The M1 amino acid consensus sequence is referred to herein as SEQ ID NO:77:
TABLE-US-00073 1 mslltevety vlsivpsgpl kaeiaqrled vfagkntdle almewlktrp ilspltkgil 61 gfvftltvps erglqrrrfv qnalngngdp nnmdravkly rklkreitfh gakeialsys 121 agalascmgl iynrmgavtt evafglvcat ceqiadsqhr shrqmvattn plirhenrmv 181 lasttakame qmagsseqaa eameiasqar qmvqamraig thpsssaglk ddllenlqty 241 qkrmgvqmqr fk
[0312]The M2 amino acid consensus sequence is referred to herein as SEQ ID NO:78:
TABLE-US-00074 1 mslltevetp irnewgcrcn dssdplvvaa siigilhlil wildrlffkc iyrlfkhglk 61 rgpstegvpe smreeyrkeq qnavdaddsh fvsiele
Example 4
Codon Optimization Algorithm
[0313]The following is an outline of the algorithm used to derive human codon-optimized sequences of influenza antigens.
Back Translation
[0314]Starting with the amino acid sequence, one can either (a) manually backtranslate using the human codon usage table from http://www.kazusa.or.ip/codon/
[0315]Homo sapiens [gbpri]: 55194 CDS's (24298072 codons)
[0316]Fields: [triplet] [frequency: per thousand] ([number])
TABLE-US-00075 UUU 17.1(415589) UCU 14.7(357770) UAU 12.1(294182) UGU 10.0(243198) UUC 20.6(500964) UCC 17.6(427664) UAC 15.5(377811) UGC 12.2(297010) UUA 7.5(182466) UCA 12.0(291788) UAA 0.7(17545) UGA 1.5(36163) UUG 12.6(306793) UCG 4.4(107809) UAG 0.6(13416) UGG 12.7(309683) CUU 13.0(315804) CCU 17.3(419521) CAU 10.5(255135) CGU 4.6(112673) CUC 19.8(480790) CCC 20.1(489224) CAC 15.0(364828) CGC 10.7(259950) CUA 7.8(189383) CCA 16.7(405320) CAA 12.0(292745) CGA 6.3(152905) CUG 39.8(967277) CCG 6.9(168542) CAG 34.1(827754) CGG 11.6(281493) AUU 16.1(390571) ACU 13.0(315736) AAU 16.7(404867) AGU 11.9(289294) AUC 21.6(525478) ACC 19.4(471273) AAC 19.5(473208) AGC 19.3(467869) AUA 7.7(186138) ACA 15.1(366753) AAA 24.1(585243) AGA 11.5(278843) AUG 22.2(538917) ACG 6.1(148277) AAG 32.2(781752) AGG 11.4(277693) GUU 11.0(266493) GCU 18.6(451517) GAU 21.9(533009) GGU 10.8(261467) GUC 14.6(354537) GCC 28.4(690382) GAC 25.6(621290) GGC 22.5(547729) GUA 7.2(174572) GCA 16.1(390964) GAA 29.0(703852) GGA 16.4(397574) GUG 28.4(690428) GCG 7.5(181803) GAG 39.9(970417) GGG 16.3(396931) *Coding GC 52.45% 1st letter GC 56.04% 2nd letter GC 42.37% 3rd letter GC 58.93% (Table as of Nov. 6, 2003)
[0317]Or (b) log on to www.syntheticgenes.com and use the backtranslation tool, as follows:
[0318](1) Under Protein tab, paste amino acid sequence;
[0319](2) Under download codon usage tab, highlight homo sapiens and then download CUT.
TABLE-US-00076 UUU 17.1(415589) UCU 14.7(357770) UAU 12.1(294182) UGU 10.0(243198) UUC 20.6(500964) UCC 17.6(427664) UAC 15.5(377811) UGC 12.2(297010) UUA 7.5(182466) UCA 12.0(291788) UAA 0.7(17545) UGA 1.5(36163) UUG 12.6(306793) UCG 4.4(107809) UAG 0.6(13416) UGG 12.7(309683) CUU 13.0(315804) CCU 17.3(419521) CAU 10.5(255135) CGU 4.6(112673) CUC 19.8(480790) CCC 20.1(489224) CAC 15.0(364828) CGC 10.7(259950) CUA 7.8(189383) CCA 16.7(405320) CAA 12.0(292745) CGA 6.3(152905) CUG 39.8(967277) CCG 6.9(168542) CAG 34.1(827754) CGG 11.6(281493) AUU 16.1(390571) ACU 13.0(315736) AAU 16.7(404867) AGU 11.9(289294) AUC 21.6(525478) ACC 19.4(471273) AAC 19.5(473208) AGC 19.3(467869) AUA 7.7(186138) ACA 15.1(366753) AAA 24.1(585243) AGA 11.5(278843) AUG 22.2(538917) ACG 6.1(148277) AAG 32.2(781752) AGG 11.4(277693) GUU 11.0(266493) GCU 18.6(451517) GAU 21.9(533009) GGU 10.8(261467) GUC 14.6(354537) GCC 28.4(690382) GAC 25.6(621290) GGC 22.5(547729) GUA 7.2(174572) GCA 16.1(390964) GAA 29.0(703852) GGA 16.4(397574) GUG 28.4(690428) GCG 7.5(181803) GAG 39.9(970417) GGG 16.3(396931) (Table as of Nov. 6, 2003)
[0320](3) Hit Apply button.
[0321](4) Under Optimize TAB, open General TAB.
[0322](5) Check use only most frequent codon box.
[0323](6) Hit Apply button.
[0324](7) Under Optimize TAB, open Motif TAB.
[0325](8) Load desired cloning restriction sites into bad motifs; load any undesirable sequences, such as Pribnow Box sequences (TATAA), Chi sequences (GCTGGCGG), and restriction sites into bad motifs.
[0326](9) Under Output TAB, click on Start box. Output will include sequence, motif search results (under Report TAB), and codon usage report.
[0327]The program did not always use the most frequent codon for amino acids such as cysteine proline, and arginine. To change this, go back to the Edit CUT TAB and manually drag the rainbow colored bar to 100% for the desired codon. Then re-do start under the Output TAB.
[0328]The use of CGG for arginine can lead to very high GC content, so AGA can be used for arginine as an alternative. The difference in codon usage is 11.6 per thousand for CGG vs. 11.5 per thousand for AGA.
Splice Donor and Acceptor Site Search
[0329](1) Log on to Berkeley Drosophila Genome Project Website at http://www.fruitfly.org/seq_tools/splice.html\
[0330](2) Check boxes for Human or other and both splice sites.
[0331](3) Select minimum scores for 5' and 3' splice sites between 0 and 1. [0332]Used the default setting at 0.4 where: [0333]Default minimum score is 0.4, where:
TABLE-US-00077 [0333] % splice % false sites recognized positives Human 5' Splice sites 93.2% 5.2% Human 3' Splice sites 83.8% 3.1%
[0334](4) Paste in sequence.
[0335](5) Submit.
[0336](6) Based on predicted donors or acceptors, change the individual codons until the sites are no longer predicted.
Add in 5' and 3' sequences.
[0337]On the 5' end of the gene sequence, the restriction enzyme site and Kozak sequence (gccacc) was added before ATG. On 3' end of the sequence, tca was added following the stop codon (tga on opposite strand) and then a restriction enzyme site. The GC content and Open Reading Frames were then checked in SEC Central.
Example 5
Preparation of Vaccine Formulations
[0338]Plasmid constructs comprising codon-optimized and non-codon-optimized coding regions encoding NP, M1, M2, HA, eM2, and/or an eM2-NP fusion; or alternatively coding regions (either codon-optimized or non-codon optimized) encoding various IV proteins or fragments, variants or derivatives either alone or as fusions with a carrier protein, e.g., HBcAg, as well as various controls, e.g., empty vector, are formulated with the poloxamer CRL 1005 and BAK (Benzalkonium chloride 50% solution, available from Ruger Chemical Co. Inc.) by the following methods. Specific final concentrations of each component of the formulae are described in the following methods, but for any of these methods, the concentrations of each component may be varied by basic stoichiometric calculations known by those of ordinary skill in the art to make a final solution having the desired concentrations.
[0339]For example, the concentration of CRL 1005 is adjusted depending on, for example, transfection efficiency, expression efficiency, or immunogenicity, to achieve a final concentration of between about 1 mg/ml to about 75 mg/ml, for example, about 1 mg/ml, about 2 mg/ml, about 3 mg/ml, about 4 mg/ml, about 5 mg/ml, about 6.5 mg/ml, about 7 mg/ml, about 7.5 mg/ml, about 8 mg/ml, about 9 mg/ml, about 10 mg/ml, about 15 mg/ml, about 20 mg/ml, about 25 mg/ml, about 30 mg/ml, about 35 mg/ml, about 40 mg/ml, about 45 mg/ml, about 50 mg/ml, about 55 mg/ml, about 60 mg/ml, about 65 mg/ml, about 70 mg/ml, or about 75 mg/ml of CRL 1005.
[0340]Similarly the concentration of DNA is adjusted depending on many factors, including the amount of a formulation to be delivered, the age and weight of the subject, the delivery method and route and the immunogenicity of the antigen being delivered. In general, formulations of the present invention are adjusted to have a final concentration from about 1 ng/ml to about 30 mg/ml of plasmid (or other polynucleotide). For example, a formulation of the present invention may have a final concentration of about 1 ng/ml, about 5 ng/ml, about 10 ng/ml, about 50 ng/ml, about 100 ng/ml, about 500 ng/ml, about 1 ฮผg/ml, about 5 ฮผg/ml, about 10 ฮผg/ml, about 50 ฮผg/ml, about 200 ฮผg/ml, about 400 ฮผg/ml, about 600 ฮผg/ml, about 800 ฮผg/ml, about 1 mg/ml, about 2 mg/ml, about 2.5, about 3 mg/ml, about 3.5, about 4 mg/ml, about 4.5, about 5 mg/ml, about 5.5 mg/ml, about 6 mg/ml, about 7 mg/ml, about 8 mg/ml, about 9 mg/ml, about 10 mg/ml, about 20 mg/ml, or about 30 mg mg/ml of a plasmid.
[0341]Certain formulations of the present invention include a cocktail of plasmids (see, e.g., Example 2 supra) of the present invention, e.g., comprising coding regions encoding IV proteins NP, M1 and/or M2 and optionally, plasmids encoding immunity enhancing proteins, e.g., cytokines. Various plasmids desired in a cocktail are combined together in PBS or other diluent prior to the addition to the other ingredients. Furthermore, plasmids may be present in a cocktail at equal proportions, or the ratios may be adjusted based on, for example, relative expression levels of the antigens or the relative immunogenicity of the encoded antigens. Thus, various plasmids in the cocktail may be present in equal proportion, or up to twice or three times as much of one plasmid may be included relative to other plasmids in the cocktail.
[0342]Additionally, the concentration of BAK may be adjusted depending on, for example, a desired particle size and improved stability. Indeed, in certain embodiments, formulations of the present invention include CRL 1005 and DNA, but are free of BAK. In general BAK-containing formulations of the present invention are adjusted to have a final concentration of BAK from about 0.05 mM to about 0.5 mM. For example, a formulation of the present invention may have a final BAK concentration of about 0.05 mM, 0.1 mM, 0.2 mM, 0.3 mM, 0.4 mM or 0.5 mM.
[0343]The total volume of the formulations produced by the methods below may be scaled up or down, by choosing apparatus of proportional size. Finally, in carrying out any of the methods described below, the three components of the formulation, BAK, CRL 1005, and plasmid DNA, may be added in any order. In each of these methods described below the term "cloud point" refers to the point in a temperature shift, or other titration, at which a clear solution becomes cloudy, i.e., when a component dissolved in a solution begins to precipitate out of solution.
Thermal Cycling of a Pre-Mixed Formulation
[0344]This example describes the preparation of a formulation comprising 0.3 mM BAK, 7.5 mg/ml CRL 1005, and 5 mg/ml of DNA in a total volume of 3.6 ml. The ingredients are combined together at a temperature below the cloud point and then the formulation is thermally cycled to room temperature (above the cloud point) several times, according to the protocol outlined in FIG. 2.
[0345]A 1.28 mM solution of BAK is prepared in PBS, 846 ฮผl of the solution is placed into a 15 ml round bottom flask fitted with a magnetic stirring bar, and the solution is stirred with moderate speed, in an ice bath on top of a stirrer/hotplate (hotplate off) for 10 minutes. CRL 1005 (27 ฮผl) is then added using a 100 ฮผl positive displacement pipette and the solution is stirred for a further 60 minutes on ice. Plasmids comprising codon-optimized coding regions encoding, for example, NP, M1, and M2 as described herein, and optionally, additional plasmids comprising codon-optimized or non-codin-optimized coding regions encoding, e.g., additional IV proteins, and or other proteins, e.g., cytokines, are mixed together at desired proportions in PBS to achieve 6.4 mg/ml total DNA. This plasmid cocktail is added drop wise, slowly, to the stirring solution over 1 min using a 5 ml pipette. The solution at this point (on ice) is clear since it is below the cloud point of the poloxamer and is further stirred on ice for 15 min. The ice bath is then removed, and the solution is stirred at ambient temperature for 15 minutes to produce a cloudy solution as the poloxamer passes through the cloud point.
[0346]The flask is then placed back into the ice bath and stirred for a further 15 minutes to produce a clear solution as the mixture is cooled below the poloxamer cloud point. The ice bath is again removed and the solution stirred at ambient temperature for a further 15 minutes. Stirring for 15 minutes above and below the cloud point (total of 30 minutes), is defined as one thermal cycle. The mixture is cycled six more times. The resulting formulation may be used immediately, or may be placed in a glass vial, cooled below the cloud point, and frozen at -80ยฐ C. for use at a later time.
Thermal Cycling, Dilution and Filtration of a Pre-mixed Formulation, Using Increased Concentrations of CRL 1005
[0347]This example describes the preparation of a formulation comprising 0.3 mM BAK, 34 mg/ml or 50 mg/ml CRL 1005, and 5.0 mg/ml of DNA in a final volume of 4.0 ml. The ingredients are combined together at a temperature below the cloud point, then the formulation is thermally cycled to room temperature (above the cloud point) several times, diluted, and filtered according to the protocol outlined in FIG. 3.
[0348]Plasmids comprising codon-optimized coding regions encoding, for example, NP, M1, and M2 as described herein, and optionally, additional plasmids comprising codon-optimized or non-codin-optimized coding regions encoding, e.g., additional IV proteins, and or other proteins, e.g., cytokines, are mixed together at desired proportions in PBS to achieve 6.4 mg/ml total DNA. This plasmid cocktail is placed into the 15 ml round bottom flask fitted with a magnetic stirring bar, and for the formulation containing 50 mg/ml CRL 1005, 3.13 ml of a solution containing about 3.2 mg/ml of NP encoding plasmid and about 3.2 mg/ml M2 encoding plasmid (about 6.4 mg/ml total DNA) is placed into the 15 ml round bottom flask fitted with a magnetic stirring bar, and the solutions are stirred with moderate speed, in an ice bath on top of a stirrer/hotplate (hotplate off) for 10 minutes. CRL 1005 (136 ฮผl for 34 mg/ml final concentration, and 200 ฮผl for 50 mg/ml final concentration) is then added using a 200 ฮผl positive displacement pipette and the solution is stirred for a further 30 minutes on ice. Solutions of 1.6 mM and 1.8 mM BAK are prepared in PBS, and 734 ฮผl of 1.6 mM and 670 ฮผl of 1.8 mM are then added drop wise, slowly, to the stirring poloxamer solutions with concentrations of 34 mg/ml or 50 mg/ml mixtures, respectively, over 1 min using a 1 ml pipette. The solutions at this point are clear since they are below the cloud point of the poloxamer and are stirred on ice for 30 min. The ice baths are then removed; the solutions stirred at ambient temperature for 15 minutes to produce cloudy solutions as the poloxamer passes through the cloud point.
[0349]The flasks are then placed back into the ice baths and stirred for a further 15 minutes to produce clear solutions as the mixtures cooled below the poloxamer cloud point. The ice baths are again removed and the solutions stirred for a further 15 minutes. Stirring for 15 minutes above and below the cloud point (total of 30 minutes), is defined as one thermal cycle. The mixtures are cycled two more times.
[0350]In the meantime, two Steriflipยฎ 50 ml disposable vacuum filtration devices, each with a 0.22 ฮผm Millipore Expressยฎ membrane (available from Millipore, cat #SCGP00525) are placed in an ice bucket, with a vacuum line attached and left for 1 hour to allow the devices to equilibrate to the temperature of the ice. The poloxamer formulations are then diluted to 2.5 mg/ml DNA with PBS and filtered under vacuum.
[0351]The resulting formulations may be used immediately, or may be transferred to glass vials, cooled below the cloud point, and frozen at -80ยฐ C. for use at a later time.
A Simplified Method Without Thermal Cycling
[0352]This example describes a simplified preparation of a formulation comprising 0.3 mM BAK, 7.5 mg/ml CRL 1005, and 5 mg/ml of DNA in a total volume of 2.0 ml. The ingredients are combined together at a temperature below the cloud point and then the formulation is simply filtered and then used or stored, according to the protocol outlined in FIG. 4.
[0353]A 0.77 mM solution of BAK is prepared in PBS, and 780 ฮผl of the solution is placed into a 15 ml round bottom flask fitted with a magnetic stirring bar, and the solution is stirred with moderate speed, in an ice bath on top of a stirrer/hotplate (hotplate off) for 15 minutes. CRL 1005 (15 ฮผl) is then added using a 100 ฮผl positive displacement pipette and the solution is stirred for a further 60 minutes on ice. Plasmids comprising codon-optimized coding regions encoding, for example, NP, M1, and M2 as described herein, and optionally, additional plasmids comprising codon-optimized or non-codin-optimized coding regions encoding, e.g., additional IV proteins, and or other proteins, e.g., cytokines, are mixed together at desired proportions in PBS to achieve a final concentration of about 8.3 mg/ml total DNA. This plasmid cocktail is added drop wise, slowly, to the stirring solution over 1 min using a 5 ml pipette. The solution at this point (on ice) is clear since it is below the cloud point of the poloxamer and is further stirred on ice for 15 min.
[0354]In the meantime, one Steriflipยฎ 50 ml disposable vacuum filtration devices, with a 0.22 ฮผm Millipore Expressยฎ membrane (available from Millipore, cat #SCGP00525) is placed in an ice bucket, with a vacuum line attached and left for 1 hour to allow the device to equilibrate to the temperature of the ice. The poloxamer formulation is then filtered under vacuum, below the cloud point and then allowed to warm above the cloud point. The resulting formulations may be used immediately, or may be transferred to glass vials, cooled below the cloud point and then frozen at -80ยฐ C. for use at a later time.
Example 6
Animal Immunizations
[0355]The immunogenicity of the various IV expression products encoded by the codon-optimized polynucleotides described herein are initially evaluated based on each plasmid's ability to mount an immune response in vivo. Plasmids are tested individually and in combinations by injecting single constructs as well as multiple constructs. Immunizations are initially carried out in animals, such as mice, rabbits, goats, sheep, non-human primates, or other suitable animal, by intramuscular (IM) injections. Serum is collected from immunized animals, and the antigen specific antibody response is quantified by ELISA assay using purified immobilized antigen proteins in a protein--immunized subject antibody--anti-species antibody type assay, according to standard protocols. The tests of immunogenicity further include measuring antibody titer, neutralizing antibody titer, T-cell proliferation, T-cell secretion of cytokines, cytolytic T cell responses, and by direct enumeration of antigen specific CD4+ and CD8+ T-cells. Correlation to protective levels of the immune responses in humans are made according to methods well known by those of ordinary skill in the art. See above.
A. DNA Formulations
[0356]Plasmid DNA is formulated with a poloxamer by any of the methods described in Example 3. Alternatively, plasmid DNA is prepared as described above and dissolved at a concentration of about 0.1 mg/ml to about 10 mg/ml, preferably about 1 mg/ml, in PBS with or without transfection-facilitating cationic lipids, e.g., DMRIE/DOPE at a 4:1 DNA:lipid mass ratio. Alternative DNA formulations include 150 mM sodium phosphate instead of PBS, adjuvants, e.g., Vaxfectinยฎ at a 4:1 DNA:Vaxfectinยฎ mass ratio, mono-phosphoryl lipid A (detoxified endotoxin) from S. minnesota (MPL) and trehalosedicorynomycolateAF (TDM), in 2% oil (squalene)-Tween 80-water (MPL+TDM, available from Sigma/Aldrich, St. Louis, Mo., (catalog #M6536)), a solubilized mono-phosphoryl lipid A formulation (AF, available from Corixa), or (ยฑ)--N-(3-Acetoxypropyl)-N,N-dimethyl-2,3-bis(octyloxy)-1-propanaminiu- m chloride (compound # VC1240) (see Shriver, J. W. et al., Nature 415:331-335 (2002), and P.C.T. Publication No. WO 02/00844 A2, each of which is incorporated herein by reference in its entirety).
B. Animal Immunizations
[0357]Plasmid constructs comprising codon-optimized and non-codon-optimized coding regions encoding NP, M1, M2, eM2, and/or an eM2-NP fusion; or alternatively coding regions (either codon-optimized or non-codon optimized) encoding various W proteins or fragments, variants or derivatives either alone or as fusions with a carrier protein, e.g., HBcAg, as well as various controls, e.g., empty vector, are injected into BALB/c mice as single plasmids or as cocktails of two or more plasmids, as either DNA in PBS or formulated with the poloxamer-based delivery system: 2 mg/ml DNA, 3 mg/ml CRL 1005, and 0.1 mM BAK. Groups of 10 mice are immunized three times, at biweekly intervals, and serum is obtained to determine antibody titers to each of the antigens. Groups are also included in which mice are immunized with a trivalent preparation, containing each of the three plasmid constructs in equal mass.
[0358]The immunization schedule is as follows: [0359]Day -3 Pre-bleed [0360]Day 0 Plasmid injections, intramuscular, bilateral in rectus femoris, 5-50 ฮผg/leg [0361]Day 21 Plasmid injections, intramuscular, bilateral in rectus femoris, 5-50 ฮผg/leg [0362]Day 49 Plasmid injections, intramuscular, bilateral in rectus femoris, 5-50 ฮผg/leg [0363]Day 59 Serum collection
[0364]Serum antibody titers are determined by ELISA with recombinant proteins, peptides or transfection supernatants and lysates from transfected VM-92 cells live, inactivated, or lysed virus.
C. Immunization of Mice with Vaccine Formulations Using a Vaxfectinยฎ Adjuvant
[0365]Vaxfectinยฎ (a 1:1 molar ratio of the cationic lipid VC1052 and the neutral co-lipid DPyPE) is a synthetic cationic lipid formulation which has shown promise for its ability to enhance antibody titers against when administered with DNA intramuscularly to mice.
[0366]In mice, intramuscular injection of Vaxfectinยฎ formulated with NP DNA increased antibody titers up to 20-fold to levels that could not be reached with DNA alone. In rabbits, complexing DNA with Vaxfectinยฎ enhanced antibody titers up to 50-fold. Thus, Vaxfectinยฎ shows promise as a delivery system and as an adjuvant in a DNA vaccine.
[0367]Vaxfectinยฎ mixtures are prepared by mixing chloroform solutions of VC1052 cationic lipid with chloroform solutions of DpyPE neutral co-lipid. Dried films are prepared in 2 ml sterile glass vials by evaporating the chloroform under a stream of nitrogen, and placing the vials under vacuum overnight to remove solvent traces. Each vial contains 1.5 mole each of VC1052 and DPyPE. Liposomes are prepared by adding sterile water followed by vortexing. The resulting liposome solution is mixed with DNA at a phosphate mole:cationic lipid mole ratio of 4:1.
[0368]Plasmid constructs comprising codon-optimized and non-codon-optimized coding regions encoding NP, M1, M2, eM2, and/or an eM2-NP fusion; or alternatively coding regions (either codon-optimized or non-codon optimized) encoding various IV proteins or fragments, variants or derivatives either alone or as fusions with a carrier protein, e.g., HBcAg, as well as various controls, e.g., empty vector, are mixed together at desired proportions in PBS to achieve a final concentration of 1.0 mg/ml. The plasmid cocktail, as well as the controls, are formulated with Vaxfectinยฎ. Groups of 5 BALB/c female mice are injected bilaterally in the rectus femoris muscle with 50 ฮผl of DNA solution (100 ฮผl total/mouse), on days 1 and 21 and 49 with each formulation. Mice are bled for serum on days 0 (prebleed), 20 (bleed 1), and 41 (bleed 2), and 62 (bleed 3), and up to 40 weeks post-injection. Antibody titers to the various IV proteins encoded by the plasmid DNAs are measured by ELISA as described elsewhere herein.
[0369]Cytolytic T-cell responses are measured as described in Hartikka et al. "Vaxfectin Enhances the Humoral Response to Plasmid DNA-encoded Antigens," Vaccine 19:1911-1923 (2001) and is incorporated herein in its entirety by reference. Standard ELISPOT technology is used for the CD4+ and CD8+ T-cell assays as described in Example 6, part A.
D. Production of NP, M1 or M2 Antisera in Animals
[0370]Plasmid constructs comprising codon-optimized and non-codon-optimized coding regions encoding NP, M1, M2, eM2, and/or an eM2-NP fusion; or alternatively coding regions (either codon-optimized or non-codon optimized) encoding various IV proteins or fragments, variants or derivatives either alone or as fusions with a carrier protein, e.g., HBcAg, as well as various controls, e.g., empty vector, are prepared according to the immunization scheme described above and injected into a suitable animal for generating polyclonal antibodies. Serum is collected and the antibody titered as above.
[0371]Monoclonal antibodies are also produced using hybridoma technology (Kohler, et al., Nature 256:495 (1975); Kohler, et al., Eur. J. Immunol. 6:511 (1976); Kohler, et al., Eur. J. Immunol. 6:292 (1976); Hammerling, et al., in Monoclonal Antibodies and T-Cell Hybridomas, Elsevier, N.Y., (1981), pp. 563-681, each of which is incorporated herein by reference in its entirety). In general, such procedures involve immunizing an animal (preferably a mouse) as described above. The splenocytes of such mice are extracted and fused with a suitable myeloma cell line. Any suitable myeloma cell line may be employed in accordance with the present invention; however, it is preferable to employ the parent myeloma cell line (SP2O), available from the American Type Culture Collection, Rockville, Md. After fusion, the resulting hybridoma cells are selectively maintained in HAT medium, and then cloned by limiting dilution as described by Wands et al., Gastroenterology 80:225-232 (1981), incorporated herein by reference in its entirety. The hybridoma cells obtained through such a selection are then assayed to identify clones which secrete antibodies capable of binding the various IV proteins.
[0372]Alternatively, additional antibodies capable of binding to IV proteins described herein may be produced in a two-step procedure through the use of anti-idiotypic antibodies. Such a method makes use of the fact that antibodies are themselves antigens, and that, therefore, it is possible to obtain an antibody which binds to a second antibody. In accordance with this method, various IV-specific antibodies are used to immunize an animal, preferably a mouse. The splenocytes of such an animal are then used to produce hybridoma cells, and the hybridoma cells are screened to identify clones which produce an antibody whose ability to bind to the IV protein-specific antibody can be blocked by the cognate IV protein. Such antibodies comprise anti-idiotypic antibodies to the IV protein-specific antibody and can be used to immunize an animal to induce formation of further IV-specific antibodies.
[0373]It will be appreciated that Fab and F(ab')2 and other fragments of the antibodies of the present invention may be used according to the methods disclosed herein. Such fragments are typically produced by proteolytic cleavage, using enzymes such as papain (to produce Fab fragments) or pepsin (to produce F(ab')2 fragments). Alternatively, NP, M1, M2, HA and eM2 binding fragments can be produced through the application of recombinant DNA technology or through synthetic chemistry.
[0374]It may be preferable to use "humanized" chimeric monoclonal antibodies. Such antibodies can be produced using genetic constructs derived from hybridoma cells producing the monoclonal antibodies described above. Methods for producing chimeric antibodies are known in the art. See, for review, Morrison, Science 229:1202 (1985); Oi, et al., BioTechniques 4:214 (1986); Cabilly, et al., U.S. Pat. No. 4,816,567; Taniguchi, et al., EP 171496; Morrison, et al., EP 173494; Neuberger, et al., WO 8601533; Robinson, et al., WO 8702671; Boulianne, et al., Nature 312:643 (1984); Neuberger, et al., Nature 314:268 (1985).
[0375]These antibodies are used, for example, in diagnostic assays, as a research reagent, or to further immunize animals to generate IV-specific anti-idiotypic antibodies. Non-limiting examples of uses for anti-IV antibodies include use in Western blots, ELISA (competitive, sandwich, and direct), immunofluorescence, immunoelectron microscopy, radioimmunoassay, immunoprecipitation, agglutination assays, immunodiffusion, immunoelectrophoresis, and epitope mapping (Weir, D. Ed. Handbook of Experimental Immunology, 4th ed. Vols. I and II, Blackwell Scientific Publications (1986)).
Example 7
Mucosal Vaccination and Electrically Assisted Plasmid Delivery
A. Mucosal DNA Vaccination
[0376]Plasmid constructs comprising codon-optimized and non-codon-optimized coding regions encoding NP, M1, M2, HA, eM2, and/or an eM2-NP fusion; or alternatively coding regions (either codon-optimized or non-codon optimized) encoding various IV proteins or fragments, variants or derivatives either alone or as fusions with a carrier protein, e.g., HBcAg, as well as various controls, e.g., empty vector, (100 ฮผg/50 ฮผl total DNA) are delivered to BALB/c mice at 0, 2 and 4 weeks via i.m., intranasal (i.n.), intravenous (i.v.), intravaginal (i.vag.), intrarectal (i.r.) or oral routes. The DNA is delivered unformulated or formulated with the cationic lipids DMRIE/DOPE (DD) or GAP-DLRIE/DOPE (GD). As endpoints, serum IgG titers against the various IV antigens are measured by ELISA and splenic T-cell responses are measured by antigen-specific production of IFN-gamma and IL-4 in ELISPOT assays. Standard chromium release assays are used to measure specific cytotoxic T lymphocyte (CTL) activity against the various IV antigens. Tetramer assays are used to detect and quantify antigen specific T-cells, with quantification being confirmed and phenotypic characterization accomplished by intracellular cytokine staining. In addition, IgG and IgA responses against the various IV antigens are analyzed by ELISA of vaginal washes.
B. Electrically-Assisted Plasmid Delivery
[0377]In vivo gene delivery may be enhanced through the application of brief electrical pulses to injected tissues, a procedure referred to herein as electrically-assisted plasmid delivery. See, e.g., Aihara, H. & Miyazaki, J. Nat. Biotechnol. 16:867-70 (1998); Mir, L. M. et al., Proc. Natl. Acad. Sci. USA 96:4262-67 (1999); Hartikka, J. et al., Mol. Ther. 4:407-15 (2001); and Mir, L. M. et al.; Rizzuto, G. et al., Hum Gene Ther 11:1891-900 (2000); Widera, G. et al, J. of Immuno. 164: 4635-4640 (2000). The use of electrical pulses for cell electropermeabilization has been used to introduce foreign DNA into prokaryotic and eukaryotic cells in vitro. Cell permeabilization can also be achieved locally, in vivo, using electrodes and optimal electrical parameters that are compatible with cell survival.
[0378]The electroporation procedure can be performed with various electroporation devices. These devices include external plate type electrodes or invasive needle/rod electrodes and can possess two electrodes or multiple electrodes placed in an array. Distances between the plate or needle electrodes can vary depending upon the number of electrodes, size of target area and treatment subject.
[0379]The TriGrid needle array, used in examples described herein, is a three electrode array comprising three elongate electrodes in the approximate shape of a geometric triangle. Needle arrays may include single, double, three, four, five, six or more needles arranged in various array formations. The electrodes are connected through conductive cables to a high voltage switching device that is connected to a power supply.
[0380]The electrode array is placed into the muscle tissue, around the site of nucleic acid injection, to a depth of approximately 3 mm to 3 cm. The depth of insertion varies depending upon the target tissue and size of patient receiving electroporation. After injection of foreign nucleic acid, such as plasmid DNA, and a period of time sufficient for distribution of the nucleic acid, square wave electrical pulses are applied to the tissue. The amplitude of each pulse ranges from about 100 volts to about 1500 volts, e.g., about 100 volts, about 200 volts, about 300 volts, about 400 volts, about 500 volts, about 600 volts, about 700 volts, about 800 volts, about 900 volts, about 1000 volts, about 1100 volts, about 1200 volts, about 1300 volts, about 1400 volts, or about 1500 volts or about 1-1.5 kV/cm, based on the spacing between electrodes. Each pulse has a duration of about 1 ฮผs to about 1000 ฮผs, e.g., about 1 ฮผs, about 10 ฮผs, about 50 ฮผs, about 100 ฮผs, about 200 ฮผs, about 304 ฮผs, about 400 ฮผs, about 500 ฮผs, about 600 ฮผs, about 700 ฮผs, about 800 ฮผs, about 900 ฮผs, or about 1000 ฮผs, and a pulse frequency on the order of about 140 Hz. The polarity of the pulses may be reversed during the electroporation procedure by switching the connectors to the pulse generator. Pulses are repeated multiple times. The electroporation parameters (e.g. voltage amplitude, duration of pulse, number of pulses, depth of electrode insertion and frequency) will vary based on target tissue type, number of electrodes used and distance of electrode spacing, as would be understood by one of ordinary skill in the art.
[0381]Immediately after completion of the pulse regimen, subjects receiving electroporation can be optionally treated with membrane stabilizing agents to prolong cell membrane permeability as a result of the electroporation. Examples of membrane stabilizing agents include, but are not limited to, steroids (e.g. dexamethasone, methylprednisone and progesterone), angiotensin II and vitamin E. A single dose of dexamethasone, approximately 0.1 mg per kilogram of body weight, should be sufficient to achieve a beneficial affect.
[0382]EAPD techniques such as electroporation can also be used for plasmids contained in liposome formulations. The liposome-plasmid suspension is administered to the animal or patient and the site of injection is treated with a safe but effective electrical field generated, for example, by a TriGrid needle array. The electroporation may aid in plasmid delivery to the cell by destabilizing the liposome bilayer so that membrane fusion between the liposome and the target cellular structure occurs. Electroporation may also aid in plasmid delivery to the cell by triggering the release of the plasmid, in high concentrations, from the liposome at the surface of the target cell so that the plasmid is driven across the cell membrane by a concentration gradient via the pores created in the cell membrane as a result of the electroporation.
[0383]Female BALB/c mice aged 8-10 weeks are anesthetized with inhalant isoflurane and maintained under anesthesia for the duration of the electroporation procedure. The legs are shaved prior to treatment. Plasmid constructs comprising codon-optimized and non-codon-optimized coding regions encoding NP, M1, M2, HA, eM2, and/or an eM2-NP fusion; or alternatively coding regions (either codon-optimized or non-codon optimized) encoding various IV proteins or fragments, variants or derivatives either alone or as fusions with a carrier protein, e.g., HBcAg, as well as various controls, e.g., empty vector, are administered to BALB/c mice (n=10) via unilateral injection in the quadriceps with 25 ฮผg total of a plasmid DNA per mouse using an 0.3 cc insulin syringe and a 26 gauge, 1/2 length needle fitted with a plastic collar to regulate injection depth. Approximately one minute after injection, electrodes are applied. Modified caliper electrodes are used to apply the electrical pulse. See Hartikka J. et al. Mol Ther 188:407-415 (2001). The caliper electrode plates are coated with conductivity gel and applied to the sides of the injected muscle before closing to a gap of 3 mm for administration of pulses. EAPD is applied using a square pulse type at 1-10 Hz with a field strength of 100-500 V/cm, 1-10 pulses, of 10-100 ms each.
[0384]Mice are vaccinatedยฑEAPD at 0, 2 and 4 weeks. As endpoints, serum IgG titers against the various IV antigens are measured by ELISA and splenic T-cell responses are measured by antigen-specific production of IFN-gamma and IL-4 in ELISPOT assays. Standard chromium release assays are used to measure specific cytotoxic T lymphocyte (CTL) activity against the various IV antigens.
[0385]Rabbits (n=3) are given bilateral injections in the quadriceps muscle with plasmid constructs comprising codon-optimized and non-codon-optimized coding regions encoding NP, HA, M1, M2, eM2, and/or an eM2-NP fusion; or alternatively coding regions (either codon-optimized or non-codon optimized) encoding various IV proteins or fragments, variants or derivatives either alone or as fusions with a carrier protein, e.g., HBcAg, as well as various controls, e.g., empty vector. The implantation area is shaved and the TriGrid electrode array is implanted into the target region of the muscle. 3.0 mg of plasmid DNA is administered per dose through the injection port of the electrode array. An injection collet is used to control the depth of injection. Electroporation begins approximately one minute after injection of the plasmid DNA is complete. Electroporation is administered with a TriGrid needle array, with electrodes evenly spaced 7 mm apart, using an Ichor TGP-2 pulse generator. The array is inserted into the target muscle to a depth of about 1 to 2 cm. 4-8 pulses are administered. Each pulse has a duration of about 50-100 ฮผs, an amplitude of about 1-1.2 kV/cm and a pulse frequency of 1 Hz. The injection and electroporation may be repeated.
[0386]Sera are collected from vaccinated rabbits at various time point. As endpoints, serum IgG titers against the various IV antigens are measured by ELISA and PBMC T-cell proliferative responses.
[0387]To test the effect of electroporation on therapeutic protein expression in non-human primates, male or female rhesus monkeys are given either 2 or 6 i.m. injections of plasmid constructs comprising codon-optimized and non-codon-optimized coding regions encoding NP, M1, M2, eM2, and/or an eM2-NP fusion; or alternatively coding regions (either codon-optimized or non-codon optimized) encoding various IV proteins or fragments, variants or derivatives either alone or as fusions with a carrier protein, e.g., HBcAg, as well as various controls, e.g., empty vector, (0.1 to 10 mg DNA total per animal). Target muscle groups include, but are not limited to, bilateral rectus fermoris, cranial tibialis, biceps, gastrocenemius or deltoid muscles. The target area is shaved and a needle array, comprising between 4 and 10 electrodes, spaced between 0.5-1.5 cm apart, is implanted into the target muscle. Once injections are complete, a sequence of brief electrical pulses are applied to the electrodes implanted in the target muscle using an Ichor TGP-2 pulse generator. The pulses have an amplitude of approximately 120-200V. The pulse sequence is completed within one second. During this time, the target muscle may make brief contractions or twitches. The injection and electroporation may be repeated.
[0388]Sera are collected from vaccinated monkeys at various time points. As endpoints, serum IgG titers against the various IV antigens are measured by ELISA and PBMC T-cell proliferative responses are measured by antigen-specific production of ITN-gamma and IL-4 in ELISPOT assays or by tetramer assays to detect and quantify antigen specific T-cells, with quantification being confirmed and phenotypic characterization accomplished by intracellular cytokine staining. Standard chromium release assays are used to measure specific cytotoxic T lymphocyte (CTL) activity against the various IV antigens.
Example 8
Combinatorial DNA Vaccine Using Heterologous Prime-Boost Vaccination
[0389]This Example describes vaccination with a combinatorial formulation including one or more polynucleotides comprising one codon-optimized coding regions encoding an IV protein or fragment, variant, or derivative thereof prepared with an adjuvant and/or transfection facilitating agent; and also an isolated IV protein or fragment, variant, or derivative thereof. Thus, antigen is provided in two forms. The exogenous isolated protein stimulates antigen specific antibody and CD4+ T-cell responses, while the polynucleotide-encoded protein, produced as a result of cellular uptake and expression of the coding region, stimulates a CD8+ T-cell response. Unlike conventional "prime-boost" vaccination strategies, this approach provides different forms of antigen in the same formulation. Because antigen expression from the DNA vaccine doesn't peak until 7-10 days after injection, the DNA vaccine provides a boost for the protein component. Furthermore, the formulation takes advantage of the immunostimulatory properties of the bacterial plasmid DNA.
[0390]A. Non-Codon Optimized NP Gene
[0391]This example demonstrates the efficacy of this procedure using a non-codon-optimized polynucleotide encoding NP, however, the methods described herein are applicable to any IV polynucleotide vaccine formulation. Because only a small amount of protein is needed in this method, it is conceivable that the approach could be used to reduce the dose of conventional vaccines, thus increasing the availability of scarce or expensive vaccines. This feature would be particularly important for vaccines against pandemic influenza or biological warfare agents.
[0392]An injection dose of 10 ฮผg influenza A/PR/8/34 nucleoprotein (NP) DNA per mouse, prepared essentially as described in Ulmer, J. B., et al., Science 259:1745-49 (1993) and Ulmer, J. B. et al., J. Virol. 72:5648-53 (1998) was pre-determined in dose response studies to induce T cell and antibody responses in the linear range of the dose response and results in a response rate of greater than 95% of mice injected. Each formulation, NP DNA alone, or NP DNA+/-NP protein formulated with Ribi I or the cationic lipids, DMRIE:DOPE or Vaxfectinยฎ, was prepared in the recommended buffer for that vaccine modality. For injections with NP DNA formulated with cationic lipid, the DNA was diluted in 2รPBS to 0.2 mg/ml+/-purified recombinant NP protein (produced in baculovirus as described in Example 2) at 0.08 mg/ml. Each cationic lipid was reconstituted from a dried film by adding 1 ml of sterile water for injection (SWFI) to each vial and vortexing continuously for 2 min., then diluted with SWFI to a final concentration of 0.15 mM. Equal volumes of NP DNA (+/-NP protein) and cationic lipid were mixed to obtain a DNA to cationic lipid molar ratio of 4:1. For injections with DNA containing Ribi I adjuvant (Sigma), Ribi I was reconstituted with saline to twice the final concentration. Ribi I (2ร) was mixed with an equal volume of NP DNA at 0.2 mg/ml in saline+/-NP protein at 0.08 mg/ml. For immunizations without cationic lipid or Ribi, NP DNA was prepared in 150 mM sodium phosphate buffer, pH 7.2. For each experiment, groups of 9 BALB/c female mice at 7-9 weeks of age were injected with 50 ฮผl of NP DNA+/- NP protein, cationic lipid or Ribi I. Injections were given bilaterally in each rectus femoris at day 0 and day 21. The mice were bled by OSP on day 20 and day 33 and serum titers of individual mice were measured.
[0393]NP specific serum antibody titers were determined by indirect binding ELISA using 96 well ELISA plates coated overnight at 4ยฐ C. with purified recombinant NP protein at 0.5 ฮผg per well in BBS buffer pH 8.3. NP coated wells were blocked with 1% bovine serum albumin in BBS for 1 h at room temperature. Two-fold serial dilutions of sera in blocking buffer were incubated for 2 h at room temperature and detected by incubating with alkaline phosphatase conjugated (AP) goat anti-mouse IgG-Fc (Jackson Immunoresearch, West Grove, Pa.) at 1:5000 for 2 h at room temperature. Color was developed with 1 mg/ml para-nitrophenyl phosphate (Calbiochem, La Jolla, Calif.) in 50 mM sodium bicarbonate buffer, pH 9.8 and 1 mM MgCl2 and the absorbance read at 405 nm. The titer is the reciprocal of the last dilution exhibiting an absorbance value 2 times that of pre-bleed samples.
[0394]Standard ELISPOT technology, used to identify the number of interferon gamma (IFN-ฮณ) secreting cells after stimulation with specific antigen (spot forming cells per million splenocytes, expressed as SFU/million), was used for the CD4+ and CD8+ T-cell assays. For the screening assays, 3 mice from each group were sacrificed on day 34, 35, and 36. At the time of collection, spleens from each group were pooled, and single cell suspensions made in cell culture media using a dounce homogenizer. Red blood cells were lysed, and cells washed and counted. For the CD4+ and CD8+ assays, cells were serially diluted 3-fold, starting at 106 cells per well and transferred to 96 well ELISPOT plates pre-coated with anti-murine IFN-ฮณ monoclonal antibody. Spleen cells were stimulated with the H-2Kd binding peptide, TYQRTRALV (SEQ ID NO:81), at 1 ฮผg/ml and recombinant murine IL-2 at 1 U/ml for the CD8+ assay and with purified recombinant NP protein at 20 ฮผg/ml for the CD4+ assay. Cells were stimulated for 20-24 hours at 37ยฐ C. in 5% CO2, then the cells were washed out and biotin labeled anti-IFN-ฮณ monoclonal antibody added for a 2 hour incubation at room temperature. Plates were washed and horseradish peroxidase-labeled avidin was added. After a 1-hour incubation at room temperature, AEC substrate was added and "spots" developed for 15 min. Spots were counted using the Immunospot automated spot counter (C.T.L. Inc., Cleveland Ohio). Thus, CD4+ and CD8+ responses were measured in three separate assays, using spleens collected on each of three consecutive days.
[0395]Three weeks after a single injection, antibody responses in mice receiving vaccine formulations containing purified protein were 6 to 8-fold higher than for mice receiving NP DNA only (FIG. 5, Table 15). The titers for mice receiving DNA and protein formulated with a cationic lipid were similar to those for mice receiving protein in Ribi adjuvant or DNA and protein in Ribi adjuvant. These data indicate that the levels of antibody seen when protein is injected with an adjuvant can be obtained with DNA vaccines containing DNA and protein formulated with a cationic lipid, without the addition of conventional adjuvant.
[0396]Twelve days after a second injection, antibody responses in mice receiving vaccine formulations containing purified protein were 9 to 129-fold higher than for mice receiving NP DNA only (FIG. 6, Table 15). With a mean anti-NP antibody titer of 750,933 at day 33, the titers for mice receiving DNA and protein formulated with Vaxfectinยฎ were 25-fold higher than for mice receiving DNA alone (mean titer=30,578), and nearly as high as those for mice injected with protein in Ribi adjuvant (mean titer=1,748,133).
TABLE-US-00078 TABLE 15 Fold increase in antibody response over DNA alone 20 days after one 12 days after second Formulation injection injection protein + Ribi 7X (p = 0.0002) 57X (p = 0.002) DNA + protein + DMRIE: 6X (p = 0.00005) 9X (p = 0.0002) DOPE DNA + protein + 8X (p = 0.00003) 25X (p = 0.0004) Vaxfectin ยฎ DNA + protein + Ribi 7X (p = 0.01) 129X (p = 0.003) *protein = purified recombinant NP protein
[0397]As expected, an NP specific CD8+ T-cell IFNฮณ response was not detected in spleens of mice injected with NP protein in Ribi (FIG. 7). All of the other groups had detectable NP specific CD8+ T-cell responses. The CD8+ T-cell responses for all groups receiving vaccine formulations containing NP DNA were not statistically different from each other.
[0398]Mice from all of the groups had detectable NP specific CD4+ T-cell responses (FIG. 8). The CD4+ T-cell responses of splenocytes from groups receiving vaccine formulations containing NP DNA and NP protein formulated with cationic lipid were 2-6 fold higher than the group injected with DNA alone.
B. Codon-Optimized IV Constructs
[0399]Plasmid constructs comprising codon-optimized and non-codon-optimized coding regions encoding NP, M1, M2, eM2, and/or an eM2-NP fusion; or alternatively coding regions (either codon-optimized or non-codon optimized) encoding various IV proteins or fragments, variants or derivatives either alone or as fusions with a carrier protein, e.g., HBcAg, as well as various controls, e.g., empty vector, are used in the prime-boost compositions described herein. For the prime-boost modalities, the same protein may be used for the boost, e.g., DNA encoding NP with NP protein, or a heterologous boost may be used, e.g., DNA encoding NP with an M1 protein boost. Each formulation, the plasmid comprising a coding region for the IV protein alone, or the plasmid comprising a coding region for the IV protein plus the isolated protein are formulated with Ribi I or the cationic lipids, DMRIE:DOPE or Vaxfectinยฎ. The formulations are prepared in the recommended buffer for that vaccine modality. Exemplary formulations, using NP as an example, are described herein. Other plasmid/protein formulations, including multivalent formulations, can be easily prepared by one of ordinary skill in the art by following this example. For injections with DNA formulated with cationic lipid, the DNA is diluted in 2รPBS to 0.2 mg/ml+/-purified recombinant NP protein at 0.08 mg/ml. Each cationic lipid is reconstituted from a dried film by adding 1 ml of sterile water for injection (SWFI) to each vial and vortexing continuously for 2 min., then diluted with SWFI to a final concentration of 0.15 mM. Equal volumes of NP DNA (+/-NP protein) and cationic lipid are mixed to obtain a DNA to cationic lipid molar ratio of 4:1. For injections with DNA containing Ribi I adjuvant (Sigma), Ribi I is reconstituted with saline to twice the final concentration. Ribi I (2ร) is mixed with an equal volume of NP DNA at 0.2 mg/ml in saline+/-NP protein at 0.08 mg/ml. For immunizations without cationic lipid or Ribi, NP DNA is prepared in 150 mM sodium phosphate buffer, pH 7.2. For each experiment, groups of 9 BALB/c female mice at 7-9 weeks of age are injected with 50 ฮผl of NP DNA+/-NP protein, cationic lipid or Ribi I. The formulations are administered to BALB/c mice (n=10) via bilateral injection in each rectus femoris at day 0 and day 21.
[0400]The mice are bled on day 20 and day 33 and serum titers of individual mice to the various IV antigens are measured. Serum antibody titers specific for the various IV antigens are determined by ELISA. Standard ELISPOT technology, used to identify the number of interferon gamma (IFN-ฮณ) secreting cells after stimulation with specific antigen (spot forming cells per million splenocytes, expressed as SFU/million), is used for the CD4+ and CD8+ T-cell assays using 3 mice from each group vaccinated above, sacrificed on day 34, 35 and 36, post vaccination.
Example 9
Murine Challenge Model of Influenza
General Experimental Procedure
[0401]A murine challenge model with influenza A virus is used to test the efficacy of the immunotherapies. The model used is based on that described in Ulmer, J. B., et al., Science 259:1745-49 (1993) and Ulmer, J. B. et al., J. Virol. 72: 5648-53 (1998), both of which are incorporated herein by reference in their entireties. This model utilizes a mouse-adapted strain of influenza A/HK/8/68 which replicates in mouse lungs and is titered in tissue culture in Madin Darby Canine Kidney cells. The LD90 of this mouse-adapted influenza virus is determined in female BALB/c mice age 13-15 weeks. In this model, two types of challenge study can be conducted: lethal challenge, where the virus is administered intranasally to heavily sedated mice under ketamine anesthesia; and a sub-lethal challenge, where mice are not anesthetized when the viral inoculum is administered (also intranasally). The endpoint for lethal challenge is survival, but loss in body mass and body temperature can also be monitored. The read-outs for the sublethal challenge include lung virus titer and loss in body mass and body temperature.
[0402]In the studies described here, mice are subjected to lethal challenge. Mice that are previously vaccinated with DNA encoding IV antigens are anesthetized and challenged intranasally with 0.02 mL of mouse-adapted influenza A/HK/8/68 (mouse passage #6), diluted 1 to 10,000 (500 PFU) in PBS containing 0.2% wt/vol BSA.
[0403]These challenge studies utilize groups of 10 mice. The route of administration is intramuscular in rectus femoris (quadriceps), using 0.1 ฮผg up to 1 mg total plasmid DNA. Plasmid constructs comprising codon-optimized and non-codon-optimized coding regions encoding NP, M1, M2, eM2, and/or an eM2-NP fusion; or alternatively coding regions (either codon-optimized or non-codon optimized) encoding various IV proteins or fragments, variants or derivatives either alone or as fusions with a carrier protein, e.g., HBcAg, as well as various controls, e.g., empty vector, are tested singly and in multivalent cocktails for the ability to protect against challenge. The plasmids are formulated with an adjuvant and/or a transfection facilitating agent, e.g., Vaxfectinยฎ by methods described elsewhere herein. Mice are vaccinated on days 0 and 21 using amounts of plasmids as described in Example 6. Subsequent injections can be administered. Nasal challenge of mice takes place 3 weeks after the final immunization, and animals are monitored daily for body mass, hypothermia, general appearance and then death.
[0404]For each group of mice that are studied, blood is taken at 2 weeks following the second injection, and/or any subsequent injection, and the animals are terminally bled two weeks following the last injection. Antibody titers are determined for M2, M1, and NP using ELISAs as previously described.
Plasmids
[0405]As described above, constructs of the present invention were inserted into the expression vector VR10551. VR10551 is an expression vector without any transgene insert.
[0406]VR4750 contains the coding sequence for hemagglutinin (HA) (H3N2) from mouse adapted A/Hong Kong/68. The DNA was prepared using Qiagen plasmid purification kits.
Experimental Procedure
[0407]The experimental procedure for the following example is as described above, with particular parameters and materials employed as described herein. In order to provide a pDNA control for protection in the mouse influenza challenge model, the hemagglutinin (HA) gene was cloned from the influenza A/HK/8/68 challenge virus stock, which was passaged 6 times in mice.
[0408]Mice were vaccinated twice at 3 week intervals with either 100 ฮผg pDNA VR4750 encoding the HA gene cloned directly from the mouse-adapted influenza A/HK/8/68 strain, or with 100 ฮผg blank vector pDNA (VR10551). An additional control group was immunized intranasally with live A/HK/8/68 virus (500 PFU). Three weeks after the last injection, mice were challenged intranasally with mouse-adapted influenza A/HK/8/68 with one of 3 doses (50, 500 and 5,000 PFU). Following viral challenge, mice were monitored daily for symptoms of disease, loss in body mass and survival.
[0409]FIG. 9 shows that homologous HA-pDNA vaccinated mice are completely protected over a range of viral challenge doses (FIG. 9A) and did not suffer significant weight loss (FIG. 9B) during the 3 week period following challenge.
[0410]Based on these results, future mouse flu challenge studies can include VR4750 (HA) pDNA as a positive control for protection and utilize 500 PFU, which is the LD90 for this mouse-adapted virus, as the challenge dose.
Example 10
Challenge in Non-Human Primates
[0411]The purpose of these studies is to evaluate three or more of the optimal plasmid DNA vaccine formulations for immunogenicity in non-human primates. Rhesus or cynomologus monkeys (6/group) are vaccinated with plasmid constructs comprising codon-optimized and non-codon-optimized coding regions encoding NP, HA, M1, M2, eM2, and/or an eM2-NP fusion; or alternatively coding regions (either codon-optimized or non-codon optimized) encoding various IV proteins or fragments, variants or derivatives either alone or as fusions with a carrier protein, e.g., HBcAg, as well as various controls, e.g., empty vector, intramuscularly 0.1 to 2 mg DNA combined with cationic lipid, and/or poloxamer and/or aluminum phosphate based or other adjuvants at 0, 1 and 4 months.
[0412]Blood is drawn twice at baseline and then again at the time of and two weeks following each vaccination, and then again 4 months following the last vaccination. At 2 weeks post-vaccination, plasma is analyzed for humoral response and PBMCs are monitored for cellular responses, by standard methods described herein. Animals are monitored for 4 months following the final vaccination to determine the durability of the immune response.
[0413]Animals are challenged within 2-4 weeks following the final vaccination. Animals are challenged intratracheally with the suitable dose of virus based on preliminary challenge studies. Nasal swabs, pharyngeal swabs and lung lavages are collected at days 0, 2, 4, 6, 8 and 11 post-challenge and will be assayed for cell-free virus titers on monkey kidney cells. After challenge, animals are monitored for clinical symptoms, e.g., rectal temperature, body weight, leukocyte counts, and in addition, hematocrit and respiratory rate. Oropharyngeal swab samples are taken to allow determination of the length of viral shedding. Illness is scored using the system developed by Berendt & Hall (Infect Immun 16:476-479 (1977)), and will be analyzed by analysis of variance and the method of least significant difference.
Example 11
Challenge in Birds
[0414]In this example, various vaccine formulations of the present invention are tested in the chicken influenza model. For these studies an IV H5N1 virus, known to infect birds, is used. Plasmid constructs comprising codon-optimized and non-codon-optimized coding regions encoding NP, M1, M2, eM2, and/or an eM2-NP fusion; or alternatively coding regions (either codon-optimized or non-codon optimized) encoding various IV proteins or fragments, variants or derivatives either alone or as fusions with a carrier protein, e.g., HBcAg, as well as various controls, e.g., empty vector, are formulated with cationic lipid, and/or poloxamer and/or aluminum phosphate based or other adjuvants. The vaccine formulations are delivered at a dose of about 1-10 ฮผg, delivered IM into the defeathered breast area, at 0 and 1 month. The animals are bled for antibody results 3 weeks following the second vaccine. Antibody titers against the various IV antigens are determined using techniques described in the literature. See, e.g., Kodihalli S. et al., Vaccine 18:2592-9 (2000). The birds are challenged intranasally with 0.1 mL containing 100 LD50 3 weeks post second vaccination. The birds are monitored daily for 10 days for disease symptoms, which include loss of appetite, diarrhea, swollen faces, cyanosis, paralysis and death. Tracheal and cloacal swabs are taken 4 days following challenge for virus titration.
Example 12
Formulation Selection Studies
[0415]The potency of different vaccine formulations was evaluated in different experimental studies using the NP protein of Influenza A/PR/8/34.
Vaccination Regimen
[0416]Groups of nine, six- to eight-week old BALB/c mice (Harlan-Sprague-Dawley) received bilateral (50 ฮผL/leg) intramuscular (rectus femoris) injections of plasmid DNA. Control mice received DNA in PBS alone. Mice received injections on days 0, 20 and 49. Mice were bled by OSP on day 62, and NP-specific antibodies analyzed by ELISA. Splenocytes were harvested from 3 mice/group/day for three sequential days beginning day 63, and NP-specific specific T cells were analyzed by IFNฮณ ELISPOT using overlapping peptide stimulation.
Cell Culture Media
[0417]Splenocyte cultures were grown in RPMI-1640 medium containing 25 mM HEPES buffer and L-glutamine and supplemented with 10% (v/v) FBS, 55 ฮผM ฮฒ-mercaptoethanol, 100 U/mL of penicillin G sodium salt, and 100 ฮผg/mL of streptomycin sulfate.
Standard Influenza NP Indirect Binding Assay
[0418]NP specific serum antibody titers were determined by indirect binding ELISA using 96 well ELISA plates coated overnight at 4ยฐ C. with purified recombinant NP protein at 0.5 ฮผg per well in BBS buffer, pH 8.3. NP coated wells were blocked with 1% bovine serum albumin in BBS for 1 hour at room temperature. Two-fold serial dilutions of sera in blocking buffer were incubated for 2 hours at room temperature and detected by incubating with alkaline phosphatase conjugated (AP) goat anti-mouse IgG-Fc (Jackson Immunoresearch, West Grove, Pa.) at 1:5000 for 2 hours at room temperature. Color was developed with 1 mg/ml para-nitrophenyl phosphate (Calbiochem, La Jolla, Calif.) in 50 mM sodium bicarbonate buffer, pH 9.8 and 1 mM MgCl2 and the absorbance read at 405 nm. The titer is the reciprocal of the last dilution exhibiting an absorbance value 2 times that of pre-bleed samples.
Standard NP CD8+ and CD4+ T-Cell ELISPOT Assay
[0419]Standard ELISPOT technology, used to identify the number of interferon gamma (IFN-ฮณ) secreting cells after stimulation with specific antigen (spot forming cells per million splenocytes, expressed as SFU/million), was used for the CD4+ and CD8+ T-cell assays. Three mice from each group were sacrificed on each of three consecutive days. At the time of collection, spleens from each group were pooled, and single cell suspensions were made in cell culture media using a dounce homogenizer. Red blood cells were lysed, and cells were washed and counted. For the CD4+ and CD8+ assays, cells were serially diluted 3-fold, starting at 106 cells per well and transferred to 96 well ELISPOT plates pre-coated with anti-murine IFN-ฮณ monoclonal antibody. Spleen cells were stimulated with the H-2Kd binding peptide, TYQRTRALV, at 1 ฮผg/ml and recombinant murine IL-2 at 1 U/ml for the CD8+ assay and with purified recombinant NP protein at 20 ฮผg/ml for the CD4+ assay. Cells were stimulated for 20-24 hours at 37ยฐ C. in 5% CO2, and then the cells were washed out and biotin labeled anti-IFN-ฮณ monoclonal antibody added for a 2 hour incubation at room temperature. Plates were washed and horseradish peroxidase-labeled avidin was added. After a 1-hour incubation at room temperature, AEC substrate was added and "spots" developed for 15 minutes. Spots were counted using the Immunospot automated spot counter (C.T.L. Inc., Cleveland Ohio).
Experiment 1
[0420]The purpose of this experiment was to determine a dose response to naked pDNA (VR4700) and for pDNA formulated with VF-P1205-02A. VR4700 is a plasmid encoding influenza AJPR/8/34 nucleoprotein (NP) in a VR10551 backbone. VR10551 is an expression vector without any transgene insert. VF-P1205-02A is a formulation containing a poloxamer with a POP molecular weight of 12 KDa and POE of 5% (CRL1005) at a DNA:poloxamer:BAK ratio of 5 mg/ml:7.5 mg/ml:0.3 mM. The results of this experiment are shown in the following Table:
TABLE-US-00079 TABLE 16 DNA CRL1005 BAK Seram Ab CD8+T CD4+T dose dose conc. titers (total cells cells (ฮผg) (ฮผg) (ฮผM) IgG, n = 9) (SFU/106) (SFU/106) 1 11,206 28 24 10 31,289 77 99 100 65,422 243 304 1 1.5 0.06 9,956 48 57 10 15 0.6 45,511 174 220 100 150 6 79,644 397 382
[0421]The results of this experiment indicate that increasing the dose of DNA increases both the humoral and cell mediated immune responses. When the DNA is formulated with poloxamer and BAK, increasing the dose also increases both the humoral and cell mediated immune responses.
Experiment 2
[0422]The purpose of this experiment was to determine a dose response to CRL1005, with a fixed pDNA (VR4700) dose and no BAK. The results of this experiment are shown in the following Table:
TABLE-US-00080 TABLE 17 Serum Ab DNA CRL1005 titers CD8+T CD4+T dose dose (total IgG, cells cells (ฮผg) (ฮผg) n = 9) (SFU/106) (SFU/106) 10 27,733 45 46 10 15 38,400 69 86 10 50 46,933 66 73 10 150 54,044 90 97 10 450 76,800 90 92 10 750 119,467 83 60
[0423]The results of this experiment indicate that increasing the dose of CRL1005 increases both the humoral and cell mediated immune responses.
Experiment 3
[0424]The purpose of this experiment was to compare immune responses of DMRIE:DOPE (1:1, mol:mol) and Vaxfectinยฎ cationic lipid formulations at different pDNA/cationic lipid molar ratios. The results of this experiment are shown in the following Table:
TABLE-US-00081 TABLE 18 DNA DMRIE:DOPE pDNA/ Vaxfectin ยฎ pDNA/ Serum Ab CD8+T CD4+T dose cationic lipid cationic lipid titers (total cells cells (ฮผg) molar ratios molar ratios IgG, n = 9) (SFU/106) (SFU/106) 10 17,778 57 54 10 4:1 48,356 47 112 10 2:1 49,778 44 133 10 4:1 88,178 68 464 10 2:1 150,756 46 363
[0425]The results of this experiment indicate that formulating the plasmid with DMRIE:DOPE or Vaxfectinยฎ increases both the humoral and cell mediated immune responses.
Experiment 4
[0426]The purpose of this experiment was first to compare immune responses of DMRIE:DOPE (1:1, mol:mol) at pDNA/cationic lipid molar ratios of 4:1 as an MLV (multi lamellar vesicle formulation--multi-vial) or SUV (small unilamellar vesicles--single-vial) formulation. Second, it was to compare sucrose (lyophilized and frozen) and PBS based formulations. The results of this experiment are shown in the following Table:
TABLE-US-00082 TABLE 19 DNA Serum Ab CD8+T CD4+T dose titers (total cells cells (ฮผg) Formulation Buffer IgG, n = 9) (SFU/106) (SFU/106) 10 PBS, 21,333 107 118 pH 7.2 10 SUV PBS, 15,644 144 169 pH 7.2 10 SUV PBS, 13,511 114 173 pH 7.8 10 SUV Sucrose 15,644 103 119 Frozen/thawed pH 7.8 10 SUV Sucrose 10,311 ND 246 Lyophilized pH 7.8 10 MLV PBS, 29,867 170 259 pH 7.2 * ND - could not be counted due to high background
[0427]The results of this experiment indicate that formulating the plasmid with DMRIE:DOPE stimulates both the humoral and cell mediated immune responses.
Experiment 5
[0428]The purpose of this experiment was first to determine what effect changing the ratio of DMRIE to DOPE has on immune response at pDNA/cationic lipid molar ratios of 4:1 as an MLV (multi-vial, in PBS) or SUV (single-vial in PBS) formulation. Second, it was to compare the effect of changing the co-lipid from DOPE to cholesterol. The results of this experiment are shown in the following Table:
TABLE-US-00083 TABLE 20 Serum Ab DNA titers CD8+T CD4+T dose (total IgG, cells cells (ฮผg) Formulation DMRIE:DOPE n = 9) (SFU/106) (SFU/106) 10 19,342 65 98 10 MLV, 1:0 38,684 70 126 DM:DP 10 MLV, 3:1 75,093 82 162 DM:DP 10 MLV, 1:1 53,476 78 186 DM:DP 10 SUV, 1:1 36,409 96 106 DM:DP 10 MLV, 1:1 52,338 65 154 DM:Chol
[0429]The results of this experiment indicate that formulating the plasmid with DMRIE:DOPE stimulates both the humoral and cell mediated immune responses. Changing the co-lipid from DOPE to cholesterol also stimulates both the humoral and cell mediated immune responses.
Experiment 6
[0430]The purpose of this experiment was to obtain a dose response to pDNA formulated with DMRIE:DOPE (1:1, mol:mol) at a 4:1 pDNA/cationic lipid molar ratio. The results of this experiment are shown in the following Table:
TABLE-US-00084 TABLE 21 Serum Ab DNA dose titers (total CD8+T cells CD4+T cells (ฮผg) Formulation IgG, n = 9) (SFU/106) (SFU/106) 10 22,044 119 154 1 MLV 5,600 22 67 3 MLV 22,756 46 97 10 MLV 45,511 199 250 30 MLV 60,444 274 473 100 MLV 91,022 277 262
[0431]The results of this experiment indicate that when the plasmid is formulated with DMRIE:DOPE, increasing the dose also increases both the humoral and cell mediated immune responses.
Example 13
In Vitro Expression of Influenza Antigens
Plasmid Vector
[0432]Polynucleotides of the present invention were inserted into eukaryotic expression vector backbones VR10551, VR10682 and VR6430 all of which are described previously. The VR10551 vector is built on a modified pUC18 background (see Yanisch-Perron, C., et al. Gene 33:103-119 (1985)), and contains a kanamycin resistance gene, the human cytomegalovirus immediate early 1 promoter/enhancer and intron A, and the bovine growth hormone transcription termination signal, and a polylinker for inserting foreign genes. See Hartikka, J., et al., Hum. Gene Ther. 7:1205-1217 (1996). However, other standard commercially available eukaryotic expression vectors may be used in the present invention, including, but not limited to: plasmids pcDNA3, pHCMV/Zeo, pCR3.1, pEF1/His, pIND/GS, pRc/HCMV2, pSV40/Zeo2, pTRACER-HCMV, pUB6N5-His, pVAX1, and pZeoSV2 (available from Invitrogen, San Diego, Calif.), and plasmid pCI (available from Promega, Madison, Wis.).
[0433]Various plasmids were generated by cloning the nucleotide sequence for the following influenza A antigens: segment 7 (encodes both Mt and M2 proteins via differential splicing), M2 and NP into expression constructions as described below and pictured in FIG. 13.
[0434]Plasmids VR4756 (SEQ ID NO:91), VR4759 (SEQ ID NO:92) and VR4762 (SEQ ID NO:93) were created by cloning the nucleotide sequence encoding the consensus sequence for the following influenza A antigens respectively: segment 7 (encoding both the M1 and M2 proteins by differential splicing), M2 and NP into the VR10551 backbone. The VR4756, VR4759 and VR4762 plasmids are also described in Table 13.
[0435]The VR4764 (SEQ ID NO:95) and VR4765 (SEQ ID NO:96) plasmids were constructed by ligating the segment 7 and NP coding regions from VR4756 and VR4762 respectively into the VR10682 vector. Specifically, the VR4756 vector was digested with EcoRV and SalI restriction endonucleases and the blunted fragment was ligated into the VR10682 backbone, which had been digested with the EcoRV restriction endonuclease. The VR4765 vector was constructed by digesting the VR4762 vector with EcoRV and NotI and ligating the NP coding region into the VR10682 backbone digested with the same restriction endonucleases.
[0436]VR4766 (SEQ ID NO:97) and VR4767 (SEQ ID NO:98) contain a CMV promoter/intron A-NP expression cassette and a RSV promoter (from VCL1005)-segment 7 expression cassette in the same orientation (VR4766) or opposite orientation (VR4767). These plasmids were generated by digesting VR4762 with the DraIII restriction endonuclease and cutting the RSV-segment 7-mRBG cassette from VR4764 with EcoRV and BamHI restriction endonucleases. After exonuclease digestion with the Klenow fragment of DNA polymerase I, the EcoRV/BamHI fragment was cloned into the DraIII digested VR4762 vector. Both insert orientations were obtained by this blunt end cloning method.
[0437]VR4768 (SEQ ID NO:99) and VR4769 (SEQ ID NO:100), containing a CMV promoter/intron A-segment 7 expression cassette and a RSV promoter-NP expression cassette, were similarly derived. VR4756 was digested with the DraIII restriction endonuclease and blunted by treatment with the Klenow fragment of DNA Polymerase I. The cassette containing the RSV promoter, NP coding region and mRBG terminator was removed from VR4765 by digesting with KpnI and NdeI restriction endonucleases. The fragment was also blunted with the Klenow fragment of DNA polymerase I and ligated into the DraIII-digested VR4756 vector in both gene orientations.
[0438]VR4770 (SEQ ID NO:101), VR4771 (SEQ ID NO:102) and VR4772 (SEQ ID NO:103) were constructed by cloning the coding regions from VR4756, VR4762 and VR4759 respectively into the VR6430 vector backbone. Specifically, the segment 7 gene from VR4756 was removed using SalI and EcoRV restriction endonucleases and blunted with the Klenow fragment of DNA polymerase I. The VR6430 plasmid was digested with EcoRV and BamHI and the vector backbone fragment was blunted with the Klenow fragment of DNA polymerase I. The segment 7 gene fragment was then ligated into the VR6430 vector backbone. VR4771 was derived by removing the NP insert from VR4762 following EcoRV and BglII restriction endonuclease digestion and the fragment was ligated into the VR6430 vector backbone which had been digested the same restriction endonucleases. VR4772 was derived by subcloning the M2 coding region from VR4759 as a blunted SalI-EcoRV fragment and ligating into the VR6430 vector backbone from a blunted EcoRV-BamHI digest.
[0439]VR4773 (SEQ ID NO:104) and VR4774 (SEQ ID NO:105) contain a CMV promoter/intron A-segment 7 expression cassette and a RSV/R-NP expression cassette with the genes in the same or opposite orientation. These plasmids were generated by digesting VR4756 with the DraIII restriction endonuclease, blunting, and ligating to the RSV/R-NP-BGH fragment from VR4771 (VR4771 digested with NdeI and SfiI and then blunted).
[0440]VR4775 (SEQ ID NO:106) and VR4776 (SEQ ID NO:107) contain a CMV promoter/intron A-NP expression cassette and a RSV/R-segment 7 expression cassette with the genes in the same or opposite orientation. These plasmids were generated by digesting VR4762 with the DraIII restriction enzyme and blunting with the Klenow fragment of DNA polymerase. The RSV/R-segment 7-BGH fragment was generated by digesting VR4770 with NdeI and SfiI restriction endonucleases and ligating the blunted fragment with the DraIII restriction endonuclease digested VR4762.
[0441]VR4777 (SEQ ID NO:108) and VR4778 (SEQ ID NO:109) contain a CMV promoter/intron A-NP expression cassette and a RSV/R-M2 expression cassette in the same or opposite orientation. These plasmids were generated by digesting VR4762 with the MscI restriction endonuclease, digesting VR4772 with NdeI and SfiI restriction endonucleases and treating the RSV/R-M2-BGH with the Klenow fragment of DNA polymerase, followed by ligation of these two gel purified fragments.
[0442]VR4779 and VR4780 contain a CMV promoter/intron A-M2 expression cassette and a RSV/R-NP expression cassette in the same or opposite orientation. These plasmids were generated by digesting VR4759 with the MscI restriction endonuclease, digesting VR4771 with NdeI and SfiI restriction endonucleases and treating the RSV/R-NP-BGH segment with the Klenow fragment of DNA polymerase, followed by ligation of these two gel purified fragments.
Plasmid DNA purification
[0443]Plasmid DNA was transformed into Escherichia coli DH5a competent cells, and highly purified covalently closed circular plasmid DNA was isolated by a modified lysis procedure (Horn, N. A., et al., Hum. Gene Ther. 6:565-573 (1995)) followed by standard double CsCl-ethidium bromide gradient ultracentrifugation (Sambrook, J., et al., Molecular Cloning: A Laboratory Manual, 2nd Ed., Cold Spring Harbor Laboratory Press, Plainview, N.Y. (1989)). All plasmid preparations were free of detectable chromosomal DNA, RNA and protein impurities based on gel analysis and the bicinchoninic protein assay (Pierce Chem. Co., Rockford Ill.). Endotoxin levels were measured using Limulus Amebocyte Lysate assay (LAL, Associates of Cape Cod, Falmouth, Mass.) and were less than 0.6 Endotoxin Units/mg of plasmid DNA. The spectrophotometric A260/A280 ratios of the DNA solutions were typically above 1.8. Plasmids were ethanol precipitated and resuspended in an appropriate solution, e.g., 150 mM sodium phosphate (for other appropriate excipients and auxiliary agents, see U.S. Patent Application Publication 2002/0019358, published Feb. 14, 2002). DNA was stored at -20ยฐ C. until use. DNA was diluted by mixing it with 300 mM salt solutions and by adding appropriate amount of USP water to obtain 1 mg/ml plasmid DNA in the desired salt at the desired molar concentration.
Plasmid Expression in Mammalian Cell Lines
[0444]The expression plasmids were analyzed in vitro by transfecting the plasmids into a well characterized mouse melanoma cell line (VM-92, also known as UM-449) and the human rhabdomyosarcoma cell line RD (ATCC CCL-136) both available from the American Type Culture Collection, Manassas, Va. Other well-characterized human cell lines may also be used, e.g. MRC-5 cells, ATCC Accession No. CCL-171. The transfection was performed using cationic lipid-based transfection procedures well known to those of skill in the alt. Other transfection procedures are well known in the art and may be used, for example electroporation and calcium chloride-mediated transfection (Graham F. L. and A. J. van der Eb Virology 52:456-67 (1973)). Following transfection, cell lysates and culture supernatants of transfected cells were evaluated to compare relative levels of expression of IV antigen proteins. The samples were assayed by western blots and ELISAs, using commercially available monoclonal antibodies (available, e.g., from Research Diagnostics Inc., Flanders, N.J.), so as to compare both the quality and the quantity of expressed antigen.
[0445]Genes encoding the consensus amino acid sequences (described above) derived for NP, M1 and M2 antigens were cloned in several configurations into several plasmid vector backbones. The pDNAs were tested for in vitro expression and are being assessed in vivo for immunogenicity, as well as for the ability to protect mice from influenza challenge.
Experiment 1
[0446]Following the derivation of an amino acid consensus for M1 and M2, a native segment 7 isolate was found to encode this consensus, and this nucleotide sequence was synthesized according to methods described above. An M2-M1 fusion gene was also created and the nucleotide sequence was human codon-optimized using the above described codon optimization algorithm of Example 4. The individual full-length M2 and M1 genes were also cloned via PCR from this fusion.
[0447]In vitro expression of influenza antigens in cell lysates was assessed 48 hours after transfection into a mouse melanoma cell line. M2 expression was detected following transfection of VR4756 (segment 7), VR4755 (M2-M1 fusion) and VR4759 (full-length M2) using the anti-M2 monoclonal antibody (14C2) from Affinity BioReagents. The data are shown in FIG. 10 for VR4756 and VR4755. Expression of M1 was detected from transfected VR4756, VR4755 and VR4760 (full-length M1) pDNAs, as detected by anti-M1 monoclonal (Serotec) in FIG. 10 for VR4756 and VR4755, or by anti-M1 goat polyclonal (Virostat, data not shown). VR10551 is the empty cloning vector.
Experiment 2
[0448]In order to compare alternative human codon-optimization methods, two versions of a fusion of the first 24 amino acids of M2 to full-length NP ("eM2-NP") were constructed. One nucleotide sequence was derived from the above codon optimization algorithm, while the other was done by an outside vendor. Comparison of expression levels from the two eM2-NP pDNAs was measured in vitro, and comparison of immunogenicity in vivo is on-going. Additionally, the full-length NP genes for both codon-optimized versions were sub-cloned from the eM2-NP pDNAs and analyzed for expression in vitro.
[0449]In vitro expression was tested to compare eM2-NP and NP pDNAs derived from the above described codon-optimization algorithm and an outside vendor algorithm. The data are shown in FIG. 11. Expression levels were approximately the same for VR4757 (eM2-NP vendor optimization) vs. VR4758 (eM2-NP Applicant optimization), as detected by anti-M2 monoclonal (FIG. 11A) or anti-NP mouse polyclonal (data not shown). Similarly, NP expression was approximately equal for VR4761 (vendor optimization) vs. VR4762 (Applicant optimization), detected by anti-NP mouse polyclonal generated by Applicants (FIG. 11B). NP consensus protein expression in vitro was also detected using a goat polyclonal antibody (Fitzgerald) generated against whole H1N1 or H3N2 virus (data not shown). Expression levels of both of these NP constructs were much higher than a pDNA containing A/PR/34 NP (VR4700).
Experiment 3
[0450]Influenza antigen-encoding plasmids were transfected into VM92 cells using methods described above. Cell lysates and media were collected 48 hours after transfection. Cells were lysed in 200 ฮผl of Laemmli buffer, cell debris removed by microcentrifuge spin, and 20 ฮผl was heated and loaded on a 4-12% Bis-Tris gel. To determine expression of those vectors encoding secreted NP protein, 15 ฮผl of media was mixed with 5 ฮผl of loading buffer, heated, and loaded on a gel. Western blots were processed as described above. Primary antibodies were as follows: monoclonal antibody MA1-082 (ABR) to detect M2 protein, monoclonal antibody MCA401 (Serotec) to detect M1 protein, and a polyclonal antibody against VR4762-injected rabbits generated in-house. All primary antibodies were used at a 1:500 dilution.
[0451]FIG. 14 shows Western blot results wherein M2 protein expression from segment 7-enocoding plasmids are higher in CMV promoter/intron A-segment 7 (VR4756) and RSV/R-segment 7 (VR4770) than VR4764 (RSV promoter). NP expression appeared highest from the RSV/R-NP plasmid (VR4771), followed by CMV/intron A-NP (VR4762) and then RSV-NP (VR4765). Similar results were seen in Western blots from human RD-transfected cells.
[0452]For dual promoter plasmids, containing RSV-segment 7 and CMV/intron A-NP (VR4766 and VR4767), M2 expression from segment 7 is very low, independent of orientation. The CMV/intron A-NP expression in these dual promoter plasmids does not differ significantly compared to VR4762. RSV-NP expression in dual promoter plasmids (VR4768 and VR4769), where segment 7 is expressed from CMV/intron A, NP expression decreases somewhat, but not as drastically as M2 expression in the dual promoter VR4766 and VR4767.
[0453]FIG. 15 shows expression of the M1 and M2 proteins from segment 7, as well as NP, from CMV promoter/intron A, RSV promoter, and RSV/R-containing plasmids. For these Western blots, dual promoter plasmids contain the CMV promoter/intron A and RSV/R driving either NP or segment 7. Similar results were seen in Western blots from human RD-transfected cells.
[0454]Western blot results confirm that the M1 and M2 protein expression from both CMV promoter/intron A-segment 7 (VR4756) and RSV/R-segment 7 (VR4770) is superior to RSV-segment 7 (VR4764). M1 and M2 expression decrease slightly when RSV/R-segment 7 or CMV/intron A-segment 7 is combined with CMV/intron A-NP or RSV/R-NP in a dual promoter plasmid (VR4773, VR4774, VR4775, and VR4776). Results were similar in Western blots from human RD transfected cells. Human RD cells transfected with M2 antigen encoding plasmids, RSV/R-M2 (VR4772) and CMV/intron A-M2 (VR4759), showed a similar level of M2 expression, which was decreased in dual promoter plasmids (VR4777, VR4778, VR4779, and VR4780). Human RD cells transfected with NP antigen-encoding plasmids, VR4762, VR4771, VR4777, VR4778, VR4779, and VR4780, all showed similar NP expression levels.
Example 14
Murine Influenza A Challenge Model
[0455]A model influenza A challenge model has been established utilizing a mouse-adapted A/HK/8/68 strain. Positive and negative control Hemagluttinin (HA)-containing plasmids were generated by PCR of the HA genes directly from mouse-adapted A/Hong Kong/68 (H3N2) and A/Puerto Rico/34 (H1N1) viruses, respectively.
[0456]For all experiments, plasmid DNA vaccinations are given as bilateral, rectus femoris injections at 0 and 3 weeks, followed by orbital sinus puncture (OSP) bleed at 5 weeks and intranasal viral challenge at 6 weeks with 500 pfu (1 LD90) of virus. Mice are monitored for morbidity and weight loss for about 3 weeks following viral challenge. Endpoint antibody titers for NP and M2 were determined by ELISA. For study GSJ08, 5 additional mice per test group were vaccinated and interferon-ฮณ ELISPOT assays were performed at week number 5.
Study CL88:
[0457]A mouse influenza challenge study was initiated to test the M1, M2, Segment 7, and NP-encoding plamids alone, or in combination. In addition to HA pDNAs, sub-lethal infection and naive mice serve as additional positive and negative controls, respectively. Mice received 100 ฮผg of each plasmid formulated in poloxamer CRL1005, 02A formulation. The test groups and 21 day post-challenge survival are shown in Table 21:
TABLE-US-00085 TABLE 21 Total pDNA # mice/ 21 day Group Construct(s) per vaccination group Survival (%) A VR4762 (NP) 100 ฮผg 12 17 B VR4759 (M2) 100 ฮผg 12 25 C VR4760 (M1) 100 ฮผg 12 0 D VR4756 (S7) 100 ฮผg 12 50 E VR4762 (NP) + VR4759 (M2) 200 ฮผg 12 100 F VR4762 (NP) + VR4760 (M1) 200 ฮผg 12 17 G VR4762 (NP) + VR4756 (S7) 200 ฮผg 12 75 H VR4750 (HA, H3N2, +control) 100 ฮผg 12 100 I VR4752 (HA, H1N1, -control) 100 ฮผg 12 8 J Naive mice (-control) N/A 12 8 K Sub-lethal (+control) N/A 12 100
CL88 Results:
[0458]The performance criteria for this study was survival of >90% for the positive controls, โฆ10% for the negative controls, and >75% for the experimental groups. Table 21 shows that all of the control groups, as well as two experimental groups met the performance criteria. The M2+NP and S7+NP plamsid DNA combinations resulted in 100% and 75% survival, respectively. There was no statistically significant difference (p<0.05) between the two lead plasmid combinations, but there was statistical significance in the S7, S7+NP, and M2+NP groups vs. the negative controls.
[0459]Weight loss data showed that the positive control groups did not exhibit any weight loss following viral challenge, as opposed to the weight loss seen in all of the experimental groups. Mice that survived the viral challenge recovered to their starting weight by the end of the study. Tables 22 and 23 show endpoint antibody titers for test groups containing M2, Segment 7, and NP antigens. Shaded boxes represent mice that died following viral challenge.
TABLE-US-00086 TABLE 22 CL88 M2 Antibody Titers ##STR00001## ##STR00002## ** An M2 antibody titer of 0 represents a titer of <100.
TABLE-US-00087 TABLE 23 CL88 NP Antibody Titers ##STR00003## ##STR00004##
Study GSJ05:
[0460]In order to attempt to distinguish between the two antigen combinations, S7+NP and M2+NP, a dose ranging challenge experiment was undertaken with these two plasmid combinations. Mice were injected with 100 ฮผg, 30 ฮผg, or 10 ฮผg per plasmid in the 02A poloxamer formulation at 0 and 3 weeks, followed by bleed at 5 weeks and viral challenge at 6 weeks. Sixteen mice per group were vaccinated for test groups A-H, while 12 mice per group were vaccinated for the controls. Poloxamer 02A-formulated HA plasmids, VR4750 (HA H3) and VR4752 (HA H1), were included as positive and negative controls, respectively. The test groups and 21 day survival post-challenge are shown in Table 24:
TABLE-US-00088 TABLE 24 Total pDNA # mice/ 21 day Group Construct(s) per vaccination group Survival (%) A VR4756 (Seg 7) + VR4762 (NP) 200 ฮผg 16 73 B VR4756 (Seg 7) + VR4762 (NP) 60 ฮผg 16 81 C VR4756 (Seg 7) + VR4762 (NP) 20 ฮผg 16 69 D VR4759 (M2) + VR4762 (NP) 200 ฮผg 16 94 E VR4759 (M2) + VR4762 (NP) 60 ฮผg 16 81 F VR4759 (M2) + VR4762 (NP) 20 ฮผg 16 75 G VR4750 (Positive DNA control) 100 ฮผg 12 100 H VR4752 (Negative DNA control) 100 ฮผg 12 8
Results
[0461]The performance criteria of >90% survival with the HA positive control and โฆ10% for the HA negative control plasmid again were met. The performance criteria for the experimental groups, >75% survival at the 30 ฮผg per plasmid dose, was met by both M2+NP and S7+NP (Table 24). In fact, at a dose of 10 ฮผg per plasmid, S7+NP and M2+NP resulted in 69% and 75% survival, respectively. There was no statistical significance (p<0.05) between the three doses of M2+NP or between the 3 doses of S7+NP, nor was there statistical significance when comparing M2+NP to S7+NP at the 200 ฮผg, 60 ฮผg, or 20 ฮผg doses. However, there was a statistical difference for the HA positive control vs. S7+NP at 200 ฮผg and 20 ฮผg. Body mass data shows weight loss and recovery by all surviving experimental plasmid DNA-vaccinated groups, while the HA positive control mice did not experience weight loss. Antibody data for M2 and NP are shown in Tables 25 and 26.
TABLE-US-00089 TABLE 25 GSJ05 M2 Antibody Titers ##STR00005## ##STR00006##
TABLE-US-00090 TABLE 26 GSJ05 NP Antibody Titers ##STR00007## ##STR00008## Gray shading represents mice that died post-challenge. Group A, mouse 9 (spotted box) died during the OSP bleed procedure.
Study GSJO6
[0462]The plasmid combination VR4759 (M2) and VR4762 (NP) was utilized in further mouse influenza challenge studies to examine additional formulations.
[0463]Using the experimental protocol described above, 12 mice per group were vaccinated with equal weight VR4759 (M2) and VR4762 (NP) in the following formulations: [0464]Poloxamer 02A used in the previous two challenge experiments. [0465]DMRIE+Cholesterol (DM:Chol) at a 4:1 molar ratio of DNA to DMRIE, the molar ratio of DM:Chol is 3:1. [0466]Vaxfectinยฎ (VC1052+DPyPE) at a 4:1 molar ratio of DNA:VC1052, the molar ratio of VC1052:DpyPE is 1:1.GSJO6 study design and 21 day survival post-challenge is found in Table 27.
TABLE-US-00091 [0466]TABLE 27 21 day Group pDNA Total pDNA Survival (%) A Poloxamer 02A 20 ฮผg 92 B Poloxamer 02A 2 ฮผg 58 C DMRIE:Cholesterol 20 ฮผg 58 D DMRIE:Cholesterol 2 ฮผg 17 E Vaxfectin 20 ฮผg 100 F Vaxfectin 2 ฮผg 75 G VR4750 (HA, positive) 100 ฮผg 100 H VR4752 (HA, negative) 100 ฮผg 0
Results
[0467]Poloxamer 02A and Vaxfectinยฎ-formulated plasmid DNA led to 92% and 100% survival at the 20 ฮผg pDNA dose, and 58% and 75% at the 2 ฮผg dose, respectively (Table 27).
[0468]Average weights were tracked for each group of mice starting at the day of challenge. As shown in Table 28, it was noted in this experiment that the weight recovery for group E (Vaxfectinยฎ-formulated pDNA, 20 ฮผg total) began after day 4, as opposed to the other groups' recovery beginning at day 7. Antibody titers, Tables 29 and 30, were determined for M2 and NP and shaded boxes represent mice that died following viral challenge.
TABLE-US-00092 TABLE 28 GSJ06 Average Body Weights Post-Challenge Avg Body Weights (g)-Days post-challenge Group pDNA Total pDNA 0 2 4 7 9 11 14 16 18 21 A Poloxamer 02A 20 ug 20.73 19.98 17.98 ##STR00009## 17.36 18.74 19.94 20.45 20.60 21.08 B Poloxamer 02A 2 ug 21.08 19.91 17.96 15.17 ##STR00010## 16.03 16.77 17.41 18.10 19.52 C DMRIE-Cholesterol 20 ug 21.43 20.24 18.14 ##STR00011## 18.68 19.24 20.14 20.50 20.90 21.42 D DMRIE-Cholesterol 2 ug 21.28 20.24 17.58 ##STR00012## 16.18 17.45 18.80 19.84 20.13 20.98 E Vaxfectin 20 ug 21.41 19.97 ##STR00013## 18.10 19.12 19.82 20.39 20.87 20.93 21.34 F Vaxfectin 2 ug 20.47 18.97 16.86 ##STR00014## 16.22 16.84 17.87 18.60 19.08 20.02 G VR4750 (HA, positive) 100 ug 21.30 20.97 21.60 21.21 21.57 21.79 21.84 22.13 21.94 22.13 H VR4752 (HA, negative) 100 ug 20.89 20.25 17.57 14.67 Shading represents the lowest group average post-challenge for each test group. Group H (negative control) weight averages are not recorded once the percentage survival has dropped below 50%.
TABLE-US-00093 TABLE 29 GSJ06 M2 Antibody Titers ##STR00015## ##STR00016##
TABLE-US-00094 TABLE 30 GSJ06 NP Antibody Titers ##STR00017## ##STR00018##
Study GSJO8
[0469]Further formulation comparisons were done with utilizing VR4759 (M2) and VR4762 (NP). Seventeen mice per test group (A-G) were vaccinated with equal weight VR4759 (M2) and VR4762 (NP) vectors in the following formulations: [0470]Poloxamer 02A. [0471]Vaxfectinยฎ (preparations A and B represent different purifications) [0472]DMRIE:DOPE at a 4:1 molar ratio of DNA to DMRIE [0473]DMRIE:DOPE at a 2.5:1 molar ratio of DNA to DMRIE [0474]PBS (unformulated pDNA)Twelve mice per test group were challenged with influenza virus at week number 6. Five mice per test group were sacrificed at days 36-38 for T cell assays (IFN-ฮณ ELISPOT). The test groups and 21 day survival post-challenge are shown in Table 31. Groups A-D, and F-G were vaccinated with 20 ฮผg total plasmid DNA per injection to further explore the weight loss/recovery phenomena seen in study GSJO6 with the Vaxfectinยฎ-formulated pDNA.
TABLE-US-00095 [0474]TABLE 31 Total pDNA 21 Day Group Construct(s) per vaccination Survival (%) A Poloxamer 02A 20 ฮผg 50 B DMRIE:DOPE 4:1 20 ฮผg 92 C DMRIE:DOPE 2.5:1 20 ฮผg 92 D Vaxfectin - prep A 20 ฮผg 92 E Vaxfectin - prep A 2 ฮผg 75 F Vaxfectin - prep B 20 ฮผg 100 G PBS 20 ฮผg 42 H VR4750 (HA, H3N2, +control) 100 ฮผg 100 I VR4752 (HA, H1N1, -control) 100 ฮผg 17
Results
[0475]The DMRIE:DOPE and Vaxfectinยฎ formulated groups resulted in 92-100% survival at a 20 ฮผg pDNA dose. Group A (Poloxamer 02A) and Group G (PBS) survival results were not statistically different than the negative control (as measured by Fisher exact p, one-tailed), while the Vaxfectinยฎ and DMIRE:DOPE Groups (Groups B-F) were shown to be statistically superior (p<0.05) as compared to the negative control. Therefore, the plasmid DNA formulated with lipids appear to provide superior protection in the mouse influenza model challenge.
[0476]A repeated measures ANOVA mixed model analysis of weight data for groups B, C, and D of the weight loss and recovery data showed that Group B and Group D were not statistically different, while Group C and Group D were statistically different.
[0477]T cell responses, as measured by IFN-ฮณ ELISPOT assay, were conducted on the last 5 mice per group using an M2 peptide encompassing the first 24 amino acids of M2 (TABLE 33), an NP protein expressed in baculovirus (TABLE 34), and an NP CD8+ Balb/c immunodominant peptide (TABLE 35).
[0478]Antibody titers, Tables 36 and 37, were determined for M2 and NP proteins. The first 12 mice listed for each group were challenge at day 42 and the last 5 mice per group were sacrificed for IFN-ฮณ ELISPOT. The shaded boxes represent mice that died following viral challenge.
TABLE-US-00096 TABLE 32 GSJ06 Average Body Weights Post-Challenge Total pDNA Avg Body Weights (g)-Days post-challenge Group Construct(s) per vaccination 0 2 4 5 6 7 9 11 14 16 18 22 A Poloxamer 02A 20 ฮผg 20.47 18.97 16.30 15.43 14.75 ##STR00019## 14.35 14.44 16.63 17.64 18.36 20.53 B DMRIE-DOPE 4:1 20 ฮผg 21.58 19.94 17.43 16.75 16.17 ##STR00020## 16.43 17.28 18.45 19.50 20.22 20.89 C DMRIE-DOPE 25:1 20 ฮผg 19.95 18.58 16.44 15.77 ##STR00021## 15.56 15.75 16.22 16.78 17.16 17.31 18.04 D Vaxfectin--prep A 20 ฮผg 20.87 19.22 16.81 16.47 ##STR00022## 16.92 17.94 19.48 20.06 20.19 20.64 21.17 E Vaxfectin--prep A 2 ฮผg 20.40 19.59 17.97 17.47 17.27 ##STR00023## 18.96 19.83 20.24 20.49 20.57 21.06 F Vaxfectin--prep B 20 ฮผg 21.33 20.01 17.88 ##STR00024## 17.74 18.21 18.85 18.85 20.29 20.77 20.88 21.39 G PBS 20 ฮผg 20.84 19.46 16.97 16.00 15.38 ##STR00025## 15.80 16.39 17.35 H VR4750 100 ฮผg 21.25 21.15 21.27 20.77 20.92 21.24 20.74 21.16 21.33 21.40 21.64 21.64 (HA, H3N2, + control) I VR4752 100 ฮผg 21.67 20.65 17.87 16.77 16.05 15.17 15.09 (HA, H1N1, - control) Shading represents the lowest group average post-challenge for each test group. Group G and I weight averages are not recorded once the percentage survival has dropped below 50%
TABLE-US-00097 TABLE 33 M2 peptide Interferon-ฮณ ELISPOT M2 peptide IFN gamma ELISPOT (SFU/10E6 cells) Mouse A B C D E F G 1 66 88 145 189 283 253 31 2 11 115 150 269 62 282 47 3 115 247 190 233 99 283 112 4 20 6 51 67 73 93 45 5 93 277 397 248 202 399 93 AVG 61 147 187 201 144 262 66
TABLE-US-00098 TABLE 34 NP CD4 peptide Interferon-ฮณ ELISPOT NP CD4 peptide IFN gamma ELISPOT (SFU/10E6 cells) Mouse A B C D E F G 1 7 32 3 52 72 108 18 2 8 83 34 125 8 34 8 3 22 91 106 293 26 51 73 4 9 15 80 39 53 10 12 5 37 150 374 117 40 217 43 AVG 17 74 119 125 40 84 31
TABLE-US-00099 TABLE 35 NP CD8 peptide Interferon-ฮณ ELISPOT NP CD8 peptide IFN gamma ELISPOT (SFU/10E6 cells) Mouse A B C D E F G 1 11 37 4 14 20 67 8 2 0 3 4 6 1 0 2 3 31 19 15 26 23 51 34 4 1 0 0 12 1 38 3 5 46 36 39 21 13 15 18 AVG 18 19 12 16 12 34 13
TABLE-US-00100 TABLE 36 GSJ08 M2 Antibody Titers ##STR00026## ##STR00027##
TABLE-US-00101 TABLE 37 GSJ08 NP Antibody Titers ##STR00028## ##STR00029##
[0479]The present invention is not to be limited in scope by the specific embodiments described which are intended as single illustrations of individual aspects of the invention, and any compositions or methods which are functionally equivalent are within the scope of this invention. Indeed, various modifications of the invention in addition to those shown and described herein will become apparent to those skilled in the art from the foregoing description and accompanying drawings. Such modifications are intended to fall within the scope of the appended claims.
[0480]All publications and patent applications mentioned in this specification are herein incorporated by reference to the same extent as if each individual publication or patent application was specifically and individually indicated to be incorporated by reference.
Sequence CWU
1
11211565DNAInfluenza A virus 1agcaaaagca gggtagataa tcactcactg agtgacatca
aaatcatggc gtctcaaggc 60accaaacgat cttacgaaca gatggagact gatggagaac
gccagaatgc cactgaaatc 120agagcatccg tcggaaaaat gattggtgga attggacgat
tctacatcca aatgtgcacc 180gaactcaaac tcagtgatta tgagggacgg ttgatccaaa
acagcttaac aatagagaga 240atggtgctct ctgcttttga cgaaaggaga aataaatacc
ttgaagaaca tcccagtgcg 300gggaaagatc ctaagaaaac tggaggacct atatacagga
gagtaaacgg aaagtggatg 360agagaactca tcctttatga caaagaagaa ataaggcgaa
tctggcgcca agctaataat 420ggtgacgatg caacggctgg tctgactcac atgatgatct
ggcattccaa tttgaatgat 480gcaacttatc agaggacaag agctcttgtt cgcaccggaa
tggatcccag gatgtgctct 540ctgatgcaag gttcaactct ccctaggagg tctggagccg
caggtgctgc agtcaaagga 600gttggaacaa tggtgatgga attggtcaga atgatcaaac
gtgggatcaa tgatcggaac 660ttctggaggg gtgagaatgg acgaaaaaca agaattgctt
atgaaagaat gtgcaacatt 720ctcaaaggga aatttcaaac tgctgcacaa aaagcaatga
tggatcaagt gagagagagc 780cggaacccag ggaatgctga gttcgaagat ctcacttttc
tagcacggtc tgcactcata 840ttgagagggt cggttgctca caagtcctgc ctgcctgcct
gtgtgtatgg acctgccgta 900gccagtgggt acgactttga aagggaggga tactctctag
tcggaataga ccctttcaga 960ctgcttcaaa acagccaagt gtacagccta atcagaccaa
atgagaatcc agcacacaag 1020agtcaactgg tgtggatggc atgccattct gccgcatttg
aagatctaag agtattaagc 1080ttcatcaaag ggacgaaggt gctcccaaga gggaagcttt
ccactagagg agttcaaatt 1140gcttccaatg aaaatatgga gactatggaa tcaagtacac
ttgaactgag aagcaggtac 1200tgggccataa ggaccagaag tggaggaaac accaatcaac
agagggcatc tgcgggccaa 1260atcagcatac aacctacgtt ctcagtacag agaaatctcc
cttttgacag aacaaccgtt 1320atggcagcat tcagtgggaa tacagagggg agaacatctg
acatgaggac cgaaatcata 1380aggatgatgg aaagtgcaag accagaagat gtgtctttcc
aggggcgggg agtcttcgag 1440ctctcggacg aaaaggcagc gagcccgatc gtgccttcct
ttgacatgag taatgaagga 1500tcttatttct tcggagacaa tgcagaggaa tacgataatt
aaagaaaaat acccttgttt 1560ctact
15652498PRTInfluenza A virus 2Met Ala Ser Gln Gly
Thr Lys Arg Ser Thr Glu Gln Met Glu Thr Asp1 5
10 15Gly Glu Arg Gln Asn Ala Thr Glu Ile Arg Ala
Ser Val Gly Lys Met 20 25
30Ile Gly Gly Ile Gly Arg Phe Tyr Ile Gln Met Cys Thr Glu Leu Lys
35 40 45Leu Ser Asp Tyr Glu Gly Arg Leu
Ile Gln Asn Ser Leu Thr Ile Glu 50 55
60Arg Met Val Leu Ser Ala Phe Asp Glu Arg Arg Asn Lys Tyr Leu Glu65
70 75 80Glu His Pro Ser Ala
Gly Lys Asp Pro Lys Lys Thr Gly Gly Pro Ile 85
90 95Tyr Arg Arg Val Asn Gly Lys Trp Met Arg Glu
Leu Ile Leu Tyr Asp 100 105
110Lys Glu Glu Ile Arg Arg Ile Trp Arg Gln Ala Asn Asn Gly Asp Asp
115 120 125Ala Thr Ala Gly Leu Thr His
Met Met Ile Trp His Ser Asn Leu Asn 130 135
140Asp Ala Thr Tyr Gln Arg Thr Arg Ala Leu Val Arg Thr Gly Met
Asp145 150 155 160Pro Arg
Met Cys Ser Leu Met Gln Gly Ser Thr Leu Pro Arg Arg Ser
165 170 175Gly Ala Ala Gly Ala Ala Val
Lys Gly Val Gly Thr Met Val Met Glu 180 185
190Leu Val Arg Met Ile Lys Arg Gly Ile Asn Asp Arg Asn Phe
Trp Arg 195 200 205Gly Glu Asn Gly
Arg Lys Thr Arg Ile Ala Tyr Glu Arg Met Cys Asn 210
215 220Ile Leu Lys Gly Lys Phe Gln Thr Ala Ala Gln Lys
Ala Met Met Asp225 230 235
240Gln Val Arg Glu Ser Arg Asn Pro Gly Asn Ala Glu Phe Glu Asp Leu
245 250 255Thr Phe Leu Ala Arg
Ser Ala Leu Ile Leu Arg Gly Ser Val Ala His 260
265 270Lys Ser Cys Leu Pro Ala Cys Val Tyr Gly Pro Ala
Val Ala Ser Gly 275 280 285Tyr Asp
Phe Glu Arg Glu Gly Tyr Ser Leu Val Gly Ile Asp Pro Phe 290
295 300Arg Leu Leu Gln Asn Ser Gln Val Tyr Ser Leu
Ile Arg Pro Asn Glu305 310 315
320Asn Pro Ala His Lys Ser Gln Leu Val Trp Met Ala Cys His Ser Ala
325 330 335Ala Phe Glu Asp
Leu Arg Val Leu Ser Phe Ile Lys Gly Thr Lys Val 340
345 350Leu Pro Arg Gly Lys Leu Ser Thr Arg Gly Val
Gln Ile Ala Ser Asn 355 360 365Glu
Asn Met Glu Thr Met Glu Ser Ser Thr Leu Glu Leu Arg Ser Arg 370
375 380Tyr Trp Ala Ile Arg Thr Arg Ser Gly Gly
Asn Thr Asn Gln Gln Arg385 390 395
400Ala Ser Ala Gly Gln Ile Ser Ile Gln Pro Thr Phe Ser Val Gln
Arg 405 410 415Asn Leu Pro
Phe Asp Arg Thr Thr Val Met Ala Ala Phe Ser Gly Asn 420
425 430Thr Glu Gly Arg Thr Ser Asp Met Arg Thr
Glu Ile Ile Arg Met Met 435 440
445Glu Ser Ala Arg Pro Glu Asp Val Ser Phe Gln Gly Arg Gly Val Phe 450
455 460Glu Leu Ser Asp Glu Lys Ala Ala
Ser Pro Ile Val Pro Ser Phe Asp465 470
475 480Met Ser Asn Glu Gly Ser Tyr Phe Phe Gly Asp Asn
Ala Glu Glu Tyr 485 490
495Asp Asn31027DNAInfluenza A virus 3agcgaaagca ggtagatatt gaaagatgag
tcttctaacc gaggtcgaaa cgtacgtact 60ctctatcatc ccgtcaggcc ccctcaaagc
cgagatcgca cagagacttg aagatgtctt 120tgcagggaag aacactgatc ttgaggttct
catggaatgg ctaaagacaa gaccaatcct 180gtcacctctg actaagggga ttttaggatt
tgtgttcacg ctcaccgtgc ccagtgagcg 240aggactgcag cgtagacgct ttgtccaaaa
tgcccttaat gggaacgggg atccaaataa 300catggacaaa gcagttaaac tgtataggaa
gctcaagagg gagataacat tccatggggc 360caaagaaatc tcactcagtt attctgctgg
tgcacttgcc agttgtatgg gcctcatata 420caacaggatg ggggctgtga ccactgaagt
ggcatttggc ctggtatgtg caacctgtga 480acagattgct gactcccagc atcggtctca
taggcaaatg gtgacaacaa ccaatccact 540aatcagacat gagaacagaa tggttttagc
cagcactaca gctaaggcta tggagcaaat 600ggctggatcg agtgagcaag cagcagaggc
catggaggtt gctagtcagg ctagacaaat 660ggtgcaagcg atgagaacca ttgggactca
tcctagctcc agtgctggtc tgaaaaatga 720tcttcttgaa aatttgcagg cctatcagaa
acgaatgggg gtgcagatgc aacggttcaa 780gtgatcctct cgctattgcc gcaaatatca
ttgggatctt gcacttgaca ttgtggattc 840ttgatcgtct ttttttcaaa tgcatttacc
gtcgctttaa atacggactg aaaggagggc 900cttctacgga aggagtgcca aagtctatga
gggaagaata tcgaaaggaa cagcagagtg 960ctgtggatgc tgacgatggt cattttgtca
gcatagagct ggagtaaaaa actaccttgt 1020ttctact
10274252PRTInfluenza A virus 4Met Ser
Leu Leu Thr Glu Val Glu Thr Tyr Val Leu Ser Ile Ile Pro1 5
10 15Ser Gly Pro Leu Lys Ala Glu Ile
Ala Gln Arg Leu Glu Asp Val Phe 20 25
30Ala Gly Lys Asn Thr Asp Leu Glu Val Leu Met Glu Trp Leu Lys
Thr 35 40 45Arg Pro Ile Leu Ser
Pro Leu Thr Lys Gly Ile Leu Gly Phe Val Phe 50 55
60Thr Leu Thr Val Pro Ser Glu Arg Gly Leu Gln Arg Arg Arg
Phe Val65 70 75 80Gln
Asn Ala Leu Asn Gly Asn Gly Asp Pro Asn Asn Met Asp Lys Ala
85 90 95Val Lys Leu Tyr Arg Lys Leu
Lys Arg Glu Ile Thr Phe His Gly Ala 100 105
110Lys Glu Ile Ser Leu Ser Tyr Ser Ala Gly Ala Leu Ala Ser
Cys Met 115 120 125Gly Leu Ile Tyr
Asn Arg Met Gly Ala Val Thr Thr Glu Val Ala Phe 130
135 140Gly Leu Val Cys Ala Thr Cys Glu Gln Ile Ala Asp
Ser Gln His Arg145 150 155
160Ser His Arg Gln Met Val Thr Thr Thr Asn Pro Leu Ile Arg His Glu
165 170 175Asn Arg Met Val Leu
Ala Ser Thr Thr Ala Lys Ala Met Glu Gln Met 180
185 190Ala Gly Ser Ser Glu Gln Ala Ala Glu Ala Met Glu
Val Ala Ser Gln 195 200 205Ala Arg
Gln Met Val Gln Ala Met Arg Thr Ile Gly Thr His Pro Ser 210
215 220Ser Ser Ala Gly Leu Lys Asn Asp Leu Leu Glu
Asn Leu Gln Ala Tyr225 230 235
240Gln Lys Arg Met Gly Val Gln Met Gln Arg Phe Lys
245 250597PRTInfluenza A virus 5Met Ser Leu Leu Thr Glu
Val Glu Thr Pro Ile Arg Asn Glu Trp Gly1 5
10 15Cys Arg Cys Asn Gly Ser Ser Asp Pro Leu Ala Ile
Ala Ala Asn Ile 20 25 30Ile
Gly Ile Leu His Leu Thr Leu Trp Ile Leu Asp Arg Leu Phe Phe 35
40 45Lys Cys Ile Tyr Arg Arg Phe Lys Tyr
Gly Leu Lys Gly Gly Pro Ser 50 55
60Thr Glu Gly Val Pro Lys Ser Met Arg Glu Glu Tyr Arg Lys Glu Gln65
70 75 80Gln Ser Ala Val Asp
Ala Asp Asp Gly His Phe Val Ser Ile Glu Leu 85
90 95Glu61566DNAArtificial sequenceeM2NP fusion
6atgagtcttc taaccgaggt cgaaacgcct atcagaaacg aatgggggtg cagatgcaac
60ggttcaagtg atatggcgtc tcaaggcacc aaacgatctt acgaacagat ggagactgat
120ggagaacgcc agaatgccac tgaaatcaga gcatccgtcg gaaaaatgat tggtggaatt
180ggacgattct acatccaaat gtgcaccgaa ctcaaactca gtgattatga gggacggttg
240atccaaaaca gcttaacaat agagagaatg gtgctctctg cttttgacga aaggagaaat
300aaataccttg aagaacatcc cagtgcgggg aaagatccta agaaaactgg aggacctata
360tacaggagag taaacggaaa gtggatgaga gaactcatcc tttatgacaa agaagaaata
420aggcgaatct ggcgccaagc taataatggt gacgatgcaa cggctggtct gactcacatg
480atgatctggc attccaattt gaatgatgca acttatcaga ggacaagagc tcttgttcgc
540accggaatgg atcccaggat gtgctctctg atgcaaggtt caactctccc taggaggtct
600ggagccgcag gtgctgcagt caaaggagtt ggaacaatgg tgatggaatt ggtcagaatg
660atcaaacgtg ggatcaatga tcggaacttc tggaggggtg agaatggacg aaaaacaaga
720attgcttatg aaagaatgtg caacattctc aaagggaaat ttcaaactgc tgcacaaaaa
780gcaatgatgg atcaagtgag agagagccgg aacccaggga atgctgagtt cgaagatctc
840acttttctag cacggtctgc actcatattg agagggtcgg ttgctcacaa gtcctgcctg
900cctgcctgtg tgtatggacc tgccgtagcc agtgggtacg actttgaaag ggagggatac
960tctctagtcg gaatagaccc tttcagactg cttcaaaaca gccaagtgta cagcctaatc
1020agaccaaatg agaatccagc acacaagagt caactggtgt ggatggcatg ccattctgcc
1080gcatttgaag atctaagagt attaagcttc atcaaaggga cgaaggtgct cccaagaggg
1140aagctttcca ctagaggagt tcaaattgct tccaatgaaa atatggagac tatggaatca
1200agtacacttg aactgagaag caggtactgg gccataagga ccagaagtgg aggaaacacc
1260aatcaacaga gggcatctgc gggccaaatc agcatacaac ctacgttctc agtacagaga
1320aatctccctt ttgacagaac aaccgttatg gcagcattca gtgggaatac agaggggaga
1380acatctgaca tgaggaccga aatcataagg atgatggaaa gtgcaagacc agaagatgtg
1440tctttccagg ggcggggagt cttcgagctc tcggacgaaa aggcagcgag cccgatcgtg
1500ccttcctttg acatgagtaa tgaaggatct tatttcttcg gagacaatgc agaggaatac
1560gataat
15667522PRTArtificial sequenceeM2NP fusion 7Met Ser Leu Leu Thr Glu Val
Glu Thr Pro Ile Arg Asn Glu Trp Gly1 5 10
15Cys Arg Cys Asn Gly Ser Ser Asp Met Ala Ser Gln Gly
Thr Lys Arg 20 25 30Ser Tyr
Glu Gln Met Glu Thr Asp Gly Glu Arg Gln Asn Ala Thr Glu 35
40 45Ile Arg Ala Ser Val Gly Lys Met Ile Gly
Gly Ile Gly Arg Phe Tyr 50 55 60Ile
Gln Met Cys Thr Glu Leu Lys Leu Ser Asp Tyr Glu Gly Arg Leu65
70 75 80Ile Gln Asn Ser Leu Thr
Ile Glu Arg Met Val Leu Ser Ala Phe Asp 85
90 95Glu Arg Arg Asn Lys Tyr Leu Glu Glu His Pro Ser
Ala Gly Lys Asp 100 105 110Pro
Lys Lys Thr Gly Gly Pro Ile Tyr Arg Arg Val Asn Gly Lys Trp 115
120 125Met Arg Glu Leu Ile Leu Tyr Asp Lys
Glu Glu Ile Arg Arg Ile Trp 130 135
140Arg Gln Ala Asn Asn Gly Asp Asp Ala Thr Ala Gly Leu Thr His Met145
150 155 160Met Ile Trp His
Ser Asn Leu Asn Asp Ala Thr Tyr Gln Arg Thr Arg 165
170 175Ala Leu Val Arg Thr Gly Met Asp Pro Arg
Met Cys Ser Leu Met Gln 180 185
190Gly Ser Thr Leu Pro Arg Arg Ser Gly Ala Ala Gly Ala Ala Val Lys
195 200 205Gly Val Gly Thr Met Val Met
Glu Leu Val Arg Met Ile Lys Arg Gly 210 215
220Ile Asn Asp Arg Asn Phe Trp Arg Gly Glu Asn Gly Arg Lys Thr
Arg225 230 235 240Ile Ala
Tyr Glu Arg Met Cys Asn Ile Leu Lys Gly Lys Phe Gln Thr
245 250 255Ala Ala Gln Lys Ala Met Met
Asp Gln Val Arg Glu Ser Arg Asn Pro 260 265
270Gly Asn Ala Glu Phe Glu Asp Leu Thr Phe Leu Ala Arg Ser
Ala Leu 275 280 285Ile Leu Arg Gly
Ser Val Ala His Lys Ser Cys Leu Pro Ala Cys Val 290
295 300Tyr Gly Pro Ala Val Ala Ser Gly Tyr Asp Phe Glu
Arg Glu Gly Tyr305 310 315
320Ser Leu Val Gly Ile Asp Pro Phe Arg Leu Leu Gln Asn Ser Gln Val
325 330 335Tyr Ser Leu Ile Arg
Pro Asn Glu Asn Pro Ala His Lys Ser Gln Leu 340
345 350Val Trp Met Ala Cys His Ser Ala Ala Phe Glu Asp
Leu Arg Val Leu 355 360 365Ser Phe
Ile Lys Gly Thr Lys Val Leu Pro Arg Gly Lys Leu Ser Thr 370
375 380Arg Gly Val Gln Ile Ala Ser Asn Glu Asn Met
Glu Thr Met Glu Ser385 390 395
400Ser Thr Leu Glu Leu Arg Ser Arg Tyr Trp Ala Ile Arg Thr Arg Ser
405 410 415Gly Gly Asn Thr
Asn Gln Gln Arg Ala Ser Ala Gly Gln Ile Ser Ile 420
425 430Gln Pro Thr Phe Ser Val Gln Arg Asn Leu Pro
Phe Asp Arg Thr Thr 435 440 445Val
Met Ala Ala Phe Ser Gly Asn Thr Glu Gly Arg Thr Ser Asp Met 450
455 460Arg Thr Glu Ile Ile Arg Met Met Glu Ser
Ala Arg Pro Glu Asp Val465 470 475
480Ser Phe Gln Gly Arg Gly Val Phe Glu Leu Ser Asp Glu Lys Ala
Ala 485 490 495Ser Pro Ile
Val Pro Ser Phe Asp Met Ser Asn Glu Gly Ser Tyr Phe 500
505 510Phe Gly Asp Asn Ala Glu Glu Tyr Asp Asn
515 52081566DNAArtificial sequenceNPeM2 Fusion
Construct 8atggcgtctc aaggcaccaa acgatcttac gaacagatgg agactgatgg
agaacgccag 60aatgccactg aaatcagagc atccgtcgga aaaatgattg gtggaattgg
acgattctac 120atccaaatgt gcaccgaact caaactcagt gattatgagg gacggttgat
ccaaaacagc 180ttaacaatag agagaatggt gctctctgct tttgacgaaa ggagaaataa
ataccttgaa 240gaacatccca gtgcggggaa agatcctaag aaaactggag gacctatata
caggagagta 300aacggaaagt ggatgagaga actcatcctt tatgacaaag aagaaataag
gcgaatctgg 360cgccaagcta ataatggtga cgatgcaacg gctggtctga ctcacatgat
gatctggcat 420tccaatttga atgatgcaac ttatcagagg acaagagctc ttgttcgcac
cggaatggat 480cccaggatgt gctctctgat gcaaggttca actctcccta ggaggtctgg
agccgcaggt 540gctgcagtca aaggagttgg aacaatggtg atggaattgg tcagaatgat
caaacgtggg 600atcaatgatc ggaacttctg gaggggtgag aatggacgaa aaacaagaat
tgcttatgaa 660agaatgtgca acattctcaa agggaaattt caaactgctg cacaaaaagc
aatgatggat 720caagtgagag agagccggaa cccagggaat gctgagttcg aagatctcac
ttttctagca 780cggtctgcac tcatattgag agggtcggtt gctcacaagt cctgcctgcc
tgcctgtgtg 840tatggacctg ccgtagccag tgggtacgac tttgaaaggg agggatactc
tctagtcgga 900atagaccctt tcagactgct tcaaaacagc caagtgtaca gcctaatcag
accaaatgag 960aatccagcac acaagagtca actggtgtgg atggcatgcc attctgccgc
atttgaagat 1020ctaagagtat taagcttcat caaagggacg aaggtgctcc caagagggaa
gctttccact 1080agaggagttc aaattgcttc caatgaaaat atggagacta tggaatcaag
tacacttgaa 1140ctgagaagca ggtactgggc cataaggacc agaagtggag gaaacaccaa
tcaacagagg 1200gcatctgcgg gccaaatcag catacaacct acgttctcag tacagagaaa
tctccctttt 1260gacagaacaa ccgttatggc agcattcagt gggaatacag aggggagaac
atctgacatg 1320aggaccgaaa tcataaggat gatggaaagt gcaagaccag aagatgtgtc
tttccagggg 1380cggggagtct tcgagctctc ggacgaaaag gcagcgagcc cgatcgtgcc
ttcctttgac 1440atgagtaatg aaggatctta tttcttcgga gacaatgcag aggaatacga
taatatgagt 1500cttctaaccg aggtcgaaac gcctatcaga aacgaatggg ggtgcagatg
caacggttca 1560agtgat
15669522PRTArtificial sequenceNPeM2 Fusion Construct 9Met Ala
Ser Gln Gly Thr Lys Arg Ser Tyr Glu Gln Met Glu Thr Asp1 5
10 15Gly Glu Arg Gln Asn Ala Thr Glu
Ile Arg Ala Ser Val Gly Lys Met 20 25
30Ile Gly Gly Ile Gly Arg Phe Tyr Ile Gln Met Cys Thr Glu Leu
Lys 35 40 45Leu Ser Asp Tyr Glu
Gly Arg Leu Ile Gln Asn Ser Leu Thr Ile Glu 50 55
60Arg Met Val Leu Ser Ala Phe Asp Glu Arg Arg Asn Lys Tyr
Leu Glu65 70 75 80Glu
His Pro Ser Ala Gly Lys Asp Pro Lys Lys Thr Gly Gly Pro Ile
85 90 95Tyr Arg Arg Val Asn Gly Lys
Trp Met Arg Glu Leu Ile Leu Tyr Asp 100 105
110Lys Glu Glu Ile Arg Arg Ile Trp Arg Gln Ala Asn Asn Gly
Asp Asp 115 120 125Ala Thr Ala Gly
Leu Thr His Met Met Ile Trp His Ser Asn Leu Asn 130
135 140Asp Ala Thr Tyr Gln Arg Thr Arg Ala Leu Val Arg
Thr Gly Met Asp145 150 155
160Pro Arg Met Cys Ser Leu Met Gln Gly Ser Thr Leu Pro Arg Arg Ser
165 170 175Gly Ala Ala Gly Ala
Ala Val Lys Gly Val Gly Thr Met Val Met Glu 180
185 190Leu Val Arg Met Ile Lys Arg Gly Ile Asn Asp Arg
Asn Phe Trp Arg 195 200 205Gly Glu
Asn Gly Arg Lys Thr Arg Ile Ala Tyr Glu Arg Met Cys Asn 210
215 220Ile Leu Lys Gly Lys Phe Gln Thr Ala Ala Gln
Lys Ala Met Met Asp225 230 235
240Gln Val Arg Glu Ser Arg Asn Pro Gly Asn Ala Glu Phe Glu Asp Leu
245 250 255Thr Phe Leu Ala
Arg Ser Ala Leu Ile Leu Arg Gly Ser Val Ala His 260
265 270Lys Ser Cys Leu Pro Ala Cys Val Tyr Gly Pro
Ala Val Ala Ser Gly 275 280 285Tyr
Asp Phe Glu Arg Glu Gly Tyr Ser Leu Val Gly Ile Asp Pro Phe 290
295 300Arg Leu Leu Gln Asn Ser Gln Val Tyr Ser
Leu Ile Arg Pro Asn Glu305 310 315
320Asn Pro Ala His Lys Ser Gln Leu Val Trp Met Ala Cys His Ser
Ala 325 330 335Ala Phe Glu
Asp Leu Arg Val Leu Ser Phe Ile Lys Gly Thr Lys Val 340
345 350Leu Pro Arg Gly Lys Leu Ser Thr Arg Gly
Val Gln Ile Ala Ser Asn 355 360
365Glu Asn Met Glu Thr Met Glu Ser Ser Thr Leu Glu Leu Arg Ser Arg 370
375 380Tyr Trp Ala Ile Arg Thr Arg Ser
Gly Gly Asn Thr Asn Gln Gln Arg385 390
395 400Ala Ser Ala Gly Gln Ile Ser Ile Gln Pro Thr Phe
Ser Val Gln Arg 405 410
415Asn Leu Pro Phe Asp Arg Thr Thr Val Met Ala Ala Phe Ser Gly Asn
420 425 430Thr Glu Gly Arg Thr Ser
Asp Met Arg Thr Glu Ile Ile Arg Met Met 435 440
445Glu Ser Ala Arg Pro Glu Asp Val Ser Phe Gln Gly Arg Gly
Val Phe 450 455 460Glu Leu Ser Asp Glu
Lys Ala Ala Ser Pro Ile Val Pro Ser Phe Asp465 470
475 480Met Ser Asn Glu Gly Ser Tyr Phe Phe Gly
Asp Asn Ala Glu Glu Tyr 485 490
495Asp Asn Met Ser Leu Leu Thr Glu Val Glu Thr Pro Ile Arg Asn Glu
500 505 510Trp Gly Cys Arg Cys
Asn Gly Ser Ser Asp 515 520106PRTArtificial
sequenceLinker Peptide 10Gly Tyr Ala Thr Arg Ala1
5116PRTArtificial sequenceLinker Peptide 11Phe Gln Met Gly Glu Thr1
5128PRTArtificial sequenceLinker Peptide 12Phe Asp Arg Val Lys
His Leu Lys1 5139PRTArtificial sequenceLinker Peptide 13Gly
Arg Asn Thr Asn Gly Val Ile Thr1 51410PRTArtificial
sequenceLinker Peptide 14Val Asn Glu Lys Thr Ile Pro Asp His Asp1
5 10151683DNAInfluenza B virus 15atgtccaaca
tggatattga cagtataaat accggaacaa tcgataaaac accagaagaa 60ctgactcccg
gaaccagtgg ggcaaccaga ccaatcatca agccagcaac ccttgctccg 120ccaagcaaca
aacgaacccg aaatccatct ccagaaagga caaccacaag cagtgaaacc 180gatatcggaa
ggaaaatcca aaagaaacaa accccaacag agataaagaa gagcgtctac 240aaaatggtgg
taaaactggg tgaattctac aaccagatga tggtcaaagc tggacttaat 300gatgacatgg
aaaggaatct aattcaaaat gcacaagctg tggagagaat cctattggct 360gcaactgatg
acaagaaaac tgaataccaa aagaaaagga atgccagaga tgtcaaagaa 420gggaaggaag
aaatagacca caacaagaca ggaggcacct tttataagat ggtaagagat 480gataaaacca
tctacttcag ccctataaaa attacctttt taaaagaaga ggtgaaaaca 540atgtacaaga
ccaccatggg gagtgatggt ttcagtggac taaatcacat tatgattgga 600cattcacaga
tgaacgatgt ctgtttccaa agatcaaagg gactgaaaag ggttggactt 660gacccttcat
taatcagtac ttttgccgga agcacactac ccagaagatc aggtacaact 720ggtgttgcaa
tcaaaggagg tggaacttta gtggatgaag ccatccgatt tataggaaga 780gcaatggcag
acagagggct actgagagac atcaaggcca agacggccta tgaaaagatt 840cttctgaatc
tgaaaaacaa gtgctctgcg ccgcaacaaa aggctctagt tgatcaagtg 900atcggaagta
ggaacccagg gattgcagac atagaagacc taactctgct tgccagaagc 960atggtagttg
tcagaccctc tgtagcgagc aaagtggtgc ttcccataag catttatgct 1020aaaatacctc
aactaggatt caataccgaa gaatactcta tggttgggta tgaagccatg 1080gctctttata
atatggcaac acctgtttcc atattaagaa tgggagatga cgcaaaagat 1140aaatctcaac
tattcttcat gtcgtgcttc ggagctgcct atgaagatct aagagtgtta 1200tctgcactaa
cgggcaccga atttaagcct agatcagcac taaaatgcaa gggtttccat 1260gtcccggcta
aggagcaagt agaaggaatg ggggcagctc tgatgtccat caagcttcag 1320ttctgggccc
caatgaccag atctggaggg aatgaagtaa gtggagaagg agggtctggt 1380caaataagtt
gcagccctgt gtttgcagta gaaagaccta ttgctctaag caagcaagct 1440gtaagaagaa
tgctgtcaat gaacgttgaa ggacgtgatg cagatgtcaa aggaaatcta 1500ctcaaaatga
tgaatgattc aatggcaaag aaaaccagtg gaaatgcttt cattgggaag 1560aaaatgtttc
aaatatcaga caaaaacaaa gtcaatccca ttgagattcc aattaagcag 1620accatcccca
atttcttctt tgggagggac acagcagagg attatgatga cctcgattat 1680taa
168316560PRTArtificial sequenceInfluenza B Virus 16Met Ser Asn Met Asp
Ile Asp Ser Ile Asn Thr Gly Thr Ile Asp Lys1 5
10 15Thr Pro Glu Glu Leu Thr Pro Gly Thr Ser Gly
Ala Thr Arg Pro Ile 20 25
30Ile Lys Pro Ala Thr Leu Ala Pro Pro Ser Asn Lys Arg Thr Arg Asn
35 40 45Pro Ser Pro Glu Arg Thr Thr Thr
Ser Ser Glu Thr Asp Ile Gly Arg 50 55
60Lys Ile Gln Lys Lys Gln Thr Pro Thr Glu Ile Lys Lys Ser Val Tyr65
70 75 80Lys Met Val Val Lys
Leu Gly Glu Phe Tyr Asn Gln Met Met Val Lys 85
90 95Ala Gly Leu Asn Asp Asp Met Glu Arg Asn Leu
Ile Gln Asn Ala Gln 100 105
110Ala Val Glu Arg Ile Leu Leu Ala Ala Thr Asp Asp Lys Lys Thr Glu
115 120 125Tyr Gln Lys Lys Arg Asn Ala
Arg Asp Val Lys Glu Gly Lys Glu Glu 130 135
140Ile Asp His Asn Lys Thr Gly Gly Thr Phe Tyr Lys Met Val Arg
Asp145 150 155 160Asp Lys
Thr Ile Tyr Phe Ser Pro Ile Lys Ile Thr Phe Leu Lys Glu
165 170 175Glu Val Lys Thr Met Tyr Lys
Thr Thr Met Gly Ser Asp Gly Phe Ser 180 185
190Gly Leu Asn His Ile Met Ile Gly His Ser Gln Met Asn Asp
Val Cys 195 200 205Phe Gln Arg Ser
Lys Gly Leu Lys Arg Val Gly Leu Asp Pro Ser Leu 210
215 220Ile Ser Thr Phe Ala Gly Ser Thr Leu Pro Arg Arg
Ser Gly Thr Thr225 230 235
240Gly Val Ala Ile Lys Gly Gly Gly Thr Leu Val Asp Glu Ala Ile Arg
245 250 255Phe Ile Gly Arg Ala
Met Ala Asp Arg Gly Leu Leu Arg Asp Ile Lys 260
265 270Ala Lys Thr Ala Tyr Glu Lys Ile Leu Leu Asn Leu
Lys Asn Lys Cys 275 280 285Ser Ala
Pro Gln Gln Lys Ala Leu Val Asp Gln Val Ile Gly Ser Arg 290
295 300Asn Pro Gly Ile Ala Asp Ile Glu Asp Leu Thr
Leu Leu Ala Arg Ser305 310 315
320Met Val Val Val Arg Pro Ser Val Ala Ser Lys Val Val Leu Pro Ile
325 330 335Ser Ile Tyr Ala
Lys Ile Pro Gln Leu Gly Phe Asn Thr Glu Glu Tyr 340
345 350Ser Met Val Gly Tyr Glu Ala Met Ala Leu Tyr
Asn Met Ala Thr Pro 355 360 365Val
Ser Ile Leu Arg Met Gly Asp Asp Ala Lys Asp Lys Ser Gln Leu 370
375 380Phe Phe Met Ser Cys Phe Gly Ala Ala Tyr
Glu Asp Leu Arg Val Leu385 390 395
400Ser Ala Leu Thr Gly Thr Glu Phe Lys Pro Arg Ser Ala Leu Lys
Cys 405 410 415Lys Gly Phe
His Val Pro Ala Lys Glu Gln Val Glu Gly Met Gly Ala 420
425 430Ala Leu Met Ser Ile Lys Leu Gln Phe Trp
Ala Pro Met Thr Arg Ser 435 440
445Gly Gly Asn Glu Val Ser Gly Glu Gly Gly Ser Gly Gln Ile Ser Cys 450
455 460Ser Pro Val Phe Ala Val Glu Arg
Pro Ile Ala Leu Ser Lys Gln Ala465 470
475 480Val Arg Arg Met Leu Ser Met Asn Val Glu Gly Arg
Asp Ala Asp Val 485 490
495Lys Gly Asn Leu Leu Lys Met Met Asn Asp Ser Met Ala Lys Lys Thr
500 505 510Ser Gly Asn Ala Phe Ile
Gly Lys Lys Met Phe Gln Ile Ser Asp Lys 515 520
525Asn Lys Val Asn Pro Ile Glu Ile Pro Ile Lys Gln Thr Ile
Pro Asn 530 535 540Phe Phe Phe Gly Arg
Asp Thr Ala Glu Asp Tyr Asp Asp Leu Asp Tyr545 550
555 560171220DNAInfluenza A virus 17atggaggcaa
gactactggt cttgttatgt gcatttgcag ctacaaatgc agacacaata 60tgtataggct
accatgcgaa taactcaacc gacactgttg acacagtact cgaaaagaat 120gtgaccgtga
cacactctgt taacctgctc gaagacagcc acaacggaaa actatgtaaa 180ttaaaaggaa
tagccccatt acaattgggg aaatgtaata tcgccggatg gctcttggga 240aacccggaat
gcgatttact gctcacagcg agctcatggt cctatattgt agaaacatcg 300aactcagaga
atggaacatg ttacccagga gatttcatcg actatgaaga actgagggag 360caattgagct
cagtgtcatc gtttgaaaaa ttcgaaatat ttcccaagac aagctcgtgg 420cccaatcatg
aaacaaccaa aggtgtaacg gcagcatgct cctatgcggg agcaagcagt 480ttttacagaa
atttgctgtg gctgacaaag aagggaagct catacccaaa gcttagcaag 540tcctatgtga
acaataaagg gaaagaagtc cttgtactat ggggtgttca tcatccgcct 600accggtactg
atcaacagag tctctatcag aatgcagatg cttatgtctc tgtagggtca 660tcaaaatata
acaggagatt caccccggaa atagcagcga gacccaaagt aagaggtcaa 720gctgggagga
tgaactatta ctggacatta ctagaacccg gagacacaat aacatttgag 780gcaactggaa
atctaatagc accatggtat gctttcgcac tgaatagagg ttctggatcc 840ggtatcatca
cttcagacgc accagtgcat gattgtaaca cgaagtgtca aacaccccat 900ggtgctataa
acagcagtct ccctttccag aatatacatc cagtcacaat aggagagtgc 960ccaaaatacg
tcaggagtac caaattgagg atggctacag gactaagaaa cattccatct 1020attcaatcca
ggggtctatt tggagccatt gccggtttta ttgagggggg atggactgga 1080atgatagatg
gatggtatgg ttatcatcat cagaatgaac agggatcagg ctatgcagcg 1140gatcaaaaaa
gcacacaaaa tgccattgac gggattacaa acaaggtgaa ttctgttatc 1200gagaaaatga
acacccaatt
122018406PRTInfluenza A virus 18Met Glu Ala Arg Leu Leu Val Leu Leu Cys
Ala Phe Ala Ala Thr Asn1 5 10
15Ala Asp Thr Ile Cys Ile Gly Tyr His Ala Asn Asn Ser Thr Asp Thr
20 25 30Val Asp Thr Val Leu Glu
Lys Asn Val Thr Val Thr His Ser Val Asn 35 40
45Leu Leu Glu Asp Ser His Asn Gly Lys Leu Cys Lys Leu Lys
Gly Ile 50 55 60Ala Pro Leu Gln Leu
Gly Lys Cys Asn Ile Ala Gly Trp Leu Leu Gly65 70
75 80Asn Pro Glu Cys Asp Leu Leu Leu Thr Ala
Ser Ser Trp Ser Tyr Ile 85 90
95Val Glu Thr Ser Asn Ser Glu Asn Gly Thr Cys Tyr Pro Gly Asp Phe
100 105 110Ile Asp Tyr Glu Glu
Leu Arg Glu Gln Leu Ser Ser Val Ser Ser Phe 115
120 125Glu Lys Phe Glu Ile Phe Pro Lys Thr Ser Ser Trp
Pro Asn His Glu 130 135 140Thr Thr Lys
Gly Val Thr Ala Ala Cys Ser Tyr Ala Gly Ala Ser Ser145
150 155 160Phe Tyr Arg Asn Leu Leu Trp
Leu Thr Lys Lys Gly Ser Ser Tyr Pro 165
170 175Lys Leu Ser Lys Ser Tyr Val Asn Asn Lys Gly Lys
Glu Val Leu Val 180 185 190Leu
Trp Gly Val His His Pro Pro Thr Gly Thr Asp Gln Gln Ser Leu 195
200 205Tyr Gln Asn Ala Asp Ala Tyr Val Ser
Val Gly Ser Ser Lys Tyr Asn 210 215
220Arg Arg Phe Thr Pro Glu Ile Ala Ala Arg Pro Lys Val Arg Gly Gln225
230 235 240Ala Gly Arg Met
Asn Tyr Tyr Trp Thr Leu Leu Glu Pro Gly Asp Thr 245
250 255Ile Thr Phe Glu Ala Thr Gly Asn Leu Ile
Ala Pro Trp Tyr Ala Phe 260 265
270Ala Leu Asn Arg Gly Ser Gly Ser Gly Ile Ile Thr Ser Asp Ala Pro
275 280 285Val His Asp Cys Asn Thr Lys
Cys Gln Thr Pro His Gly Ala Ile Asn 290 295
300Ser Ser Leu Pro Phe Gln Asn Ile His Pro Val Thr Ile Gly Glu
Cys305 310 315 320Pro Lys
Tyr Val Arg Ser Thr Lys Leu Arg Met Ala Thr Gly Leu Arg
325 330 335Asn Ile Pro Ser Ile Gln Ser
Arg Gly Leu Phe Gly Ala Ile Ala Gly 340 345
350Phe Ile Glu Gly Gly Trp Thr Gly Met Ile Asp Gly Trp Tyr
Gly Tyr 355 360 365His His Gln Asn
Glu Gln Gly Ser Gly Tyr Ala Ala Asp Gln Lys Ser 370
375 380Thr Gln Asn Ala Ile Asp Gly Ile Thr Asn Lys Val
Asn Ser Val Ile385 390 395
400Glu Lys Met Asn Thr Gln 405191741DNAInfluenza A virus
19ctgtcaaaat ggagaaaata gtgcttcttc ttgcaacagt cagtcttgtt aaaagtgatc
60agatttgcat tggttaccat gcaaacaact cgacagagca ggttgacaca ataatggaaa
120agaatgttac tgttacacat gcccaagaca tactggaaag gacacacaac gggaagctct
180gcgatctaaa tggagtgaaa cctctcattt tgagggattg tagtgtagct ggatggctcc
240tcggaaaccc tatgtgtgac gaattcatca atgtgccgga atggtcttac atagtggaga
300aggccagtcc agccaatgac ctctgttatc cagggaattt caacgactat gaagaactga
360aacacctatt gagcagaata aaccattttg agaaaattca gatcatcccc aaaagttctt
420ggtccaatca tgatgcctca tcaggggtga gctcagcatg tccatacctt gggaggtcct
480cctttttcag aaatgtggta tggcttatca aaaagaacag tgcataccca acaataaaga
540ggagctacaa taataccaac caagaagatc ttttggtact gtgggggatt caccatccta
600atgatgcggc agagcagaca aagctctatc aaaatccaac cacctacatt tccgttggaa
660catcaacact gaaccagaga ttggttccag aaatagctac tagacccaaa gtaaacgggc
720aaagtggaag aatggagttc ttctggacaa ttttaaagcc gaatgatgcc atcaatttcg
780agagtaatgg aaatttcatt gccccagaat atgcatacaa aattgtcaag aaaggggact
840caacaattat gaaaagtgaa ttggaatatg gtaactgcaa caccaagtgt caaactccaa
900tgggggcgat aaactctagt atgccattcc acaacataca ccccctcacc atcggggaat
960gccccaaata tgtgaaatca aacagattag ttcttgcgac tggactcaga aatacccctc
1020aaagggagag aagaagaaaa aagagaggac tatttggagc tatagcaggt tttatagagg
1080gaggatggca gggcatggta gatggttggt atgggtacca ccatagcaat gagcagggga
1140gtggatacgc tgcagacaaa gaatccactc aaaaggcaat agatggagtc accaataagg
1200tcaactcgat cattaacaaa atgaacactc agtttgaggc cgttggaagg gaatttaata
1260acttagaaag gagaatagag aatttaaaca agaaaatgga agacggattc ctagatgtct
1320ggacttacaa tgctgaactt ctggttctca tggaaaatga gagaactctc gactttcatg
1380actcaaatgt caagaacctt tacgacaagg tccgactaca gcttagggat aatgcaaagg
1440aactgggtaa tggttgtttc gaattctatc acaaatgtga taatgaatgt atggaaagtg
1500taaaaaacgg aacgtatgac tacccgcagt attcagaaga agcaagacta aacagagagg
1560aaataagtgg agtaaaattg gaatcaatgg gaacttacca aatactgtca atttattcaa
1620cagtggcgag ttccctagca ctggcaatca tggtagctgg tctatcttta tggatgtgct
1680ccaatggatc gttacaatgc agaatttgca tttaaatttg tgagttcaga ttgtagttaa
1740a
174120568PRTInfluenza A virus 20Met Glu Lys Ile Val Leu Leu Leu Ala Thr
Val Ser Leu Val Lys Ser1 5 10
15Asp Gln Ile Cys Ile Gly Tyr His Ala Asn Asn Ser Thr Glu Gln Val
20 25 30Asp Thr Ile Met Glu Lys
Asn Val Thr Val Thr His Ala Gln Asp Ile 35 40
45Leu Glu Arg Thr His Asn Gly Lys Leu Cys Asp Leu Asn Gly
Val Lys 50 55 60Pro Leu Ile Leu Arg
Asp Cys Ser Val Ala Gly Trp Leu Leu Gly Asn65 70
75 80Pro Met Cys Asp Glu Phe Ile Asn Val Pro
Glu Trp Ser Tyr Ile Val 85 90
95Glu Lys Ala Ser Pro Ala Asn Asp Leu Cys Tyr Pro Gly Asn Phe Asn
100 105 110Asp Tyr Glu Glu Leu
Lys His Leu Leu Ser Arg Ile Asn His Phe Glu 115
120 125Lys Ile Gln Ile Ile Pro Lys Ser Ser Trp Ser Asn
His Asp Ala Ser 130 135 140Ser Gly Val
Ser Ser Ala Cys Pro Tyr Leu Gly Arg Ser Ser Phe Phe145
150 155 160Arg Asn Val Val Trp Leu Ile
Lys Lys Asn Ser Ala Tyr Pro Thr Ile 165
170 175Lys Arg Ser Tyr Asn Asn Thr Asn Gln Glu Asp Leu
Leu Val Leu Trp 180 185 190Gly
Ile His His Pro Asn Asp Ala Ala Glu Gln Thr Lys Leu Tyr Gln 195
200 205Asn Pro Thr Thr Tyr Ile Ser Val Gly
Thr Ser Thr Leu Asn Gln Arg 210 215
220Leu Val Pro Glu Ile Ala Thr Arg Pro Lys Val Asn Gly Gln Ser Gly225
230 235 240Arg Met Glu Phe
Phe Trp Thr Ile Leu Lys Pro Asn Asp Ala Ile Asn 245
250 255Phe Glu Ser Asn Gly Asn Phe Ile Ala Pro
Glu Tyr Ala Tyr Lys Ile 260 265
270Val Lys Lys Gly Asp Ser Thr Ile Met Lys Ser Glu Leu Glu Tyr Gly
275 280 285Asn Cys Asn Thr Lys Cys Gln
Thr Pro Met Gly Ala Ile Asn Ser Ser 290 295
300Met Pro Phe His Asn Ile His Pro Leu Thr Ile Gly Glu Cys Pro
Lys305 310 315 320Tyr Val
Lys Ser Asn Arg Leu Val Leu Ala Thr Gly Leu Arg Asn Thr
325 330 335Pro Gln Arg Glu Arg Arg Arg
Lys Lys Arg Gly Leu Phe Gly Ala Ile 340 345
350Ala Gly Phe Ile Glu Gly Gly Trp Gln Gly Met Val Asp Gly
Trp Tyr 355 360 365Gly Tyr His His
Ser Asn Glu Gln Gly Ser Gly Tyr Ala Ala Asp Lys 370
375 380Glu Ser Thr Gln Lys Ala Ile Asp Gly Val Thr Asn
Lys Val Asn Ser385 390 395
400Ile Ile Asn Lys Met Asn Thr Gln Phe Glu Ala Val Gly Arg Glu Phe
405 410 415Asn Asn Leu Glu Arg
Arg Ile Glu Asn Leu Asn Lys Lys Met Glu Asp 420
425 430Gly Phe Leu Asp Val Trp Thr Tyr Asn Ala Glu Leu
Leu Val Leu Met 435 440 445Glu Asn
Glu Arg Thr Leu Asp Phe His Asp Ser Asn Val Lys Asn Leu 450
455 460Tyr Asp Lys Val Arg Leu Gln Leu Arg Asp Asn
Ala Lys Glu Leu Gly465 470 475
480Asn Gly Cys Phe Glu Phe Tyr His Lys Cys Asp Asn Glu Cys Met Glu
485 490 495Ser Val Lys Asn
Gly Thr Tyr Asp Tyr Pro Gln Tyr Ser Glu Glu Ala 500
505 510Arg Leu Asn Arg Glu Glu Ile Ser Gly Val Lys
Leu Glu Ser Met Gly 515 520 525Thr
Tyr Gln Ile Leu Ser Ile Tyr Ser Thr Val Ala Ser Ser Leu Ala 530
535 540Leu Ala Ile Met Val Ala Gly Leu Ser Leu
Trp Met Cys Ser Asn Gly545 550 555
560Ser Leu Gln Cys Arg Ile Cys Ile
565211714DNAInfluenza A virus 21gcaaaagcag gggaattact taactagcaa
aatggaaaca atatcactaa taactatact 60actagtagta acagcaagca atgcagataa
aatctgcatc ggccaccagt caacaaactc 120cacagaaact gtggacacgc taacagaaac
caatgttcct gtgacacatg ccaaagaatt 180gctccacaca gagcataatg gaatgctgtg
tgcaacaagc ctgggacatc ccctcattct 240agacacatgc actattgaag gactagtcta
tggcaaccct tcttgtgacc tgctgttggg 300aggaagagaa tggtcctaca tcgtcgaaag
atcatcagct gtaaatggaa cgtgttaccc 360tgggaatgta gaaaacctag aggaactcag
gacacttttt agttccgcta gttcctacca 420aagaatccaa atcttcccag acacaacctg
gaatgtgact tacactggaa caagcagagc 480atgttcaggt tcattctaca ggagtatgag
atggctgact caaaagagcg gtttttaccc 540tgttcaagac gcccaataca caaataacag
gggaaagagc attcttttcg tgtggggcat 600acatcaccca cccacctata ccgagcaaac
aaatttgtac ataagaaacg acacaacaac 660aagcgtgaca acagaagatt tgaataggac
cttcaaacca gtgatagggc caaggcccct 720tgtcaatggt ctgcagggaa gaattgatta
ttattggtcg gtactaaaac caggccaaac 780attgcgagta cgatccaatg ggaatctaat
tgctccatgg tatggacacg ttctttcagg 840agggagccat ggaagaatcc tgaagactga
tttaaaaggt ggtaattgtg tagtgcaatg 900tcagactgaa aaaggtggct taaacagtac
attgccattc cacaatatca gtaaatatgc 960atttggaacc tgccccaaat atgtaagagt
taatagtctc aaactggcag tcggtctgag 1020gaacgtgcct gctagatcaa gtagaggact
atttggagcc atagctggat tcatagaagg 1080aggttggcca ggactagtcg ctggctggta
tggtttccag cattcaaatg atcaaggggt 1140tggtatggct gcagataggg attcaactca
aaaggcaatt gataaaataa catccaaggt 1200gaataatata gtcgacaaga tgaacaagca
atatgaaata attgatcatg aattcagtga 1260ggttgaaact agactcaata tgatcaataa
taagattgat gaccaaatac aagacgtatg 1320ggcatataat gcagaattgc tagtactact
tgaaaatcaa aaaacactcg atgagcatga 1380tgcgaacgtg aacaatctat ataacaaggt
gaagagggca ctgggctcca atgctatgga 1440agatgggaaa ggctgtttcg agctatacca
taaatgtgat gatcagtgca tggaaacaat 1500tcggaacggg acctataata ggagaaagta
tagagaggaa tcaagactag aaaggcagaa 1560aatagagggg gttaagctgg aatctgaggg
aacttacaaa atcctcacca tttattcgac 1620tgtcgcctca tctcttgtgc ttgcaatggg
gtttgctgcc ttcctgttct gggccatgtc 1680caatggatct tgcagatgca acatttgtat
ataa 171422560PRTInfluenza A virus 22Met
Glu Thr Ile Ser Leu Ile Thr Ile Leu Leu Val Val Thr Ala Ser1
5 10 15Asn Ala Asp Lys Ile Cys Ile
Gly His Gln Ser Thr Asn Ser Thr Glu 20 25
30Thr Val Asp Thr Leu Thr Glu Thr Asn Val Pro Val Thr His
Ala Lys 35 40 45Glu Leu Leu His
Thr Glu His Asn Gly Met Leu Cys Ala Thr Ser Leu 50 55
60Gly His Pro Leu Ile Leu Asp Thr Cys Thr Ile Glu Gly
Leu Val Tyr65 70 75
80Gly Asn Pro Ser Cys Asp Leu Leu Leu Gly Gly Arg Glu Trp Ser Tyr
85 90 95Ile Val Glu Arg Ser Ser
Ala Val Asn Gly Thr Cys Tyr Pro Gly Asn 100
105 110Val Glu Asn Leu Glu Glu Leu Arg Thr Leu Phe Ser
Ser Ala Ser Ser 115 120 125Tyr Gln
Arg Ile Gln Ile Phe Pro Asp Thr Thr Trp Asn Val Thr Tyr 130
135 140Thr Gly Thr Ser Arg Ala Cys Ser Gly Ser Phe
Tyr Arg Ser Met Arg145 150 155
160Trp Leu Thr Gln Lys Ser Gly Phe Tyr Pro Val Gln Asp Ala Gln Tyr
165 170 175Thr Asn Asn Arg
Gly Lys Ser Ile Leu Phe Val Trp Gly Ile His His 180
185 190Pro Pro Thr Tyr Thr Glu Gln Thr Asn Leu Tyr
Ile Arg Asn Asp Thr 195 200 205Thr
Thr Ser Val Thr Thr Glu Asp Leu Asn Arg Thr Phe Lys Pro Val 210
215 220Ile Gly Pro Arg Pro Leu Val Asn Gly Leu
Gln Gly Arg Ile Asp Tyr225 230 235
240Tyr Trp Ser Val Leu Lys Pro Gly Gln Thr Leu Arg Val Arg Ser
Asn 245 250 255Gly Asn Leu
Ile Ala Pro Trp Tyr Gly His Val Leu Ser Gly Gly Ser 260
265 270His Gly Arg Ile Leu Lys Thr Asp Leu Lys
Gly Gly Asn Cys Val Val 275 280
285Gln Cys Gln Thr Glu Lys Gly Gly Leu Asn Ser Thr Leu Pro Phe His 290
295 300Asn Ile Ser Lys Tyr Ala Phe Gly
Thr Cys Pro Lys Tyr Val Arg Val305 310
315 320Asn Ser Leu Lys Leu Ala Val Gly Leu Arg Asn Val
Pro Ala Arg Ser 325 330
335Ser Arg Gly Leu Phe Gly Ala Ile Ala Gly Phe Ile Glu Gly Gly Trp
340 345 350Pro Gly Leu Val Ala Gly
Trp Tyr Gly Phe Gln His Ser Asn Asp Gln 355 360
365Gly Val Gly Met Ala Ala Asp Arg Asp Ser Thr Gln Lys Ala
Ile Asp 370 375 380Lys Ile Thr Ser Lys
Val Asn Asn Ile Val Asp Lys Met Asn Lys Gln385 390
395 400Tyr Glu Ile Ile Asp His Glu Phe Ser Glu
Val Glu Thr Arg Leu Asn 405 410
415Met Ile Asn Asn Lys Ile Asp Asp Gln Ile Gln Asp Val Trp Ala Tyr
420 425 430Asn Ala Glu Leu Leu
Val Leu Leu Glu Asn Gln Lys Thr Leu Asp Glu 435
440 445His Asp Ala Asn Val Asn Asn Leu Tyr Asn Lys Val
Lys Arg Ala Leu 450 455 460Gly Ser Asn
Ala Met Glu Asp Gly Lys Gly Cys Phe Glu Leu Tyr His465
470 475 480Lys Cys Asp Asp Gln Cys Met
Glu Thr Ile Arg Asn Gly Thr Tyr Asn 485
490 495Arg Arg Lys Tyr Arg Glu Glu Ser Arg Leu Glu Arg
Gln Lys Ile Glu 500 505 510Gly
Val Lys Leu Glu Ser Glu Gly Thr Tyr Lys Ile Leu Thr Ile Tyr 515
520 525Ser Thr Val Ala Ser Ser Leu Val Leu
Ala Met Gly Phe Ala Ala Phe 530 535
540Leu Phe Trp Ala Met Ser Asn Gly Ser Cys Arg Cys Asn Ile Cys Ile545
550 555
560231494DNAArtificial sequenceHuman Codon Optimized Influenza A Virus
H1N1 Nucleoprotein 23atggcctctc aggggacaaa gcggtcctac gagcagatgg
agaccgatgg agaaaggcag 60aatgctaccg agatacgagc ctcggtggga aagatgatag
gcgggatcgg taggttttac 120attcagatgt gcactgagct taagctgagt gattatgaag
gtagactgat acagaattca 180ctcaccatcg aaagaatggt gctgagtgca ttcgacgagc
gccgaaacaa atacctggag 240gaacatcctt cagccggcaa ggatcccaag aaaactggcg
gacccatcta ccggagggtg 300aacgggaaat ggatgcgcga gctgattctg tatgataaag
aagaaatccg gcgtatctgg 360aggcaagcta acaacggaga tgatgccaca gccggactga
cgcatatgat gatttggcac 420tctaacctta acgacgcgac ctaccagagg acccgggccc
tcgtgagaac aggcatggat 480ccacgaatgt gctcacttat gcaggggtcc accctgccaa
ggaggagcgg ggcagctggt 540gccgcagtca aaggggtggg aactatggtg atggagctag
tgcgtatgat taagcgcggc 600ataaatgacc gcaatttctg gcggggggaa aacggacgaa
agacacgcat tgcatatgaa 660cgcatgtgca atattctcaa ggggaaattc cagacggctg
ctcaaaaggc catgatggac 720caggtgaggg agtcaagaaa cccaggcaac gccgagtttg
aagacctgac cttcctggca 780cggtctgctc taatcctcag aggtagtgta gcacacaaga
gttgtcttcc ggcttgtgtg 840tatggaccag ctgttgcatc agggtatgat ttcgaaaggg
aaggctacag cctagttggt 900atcgacccgt ttagactctt acagaattcc caagtctatt
ccctgatcag acccaacgag 960aatcctgctc acaaaagcca gttggtctgg atggcctgtc
actccgccgc cttcgaggac 1020ctccgggtct tgtcctttat caaaggcact aaggttctgc
cccgcggcaa gttaagcact 1080aggggagttc agatcgcaag taacgagaac atggagacaa
tggagtctag caccttggaa 1140ttgcgctccc gttattgggc gatccggaca agaagcggag
gtaacacgaa tcagcaacgg 1200gccagcgcgg gccaaatttc gatacagcct actttcagcg
tgcagcggaa tctccccttc 1260gatcgcacca ccgtaatggc cgcgtttagt ggtaatacag
agggcagaac ttctgacatg 1320cgaacagaga ttatccgtat gatggagagc gctcgacctg
aagatgtgtc atttcagggc 1380agaggcgtat ttgagctgtc cgacgagaaa gcagcctctc
ctattgtccc ctctttcgac 1440atgtccaacg aggggagcta cttctttggc gacaatgccg
aagaatacga caat 1494241497DNAArtificial SequenceHuman Codon
Optimized Influenza A Virus H1N1 Nucleoprotein 24atggccagcc
agggcaccaa gcggagctac gagcagatgg agaccgacgg cgagcggcag 60aacgccaccg
agatccgggc cagcgtgggc aagatgatcg gcggcatcgg ccggttctac 120atccagatgt
gcaccgagct gaagctgagc gactacgagg gccggctgat ccagaacagc 180ctgaccatcg
agcggatggt gctgagcgcc ttcgacgagc ggcggaacaa gtacctggag 240gagcacccca
gcgccggcaa ggaccccaag aagaccggcg gccccatcta ccggcgggtg 300aacggcaagt
ggatgcggga gctgatcctg tacgacaagg aggagatccg gcggatctgg 360cggcaggcca
acaacggcga cgacgccacc gccggcctga cccacatgat gatctggcac 420agcaacctga
acgacgccac ctaccagcgg acccgggccc tggtgcggac cggcatggac 480ccccggatgt
gcagcctgat gcagggcagc accctgcccc ggcggagcgg cgccgccggc 540gccgccgtga
agggcgtggg caccatggtg atggagctgg tgcggatgat caagcggggc 600atcaacgacc
ggaacttctg gcggggcgag aacggccgga agacccggat cgcctacgag 660cggatgtgca
acatcctgaa gggcaagttc cagaccgccg cccagaaggc catgatggac 720caggtgcggg
agagccggaa ccccggcaac gccgagttcg aggacctgac cttcctggcc 780cggagcgccc
tgatcctgcg gggcagcgtg gcccacaaga gctgcctgcc cgcctgcgtg 840tacggccccg
ccgtggccag cggctacgac ttcgagcggg agggctacag cctggtgggc 900atcgacccct
tccggctgct gcagaacagc caggtgtaca gcctgatccg gcccaacgag 960aaccccgccc
acaagagcca gctggtgtgg atggcctgcc acagcgccgc cttcgaggac 1020ctgcgggtgc
tgagcttcat caagggcacc aaggtgctgc cccggggcaa gctgagcacc 1080cggggcgtgc
agatcgccag caacgagaac atggagacca tggagagcag caccctggag 1140ctgcggagcc
ggtactgggc catccggacc cggagcggcg gcaacaccaa ccagcagcgg 1200gccagcgccg
gccagatcag catccagccc accttcagcg tgcagcggaa cctgcccttc 1260gaccggacca
ccgtgatggc cgccttcagc ggcaacaccg agggccggac cagcgacatg 1320cggaccgaga
tcatccggat gatggagagc gcccggcccg aggacgtgag cttccagggc 1380cggggcgtgt
tcgagctgag cgacgagaag gccgccagcc ccatcgtgcc cagcttcgac 1440atgagcaacg
agggcagcta cttcttcggc gacaacgccg aggagtacga caactga
1497251497DNAArtificial sequenceHuman Codon Optimized Influenza A Virus
H1N1 Nucleoprotein 25atggcctcac agggcaccaa gcggagttat gagcagatgg
agaccgatgg cgagagacag 60aacgccacag agatcagagc ctcagttggc aagatgatcg
gcggcatcgg ccggttctat 120atccagatgt gcacggagct gaagctgagc gactacgagg
gcagactgat tcagaactct 180ctgaccatcg agagaatggt cctgagtgcc ttcgatgaga
gacgaaacaa gtatctggag 240gagcatccct ccgccggcaa ggaccccaag aagacgggcg
gccccatata tagaagagtt 300aacggcaagt ggatgagaga gctgatcctg tacgataagg
aggagatccg cagaatatgg 360aggcaggcca acaacggcga cgatgccact gccggcctga
cacatatgat gatatggcac 420agtaacctga acgacgccac ctaccagaga acaagggccc
tggttcgcac gggcatggat 480cccagaatgt gttcactgat gcagggctct acactgccca
gaaggtctgg cgccgccggc 540gccgccgtca agggcgttgg cacaatggtg atggagctgg
tgcggatgat caagagaggc 600attaacgatc ggaacttttg gaggggcgag aacggcagaa
agaccaggat agcctacgag 660cgaatgtgca acattctgaa gggcaagttc cagactgccg
cccagaaggc catgatggat 720caggtgcggg agagcagaaa ccccggcaac gccgagttcg
aggacctgac tttcctggcc 780agatctgccc tgatactgag gggctctgta gcccacaagt
cctgcctgcc cgcctgcgtg 840tacggccccg ccgtggcctc cggctatgac ttcgagcgag
agggctactc cctggtaggc 900atcgatccct ttagactgct gcagaactct caggtctaca
gtctgattag acccaacgag 960aaccccgccc ataagagcca gctggtgtgg atggcctgcc
acagtgccgc cttcgaggac 1020ctgagggtgc tgtcttttat aaagggcaca aaggtgctgc
cccgcggcaa gctgtctact 1080aggggcgtcc agatagcctc caacgagaac atggagacaa
tggagtctag tactctggag 1140ctgaggtcta ggtactgggc catcaggact aggagcggcg
gcaacaccaa ccagcagagg 1200gccagcgccg gccagatcag cattcagccc accttcagtg
tacagagaaa cctgcccttt 1260gatagaacta ctgttatggc cgccttctct ggcaacactg
agggcagaac tagtgacatg 1320cgaacagaga tcataagaat gatggagtcg gcccgtcccg
aggatgtgtc ctttcagggc 1380aggggcgtct tcgagctgag cgacgagaag gccgccagcc
ccatcgtacc ctctttcgat 1440atgagtaacg agggctcgta cttttttggc gacaacgccg
aggagtatga taactga 149726756DNAArtificial sequenceHuman Codon
Optimized Influenza A Virus M1 Protein 26atgagcttgc taacagaagt
ggaaacctat gtcctcagta tcattcctag cggcccctta 60aaagccgaaa tcgctcagcg
gctcgaggat gtttttgccg gcaagaacac cgacctggag 120gtattgatgg agtggctgaa
aacgcgacct attctgagcc ccctgactaa gggaatactc 180ggcttcgttt ttacattgac
cgtgccctca gagaggggtc tccaaaggag gcgcttcgtg 240cagaacgcct taaacgggaa
cggggaccca aataatatgg ataaggcagt gaaactgtat 300cgcaaattaa agcgggagat
aaccttccat ggagccaagg agatctccct gtcttactct 360gcaggtgctc tcgcgtcgtg
tatgggactt atctacaacc gaatgggcgc cgtcacaaca 420gaagtggctt tcgggctggt
gtgcgcaact tgcgaacaga ttgctgacag tcagcaccgg 480tcccaccgtc aaatggtcac
caccaccaat ccgctgatta gacatgaaaa tcgcatggtt 540ctagcatcaa ctacagccaa
agcaatggaa caaatggccg gaagctccga gcaggctgcc 600gaggcgatgg aggtggcgtc
ccaggccaga cagatggtac aggctatgag aactatcggt 660acgcacccaa gttcttcagc
tgggctgaag aatgatcttc ttgagaacct gcaggcctac 720caaaagcgga tgggcgtcca
gatgcagaga tttaaa 75627756DNAArtificial
sequenceHuman Codon Optimized Influenza A Virus M1 Protein
27atgagcctgc tgaccgaggt ggagacctac gtgctgagca tcatccccag cggccccctg
60aaggccgaga tcgcccagag gctggaggac gtgttcgccg gcaagaacac cgacctggag
120gtgctgatgg agtggctgaa gaccaggccc atcctgagcc ccctgaccaa gggcatcctg
180ggcttcgtgt tcaccctgac cgtgcccagc gagaggggcc tgcagaggag gaggttcgtg
240cagaacgccc tgaacggcaa cggcgacccc aacaacatgg acaaggccgt gaagctgtac
300aggaagctga agagggagat caccttccac ggcgccaagg agatcagcct gagctacagc
360gccggcgccc tggccagctg catgggcctg atctacaaca ggatgggcgc cgtgaccacc
420gaggtggcct tcggcctggt gtgcgccacc tgcgagcaga tcgccgacag ccagcacagg
480agccacaggc agatggtgac caccaccaac cccctgatca ggcacgagaa caggatggtg
540ctggccagca ccaccgccaa ggccatggag cagatggccg gcagcagcga gcaggccgcc
600gaggccatgg aggtggccag ccaggccagg cagatggtgc aggccatgag gaccatcggc
660acccacccca gcagcagcgc cggcctgaag aacgacctgc tggagaacct gcaggcctac
720cagaagagga tgggcgtgca gatgcagagg ttcaag
75628756DNAArtificial sequenceHuman Codon Optimized Influenza A Virus M1
Protein 28atgagtctgc tgacagaggt tgagacgtac gtgctgtcca tcattccctc
aggccccctg 60aaggccgaga ttgcccagag actggaggac gtcttcgccg gcaagaacac
cgatctggag 120gtgctgatgg agtggctgaa gactcgcccc atcctgtctc ccctgacaaa
gggcatcctg 180ggcttcgtat ttacactgac cgtcccctcc gagagaggcc tgcagcggag
gaggttcgtt 240cagaacgccc tgaacggcaa cggcgatccc aacaacatgg ataaggccgt
gaagctgtat 300agaaagctga agcgagagat cacatttcat ggcgccaagg agatatcgct
gagctacagt 360gccggcgccc tggcctcttg catgggcctg atatacaaca gaatgggcgc
cgttactaca 420gaggtagcct ttggcctggt ctgcgccact tgcgagcaga tcgccgactc
tcagcataga 480tctcacagac agatggtgac gactacaaac cccctgatac ggcacgagaa
caggatggtg 540ctggcctcta ctaccgccaa ggccatggag cagatggccg gcagcagtga
gcaggccgcc 600gaggccatgg aggtagcctc acaggccagg cagatggtgc aggccatgcg
aaccatcggc 660actcacccct ccagctctgc cggcctgaag aacgacctgc tggagaacct
gcaggcctat 720cagaagagaa tgggcgtaca gatgcagagg ttcaag
75629294DNAArtificial sequenceHuman Codon Optimized Influenza
A Virus M2 Protein 29atgagtcttc taaccgaggt cgaaacgcct atcagaaacg
aatgggggtg cagatgcaac 60ggttcaagtg atcctctcgc tattgccgca aatatcattg
ggatcttgca cttgacattg 120tggattcttg atcgtctttt tttcaaatgc atttaccgtc
gctttaaata cggactgaaa 180ggagggcctt ctacggaagg agtgccaaag tctatgaggg
aagaatatcg aaaggaacag 240cagagtgctg tggatgctga cgatggtcat tttgtcagca
tagagctgga gtaa 29430294DNAArtificial sequenceHuman Codon
Optimized Influenza A Virus M2 Protein 30atgagcctgc tgaccgaggt
ggagaccccc atccggaacg agtggggctg ccggtgcaac 60ggcagcagcg accccctggc
catcgccgcc aacatcatcg gcatcctgca cctgaccctg 120tggatcctgg accggctgtt
cttcaagtgc atctaccggc ggttcaagta cggcctgaag 180ggcggcccca gcaccgaggg
cgtgcccaag agcatgcggg aggagtaccg gaaggagcag 240cagagcgccg tggacgccga
cgacggccac ttcgtgagca tcgagctgga gtga 29431294DNAArtificial
sequenceHuman Codon-Optimized Influenza A Virus M2 Protein
31atgtctctgc tgacagaggt ggagacaccc ataaggaacg agtggggctg caggtgcaac
60ggctctagtg atcccctggc catcgccgcc aacatcattg gcatactgca tctgaccctg
120tggatcctgg atagactgtt ctttaagtgc atttacagac gatttaagta tggcctgaag
180ggcggcccct caactgaggg cgtgcccaag agtatgagag aggagtaccg gaaggagcag
240cagagcgccg ttgacgccga tgacggccac ttcgtctcca tcgagctgga gtga
294321566DNAArtificial sequenceHuman Codon Optimized Coding Region
Encoding eM2NP 32atgagccttc tcacagaagt ggaaacacct atcagaaatg
aatggggatg cagatgcaat 60gggtcgagtg atatggcctc tcaaggtacg aaaagaagct
acgagcaaat ggaaacggat 120ggagaaagac aaaacgcgac cgaaatcaga gcatccgtcg
ggaagatgat tggaggaatc 180ggacgattct acatccagat gtgcacagag ctaaagctat
cggattatga agggagacta 240atacaaaata gcctaactat cgagagaatg gtgctgtctg
catttgacga aaggagaaac 300aaatacctgg aagaacaccc ctctgcaggg aaagacccaa
aaaaaactgg aggtccgata 360taccggagag tcaacggtaa atggatgaga gagctgatct
tgtatgataa ggaagaaata 420agacgcatct ggcggcaagc taataatgga gacgacgcta
ctgcagggct cacgcatatg 480atgatctggc actctaattt gaatgatgca acgtaccaaa
gaacccgcgc acttgtgcgg 540accggaatgg accctcgtat gtgcagcctt atgcaggggt
ccacactgcc cagaaggtcc 600ggagcagctg gagcagcagt aaagggggtt ggaaccatgg
tgatggagct ggtgagaatg 660attaagaggg ggatcaatga caggaacttc tggcgaggag
aaaacgggag aaaaactagg 720atagcatatg agaggatgtg taacatcctc aaaggaaaat
tccaaaccgc tgctcagaaa 780gcaatgatgg atcaagtacg cgaaagtaga aatcctggaa
atgcagagtt tgaagatctc 840actttcctcg cgcgaagcgc tctcatcctc agagggagtg
tcgctcataa aagttgcctg 900cctgcctgcg tatatggtcc tgccgtggca agtggatacg
actttgagag agaggggtac 960tctcttgttg gaatagatcc attcagatta cttcagaatt
cccaggtgta cagtttaata 1020aggccaaacg aaaatcctgc acacaaatca caacttgttt
ggatggcatg ccatagtgcc 1080gcattcgaag atctaagagt tctctctttc atcaaaggta
caaaggtcct tccaagggga 1140aaactctcta ccagaggggt acaaatagct tcaaatgaga
acatggagac aatggaatct 1200agcacattgg aattgagaag taggtattgg gccattagaa
ccaggagtgg aggcaatact 1260aatcaacagc gggcttctgc cggtcaaatt agcatacaac
ctactttttc agtgcaacgg 1320aatctccctt ttgataggac aactgtcatg gcggcattct
ctggaaatac cgaaggaagg 1380acttccgata tgaggactga gatcattagg atgatggaaa
gtgcccgacc tgaagacgtc 1440agttttcaag gaagaggtgt gttcgaactc tctgacgaaa
aggcagctag cccaatcgtt 1500ccttcttttg atatgtcaaa tgaaggatcc tacttcttcg
gcgataatgc ggaggaatat 1560gacaac
1566331566DNAArtificial sequenceHuman Codon
Optimized Coding Region Encoding eM2NP 33atgagcctgc tgaccgaggt
ggagaccccc atcaggaacg agtggggctg caggtgcaac 60ggcagcagcg acatggccag
ccagggcacc aagaggagct acgagcagat ggagaccgac 120ggcgagaggc agaacgccac
cgagatcagg gccagcgtgg gcaagatgat cggcggcatc 180ggcaggttct acatccagat
gtgcaccgag ctgaagctga gcgactacga gggcaggctg 240atccagaaca gcctgaccat
cgagaggatg gtgctgagcg ccttcgacga gaggaggaac 300aagtacctgg aggagcaccc
cagcgccggc aaggacccca agaagaccgg cggccccatc 360tacaggaggg tgaacggcaa
gtggatgagg gagctgatcc tgtacgacaa ggaggagatc 420aggaggatct ggaggcaggc
caacaacggc gacgacgcca ccgccggcct gacccacatg 480atgatctggc acagcaacct
gaacgacgcc acctaccaga ggaccagggc cctggtgagg 540accggcatgg accccaggat
gtgcagcctg atgcagggca gcaccctgcc caggaggagc 600ggcgccgccg gcgccgccgt
gaagggcgtg ggcaccatgg tgatggagct ggtgaggatg 660atcaagaggg gcatcaacga
caggaacttc tggaggggcg agaacggcag gaagaccagg 720atcgcctacg agaggatgtg
caacatcctg aagggcaagt tccagaccgc cgcccagaag 780gccatgatgg accaggtgag
ggagagcagg aaccccggca acgccgagtt cgaggacctg 840accttcctgg ccaggagcgc
cctgatcctg aggggcagcg tggcccacaa gagctgcctg 900cccgcctgcg tgtacggccc
cgccgtggcc agcggctacg acttcgagag ggagggctac 960agcctggtgg gcatcgaccc
cttcaggctg ctgcagaaca gccaggtgta cagcctgatc 1020aggcccaacg agaaccccgc
ccacaagagc cagctggtgt ggatggcctg ccacagcgcc 1080gccttcgagg acctgagggt
gctgagcttc atcaagggca ccaaggtgct gcccaggggc 1140aagctgagca ccaggggcgt
gcagatcgcc agcaacgaga acatggagac catggagagc 1200agcaccctgg agctgaggag
caggtactgg gccatcagga ccaggagcgg cggcaacacc 1260aaccagcaga gggccagcgc
cggccagatc agcatccagc ccaccttcag cgtgcagagg 1320aacctgccct tcgacaggac
caccgtgatg gccgccttca gcggcaacac cgagggcagg 1380accagcgaca tgaggaccga
gatcatcagg atgatggaga gcgccaggcc cgaggacgtg 1440agcttccagg gcaggggcgt
gttcgagctg agcgacgaga aggccgccag ccccatcgtg 1500cccagcttcg acatgagcaa
cgagggcagc tacttcttcg gcgacaacgc cgaggagtac 1560gacaac
1566341566DNAArtificial
seequenceHuman Codon Optimized Coding Region Encoding NPeM2
34atggcaagcc agggcacaaa acgcagttac gagcagatgg agactgatgg tgagaggcag
60aacgccaccg aaatccgggc ctccgtcggc aagatgattg gtggcatcgg aagattctat
120atccagatgt gcacggagct taagctgtcc gattacgagg ggcgcttaat acagaactct
180ctgactatcg agcgaatggt cttgagcgcc tttgatgagc ggcgtaataa gtatctcgaa
240gagcaccctt ctgctggaaa agaccccaaa aagaccgggg gacctatcta ccgacgtgtg
300aacggaaaat ggatgcgcga actgatactg tacgacaagg aggagatccg taggatctgg
360agacaggcta ataacggaga tgatgccaca gctgggctga cccatatgat gatatggcat
420agcaacctga acgacgcaac ctatcaacgc actagagcac tcgtgaggac cggtatggac
480ccacgcatgt gctcattgat gcaaggtagc acattgcctc ggaggtcagg cgccgccggt
540gccgccgtaa agggggtggg cacaatggtg atggaactgg tccgaatgat caaaagaggc
600atcaatgaca ggaacttttg gcgcggagaa aacgggcgca agacccgcat tgcctacgag
660cgcatgtgta acattttaaa aggcaaattc cagactgcag cccagaaagc aatgatggac
720caagttagag aaagtagaaa tcccgggaat gccgagtttg aagacctgac tttcctggct
780agaagcgcct tgatcctgcg gggctctgtc gcccacaaga gctgcctccc cgcttgcgtt
840tacggccccg cggtcgcaag tggctacgat ttcgagaggg aggggtattc cctagttggg
900atcgatccct tccggctcct acagaattct caggtgtata gtctgattag acccaacgaa
960aacccggctc acaagagtca gcttgtttgg atggcatgtc actcagcagc tttcgaagac
1020ctgcgggtac tcagctttat taaaggcacc aaggtcctgc caagaggaaa gctctccacg
1080aggggagtac agatcgcctc aaacgagaac atggagacaa tggaaagctc cacccttgag
1140cttaggtcgc ggtattgggc tattagaaca cgatctgggg ggaataccaa tcagcaacga
1200gcgagtgctg gtcagatttc cattcagcct actttctctg tgcaacggaa tctaccattt
1260gacaggacaa ctgtgatggc agcgttctcc ggcaatacag aaggacgaac atcagacatg
1320aggaccgaaa ttatccggat gatggagagc gctcggccag aagatgtgtc gttccagggc
1380cggggcgtgt ttgagctcag cgacgagaag gccgcgtctc caattgtgcc ttcctttgat
1440atgagcaatg aggggtcata ctttttcgga gacaatgccg aagagtatga taatatgtct
1500ctgcttaccg aggtggaaac gccgatacgc aacgaatggg gttgtcgttg taacggctcc
1560agtgat
1566351566DNAArtificial sequenceHuman Codon Optimized Coding Region
Encoding NPeM2 35atggccagcc agggcaccaa gaggagctac gagcagatgg
agaccgacgg cgagaggcag 60aacgccaccg agatcagggc cagcgtgggc aagatgatcg
gcggcatcgg caggttctac 120atccagatgt gcaccgagct gaagctgagc gactacgagg
gcaggctgat ccagaacagc 180ctgaccatcg agaggatggt gctgagcgcc ttcgacgaga
ggaggaacaa gtacctggag 240gagcacccca gcgccggcaa ggaccccaag aagaccggcg
gccccatcta caggagggtg 300aacggcaagt ggatgaggga gctgatcctg tacgacaagg
aggagatcag gaggatctgg 360aggcaggcca acaacggcga cgacgccacc gccggcctga
cccacatgat gatctggcac 420agcaacctga acgacgccac ctaccagagg accagggccc
tggtgaggac cggcatggac 480cccaggatgt gcagcctgat gcagggcagc accctgccca
ggaggagcgg cgccgccggc 540gccgccgtga agggcgtggg caccatggtg atggagctgg
tgaggatgat caagaggggc 600atcaacgaca ggaacttctg gaggggcgag aacggcagga
agaccaggat cgcctacgag 660aggatgtgca acatcctgaa gggcaagttc cagaccgccg
cccagaaggc catgatggac 720caggtgaggg agagcaggaa ccccggcaac gccgagttcg
aggacctgac cttcctggcc 780aggagcgccc tgatcctgag gggcagcgtg gcccacaaga
gctgcctgcc cgcctgcgtg 840tacggccccg ccgtggccag cggctacgac ttcgagaggg
agggctacag cctggtgggc 900atcgacccct tcaggctgct gcagaacagc caggtgtaca
gcctgatcag gcccaacgag 960aaccccgccc acaagagcca gctggtgtgg atggcctgcc
acagcgccgc cttcgaggac 1020ctgagggtgc tgagcttcat caagggcacc aaggtgctgc
ccaggggcaa gctgagcacc 1080aggggcgtgc agatcgccag caacgagaac atggagacca
tggagagcag caccctggag 1140ctgaggagca ggtactgggc catcaggacc aggagcggcg
gcaacaccaa ccagcagagg 1200gccagcgccg gccagatcag catccagccc accttcagcg
tgcagaggaa cctgcccttc 1260gacaggacca ccgtgatggc cgccttcagc ggcaacaccg
agggcaggac cagcgacatg 1320aggaccgaga tcatcaggat gatggagagc gccaggcccg
aggacgtgag cttccagggc 1380aggggcgtgt tcgagctgag cgacgagaag gccgccagcc
ccatcgtgcc cagcttcgac 1440atgagcaacg agggcagcta cttcttcggc gacaacgccg
aggagtacga caacatgagc 1500ctgctgaccg aggtggagac ccccatcagg aacgagtggg
gctgcaggtg caacggcagc 1560agcgac
1566361683DNAArtificial sequenceHuman Codon
Optimized Coding Region Encoding IBV NP Protein 36atgtcgaaca
tggacatcga cagcattaac acaggtacta ttgacaaaac ccccgaagaa 60ctaacccctg
gaacctcagg agcaacacgc ccaataatca aaccggccac cctcgcgccc 120cctagcaata
agaggacccg caatccaagt cctgagagaa ccactacttc atctgaaacg 180gatatcggtc
ggaaaattca aaaaaagcag acgcccacag agataaagaa gtctgtttac 240aaaatggtgg
taaagctcgg tgagttttat aaccagatga tggtcaaggc ggggcttaac 300gacgatatgg
aacgaaatct tatacagaat gcacaggcag tagagagaat actgctggcc 360gctactgatg
acaagaaaac ggagtaccaa aaaaaacgga atgctcgaga tgtgaaagaa 420ggaaaagaag
aaattgacca taacaaaact ggggggacat tctataagat ggtgcgggac 480gataagacaa
tctattttag cccgataaag attaccttcc tgaaggagga ggttaaaaca 540atgtacaaga
cgacgatggg cagcgatggg ttttccggac ttaatcatat aatgattggt 600cactcgcaga
tgaacgatgt atgtttccag cgctccaagg gcttaaagag ggtaggtctt 660gacccgtctc
taatatcaac tttcgcagga tccactttgc cgaggcgttc tggcacgaca 720ggcgtggcta
tcaagggcgg ggggacgctg gtcgatgagg ccattcgctt tattggtagg 780gccatggccg
atagagggct tctacgagac atcaaagcaa aaacagcata tgagaagata 840ttattaaact
taaagaacaa atgctccgct cctcagcaaa aagcgctcgt tgaccaagta 900atcggttcga
gaaatccagg cattgccgat atcgaagatc ttacactctt ggcgcgaagc 960atggtcgttg
tccgtcccag tgtcgctagt aaggtggtac taccaatctc gatttacgca 1020aaaattccac
aactcggctt taatacagag gaatattcta tggtaggtta tgaagccatg 1080gcgttgtata
atatggctac accagtctcc atattgcgta tgggagatga cgcaaaagat 1140aagagtcaac
tctttttcat gtcatgtttc ggcgcagcgt acgaagatct gagagtacta 1200tccgccttga
ctggaacgga atttaaacca cggtcagcct taaagtgtaa gggttttcac 1260gtccctgcta
aggagcaagt tgagggaatg ggcgcggcac tgatgagtat aaaattacaa 1320ttttgggctc
caatgacgcg ttcgggaggg aatgaagttt ctggtgaggg agggagtgga 1380cagatatcat
gctcgcccgt gttcgcggtt gaacgtccga ttgctttgag taagcaggcg 1440gttaggcgga
tgttaagtat gaatgtggag ggccgcgatg ccgacgtcaa aggcaactta 1500ttaaaaatga
tgaacgacag catggcaaag aagactagtg ggaatgcttt tatagggaaa 1560aaaatgttcc
aaataagtga caaaaacaaa gtgaacccca tcgaaatacc tatcaagcaa 1620accatcccga
atttcttttt cggtcgagac accgcggagg actacgatga cctagattac 1680taa
1683371683DNAArtificial sequenceHuman Codon Optimized Coding Region
Encoding IBV NP Protein 37atgagcaaca tggacatcga cagcatcaac
accggcacca tcgacaagac ccccgaggag 60ctgacccccg gcaccagcgg cgccacccgg
cccatcatca agcccgccac cctggccccc 120cccagcaaca agcggacccg gaaccccagc
cccgagcgga ccaccaccag cagcgagacc 180gacatcggcc ggaagatcca gaagaagcag
acccccaccg agatcaagaa gagcgtgtac 240aagatggtgg tgaagctggg cgagttctac
aaccagatga tggtgaaggc cggcctgaac 300gacgacatgg agcggaacct gatccagaac
gcccaggccg tggagcggat cctgctggcc 360gccaccgacg acaagaagac cgagtaccag
aagaagcgga acgcccggga cgtgaaggag 420ggcaaggagg agatcgacca caacaagacc
ggcggcacct tctacaagat ggtgcgggac 480gacaagacca tctacttcag ccccatcaag
atcaccttcc tgaaggagga ggtgaagacc 540atgtacaaga ccaccatggg cagcgacggc
ttcagcggcc tgaaccacat catgatcggc 600cacagccaga tgaacgacgt gtgcttccag
cggagcaagg gcctgaagcg ggtgggcctg 660gaccccagcc tgatcagcac cttcgccggc
agcaccctgc cccggcggag cggcaccacc 720ggcgtggcca tcaagggcgg cggcaccctg
gtggacgagg ccatccggtt catcggccgg 780gccatggccg accggggcct gctgcgggac
atcaaggcca agaccgccta cgagaagatc 840ctgctgaacc tgaagaacaa gtgcagcgcc
ccccagcaga aggccctggt ggaccaggtg 900atcggcagcc ggaaccccgg catcgccgac
atcgaggacc tgaccctgct ggcccggagc 960atggtggtgg tgcggcccag cgtggccagc
aaggtggtgc tgcccatcag catctacgcc 1020aagatccccc agctgggctt caacaccgag
gagtacagca tggtgggcta cgaggccatg 1080gccctgtaca acatggccac ccccgtgagc
atcctgcgga tgggcgacga cgccaaggac 1140aagagccagc tgttcttcat gagctgcttc
ggcgccgcct acgaggacct gcgggtgctg 1200agcgccctga ccggcaccga gttcaagccc
cggagcgccc tgaagtgcaa gggcttccac 1260gtgcccgcca aggagcaggt ggagggcatg
ggcgccgccc tgatgagcat caagctgcag 1320ttctgggccc ccatgacccg gagcggcggc
aacgaggtga gcggcgaggg cggcagcggc 1380cagatcagct gcagccccgt gttcgccgtg
gagcggccca tcgccctgag caagcaggcc 1440gtgcggcgga tgctgagcat gaacgtggag
ggccgggacg ccgacgtgaa gggcaacctg 1500ctgaagatga tgaacgacag catggccaag
aagaccagcg gcaacgcctt catcggcaag 1560aagatgttcc agatcagcga caagaacaag
gtgaacccca tcgagatccc catcaagcag 1620accatcccca acttcttctt cggccgggac
accgccgagg actacgacga cctggactac 1680tga
1683381683DNAArtificial sequenceHuman
Codon Optimized Coding Region Encoding IBV NP Protein 38atgtctaaca
tggacatcga ctctataaac acaggcacga tcgataagac ccccgaggag 60ctgacacccg
gcacttcagg cgccaccaga cccataataa agcccgccac tctggccccc 120ccctctaaca
agaggacgag gaacccctct cccgagcgca ccacaacgag tagcgagacg 180gacatcggca
ggaagataca gaagaagcag actcccactg agattaagaa gtccgtgtat 240aagatggtgg
ttaagctggg cgagttttac aaccagatga tggtgaaggc cggcctgaac 300gatgacatgg
agaggaacct gatacagaac gcccaggccg tggagaggat tctgctggcc 360gccaccgatg
acaagaagac tgagtatcag aagaagagaa acgcccggga cgttaaggag 420ggcaaggagg
agatcgatca caacaagaca ggcggcactt tctataagat ggtccgtgat 480gacaagacaa
tctacttttc tcccatcaag atcacattcc tgaaggagga ggtaaagact 540atgtacaaga
caactatggg ctccgatggc ttcagtggcc tgaaccacat aatgataggc 600catagtcaga
tgaacgatgt gtgcttccag agaagcaagg gcctgaagag ggtcggcctg 660gatccctcgc
tgattagtac cttcgccggc agcactctgc ccagaagatc tggcactact 720ggcgtagcca
taaagggcgg cggcacactg gtagacgagg ccataaggtt tattggcaga 780gccatggccg
accgcggcct gctgagagat atcaaggcca agaccgccta cgagaagata 840ctgctgaacc
tgaagaacaa gtgctcagcc ccccagcaga aggccctggt ggatcaggtg 900atcggcagta
gaaaccccgg catcgccgac atcgaggatc tgactctgct ggccagaagc 960atggtagtcg
taagaccctc tgtggcctct aaggttgtgc tgcccatctc catctacgcc 1020aagattcccc
agctgggctt taacactgag gagtactcca tggtgggcta tgaggccatg 1080gccctgtata
acatggccac acccgtctct atcctgcgga tgggcgacga tgccaaggac 1140aagtctcagc
tgttttttat gagttgtttc ggcgccgcct atgaggatct gagagtcctg 1200tcagccctga
caggcactga gttcaagccc aggtccgccc tgaagtgcaa gggctttcat 1260gtgcccgcca
aggagcaggt ggagggcatg ggcgccgccc tgatgagcat caagctgcag 1320ttctgggccc
ccatgacccg gtctggcggc aacgaggtct cgggcgaggg cggcagtggc 1380cagataagtt
gcagccccgt ttttgccgtt gagagaccca tcgccctgtc taagcaggcc 1440gttagacgaa
tgctgagtat gaacgtcgag ggccgagacg ccgatgtgaa gggcaacctg 1500ctgaagatga
tgaacgattc catggccaag aagacaagcg gcaacgcctt cattggcaag 1560aagatgttcc
agataagcga taagaacaag gttaacccca tcgagattcc catcaagcag 1620accatcccca
acttcttctt cggcagggat accgccgagg attacgatga cctggactac 1680tga
168339552DNAHepatitis B virus 39atggacatcg acccttataa agaatttgga
gctactgtgg agttactctc gtttttgcct 60tctgacttct ttccttcagt acgagatctt
ctagataccg cctcagctct gtatcgggaa 120gccttagagt ctcctgagca ttgttcacct
caccatactg cactcaggca agcaattctt 180tgctgggggg aactaatgac tctagctacc
tgggtgggtg ttaatttgga agatccagcg 240tctagagacc tagtagtcag ttatgtcaac
actaatatgg gcctaaagtt caggcaactc 300ttgtggtttc acatttcttg tctcactttt
ggaagagaaa cagttataga gtatttggtg 360tctttcggag tgtggattcg cactcctcca
gcttatagac caccaaatgc ccctatccta 420tcaacacttc cggagactac tgttgttaga
cgacgaggca ggtcccctag aagaagaact 480ccctcgcctc gcagacgaag gtctcaatcg
ccgcgtcgca gaagatctca atctcgggaa 540tctcaatgtt ag
55240183PRTArtificial sequenceHepatitus
B Virus 40Met Asp Ile Asp Pro Tyr Lys Glu Phe Gly Ala Thr Val Glu Leu
Leu1 5 10 15Ser Phe Leu
Pro Ser Asp Phe Phe Pro Ser Val Arg Asp Leu Leu Asp 20
25 30Thr Ala Ser Ala Leu Tyr Arg Glu Ala Leu
Glu Ser Pro Glu His Cys 35 40
45Ser Pro His His Thr Ala Leu Arg Gln Ala Ile Leu Cys Trp Gly Glu 50
55 60Leu Met Thr Leu Ala Thr Trp Val Gly
Val Asn Leu Glu Asp Pro Ala65 70 75
80Ser Arg Asp Leu Val Val Ser Tyr Val Asn Thr Asn Met Gly
Leu Lys 85 90 95Phe Arg
Gln Leu Leu Trp Phe His Ile Ser Cys Leu Thr Phe Gly Arg 100
105 110Glu Thr Val Ile Glu Tyr Leu Val Ser
Phe Gly Val Trp Ile Arg Thr 115 120
125Pro Pro Ala Tyr Arg Pro Pro Asn Ala Pro Ile Leu Ser Thr Leu Pro
130 135 140Glu Thr Thr Val Val Arg Arg
Arg Gly Arg Ser Pro Arg Arg Arg Thr145 150
155 160Pro Ser Pro Arg Arg Arg Arg Ser Gln Ser Pro Arg
Arg Arg Arg Ser 165 170
175Gln Ser Arg Glu Ser Gln Cys 18041555DNAArtificial
sequenceSynthetic HBcAg 41atggatatcg atccttataa agaattcgga gctactgtgg
agttactctc gtttctcccg 60agtgacttct ttccttcagt acgagatctt ctggataccg
ccagcgcgct gtatcgggaa 120gccttggagt ctcctgagca ctgcagccct caccatactg
ccctcaggca agcaattctt 180tgctgggggg agctcatgac tctggccacg tgggtgggtg
ttaacttgga agatccagct 240agcagggacc tggtagtcag ttatgtcaac actaatatgg
gtttaaagtt caggcaactc 300ttgtggtttc acattagctg cctcactttc ggccgagaaa
cagttctaga atatttggtg 360tctttcggag tgtggatccg cactcctcca gcttataggc
ctccgaatgc ccctatcctg 420tcgacactcc cggagactac tgttgttaga cgtcgaggca
ggtcacctag aagaagaact 480ccttcgcctc gcaggcgaag gtctcaatcg ccgcggcgcc
gaagatctca atctcgggaa 540tctcaatgtt agtga
55542183PRTArtificial sequenceSynthetic HBcAg
42Met Asp Ile Asp Pro Tyr Lys Glu Phe Gly Ala Thr Val Glu Leu Leu1
5 10 15Ser Phe Leu Pro Ser Asp
Phe Phe Pro Ser Val Arg Asp Leu Leu Asp 20 25
30Thr Ala Ser Ala Leu Tyr Arg Glu Ala Leu Glu Ser Pro
Glu His Cys 35 40 45Ser Pro His
His Thr Ala Leu Arg Gln Ala Ile Leu Cys Trp Gly Glu 50
55 60Leu Met Thr Leu Ala Thr Trp Val Gly Val Asn Leu
Glu Asp Pro Ala65 70 75
80Ser Arg Asp Leu Val Val Ser Tyr Val Asn Thr Asn Met Gly Leu Lys
85 90 95Phe Arg Gln Leu Leu Trp
Phe His Ile Ser Cys Leu Thr Phe Gly Arg 100
105 110Glu Thr Val Leu Glu Tyr Leu Val Ser Phe Gly Val
Trp Ile Arg Thr 115 120 125Pro Pro
Ala Tyr Arg Pro Pro Asn Ala Pro Ile Leu Ser Thr Leu Pro 130
135 140Glu Thr Thr Val Val Arg Arg Arg Gly Arg Ser
Pro Arg Arg Arg Thr145 150 155
160Pro Ser Pro Arg Arg Arg Arg Ser Gln Ser Pro Arg Arg Arg Arg Ser
165 170 175Gln Ser Arg Glu
Ser Gln Cys 180432043DNAArtificial sequenceInfluenza A Virus
NP Gene Fused to Synthetic HBcAg 43atggcgtctc aaggcaccaa acgatcttac
gaacagatgg agactgatgg agaacgccag 60aatgccactg aaatcagagc atccgtcgga
aaaatgattg gtggaattgg acgattctac 120atccaaatgt gcaccgaact caaactcagt
gattatgagg gacggttgat ccaaaacagc 180ttaacaatag agagaatggt gctctctgct
tttgacgaaa ggagaaataa ataccttgaa 240gaacatccca gtgcggggaa agatcctaag
aaaactggag gacctatata caggagagta 300aacggaaagt ggatgagaga actcatcctt
tatgacaaag aagaaataag gcgaatctgg 360cgccaagcta ataatggtga cgatgcaacg
gctggtctga ctcacatgat gatctggcat 420tccaatttga atgatgcaac ttatcagagg
acaagagctc ttgttcgcac cggaatggat 480cccaggatgt gctctctgat gcaaggttca
actctcccta ggaggtctgg agccgcaggt 540gctgcagtca aaggagttgg aacaatggtg
atggaattgg tcagaatgat caaacgtggg 600atcaatgatc ggaacttctg gaggggtgag
aatggacgaa aaacaagaat tgcttatgaa 660agaatgtgca acattctcaa agggaaattt
caaactgctg cacaaaaagc aatgatggat 720caagtgagag agagccggaa cccagggaat
gctgagttcg aagatctcac ttttctagca 780cggtctgcac tcatattgag agggtcggtt
gctcacaagt cctgcctgcc tgcctgtgtg 840tatggacctg ccgtagccag tgggtacgac
tttgaaaggg agggatactc tctagtcgga 900atagaccctt tcagactgct tcaaaacagc
caagtgtaca gcctaatcag accaaatgag 960aatccagcac acaagagtca actggtgtgg
atggcatgcc attctgccgc atttgaagat 1020ctaagagtat taagcttcat caaagggacg
aaggtgctcc caagagggaa gctttccact 1080agaggagttc aaattgcttc caatgaaaat
atggagacta tggaatcaag tacacttgaa 1140ctgagaagca ggtactgggc cataaggacc
agaagtggag gaaacaccaa tcaacagagg 1200gcatctgcgg gccaaatcag catacaacct
acgttctcag tacagagaaa tctccctttt 1260gacagaacaa ccgttatggc agcattcagt
gggaatacag aggggagaac atctgacatg 1320aggaccgaaa tcataaggat gatggaaagt
gcaagaccag aagatgtgtc tttccagggg 1380cggggagtct tcgagctctc ggacgaaaag
gcagcgagcc cgatcgtgcc ttcctttgac 1440atgagtaatg aaggatctta tttcttcgga
gacaatgcag aggaatacga taatatggat 1500atcgatcctt ataaagaatt cggagctact
gtggagttac tctcgtttct cccgagtgac 1560ttctttcctt cagtacgaga tcttctggat
accgccagcg cgctgtatcg ggaagccttg 1620gagtctcctg agcactgcag ccctcaccat
actgccctca ggcaagcaat tctttgctgg 1680ggggagctca tgactctggc cacgtgggtg
ggtgttaact tggaagatcc agctagcagg 1740gacctggtag tcagttatgt caacactaat
atgggtttaa agttcaggca actcttgtgg 1800tttcacatta gctgcctcac tttcggccga
gaaacagttc tagaatattt ggtgtctttc 1860ggagtgtgga tccgcactcc tccagcttat
aggcctccga atgcccctat cctgtcgaca 1920ctcccggaga ctactgttgt tagacgtcga
ggcaggtcac ctagaagaag aactccttcg 1980cctcgcaggc gaaggtctca atcgccgcgg
cgccgaagat ctcaatctcg ggaatctcaa 2040tgt
2043442230DNAArtificial
sequenceInfluenza B Virus NP Gene Fused to Synthetic HBcAg
44atgtccaaca tggatattga cagtataaat accggaacaa tcgataaaac accagaagaa
60ctgactcccg gaaccagtgg ggcaaccaga ccaatcatca agccagcaac ccttgctccg
120ccaagcaaca aacgaacccg aaatccatct ccagaaagga caaccacaag cagtgaaacc
180gatatcggaa ggaaaatcca aaagaaacaa accccaacag agataaagaa gagcgtctac
240aaaatggtgg taaaactggg tgaattctac aaccagatga tggtcaaagc tggacttaat
300gatgacatgg aaaggaatct aattcaaaat gcacaagctg tggagagaat cctattggct
360gcaactgatg acaagaaaac tgaataccaa aagaaaagga atgccagaga tgtcaaagaa
420gggaaggaag aaatagacca caacaagaca ggaggcacct tttataagat ggtaagagat
480gataaaacca tctacttcag ccctataaaa attacctttt taaaagaaga ggtgaaaaca
540atgtacaaga ccaccatggg gagtgatggt ttcagtggac taaatcacat tatgattgga
600cattcacaga tgaacgatgt ctgtttccaa agatcaaagg gactgaaaag ggttggactt
660gacccttcat taatcagtac ttttgccgga agcacactac ccagaagatc aggtacaact
720ggtgttgcaa tcaaaggagg tggaacttta gtggatgaag ccatccgatt tataggaaga
780gcaatggcag acagagggct actgagagac atcaaggcca agacggccta tgaaaagatt
840cttctgaatc tgaaaaacaa gtgctctgcg ccgcaacaaa aggctctagt tgatcaagtg
900atcggaagta ggaacccagg gattgcagac atagaagacc taactctgct tgccagaagc
960atggtagttg tcagaccctc tgtagcgagc aaagtggtgc ttcccataag catttatgct
1020aaaatacctc aactaggatt caataccgaa gaatactcta tggttgggta tgaagccatg
1080gctctttata atatggcaac acctgtttcc atattaagaa tgggagatga cgcaaaagat
1140aaatctcaac tattcttcat gtcgtgcttc ggagctgcct atgaagatct aagagtgtta
1200tctgcactaa cgggcaccga atttaagcct agatcagcac taaaatgcaa gggtttccat
1260gtcccggcta aggagcaagt agaaggaatg ggggcagctc tgatgtccat caagcttcag
1320ttctgggccc caatgaccag atctggaggg aatgaagtaa gtggagaagg agggtctggt
1380caaataagtt gcagccctgt gtttgcagta gaaagaccta ttgctctaag caagcaagct
1440gtaagaagaa tgctgtcaat gaacgttgaa ggacgtgatg cagatgtcaa aggaaatcta
1500ctcaaaatga tgaatgattc aatggcaaag aaaaccagtg gaaatgcttt cattgggaag
1560aaaatgtttc aaatatcaga caaaaacaaa gtcaatccca ttgagattcc aattaagcag
1620accatcccca atttcttctt tgggagggac acagcagagg attatgatga cctcgattat
1680atggatatcg atccttataa agaattcgga gctactgtgg agttactctc gtttctcccg
1740agtgacttct ttccttcagt acgagatctt ctggataccg ccagcgcgct gtatcgggaa
1800gccttggagt ctcctgagca ctgcagccct caccatactg ccctcaggca agcaattctt
1860tgctgggggg agctcatgac tctggccacg tgggtgggtg ttaacttgga agatccagct
1920agcagggacc tggtagtcag ttatgtcaac actaatatgg gtttaaagtt caggcaactc
1980ttgtggtttc acattagctg cctcactttc ggccgagaaa cagttctaga atatttggtg
2040tctttcggag tgtggatccg cactcctcca gcttataggc ctccgaatgc ccctatcctg
2100tcgacactcc cggagactac tgttgttaga cgtcgaggca ggtcacctag aagaagaact
2160ccttcgcctc gcaggcgaag gtctcaatcg ccgcggcgcc gaagatctca atctcgggaa
2220tctcaatgtt
2230451305DNAArtificial sequenceInfluenza A Virus M1 Fused to Synthetic
HBcAg 45atgagtcttc taaccgaggt cgaaacgtac gtactctcta tcatcccgtc aggccccctc
60aaagccgaga tcgcacagag acttgaagat gtctttgcag ggaagaacac tgatcttgag
120gttctcatgg aatggctaaa gacaagacca atcctgtcac ctctgactaa ggggatttta
180ggatttgtgt tcacgctcac cgtgcccagt gagcgaggac tgcagcgtag acgctttgtc
240caaaatgccc ttaatgggaa cggggatcca aataacatgg acaaagcagt taaactgtat
300aggaagctca agagggagat aacattccat ggggccaaag aaatctcact cagttattct
360gctggtgcac ttgccagttg tatgggcctc atatacaaca ggatgggggc tgtgaccact
420gaagtggcat ttggcctggt atgtgcaacc tgtgaacaga ttgctgactc ccagcatcgg
480tctcataggc aaatggtgac aacaaccaat ccactaatca gacatgagaa cagaatggtt
540ttagccagca ctacagctaa ggctatggag caaatggctg gatcgagtga gcaagcagca
600gaggccatgg aggttgctag tcaggctaga caaatggtgc aagcgatgag aaccattggg
660actcatccta gctccagtgc tggtctgaaa aatgatcttc ttgaaaattt gcaggcctat
720cagaaacgaa tgggggtgca gatgcaacgg ttcaagatgg atatcgatcc ttataaagaa
780ttcggagcta ctgtggagtt actctcgttt ctcccgagtg acttctttcc ttcagtacga
840gatcttctgg ataccgccag cgcgctgtat cgggaagcct tggagtctcc tgagcactgc
900agccctcacc atactgccct caggcaagca attctttgct ggggggagct catgactctg
960gccacgtggg tgggtgttaa cttggaagat ccagctagca gggacctggt agtcagttat
1020gtcaacacta atatgggttt aaagttcagg caactcttgt ggtttcacat tagctgcctc
1080actttcggcc gagaaacagt tctagaatat ttggtgtctt tcggagtgtg gatccgcact
1140cctccagctt ataggcctcc gaatgcccct atcctgtcga cactcccgga gactactgtt
1200gttagacgtc gaggcaggtc acctagaaga agaactcctt cgcctcgcag gcgaaggtct
1260caatcgccgc ggcgccgaag atctcaatct cgggaatctc aatgt
1305461581DNAArtificial sequenceOpen Reading Frame for TPANP from VR4700
46atggatgcaa tgaagagagg gctctgctgt gtgctgctgc tgtgtggagc agtcttcgtt
60tcgcccagcg ctagaggatc gggaatggcg tcccaaggca ccaaacggtc ttacgaacag
120atggagactg atggagaacg ccagaatgcc actgaaatca gagcatccgt cggaaaaatg
180attggtggaa ttggacgatt ctacatccaa atgtgcaccg aactcaaact cagtgattat
240gagggacggt tgatccaaaa cagcttaaca atagagagaa tggtgctctc tgcttttgac
300gaaaggagaa ataaatacct ggaagaacat cccagtgcgg ggaaagatcc taagaaaact
360ggaggaccta tatacaggag agtaaacgga aagtggatga gagaactcat cctttatgac
420aaagaagaaa taaggcgaat ctggcgccaa gctaataatg gtgacgatgc aacggctggt
480ctgactcaca tgatgatctg gcattccaat ttgaatgatg caacttatca gaggacaaga
540gctcttgttc gcaccggaat ggatcccagg atgtgctctc tgatgcaagg ttcaactctc
600cctaggaggt ctggagccgc aggtgctgca gtcaaaggag ttggaacaat ggtgatggaa
660ttggtcagga tgatcaaacg tgggatcaat gatcggaact tctggagggg tgagaatgga
720cgaaaaacaa gaattgctta tgaaagaatg tgcaacattc tcaaagggaa atttcaaact
780gctgcacaaa aagcaatgat ggatcaagtg agagagagcc ggaacccagg gaatgctgag
840ttcgaagatc tcacttttct agcacggtct gcactcatat tgagagggtc ggttgctcac
900aagtcctgcc tgcctgcctg tgtgtatgga cctgccgtag ccagtgggta cgactttgaa
960agagagggat actctctagt cggaatagac cctttcagac tgcttcaaaa cagccaagtg
1020tacagcctaa tcagaccaaa tgagaatcca gcacacaaga gtcaactggt gtggatggca
1080tgccattctg ccgcatttga agatctaaga gtattaagct tcatcaaagg gacgaaggtg
1140ctcccaagag ggaagctttc cactagagga gttcaaattg cttccaatga aaatatggag
1200actatggaat caagtacact tgaactgaga agcaggtact gggccataag gaccagaagt
1260ggaggaaaca ccaatcaaca gagggcatct gcgggccaaa tcagcataca acctacgttc
1320tcagtacaga gaaatctccc ttttgacaga acaaccatta tggcagcatt caatgggaat
1380acagagggaa gaacatctga catgaggacc gaaatcataa ggatgatgga aagtgcaaga
1440ccagaagatg tgtctttcca ggggcgggga gtcttcgagc tctcggacga aaaggcagcg
1500agcccgatcg tgccttcctt tgacatgagt aatgaaggat cttatttctt cggagacaat
1560gcagatgagt acgacaatta a
158147333DNAArtificial sequenceOpen Reading Frame for TPAM2 DeltaTM from
VR4707 47atggatgcaa tgaagagagg gctctgctgt gtgctgctgc tgtgtggagc
agtcttcgtt 60tcgcccagcg ctagaggatc gggaatgagt cttctgaccg aggtcgaaac
ccctatcaga 120aacgaatggg ggtgcagatg caacgattca agtgatcctg gcggcggcga
tcggcttttt 180ttcaaatgca tttatcggcg ctttaaatac ggcttgaaaa gagggccttc
taccgaagga 240gtgccagagt ctatgaggga agaatatcgg aaggaacagc agaatgctgt
ggatgttgac 300gatagccatt ttgtcagcat cgagctggag taa
3334824DNAArtificial sequencePrimer Used to Amplify TPAM2
Fragment 48gccgaatcca tggatgcaat gaag
244936DNAArtificial sequencePrimer Used to Amplify TPAM2 Fragment
49ggtgccttgg gacgccatat cacttgaatc gttgca
365036DNAArtificial sequencePrimer Used to Amplify NP Gene 50tgcaacgatt
caagtgatat ggcgtcccaa ggcacc
365124DNAArtificial sequencePrimer Used to Amplify NP Gene 51gccgtcgact
taattgtcgt actc
24521653DNAArtificial sequenceOpen Reading Frame for TPAM2NP from VR4710
52atggatgcaa tgaagagagg gctctgctgt gtgctgctgc tgtgtggagc agtcttcgtt
60tcgcccagcg ctagaggatc gggaatgagt cttctgaccg aggtcgaaac ccctatcaga
120aacgaatggg ggtgcagatg caacgattca agtgatatgg cgtcccaagg caccaaacgg
180tcttacgaac agatggagac tgatggagaa cgccagaatg ccactgaaat cagagcatcc
240gtcggaaaaa tgattggtgg aattggacga ttctacatcc aaatgtgcac cgaactcaaa
300ctcagtgatt atgagggacg gttgatccaa aacagcttaa caatagagag aatggtgctc
360tctgcttttg acgaaaggag aaataaatac ctggaagaac atcccagtgc ggggaaagat
420cctaagaaaa ctggaggacc tatatacagg agagtaaacg gaaagtggat gagagaactc
480atcctttatg acaaagaaga aataaggcga atctggcgcc aagctaataa tggtgacgat
540gcaacggctg gtctgactca catgatgatc tggcattcca atttgaatga tgcaacttat
600cagaggacaa gagctcttgt tcgcaccgga atggatccca ggatgtgctc tctgatgcaa
660ggttcaactc tccctaggag gtctggagcc gcaggtgctg cagtcaaagg agttggaaca
720atggtgatgg aattggtcag gatgatcaaa cgtgggatca atgatcggaa cttctggagg
780ggtgagaatg gacgaaaaac aagaattgct tatgaaagaa tgtgcaacat tctcaaaggg
840aaatttcaaa ctgctgcaca aaaagcaatg atggatcaag tgagagagag ccggaaccca
900gggaatgctg agttcgaaga tctcactttt ctagcacggt ctgcactcat attgagaggg
960tcggttgctc acaagtcctg cctgcctgcc tgtgtgtatg gacctgccgt agccagtggg
1020tacgactttg aaagagaggg atactctcta gtcggaatag accctttcag actgcttcaa
1080aacagccaag tgtacagcct aatcagacca aatgagaatc cagcacacaa gagtcaactg
1140gtgtggatgg catgccattc tgccgcattt gaagatctaa gagtattaag cttcatcaaa
1200gggacgaagg tgctcccaag agggaagctt tccactagag gagttcaaat tgcttccaat
1260gaaaatatgg agactatgga atcaagtaca cttgaactga gaagcaggta ctgggccata
1320aggaccagaa gtggaggaaa caccaatcaa cagagggcat ctgcgggcca aatcagcata
1380caacctacgt tctcagtaca gagaaatctc ccttttgaca gaacaaccat tatggcagca
1440ttcaatggga atacagaggg aagaacatct gacatgagga ccgaaatcat aaggatgatg
1500gaaagtgcaa gaccagaaga tgtgtctttc caggggcggg gagtcttcga gctctcggac
1560gaaaaggcag cgagcccgat cgtgccttcc tttgacatga gtaatgaagg atcttatttc
1620ttcggagaca atgcagatga gtacgacaat taa
16535335DNAArtificial sequencePrimer Used to Amplify the HA Gene
53gggctagcgc cgccaccatg aagaccatca ttgct
355426DNAArtificial sequencePrimer Used to Amplify the HA Gene
54ccgtcgactc aaatgcaaat gttgca
26551701DNAArtificial sequenceOpen Reading Frame for HA H3N2 from VR4750
55atgaagacca tcattgcttt gagctacatt ttctgtctgg ctctcggcca agaccttcca
60ggaaatgaca acaacacagc aacgctgtgc ctgggacatc atgcggtgcc aaacggaaca
120ctagtgaaaa caatcacaga tgatcagatt gaagtgacta atgctactga gctagttcag
180agctcctcaa cggggaaaat atgcaacaat cctcatcgaa tccttgatgg aatagactgc
240acactgatag atgctctatt gggggaccct cattgtgatg tttttcaaaa tgagacatgg
300gaccttttcg ttgaacgcag caaagctttc agcaactgtt acccttatga tgtgccagat
360tatgcccccc ttaggtcact agttgcctcg tcaggcactc tggagtttat cactgagggt
420ttcacttgga ctggggtcac tcagaatggg ggaagcagtg cttgcaaaag gggacctggt
480agcggttttt tcagtagact gaactggttg accaaatcag gaagcacata tccagtgctg
540aacgtgacta tgccaaacaa tgacaatttt gacaaactat acatttgggg ggttcaccac
600ccgagcacga accaagaaca aaccagcctg tatgttcaag catcagggag agtcacagtc
660tctaccagga gaagccagca aactataatc ccgaatatcg agtccagacc ctgggtaagg
720ggtctgtcta gtagaataag catctattgg acaatagtta agccgggaga cgtactggta
780attaatagta atgggaacct aatcgctcct cggggttatt tcaagatgcg cactgggaaa
840agctcaataa tgaggtcaga tgcacctatt gatacctgta tttctgaatg catcactcca
900aatggaagca ttcccaatga caagcccttt caaaacgtaa acaaaatcac gtatggagca
960tgccccaagt atgttaagca aaacaccctg aagttggcaa cagggatgcg gaatgtacca
1020gagaaacaaa ctagaggcct attcggcgca atagcaggtt tcatagaaaa tggttgggag
1080ggaatgatag acggttggta cggtttcagg catcaaaatt ctgagggcac aggacaagca
1140gcagatctta aaagcactca agcagccatc gaccaaatca atgggaaatt gaacaggata
1200atcaagaaga cgaacgagaa attccatcaa atcgaaaagg aattctcaga agtagaaggg
1260agaattcagg acctcgagaa atacgttgaa gacactaaaa tagatctctg gtcttacaat
1320gcggagcttc ttgtcgctct ggagaatcaa catacaattg acctgactga ctcggaaatg
1380aacaagctgt ttgaaaaaac aaggaggcaa ctgagggaaa atgctgaaga catgggcaat
1440ggttgcttca aaatatacca caaatgtgac aacgcttgca tagagtcaat cagaactggg
1500acttatgacc atgatgtata cagagacgaa gcattaaaca accggtttca gatcaaaggt
1560gttgaactga agtctggata caaagactgg atcctgtgga tttcctttgc catatcatgc
1620tttttgcttt gtgttgtttt gctggggttc atcatgtggg cctgccagaa aggcaacatt
1680aggtgcaaca tttgcatttg a
17015635DNAArtificial sequencePrimer Used to Amplify the HA Gene
56gggctagcgc cgccaccatg aaggcaaacc tactg
355726DNAArtificial sequencePrimer Used to Amplify the HA Gene
57ccgtcgactc agatgcatat tctgca
26581701DNAArtificial sequenceOpen Reading Frame for HA H1N1 from VR4752
58atgaaggcaa acctactggt cctgttatgt gcacttgcag ctgcagatgc agacacaata
60tgtataggct accatgcgaa caattcaacc gacactgttg acacagtgct cgagaagaat
120gtgacagtga cacactctgt taacctgctc gaagacagcc acaacggaaa actatgtaga
180ttaaaaggaa tagccccact acaattgggg aaatgtaaca tcgccggatg gctcttggga
240aacccagaat gcgacccact gcttccagtg agatcatggt cctacattgt agaaacacca
300aactctgaga atggaatatg ttatccagga gatttcatcg actatgagga gctgagggag
360caattgagct cagtgtcatc attcgaaaga ttcgaaatat ttcccaaaga aagctcatgg
420cccaaccaca acacaaccaa aggagtaacg gcagcatgct cccatgcggg gaaaagcagt
480ttttacagaa atttgctatg gctgacggag aaggagggct catacccaaa gctgaaaaat
540tcttatgtga acaagaaagg gaaagaagtc cttgtactgt ggggtattca tcacccgtct
600aacagtaagg atcaacagaa tatctatcag aatgaaaatg cttatgtctc tgtagtgact
660tcaaattata acaggagatt taccccggaa atagcagaaa gacccaaagt aagagatcaa
720gctgggagga tgaactatta ctggaccttg ctaaaacccg gagacacaat aatatttgag
780gcaaatggaa atctaatagc accaaggtat gctttcgcac tgagtagagg ctttgggtcc
840ggcatcatca cctcaaacgc atcaatgcat gagtgtaaca cgaagtgtca aacacccctg
900ggagctataa acagcagtct ccctttccag aatatacacc cagtcacaat aggagagtgc
960ccaaaatacg tcaggagtgc caaattgagg atggttacag gactaaggaa cattccgtcc
1020attcaatcca gaggtctatt tggagccatt gccggtttta ttgaaggggg atggactgga
1080atgatagatg gatggtacgg ttatcatcat cagaatgaac agggatcagg ctatgcagcg
1140gatcaaaaaa gcacacaaaa tgccattaac gggattacaa acaaggtgaa ctctgttatc
1200gagaaaatga acattcaatt cacagctgtg ggtaaagaat tcaacaaatt agaaaaaagg
1260atggaaaatt taaataaaaa agttgatgat ggatttctgg acatttggac atataatgca
1320gaattgttag ttctactgga aaatgaaagg actctggatt tccatgactc aaatgtgaag
1380aatctgtatg agaaagtaaa aagccaatta aagaataatg ccaaagaaat cggaaatgga
1440tgttttgagt tctaccacaa gtgtgacaat gaatgcatgg aaagtgtaag aaatgggact
1500tatgattatc ccaaatattc agaagagtca aagttgaaca gggaaaaggt agatggagtg
1560aaattggaat caatggggat ctatcagatt ctggcgatct actcaactgt cgccagttca
1620ctggtgcttt tggtctccct gggggcaatc agtttctgga tgtgttctaa tggatctttg
1680cagtgcagaa tatgcatctg a
1701591050DNAArtificial sequenceOpen Reading Frame for the M2M1 Fusion
from VR4755 59atgagcctgc tgaccgaggt ggagaccccc atcagaaacg agtggggctg
cagatgcaac 60gacagcagcg accccctggt ggtggccgcc agcatcatcg gcatcctgca
cctgatcctg 120tggatcctgg acagactgtt cttcaagtgc atctacagac tgttcaagca
cggcctgaag 180agaggcccca gcaccgaggg cgtgcccgag agcatgagag aggagtacag
aaaggagcag 240cagaacgccg tggacgccga cgacagccac ttcgtgagca tcgagctgga
gatgtccctg 300ctgacagaag tggaaacata cgtgctgagc atcgtgccca gcggccccct
gaaggccgag 360atcgcccaga gactggagga cgtgttcgcc ggcaagaaca ccgacctgga
ggccctgatg 420gagtggctga agaccagacc catcctgagc cccctgacca agggcatcct
gggcttcgtg 480ttcaccctga ccgtgcccag cgagagaggc ctgcagagaa gaagattcgt
gcagaacgcc 540ctgaacggca acggcgaccc caacaacatg gaccgggccg tgaagctgta
ccggaagctg 600aagagagaga tcaccttcca cggcgccaag gagatcgccc tgagctacag
cgccggcgcc 660ctggccagct gcatgggcct gatctacaac agaatgggcg ccgtgaccac
cgaggtggcc 720ttcggcctgg tgtgcgccac ctgcgagcag atcgccgaca gccagcacag
aagccacaga 780cagatggtgg ccaccaccaa ccccctgatc agacacgaga acagaatggt
gctggccagc 840accaccgcca aggccatgga gcagatggcc ggcagcagcg agcaggccgc
cgaggccatg 900gagatcgcca gccaggccag acagatggtg caggccatga gagccatcgg
cacccacccc 960agcagcagcg ccggcctgaa ggacgacctg ctggagaacc tgcagaccta
ccagaagaga 1020atgggcgtgc agatgcagag attcaagtga
105060982DNAArtificial sequenceOpen Reading Frame for Fragment
7 from VR4756 60atgagccttc taaccgaggt cgaaacgtat gttctctcta tcgttccatc
aggccccctc 60aaagccgaaa tcgcgcagag acttgaagat gtctttgctg ggaaaaacac
agatcttgag 120gctctcatgg aatggctaaa gacaagacca atcctgtcac ctctgactaa
ggggattttg 180gggtttgtgt tcacgctcac cgtgcccagt gagcgaggac tgcagcgtag
acgctttgtc 240caaaatgccc tcaatgggaa tggggatcca aataacatgg acagagcagt
taaactatat 300agaaaactta agagggagat tacattccat ggggccaaag aaatagcact
cagttattct 360gctggtgcac ttgccagttg catgggcctc atatacaaca gaatgggggc
tgtaaccact 420gaagtggcct ttggcctggt atgtgcaaca tgtgaacaga ttgctgactc
ccagcacagg 480tctcataggc aaatggtggc aacaaccaat ccattaataa ggcatgagaa
cagaatggtt 540ttggccagca ctacagctaa ggctatggag caaatggctg gatcaagtga
gcaggcagcg 600gaggccatgg aaattgctag tcaggccagg caaatggtgc aggcaatgag
agccattggg 660actcatccta gctccagtgc tggtctaaaa gatgatcttc ttgaaaattt
gcagacctat 720cagaaacgaa tgggggtgca gatgcaacga ttcaagtgac ccgcttgttg
ttgctgcgag 780tatcattggg atcttgcact tgatattgtg gattcttgat cgtctttttt
tcaaatgcat 840ctatcgactc ttcaaacacg gtctgaaaag agggccttct acggaaggag
tacctgagtc 900tatgagggaa gaatatcgaa aggaacagca gaatgctgtg gatgctgacg
acagtcattt 960tgtcagcata gagctggagt aa
98261982DNAArtificial sequenceCodon Optimized Segment 7 from
VR4763 61atgagcctgc tgaccgaggt cgaaacgtat gttctctcta tcgtgcccag
cggccccctg 60aaggccgaga tcgcccagag actggaggac gtgttcgccg gcaagaacac
cgacctggag 120gccctgatgg agtggctgaa gaccagaccc atcctgagcc ccctgaccaa
gggcatcctg 180ggcttcgtgt tcaccctgac cgtgcccagc gagagaggcc tgcagagaag
aagattcgtg 240cagaacgccc tgaacggcaa cggcgacccc aacaacatgg acagagccgt
gaagctgtac 300agaaagctga agagagagat caccttccac ggcgccaagg agatcgccct
gagctacagc 360gccggcgccc tggccagctg catgggcctg atctacaaca gaatgggcgc
cgtgaccacc 420gaggtggcct tcggcctggt gtgcgccacc tgcgagcaga tcgccgacag
ccagcacaga 480agccacagac agatggtggc caccaccaac cccctgatca gacacgagaa
cagaatggtg 540ctggccagca ccaccgccaa ggccatggag cagatggccg gcagcagcga
gcaggccgcc 600gaggccatgg agatcgccag ccaggccaga cagatggtgc aggccatgag
agccatcggc 660acccacccca gcagcagcgc cggcctgaaa gatgatcttc ttgaaaattt
gcagacctat 720cagaaacgaa tgggggtgca gatgcaacga ttcaagtgac cccctggtgg
tggccgccag 780catcatcggc atcctgcacc tgatcctgtg gatcctggac agactgttct
tcaagtgcat 840ctacagactg ttcaagcacg gcctgaagag aggccccagc accgagggcg
tgcccgagag 900catgagagag gagtacagaa aggagcagca gaacgccgtg gacgccgacg
acagccactt 960cgtgagcatc gagctggagt ga
982621569DNAArtificial sequenceOpen Reading Frame for eM2NP
Codon Optimized by Contract 62atgagcttgc tcactgaagt cgagacacca
atcagaaacg aatggggatg tagatgcaac 60gatagctcag acatggcctc ccagggaacc
aaaagaagct atgaacagat ggagactgac 120ggagagagac agaacgccac agagatcaga
gctagtgtag gaaagatgat agacggtatc 180gggcgatttt acattcaaat gtgtacggaa
ttgaaactca gcgactatga aggcagactt 240atccagaact cactcacaat tgagcgcatg
gtactcagtg catttgatga aagaaggaat 300aggtacctcg aagaacaccc cagcgccggc
aaagatccca agaagactgg cggcccaatt 360tacagaagag tggacggtaa gtggatgaga
gagctggtat tgtacgataa agaagaaatt 420agaagaatct ggaggcaagc aaacaatgga
gaggatgcta cagctggcct gacccacatg 480atgatttggc atagtaacct gaatgatacc
acctaccagc ggacaagggc tctcgttcga 540accgggatgg atccccgcat gtgctcattg
atgcagggta gtacactccc gaggaggtca 600ggcgcggccg gtgcagccgt gaaaggaatc
ggcactatgg taatggaatt gataagaatg 660attaaaaggg ggattaatga caggaacttt
tggagaggag aaaatggacg caaaacaagg 720agtgcgtatg aacggatgtg caatattttg
aaaggaaaat tccaaactgc agcacagcgc 780gccatgatgg atcaggtacg agaaagtcgc
aacccaggta atgctgaaat agaggacctt 840atatttctcg cccggagtgc tctcatactt
agaggaagcg tggcccataa aagttgtctc 900cccgcatgcg tatacggtcc cgctgtgtct
tccggatacg attttgaaaa agagggatat 960tcattggtgg gaatcgaccc ttttaagctg
cttcagaact cacaggttta cagtttgatt 1020agaccaaacg agaacccagc ccacaaatca
caactcgtgt ggatggcatg ccactctgcc 1080gctttcgaag atctgagact gctctcattt
attagaggca ctaaagtgag cccgagggga 1140aaactgagca cacgaggagt acagatagca
tctaacgaaa atatggataa tatgggatct 1200agcacactcg aattgaggtc acgatactgg
gctattagaa cacggagcgg agggaacacc 1260aaccagcaga gagcatccgc cggtcagata
agcgttcagc ctacattttc agtacaacga 1320aacctgccat ttgaaaagag tacagtgatg
gccgcattta ctggcaacac cgagggacga 1380acaagcgaca tgagagcaga gattattaga
atgatggaag gagctaaacc agaggaggtt 1440tcatttagag gaaggggagt cttcgaattg
tccgatgaga aagccacaaa tcccatagta 1500cctagcttcg acatgtccaa cgaaggctct
tacttttttg gtgacaatgc cgaagagtac 1560gacaattga
1569631569DNAArtificial sequenceOpen
Reading Frame for eM2NP Codon Optimized by Applicants 63atgagcctgc
tgaccgaggt ggagaccccc atcagaaacg agtggggctg cagatgcaac 60gacagcagcg
acatggccag ccagggcacc aagagaagct acgagcagat ggagaccgac 120ggcgagagac
agaacgccac cgagatcaga gccagcgtgg gcaagatgat cgacggcatc 180ggcagattct
acatccagat gtgcaccgag ctgaagctga gcgactacga gggcagactg 240atccagaaca
gcctgaccat cgagagaatg gtgctgagcg ccttcgacga gagaagaaac 300agatacctgg
aggagcaccc cagcgccggc aaggacccca agaagaccgg cggccccatc 360tacagaagag
tggacggcaa gtggatgaga gagctggtgc tgtacgacaa ggaggagatc 420agaagaatct
ggagacaggc caacaacggc gaggacgcca ccgccggcct gacccacatg 480atgatctggc
acagcaacct gaacgacacc acctaccaga gaaccagagc cctggtgcgg 540accggcatgg
accccagaat gtgcagcctg atgcagggca gcaccctgcc cagaagaagc 600ggcgccgccg
gcgccgccgt gaagggcatc ggcaccatgg tgatggagct gatcagaatg 660atcaagagag
gcatcaacga cagaaacttc tggagaggcg agaacggcag aaagaccaga 720agcgcctacg
agagaatgtg caacatcctg aagggcaagt tccagaccgc cgcccagaga 780gccatgatgg
accaggtccg ggagagcaga aaccccggca acgccgagat cgaggacctg 840atcttcctgg
ccagaagcgc cctgatcctg agaggcagcg tggcccacaa gagctgcctg 900cccgcctgcg
tgtacggccc cgccgtgagc agcggctacg acttcgagaa ggagggctac 960agcctggtgg
gcatcgaccc cttcaagctg ctgcagaaca gccaggtgta cagcctgatc 1020agacccaacg
agaaccccgc ccacaagagc cagctggtgt ggatggcctg ccacagcgcc 1080gccttcgagg
acctgagact gctgagcttc atcagaggca ccaaggtgtc ccccagaggc 1140aagctgagca
ccagaggcgt gcagatcgcc agcaacgaga acatggacaa catgggcagc 1200agcaccctgg
agctgagaag cagatactgg gccatcagaa ccagaagcgg cggcaacacc 1260aaccagcaga
gagccagcgc cggccagatc agcgtgcagc ccaccttcag cgtgcagaga 1320aacctgccct
tcgagaagag caccgtgatg gccgccttca ccggcaacac cgagggcaga 1380accagcgaca
tgagagccga gatcatcaga atgatggagg gcgccaagcc cgaggaggtg 1440tccttcagag
gcagaggcgt gttcgagctg agcgacgaga aggccaccaa ccccatcgtg 1500cctagcttcg
acatgagcaa cgagggcagc tacttcttcg gcgacaacgc cgaggagtac 1560gacaactga
15696430DNAArtificial sequencePrimer Used to Amplify the M2 Gene
64gccgaattcg ccaccatgag cctgctgacc
306533DNAArtificial sequencePrimer Used to Amplify the M2 Gene
65gccgtcgact gatcactcca gctcgatgct cac
3366294DNAArtificial sequenceOpen Reading Frame for M2 Gene from VR4759
66atgagcctgc tgaccgaggt ggagaccccc atcagaaacg agtggggctg cagatgcaac
60gacagcagcg accccctggt ggtggccgcc agcatcatcg gcatcctgca cctgatcctg
120tggatcctgg acagactgtt cttcaagtgc atctacagac tgttcaagca cggcctgaag
180agaggcccca gcaccgaggg cgtgcccgag agcatgagag aggagtacag aaaggagcag
240cagaacgccg tggacgccga cgacagccac ttcgtgagca tcgagctgga gtga
2946736DNAArtificial sequencePrimer Used Amplify M1 Gene from VR4755
67gccgaattcg ccaccatgtc cctgctgaca gaagtg
366831DNAArtificial sequencePrimer Used to Amplify M1 Gene from VR4755
68gccgtcgact gatcacttga atctctgcat c
3169759DNAArtificial sequenceOpen Reading Frame for M1 Gene from VR4760
69atgtccctgc tgacagaagt ggaaacatac gtgctgagca tcgtgcccag cggccccctg
60aaggccgaga tcgcccagag actggaggac gtgttcgccg gcaagaacac cgacctggag
120gccctgatgg agtggctgaa gaccagaccc atcctgagcc ccctgaccaa gggcatcctg
180ggcttcgtgt tcaccctgac cgtgcccagc gagagaggcc tgcagagaag aagattcgtg
240cagaacgccc tgaacggcaa cggcgacccc aacaacatgg accgggccgt gaagctgtac
300cggaagctga agagagagat caccttccac ggcgccaagg agatcgccct gagctacagc
360gccggcgccc tggccagctg catgggcctg atctacaaca gaatgggcgc cgtgaccacc
420gaggtggcct tcggcctggt gtgcgccacc tgcgagcaga tcgccgacag ccagcacaga
480agccacagac agatggtggc caccaccaac cccctgatca gacacgagaa cagaatggtg
540ctggccagca ccaccgccaa ggccatggag cagatggccg gcagcagcga gcaggccgcc
600gaggccatgg agatcgccag ccaggccaga cagatggtgc aggccatgag agccatcggc
660acccacccca gcagcagcgc cggcctgaag gacgacctgc tggagaacct gcagacctac
720cagaagagaa tgggcgtgca gatgcagaga ttcaagtga
7597038DNAArtificial sequencePrimer Used to Amplify NP Gene from VR4757
70gccgaattcg ccaccatggc ctcccaggga accaaaag
387130DNAArtificial sequencePrimer Used to Amplify NP Gene from VR4757
71gccgtcgact gatcaattgt cgtactcttc
30721497DNAArtificial sequenceOpen Reading Frame for NP Codon Optimized
by Contract 72atggcctccc agggaaccaa aagaagctat gaacagatgg agactgacgg
agagagacag 60aacgccacag agatcagagc tagtgtagga aagatgatag acggtatcgg
gcgattttac 120attcaaatgt gtacggaatt gaaactcagc gactatgaag gcagacttat
ccagaactca 180ctcacaattg agcgcatggt actcagtgca tttgatgaaa gaaggaatag
gtacctcgaa 240gaacacccca gcgccggcaa agatcccaag aagactggcg gcccaattta
cagaagagtg 300gacggtaagt ggatgagaga gctggtattg tacgataaag aagaaattag
aagaatctgg 360aggcaagcaa acaatggaga ggatgctaca gctggcctga cccacatgat
gatttggcat 420agtaacctga atgataccac ctaccagcgg acaagggctc tcgttcgaac
cgggatggat 480ccccgcatgt gctcattgat gcagggtagt acactcccga ggaggtcagg
cgcggccggt 540gcagccgtga aaggaatcgg cactatggta atggaattga taagaatgat
taaaaggggg 600attaatgaca ggaacttttg gagaggagaa aatggacgca aaacaaggag
tgcgtatgaa 660cggatgtgca atattttgaa aggaaaattc caaactgcag cacagcgcgc
catgatggat 720caggtacgag aaagtcgcaa cccaggtaat gctgaaatag aggaccttat
atttctcgcc 780cggagtgctc tcatacttag aggaagcgtg gcccataaaa gttgtctccc
cgcatgcgta 840tacggtcccg ctgtgtcttc cggatacgat tttgaaaaag agggatattc
attggtggga 900atcgaccctt ttaagctgct tcagaactca caggtttaca gtttgattag
accaaacgag 960aacccagccc acaaatcaca actcgtgtgg atggcatgcc actctgccgc
tttcgaagat 1020ctgagactgc tctcatttat tagaggcact aaagtgagcc cgaggggaaa
actgagcaca 1080cgaggagtac agatagcatc taacgaaaat atggataata tgggatctag
cacactcgaa 1140ttgaggtcac gatactgggc tattagaaca cggagcggag ggaacaccaa
ccagcagaga 1200gcatccgccg gtcagataag cgttcagcct acattttcag tacaacgaaa
cctgccattt 1260gaaaagagta cagtgatggc cgcatttact ggcaacaccg agggacgaac
aagcgacatg 1320agagcagaga ttattagaat gatggaagga gctaaaccag aggaggtttc
atttagagga 1380aggggagtct tcgaattgtc cgatgagaaa gccacaaatc ccatagtacc
tagcttcgac 1440atgtccaacg aaggctctta cttttttggt gacaatgccg aagagtacga
caattga 14977336DNAArtificial sequencePrimer Used to Amplify NP Gene
from VR4758 73gccgaattcg ccaccatggc cagccagggc accaag
367428DNAArtificial sequencePrimer Used to Amplify NP Gene from
VR4758 74gccgtcgact gatcagttgt cgtactcc
28751497DNAArtificial sequenceOpen Reading Frame for NP Codon
Optimized by Applicants from VR4762 75atggccagcc agggcaccaa
gagaagctac gagcagatgg agaccgacgg cgagagacag 60aacgccaccg agatcagagc
cagcgtgggc aagatgatcg acggcatcgg cagattctac 120atccagatgt gcaccgagct
gaagctgagc gactacgagg gcagactgat ccagaacagc 180ctgaccatcg agagaatggt
gctgagcgcc ttcgacgaga gaagaaacag atacctggag 240gagcacccca gcgccggcaa
ggaccccaag aagaccggcg gccccatcta cagaagagtg 300gacggcaagt ggatgagaga
gctggtgctg tacgacaagg aggagatcag aagaatctgg 360agacaggcca acaacggcga
ggacgccacc gccggcctga cccacatgat gatctggcac 420agcaacctga acgacaccac
ctaccagaga accagagccc tggtgcggac cggcatggac 480cccagaatgt gcagcctgat
gcagggcagc accctgccca gaagaagcgg cgccgccggc 540gccgccgtga agggcatcgg
caccatggtg atggagctga tcagaatgat caagagaggc 600atcaacgaca gaaacttctg
gagaggcgag aacggcagaa agaccagaag cgcctacgag 660agaatgtgca acatcctgaa
gggcaagttc cagaccgccg cccagagagc catgatggac 720caggtccggg agagcagaaa
ccccggcaac gccgagatcg aggacctgat cttcctggcc 780agaagcgccc tgatcctgag
aggcagcgtg gcccacaaga gctgcctgcc cgcctgcgtg 840tacggccccg ccgtgagcag
cggctacgac ttcgagaagg agggctacag cctggtgggc 900atcgacccct tcaagctgct
gcagaacagc caggtgtaca gcctgatcag acccaacgag 960aaccccgccc acaagagcca
gctggtgtgg atggcctgcc acagcgccgc cttcgaggac 1020ctgagactgc tgagcttcat
cagaggcacc aaggtgtccc ccagaggcaa gctgagcacc 1080agaggcgtgc agatcgccag
caacgagaac atggacaaca tgggcagcag caccctggag 1140ctgagaagca gatactgggc
catcagaacc agaagcggcg gcaacaccaa ccagcagaga 1200gccagcgccg gccagatcag
cgtgcagccc accttcagcg tgcagagaaa cctgcccttc 1260gagaagagca ccgtgatggc
cgccttcacc ggcaacaccg agggcagaac cagcgacatg 1320agagccgaga tcatcagaat
gatggagggc gccaagcccg aggaggtgtc cttcagaggc 1380agaggcgtgt tcgagctgag
cgacgagaag gccaccaacc ccatcgtgcc tagcttcgac 1440atgagcaacg agggcagcta
cttcttcggc gacaacgccg aggagtacga caactga 149776498PRTArtificial
sequenceNP Consensus Sequence 76Met Ala Ser Gln Gly Thr Lys Arg Ser Tyr
Glu Gln Met Glu Thr Asp1 5 10
15Gly Glu Arg Gln Asn Ala Thr Glu Ile Arg Ala Ser Val Gly Lys Met
20 25 30Ile Asp Gly Ile Gly Arg
Phe Tyr Ile Gln Met Cys Thr Glu Leu Lys 35 40
45Leu Ser Asp Tyr Glu Gly Arg Leu Ile Gln Asn Ser Leu Thr
Ile Glu 50 55 60Arg Met Val Leu Ser
Ala Phe Asp Glu Arg Arg Asn Arg Tyr Leu Glu65 70
75 80Glu His Pro Ser Ala Gly Lys Asp Pro Lys
Lys Thr Gly Gly Pro Ile 85 90
95Tyr Arg Arg Val Asp Gly Lys Trp Met Arg Glu Leu Val Leu Tyr Asp
100 105 110Lys Glu Glu Ile Arg
Arg Ile Trp Arg Gln Ala Asn Asn Gly Glu Asp 115
120 125Ala Thr Ala Gly Leu Thr His Met Met Ile Trp His
Ser Asn Leu Asn 130 135 140Asp Thr Thr
Tyr Gln Arg Thr Arg Ala Leu Val Arg Thr Gly Met Asp145
150 155 160Pro Arg Met Cys Ser Leu Met
Gln Gly Ser Thr Leu Pro Arg Arg Ser 165
170 175Gly Ala Ala Gly Ala Ala Val Lys Gly Ile Gly Thr
Met Val Met Glu 180 185 190Leu
Ile Arg Met Ile Lys Arg Gly Ile Asn Asp Arg Asn Phe Trp Arg 195
200 205Gly Glu Asn Gly Arg Lys Thr Arg Ser
Ala Tyr Glu Arg Met Cys Asn 210 215
220Ile Leu Lys Gly Lys Phe Gln Thr Ala Ala Gln Arg Ala Met Met Asp225
230 235 240Gln Val Arg Glu
Ser Arg Asn Pro Gly Asn Ala Glu Ile Glu Asp Leu 245
250 255Ile Phe Leu Ala Arg Ser Ala Leu Ile Leu
Arg Gly Ser Val Ala His 260 265
270Lys Ser Cys Leu Pro Ala Cys Val Tyr Gly Pro Ala Val Ser Ser Gly
275 280 285Tyr Asp Phe Glu Lys Glu Gly
Tyr Ser Leu Val Gly Ile Asp Pro Phe 290 295
300Lys Leu Leu Gln Asn Ser Gln Val Tyr Ser Leu Ile Arg Pro Asn
Glu305 310 315 320Asn Pro
Ala His Lys Ser Gln Leu Val Trp Met Ala Cys His Ser Ala
325 330 335Ala Phe Glu Asp Leu Arg Leu
Leu Ser Phe Ile Arg Gly Thr Lys Val 340 345
350Ser Pro Arg Gly Lys Leu Ser Thr Arg Gly Val Gln Ile Ala
Ser Asn 355 360 365Glu Asn Met Asp
Asn Met Gly Ser Ser Thr Leu Glu Leu Arg Ser Arg 370
375 380Tyr Trp Ala Ile Arg Thr Arg Ser Gly Gly Asn Thr
Asn Gln Gln Arg385 390 395
400Ala Ser Ala Gly Gln Ile Ser Val Gln Pro Thr Phe Ser Val Gln Arg
405 410 415Asn Leu Pro Phe Glu
Lys Ser Thr Val Met Ala Ala Phe Thr Gly Asn 420
425 430Thr Glu Gly Arg Thr Ser Asp Met Arg Ala Glu Ile
Ile Arg Met Met 435 440 445Glu Gly
Ala Lys Pro Glu Glu Val Ser Phe Arg Gly Arg Gly Val Phe 450
455 460Glu Leu Ser Asp Glu Lys Ala Thr Asn Pro Ile
Val Pro Ser Phe Asp465 470 475
480Met Ser Asn Glu Gly Ser Tyr Phe Phe Gly Asp Asn Ala Glu Glu Tyr
485 490 495Asp
Asn77252PRTInfluenza A Virus 77Met Ser Leu Leu Thr Glu Val Glu Thr Tyr
Val Leu Ser Ile Val Pro1 5 10
15Ser Gly Pro Leu Lys Ala Glu Ile Ala Gln Arg Leu Glu Asp Val Phe
20 25 30Ala Gly Lys Asn Thr Asp
Leu Glu Ala Leu Met Glu Trp Leu Lys Thr 35 40
45Arg Pro Ile Leu Ser Pro Leu Thr Lys Gly Ile Leu Gly Phe
Val Phe 50 55 60Thr Leu Thr Val Pro
Ser Glu Arg Gly Leu Gln Arg Arg Arg Phe Val65 70
75 80Gln Asn Ala Leu Asn Gly Asn Gly Asp Pro
Asn Asn Met Asp Arg Ala 85 90
95Val Lys Leu Tyr Arg Lys Leu Lys Arg Glu Ile Thr Phe His Gly Ala
100 105 110Lys Glu Ile Ala Leu
Ser Tyr Ser Ala Gly Ala Leu Ala Ser Cys Met 115
120 125Gly Leu Ile Tyr Asn Arg Met Gly Ala Val Thr Thr
Glu Val Ala Phe 130 135 140Gly Leu Val
Cys Ala Thr Cys Glu Gln Ile Ala Asp Ser Gln His Arg145
150 155 160Ser His Arg Gln Met Val Ala
Thr Thr Asn Pro Leu Ile Arg His Glu 165
170 175Asn Arg Met Val Leu Ala Ser Thr Thr Ala Lys Ala
Met Glu Gln Met 180 185 190Ala
Gly Ser Ser Glu Gln Ala Ala Glu Ala Met Glu Ile Ala Ser Gln 195
200 205Ala Arg Gln Met Val Gln Ala Met Arg
Ala Ile Gly Thr His Pro Ser 210 215
220Ser Ser Ala Gly Leu Lys Asp Asp Leu Leu Glu Asn Leu Gln Thr Tyr225
230 235 240Gln Lys Arg Met
Gly Val Gln Met Gln Arg Phe Lys 245
2507897PRTInfluenza A Virus 78Met Ser Leu Leu Thr Glu Val Glu Thr Pro Ile
Arg Asn Glu Trp Gly1 5 10
15Cys Arg Cys Asn Asp Ser Ser Asp Pro Leu Val Val Ala Ala Ser Ile
20 25 30Ile Gly Ile Leu His Leu Ile
Leu Trp Ile Leu Asp Arg Leu Phe Phe 35 40
45Lys Cys Ile Tyr Arg Leu Phe Lys His Gly Leu Lys Arg Gly Pro
Ser 50 55 60Thr Glu Gly Val Pro Glu
Ser Met Arg Glu Glu Tyr Arg Lys Glu Gln65 70
75 80Gln Asn Ala Val Asp Ala Asp Asp Ser His Phe
Val Ser Ile Glu Leu 85 90
95Glu79759DNAArtificial sequenceOptimized M1 Coding Region 79atgagcctgc
tgaccgaggt cgaaacgtat gttctctcta tcgtgcccag cggccccctg 60aaggccgaga
tcgcccagag actggaggac gtgttcgccg gcaagaacac cgacctggag 120gccctgatgg
agtggctgaa gaccagaccc atcctgagcc ccctgaccaa gggcatcctg 180ggcttcgtgt
tcaccctgac cgtgcccagc gagagaggcc tgcagagaag aagattcgtg 240cagaacgccc
tgaacggcaa cggcgacccc aacaacatgg acagagccgt gaagctgtac 300agaaagctga
agagagagat caccttccac ggcgccaagg agatcgccct gagctacagc 360gccggcgccc
tggccagctg catgggcctg atctacaaca gaatgggcgc cgtgaccacc 420gaggtggcct
tcggcctggt gtgcgccacc tgcgagcaga tcgccgacag ccagcacaga 480agccacagac
agatggtggc caccaccaac cccctgatca gacacgagaa cagaatggtg 540ctggccagca
ccaccgccaa ggccatggag cagatggccg gcagcagcga gcaggccgcc 600gaggccatgg
agatcgccag ccaggccaga cagatggtgc aggccatgag agccatcggc 660acccacccca
gcagcagcgc cggcctgaaa gatgatcttc ttgaaaattt gcagacctat 720cagaaacgaa
tgggggtgca gatgcaacga ttcaagtga
75980294DNAArtificial sequenceOptimized M2 Coding Region 80atgagcctgc
tgaccgaggt cgaaacacct atcagaaacg aatgggggtg cagatgcaac 60gattcaagtg
accccctggt ggtggccgcc agcatcatcg gcatcctgca cctgatcctg 120tggatcctgg
acagactgtt cttcaagtgc atctacagac tgttcaagca cggcctgaag 180agaggcccca
gcaccgaggg cgtgcccgag agcatgagag aggagtacag aaaggagcag 240cagaacgccg
tggacgccga cgacagccac ttcgtgagca tcgagctgga gtga
294819PRTArtificial sequenceH2Kd Binding Peptide 81Thr Tyr Gln Arg Thr
Arg Ala Leu Val1 58211DNAArtificial sequenceRSV Promoter
from Plasmid VCL1005 82tactctagac g
118311DNAArtificial sequencePromoter RSV/R
83tacaataaac g
118427DNAArtificial SequencePrimer RSVfor 84catcagctgc tccctgcttg tgtgttg
278519DNAArtificial
sequencePrimer WNVpst rev 85cgatatccga cgacggtga
198639DNAArtificial sequencePrimer RSV HTLV5
86caccacattg gtgtgcacct ccatcggctc gcatctctc
398742DNAArtificial sequencePrimer HTLV RSVrev 87aggtgcacac caatgtggtg
aatggtcaaa tggcgtttat tg 428844DNAArtificial
sequencePrimer RSVrev 88aatggtcaaa tggcgtttat tgtatcgagc taggcactta aata
44896254DNAArtificial sequenceVR-6430, RSV RWNV
89tcgcgcgttt cggtgatgac ggtgaaaacc tctgacacat gcagctcccg gagacggtca
60cagcttgtct gtaagcggat gccgggagca gacaagcccg tcagggcgcg tcagcgggtg
120ttggcgggtg tcggggctgg cttaactatg cggcatcaga gcagattgta ctgagagtgc
180accatatgcg gtgtgaaata ccgcacagat gcgtaaggag aaaataccgc atcagattgg
240ctattggctg ctccctgctt gtgtgttgga ggtcgctgag tagtgcgcga gcaaaattta
300agctacaaca aggcaaggct tgaccgacaa ttgcatgaag aatctgctta gggttaggcg
360ttttgcgctg cttcgcgatg tacgggccag atatacgcgt atctgagggg actagggtgt
420gtttaggcga aaagcggggc ttcggttgta cgcggttagg agtcccctca ggatatagta
480gtttcgcttt tgcataggga gggggaaatg tagtcttatg caatactctt gtagtcttgc
540aacatggtaa cgatgagtta gcaacatgcc ttacaaggag agaaaaagca ccgtgcatgc
600cgattggtgg aagtaaggtg gtacgatcgt gccttattag gaaggcaaca gacgggtctg
660acatggattg gacgaaccac tgaattccgc attgcagaga tattgtattt aagtgcctag
720ctcgatacaa taaacgccat ttgaccattc accacattgg tgtgcacctc catcggctcg
780catctctcct tcacgcgccc gccgccctac ctgaggccgc catccacgcc ggttgagtcg
840cgttctgccg cctcccgcct gtggtgcctc ctgaactgcg tccgccgtct aggtaagttt
900aaagctcagg tcgagaccgg gcctttgtcc ggcgctccct tggagcctac ctagactcag
960ccggctctcc acgctttgcc tgaccctgct tgctcaactc tagttaacgg tggagggcag
1020tgtagtctga gcagtactcg ttgctgccgc gcgcgccacc agacataata gctgacagac
1080taacagactg ttcctttcca tgggtctttt ctgcagtcac cgtcgtcgga tatcgaattc
1140gccgccacca tgggcaagcg gagcgctggc tcaatcatgt ggctcgcgag cttggcagtt
1200gtcatagctt gtgcaggagc cgttaccctc tctaacttcc aagggaaggt gatgatgacg
1260gtaaatgcta ctgacgtcac agatgtcatc acgattccaa cagctgctgg aaagaaccta
1320tgcattgtca gagcaatgga tgtgggatac atgtgcgatg atactatcac ctatgaatgc
1380ccagtgctgt cggctggtaa tgatccagaa gacatcgact gttggtgcac aaagtcagca
1440gtctacgtca ggtatggaag atgcaccaag acacgccact caagacgcag tcggaggtca
1500ctgacagtgc agacacacgg agaaagcact ctagcgaaca agaagggggc ttggatggac
1560agcaccaagg ccacaaggta tttggtaaaa acagaatcat ggatcttgag gaaccctgga
1620tatgccctgg tggcagccgt cattggttgg atgcttggga gcaacaccat gcagagagtt
1680gtgtttgtcg tgctattgct tttggtggcc ccagcttaca gcttcaactg ccttggaatg
1740agcaacagag acttcttgga aggagtgtct ggagcaacat gggtggattt ggttctcgaa
1800ggcgatagct gcgtgactat catgtctaag gacaagccta ccatcgatgt gaagatgatg
1860aatatggagg cggccaacct ggcagaggtc cgcagttatt gctatttggc taccgtcagc
1920gatctctcca ccaaagctgc gtgcccgacc atgggggaag cccacaatga caaacgtgct
1980gacccagctt ttgtgtgcag acaaggagtg gtggacaggg gctggggcaa cggctgcgga
2040ctatttggca aaggaagcat tgacacatgc gccaaatttg cctgctctac caaggcaata
2100ggaagaacca tcttgaaaga gaatatcaag tacgaagtgg ccatttttgt ccatggacca
2160actactgtgg agtcgcacgg aaactactcc acacaggttg gagccactca ggcagggaga
2220ttcagcatca ctcctgcggc gccttcatac acactaaagc ttggagaata tggagaggtg
2280acagtggact gtgaaccacg gtcagggatt gacaccaatg catactacgt gatgactgtt
2340ggaacaaaga cgttcttggt ccatcgtgag tggttcatgg acctcaacct cccttggagc
2400agtgctggaa gtactgtgtg gaggaacaga gagacgttaa tggagtttga ggaaccacac
2460gccacgaagc agtctgtgat agcattgggc tcacaagagg gagctctgca tcaagctttg
2520gctggagcca ttcctgtgga attttcaagc aacactgtca agttgacgtc gggtcatttg
2580aagtgtagag tgaagatgga aaaattgcag ttgaagggaa caacctatgg cgtctgttca
2640aaggctttca agtttcttgg gactcccgca gacacaggtc acggcactgt ggtgttggaa
2700ttgcagtaca ctggcacgga tggaccttgc aaagttccta tctcgtcagt ggcttcattg
2760aacgacctaa cgccagtggg cagattggtc actgtcaacc cttttgtttc agtggccacg
2820gccaacgcta aggtcctgat tgaattggaa ccaccctttg gagactcata catagtggtg
2880ggcagaggag aacaacagat caatcaccat tggcacaagt ctggaagcag cattggcaaa
2940gcctttacaa ccaccctcaa aggagcgcag agactagccg ctctaggaga cacagcttgg
3000gactttggat cagttggagg ggtgttcacc tcagttggga aggctgtcca tcaagtgttc
3060ggaggagcat tccgctcact gttcggaggc atgtcctgga taacgcaagg attgctgggg
3120gctctcctgt tgtggatggg catcaatgct cgtgataggt ccatagctct cacgtttctc
3180gcagttggag gagttctgct cttcctctcc gtgaacgtgc acgcttgagg atccagatct
3240gctgtgcctt ctagttgcca gccatctgtt gtttgcccct cccccgtgcc ttccttgacc
3300ctggaaggtg ccactcccac tgtcctttcc taataaaatg aggaaattgc atcgcattgt
3360ctgagtaggt gtcattctat tctggggggt ggggtggggc aggacagcaa gggggaggat
3420tgggaagaca atagcaggca tgctggggat gcggtgggct ctatgggtac ccaggtgctg
3480aagaattgac ccggttcctc ctgggccaga aagaagcagg cacatcccct tctctgtgac
3540acaccctgtc cacgcccctg gttcttagtt ccagccccac tcataggaca ctcatagctc
3600aggagggctc cgccttcaat cccacccgct aaagtacttg gagcggtctc tccctccctc
3660atcagcccac caaaccaaac ctagcctcca agagtgggaa gaaattaaag caagataggc
3720tattaagtgc agagggagag aaaatgcctc caacatgtga ggaagtaatg agagaaatca
3780tagaatttta aggccatgat ttaaggccat catggcctta atcttccgct tcctcgctca
3840ctgactcgct gcgctcggtc gttcggctgc ggcgagcggt atcagctcac tcaaaggcgg
3900taatacggtt atccacagaa tcaggggata acgcaggaaa gaacatgtga gcaaaaggcc
3960agcaaaaggc caggaaccgt aaaaaggccg cgttgctggc gtttttccat aggctccgcc
4020cccctgacga gcatcacaaa aatcgacgct caagtcagag gtggcgaaac ccgacaggac
4080tataaagata ccaggcgttt ccccctggaa gctccctcgt gcgctctcct gttccgaccc
4140tgccgcttac cggatacctg tccgcctttc tcccttcggg aagcgtggcg ctttctcata
4200gctcacgctg taggtatctc agttcggtgt aggtcgttcg ctccaagctg ggctgtgtgc
4260acgaaccccc cgttcagccc gaccgctgcg ccttatccgg taactatcgt cttgagtcca
4320acccggtaag acacgactta tcgccactgg cagcagccac tggtaacagg attagcagag
4380cgaggtatgt aggcggtgct acagagttct tgaagtggtg gcctaactac ggctacacta
4440gaagaacagt atttggtatc tgcgctctgc tgaagccagt taccttcgga aaaagagttg
4500gtagctcttg atccggcaaa caaaccaccg ctggtagcgg tggttttttt gtttgcaagc
4560agcagattac gcgcagaaaa aaaggatctc aagaagatcc tttgatcttt tctacggggt
4620ctgacgctca gtggaacgaa aactcacgtt aagggatttt ggtcatgaga ttatcaaaaa
4680ggatcttcac ctagatcctt ttaaattaaa aatgaagttt taaatcaatc taaagtatat
4740atgagtaaac ttggtctgac agttaccaat gcttaatcag tgaggcacct atctcagcga
4800tctgtctatt tcgttcatcc atagttgcct gactcggggg gggggggcgc tgaggtctgc
4860ctcgtgaaga aggtgttgct gactcatacc aggcctgaat cgccccatca tccagccaga
4920aagtgaggga gccacggttg atgagagctt tgttgtaggt ggaccagttg gtgattttga
4980acttttgctt tgccacggaa cggtctgcgt tgtcgggaag atgcgtgatc tgatccttca
5040actcagcaaa agttcgattt attcaacaaa gccgccgtcc cgtcaagtca gcgtaatgct
5100ctgccagtgt tacaaccaat taaccaattc tgattagaaa aactcatcga gcatcaaatg
5160aaactgcaat ttattcatat caggattatc aataccatat ttttgaaaaa gccgtttctg
5220taatgaagga gaaaactcac cgaggcagtt ccataggatg gcaagatcct ggtatcggtc
5280tgcgattccg actcgtccaa catcaataca acctattaat ttcccctcgt caaaaataag
5340gttatcaagt gagaaatcac catgagtgac gactgaatcc ggtgagaatg gcaaaagctt
5400atgcatttct ttccagactt gttcaacagg ccagccatta cgctcgtcat caaaatcact
5460cgcatcaacc aaaccgttat tcattcgtga ttgcgcctga gcgagacgaa atacgcgatc
5520gctgttaaaa ggacaattac aaacaggaat cgaatgcaac cggcgcagga acactgccag
5580cgcatcaaca atattttcac ctgaatcagg atattcttct aatacctgga atgctgtttt
5640cccggggatc gcagtggtga gtaaccatgc atcatcagga gtacggataa aatgcttgat
5700ggtcggaaga ggcataaatt ccgtcagcca gtttagtctg accatctcat ctgtaacatc
5760attggcaacg ctacctttgc catgtttcag aaacaactct ggcgcatcgg gcttcccata
5820caatcgatag attgtcgcac ctgattgccc gacattatcg cgagcccatt tatacccata
5880taaatcagca tccatgttgg aatttaatcg cggcctcgag caagacgttt cccgttgaat
5940atggctcata acaccccttg tattactgtt tatgtaagca gacagtttta ttgttcatga
6000tgatatattt ttatcttgtg caatgtaaca tcagagattt tgagacacaa cgtggctttc
6060cccccccccc cattattgaa gcatttatca gggttattgt ctcatgagcg gatacatatt
6120tgaatgtatt tagaaaaata aacaaatagg ggttccgcgc acatttcccc gaaaagtgcc
6180acctgacgtc taagaaacca ttattatcat gacattaacc tataaaaata ggcgtatcac
6240gaggcccttt cgtc
6254906425DNAArtificial sequenceVR6307, Ligation of VCL6292 into VR6430
90tcgcgcgttt cggtgatgac ggtgaaaacc tctgacacat gcagctcccg gagacggtca
60cagcttgtct gtaagcggat gccgggagca gacaagcccg tcagggcgcg tcagcgggtg
120ttggcgggtg tcggggctgg cttaactatg cggcatcaga gcagattgta ctgagagtgc
180accatatgcg gtgtgaaata ccgcacagat gcgtaaggag aaaataccgc atcagattgg
240ctattggctg ctccctgctt gtgtgttgga ggtcgctgag tagtgcgcga gcaaaattta
300agctacaaca aggcaaggct tgaccgacaa ttgcatgaag aatctgctta gggttaggcg
360ttttgcgctg cttcgcgatg tacgggccag atatacgcgt atctgagggg actagggtgt
420gtttaggcga aaagcggggc ttcggttgta cgcggttagg agtcccctca ggatatagta
480gtttcgcttt tgcataggga gggggaaatg tagtcttatg caatactctt gtagtcttgc
540aacatggtaa cgatgagtta gcaacatgcc ttacaaggag agaaaaagca ccgtgcatgc
600cgattggtgg aagtaaggtg gtacgatcgt gccttattag gaaggcaaca gacgggtctg
660acatggattg gacgaaccac tgaattccgc attgcagaga tattgtattt aagtgcctag
720ctcgatacaa taaacgccat ttgaccattc accacattgg tgtgcacctc catcggctcg
780catctctcct tcacgcgccc gccgccctac ctgaggccgc catccacgcc ggttgagtcg
840cgttctgccg cctcccgcct gtggtgcctc ctgaactgcg tccgccgtct aggtaagttt
900aaagctcagg tcgagaccgg gcctttgtcc ggcgctccct tggagcctac ctagactcag
960ccggctctcc acgctttgcc tgaccctgct tgctcaactc tagttaacgg tggagggcag
1020tgtagtctga gcagtactcg ttgctgccgc gcgcgccacc agacataata gctgacagac
1080taacagactg ttcctttcca tgggtctttt ctgcagtcac cgtcgtcgga tatcgccacc
1140atggatgcaa tgaagagagg gctctgctgt gtgctgctgc tgtgtggagc agtcttcgtt
1200tcgcccagcg aagtgaagca agaaaatcga cttctgaacg agagcgaaag ttcatcacag
1260ggtcttctcg gatactactt cagtgacttg aatttccaag caccaatggt ggtgactagt
1320agcaccaccg gcgatttgag cattcccagc tctgagttgg agaacattcc cagcgaaaat
1380cagtacttcc agtctgctat ctggtccgga ttcattaagg ttaaaaagtc cgacgaatat
1440acatttgcta cctcggcgga taaccatgtg acaatgtggg tggacgacca ggaagtgatc
1500aacaaggctt caaactctaa taaaatccgg ctcgagaagg ggaggctcta ccagatcaaa
1560attcagtacc agcgggaaaa ccctacagaa aaaggactcg atttcaagct gtactggaca
1620gatagccaaa acaagaaaga agttatcagc tcagacaatc tgcagttacc cgagctcaag
1680cagaagagtt ctaatacaag cgctgggcca actgtgcccg acagagacaa tgatggaatc
1740cctgatagtc tagaggttga gggatacacg gtagatgtca agaacaaaag gacttttctc
1800tcgccttgga tctcaaatat ccatgagaag aaggggctta ccaagtacaa gtcctccccc
1860gagaagtggt ctaccgcttc cgatccatat agcgatttcg agaaggtcac aggccggatc
1920gataaaaatg tgtctccaga ggctagacac cccctggtag cagcctaccc gattgtacac
1980gtggacatgg agaacatcat tctaagcaaa aacgaggacc agtccacaca aaacactgac
2040tccgagaccc gcaccatatc taaaaacacc agtacttcaa ggacccacac ctctgaagtg
2100cacggcaatg cggaagtcca tgcatcgttt ttcgatattg gtggctccgt gtcagccggc
2160tttagcaata gcaactcctc gacggttgcc attgaccact cactgtcatt agcaggtgag
2220aggacttggg ctgaaactat gggtctgaat accgccgata cggcccggct caacgcaaat
2280attcggtacg tcaacacagg gactgctcct atatataacg tgctgcctac gacaagtctt
2340gtcctgggca aaaatcagac cctcgcaacc attaaggcaa aggaaaatca gctgagccag
2400atcctcgccc ctaacaacta ttatccatcc aaaaatttag cccccatagc cctgaacgcc
2460caggacgact tttcctctac ccccataact atgaattaca atcagttcct ggagctggaa
2520aagacgaagc agctgagact agacaccgat caggtgtatg gaaacatagc gacatataac
2580tttgagaacg gccgcgtgcg cgtcgacact gggtcaaact ggtctgaagt tctgccgcaa
2640attcaagaga caaccgccag aattatcttt aatgggaagg acttgaacct tgtcgaacgt
2700agaattgccg ccgtgaaccc cagtgatcca ctcgagacga ctaaaccgga tatgacactg
2760aaagaggctc tgaagattgc cttcggattc aacgaaccta atggcaattt gcagtatcag
2820gggaaagaca tcacagagtt tgatttcaat ttcgatcagc agacttccca aaatatcaaa
2880aatcagttgg cagagctgaa tgccaccaat atctacacgg ttctcgataa aatcaaactt
2940aacgccaaga tgaacatatt gattcgagac aaacgcttcc actacgaccg caacaatata
3000gccgtaggcg ctgatgagtc tgtcgtcaag gaggctcata gggaagttat caacagcagt
3060actgaagggc tgttacttaa tatcgacaag gacattcgga agatcctgtc cgggtatatc
3120gtggagatcg aggataccga gggcctgaag gaagtcatta acgaccgcta tgatatgctg
3180aacatttcca gcttacgaca ggacggtaag acatttattg actttaaaaa gtataacgac
3240aagctacccc tgtacatttc caacccaaat tacaaagtta atgtgtatgc tgtaaccaag
3300gagaacacaa tcatcaatcc aagcgagaac ggcgatacca gcacaaatgg aatcaaaaag
3360atccttatat ttagtaaaaa aggctacgag atcggttgag gatccagatc tgctgtgcct
3420tctagttgcc agccatctgt tgtttgcccc tcccccgtgc cttccttgac cctggaaggt
3480gccactccca ctgtcctttc ctaataaaat gaggaaattg catcgcattg tctgagtagg
3540tgtcattcta ttctgggggg tggggtgggg caggacagca agggggagga ttgggaagac
3600aatagcaggc atgctgggga tgcggtgggc tctatgggta cccaggtgct gaagaattga
3660cccggttcct cctgggccag aaagaagcag gcacatcccc ttctctgtga cacaccctgt
3720ccacgcccct ggttcttagt tccagcccca ctcataggac actcatagct caggagggct
3780ccgccttcaa tcccacccgc taaagtactt ggagcggtct ctccctccct catcagccca
3840ccaaaccaaa cctagcctcc aagagtggga agaaattaaa gcaagatagg ctattaagtg
3900cagagggaga gaaaatgcct ccaacatgtg aggaagtaat gagagaaatc atagaatttt
3960aaggccatga tttaaggcca tcatggcctt aatcttccgc ttcctcgctc actgactcgc
4020tgcgctcggt cgttcggctg cggcgagcgg tatcagctca ctcaaaggcg gtaatacggt
4080tatccacaga atcaggggat aacgcaggaa agaacatgtg agcaaaaggc cagcaaaagg
4140ccaggaaccg taaaaaggcc gcgttgctgg cgtttttcca taggctccgc ccccctgacg
4200agcatcacaa aaatcgacgc tcaagtcaga ggtggcgaaa cccgacagga ctataaagat
4260accaggcgtt tccccctgga agctccctcg tgcgctctcc tgttccgacc ctgccgctta
4320ccggatacct gtccgccttt ctcccttcgg gaagcgtggc gctttctcat agctcacgct
4380gtaggtatct cagttcggtg taggtcgttc gctccaagct gggctgtgtg cacgaacccc
4440ccgttcagcc cgaccgctgc gccttatccg gtaactatcg tcttgagtcc aacccggtaa
4500gacacgactt atcgccactg gcagcagcca ctggtaacag gattagcaga gcgaggtatg
4560taggcggtgc tacagagttc ttgaagtggt ggcctaacta cggctacact agaagaacag
4620tatttggtat ctgcgctctg ctgaagccag ttaccttcgg aaaaagagtt ggtagctctt
4680gatccggcaa acaaaccacc gctggtagcg gtggtttttt tgtttgcaag cagcagatta
4740cgcgcagaaa aaaaggatct caagaagatc ctttgatctt ttctacgggg tctgacgctc
4800agtggaacga aaactcacgt taagggattt tggtcatgag attatcaaaa aggatcttca
4860cctagatcct tttaaattaa aaatgaagtt ttaaatcaat ctaaagtata tatgagtaaa
4920cttggtctga cagttaccaa tgcttaatca gtgaggcacc tatctcagcg atctgtctat
4980ttcgttcatc catagttgcc tgactcgggg ggggggggcg ctgaggtctg cctcgtgaag
5040aaggtgttgc tgactcatac caggcctgaa tcgccccatc atccagccag aaagtgaggg
5100agccacggtt gatgagagct ttgttgtagg tggaccagtt ggtgattttg aacttttgct
5160ttgccacgga acggtctgcg ttgtcgggaa gatgcgtgat ctgatccttc aactcagcaa
5220aagttcgatt tattcaacaa agccgccgtc ccgtcaagtc agcgtaatgc tctgccagtg
5280ttacaaccaa ttaaccaatt ctgattagaa aaactcatcg agcatcaaat gaaactgcaa
5340tttattcata tcaggattat caataccata tttttgaaaa agccgtttct gtaatgaagg
5400agaaaactca ccgaggcagt tccataggat ggcaagatcc tggtatcggt ctgcgattcc
5460gactcgtcca acatcaatac aacctattaa tttcccctcg tcaaaaataa ggttatcaag
5520tgagaaatca ccatgagtga cgactgaatc cggtgagaat ggcaaaagct tatgcatttc
5580tttccagact tgttcaacag gccagccatt acgctcgtca tcaaaatcac tcgcatcaac
5640caaaccgtta ttcattcgtg attgcgcctg agcgagacga aatacgcgat cgctgttaaa
5700aggacaatta caaacaggaa tcgaatgcaa ccggcgcagg aacactgcca gcgcatcaac
5760aatattttca cctgaatcag gatattcttc taatacctgg aatgctgttt tcccggggat
5820cgcagtggtg agtaaccatg catcatcagg agtacggata aaatgcttga tggtcggaag
5880aggcataaat tccgtcagcc agtttagtct gaccatctca tctgtaacat cattggcaac
5940gctacctttg ccatgtttca gaaacaactc tggcgcatcg ggcttcccat acaatcgata
6000gattgtcgca cctgattgcc cgacattatc gcgagcccat ttatacccat ataaatcagc
6060atccatgttg gaatttaatc gcggcctcga gcaagacgtt tcccgttgaa tatggctcat
6120aacacccctt gtattactgt ttatgtaagc agacagtttt attgttcatg atgatatatt
6180tttatcttgt gcaatgtaac atcagagatt ttgagacaca acgtggcttt cccccccccc
6240ccattattga agcatttatc agggttattg tctcatgagc ggatacatat ttgaatgtat
6300ttagaaaaat aaacaaatag gggttccgcg cacatttccc cgaaaagtgc cacctgacgt
6360ctaagaaacc attattatca tgacattaac ctataaaaat aggcgtatca cgaggccctt
6420tcgtc
6425915398DNAArtificial sequenceVR4756, Ligation of Segment7 into VR10551
91tggccattgc atacgttgta tccatatcat aatatgtaca tttatattgg ctcatgtcca
60acattaccgc catgttgaca ttgattattg actagttatt aatagtaatc aattacgggg
120tcattagttc atagcccata tatggagttc cgcgttacat aacttacggt aaatggcccg
180cctggctgac cgcccaacga cccccgccca ttgacgtcaa taatgacgta tgttcccata
240gtaacgccaa tagggacttt ccattgacgt caatgggtgg agtatttacg gtaaactgcc
300cacttggcag tacatcaagt gtatcatatg ccaagtacgc cccctattga cgtcaatgac
360ggtaaatggc ccgcctggca ttatgcccag tacatgacct tatgggactt tcctacttgg
420cagtacatct acgtattagt catcgctatt accatggtga tgcggttttg gcagtacatc
480aatgggcgtg gatagcggtt tgactcacgg ggatttccaa gtctccaccc cattgacgtc
540aatgggagtt tgttttggca ccaaaatcaa cgggactttc caaaatgtcg taacaactcc
600gccccattga cgcaaatggg cggtaggcgt gtacggtggg aggtctatat aagcagagct
660cgtttagtga accgtcagat cgcctggaga cgccatccac gctgttttga cctccataga
720agacaccggg accgatccag cctccgcggc cgggaacggt gcattggaac gcggattccc
780cgtgccaaga gtgacgtaag taccgcctat agactctata ggcacacccc tttggctctt
840atgcatgcta tactgttttt ggcttggggc ctatacaccc ccgcttcctt atgctatagg
900tgatggtata gcttagccta taggtgtggg ttattgacca ttattgacca ctcccctatt
960ggtgacgata ctttccatta ctaatccata acatggctct ttgccacaac tatctctatt
1020ggctatatgc caatactctg tccttcagag actgacacgg actctgtatt tttacaggat
1080ggggtcccat ttattattta caaattcaca tatacaacaa cgccgtcccc cgtgcccgca
1140gtttttatta aacatagcgt gggatctcca cgcgaatctc gggtacgtgt tccggacatg
1200ggctcttctc cggtagcggc ggagcttcca catccgagcc ctggtcccat gcctccagcg
1260gctcatggtc gctcggcagc tccttgctcc taacagtgga ggccagactt aggcacagca
1320caatgcccac caccaccagt gtgccgcaca aggccgtggc ggtagggtat gtgtctgaaa
1380atgagcgtgg agattgggct cgcacggctg acgcagatgg aagacttaag gcagcggcag
1440aagaagatgc aggcagctga gttgttgtat tctgataaga gtcagaggta actcccgttg
1500cggtgctgtt aacggtggag ggcagtgtag tctgagcagt actcgttgct gccgcgcgcg
1560ccaccagaca taatagctga cagactaaca gactgttcct ttccatgggt cttttctgca
1620gtcaccgtcg tcggatatcg aattcgccac catgagcctt ctaaccgagg tcgaaacgta
1680tgttctctct atcgttccat caggccccct caaagccgaa atcgcgcaga gacttgaaga
1740tgtctttgct gggaaaaaca cagatcttga ggctctcatg gaatggctaa agacaagacc
1800aatcctgtca cctctgacta aggggatttt ggggtttgtg ttcacgctca ccgtgcccag
1860tgagcgagga ctgcagcgta gacgctttgt ccaaaatgcc ctcaatggga atggggatcc
1920aaataacatg gacagagcag ttaaactata tagaaaactt aagagggaga ttacattcca
1980tggggccaaa gaaatagcac tcagttattc tgctggtgca cttgccagtt gcatgggcct
2040catatacaac agaatggggg ctgtaaccac tgaagtggcc tttggcctgg tatgtgcaac
2100atgtgaacag attgctgact cccagcacag gtctcatagg caaatggtgg caacaaccaa
2160tccattaata aggcatgaga acagaatggt tttggccagc actacagcta aggctatgga
2220gcaaatggct ggatcaagtg agcaggcagc ggaggccatg gaaattgcta gtcaggccag
2280gcaaatggtg caggcaatga gagccattgg gactcatcct agctccagtg ctggtctaaa
2340agatgatctt cttgaaaatt tgcagaccta tcagaaacga atgggggtgc agatgcaacg
2400attcaagtga cccgcttgtt gttgctgcga gtatcattgg gatcttgcac ttgatattgt
2460ggattcttga tcgtcttttt ttcaaatgca tctatcgact cttcaaacac ggtctgaaaa
2520gagggccttc tacggaagga gtacctgagt ctatgaggga agaatatcga aaggaacagc
2580agaatgctgt ggatgctgac gacagtcatt ttgtcagcat agagctggag taatcagtcg
2640accacgtgtg atccagatct acttctggct aataaaagat cagagctcta gagatctgtg
2700tgttggtttt ttgtgtggta ctcttccgct tcctcgctca ctgactcgct gcgctcggtc
2760gttcggctgc ggcgagcggt atcagctcac tcaaaggcgg taatacggtt atccacagaa
2820tcaggggata acgcaggaaa gaacatgtga gcaaaaggcc agcaaaaggc caggaaccgt
2880aaaaaggccg cgttgctggc gtttttccat aggctccgcc cccctgacga gcatcacaaa
2940aatcgacgct caagtcagag gtggcgaaac ccgacaggac tataaagata ccaggcgttt
3000ccccctggaa gctccctcgt gcgctctcct gttccgaccc tgccgcttac cggatacctg
3060tccgcctttc tcccttcggg aagcgtggcg ctttctcata gctcacgctg taggtatctc
3120agttcggtgt aggtcgttcg ctccaagctg ggctgtgtgc acgaaccccc cgttcagccc
3180gaccgctgcg ccttatccgg taactatcgt cttgagtcca acccggtaag acacgactta
3240tcgccactgg cagcagccac tggtaacagg attagcagag cgaggtatgt aggcggtgct
3300acagagttct tgaagtggtg gcctaactac ggctacacta gaagaacagt atttggtatc
3360tgcgctctgc tgaagccagt taccttcgga aaaagagttg gtagctcttg atccggcaaa
3420caaaccaccg ctggtagcgg tggttttttt gtttgcaagc agcagattac gcgcagaaaa
3480aaaggatctc aagaagatcc tttgatcttt tctacggggt ctgacgctca gtggaacgaa
3540aactcacgtt aagggatttt ggtcatgaga ttatcaaaaa ggatcttcac ctagatcctt
3600ttaaattaaa aatgaagttt taaatcaatc taaagtatat atgagtaaac ttggtctgac
3660agttaccaat gcttaatcag tgaggcacct atctcagcga tctgtctatt tcgttcatcc
3720atagttgcct gactcggggg gggggggcgc tgaggtctgc ctcgtgaaga aggtgttgct
3780gactcatacc aggcctgaat cgccccatca tccagccaga aagtgaggga gccacggttg
3840atgagagctt tgttgtaggt ggaccagttg gtgattttga acttttgctt tgccacggaa
3900cggtctgcgt tgtcgggaag atgcgtgatc tgatccttca actcagcaaa agttcgattt
3960attcaacaaa gccgccgtcc cgtcaagtca gcgtaatgct ctgccagtgt tacaaccaat
4020taaccaattc tgattagaaa aactcatcga gcatcaaatg aaactgcaat ttattcatat
4080caggattatc aataccatat ttttgaaaaa gccgtttctg taatgaagga gaaaactcac
4140cgaggcagtt ccataggatg gcaagatcct ggtatcggtc tgcgattccg actcgtccaa
4200catcaataca acctattaat ttcccctcgt caaaaataag gttatcaagt gagaaatcac
4260catgagtgac gactgaatcc ggtgagaatg gcaaaagctt atgcatttct ttccagactt
4320gttcaacagg ccagccatta cgctcgtcat caaaatcact cgcatcaacc aaaccgttat
4380tcattcgtga ttgcgcctga gcgagacgaa atacgcgatc gctgttaaaa ggacaattac
4440aaacaggaat cgaatgcaac cggcgcagga acactgccag cgcatcaaca atattttcac
4500ctgaatcagg atattcttct aatacctgga atgctgtttt cccggggatc gcagtggtga
4560gtaaccatgc atcatcagga gtacggataa aatgcttgat ggtcggaaga ggcataaatt
4620ccgtcagcca gtttagtctg accatctcat ctgtaacatc attggcaacg ctacctttgc
4680catgtttcag aaacaactct ggcgcatcgg gcttcccata caatcgatag attgtcgcac
4740ctgattgccc gacattatcg cgagcccatt tatacccata taaatcagca tccatgttgg
4800aatttaatcg cggcctcgag caagacgttt cccgttgaat atggctcata acaccccttg
4860tattactgtt tatgtaagca gacagtttta ttgttcatga tgatatattt ttatcttgtg
4920caatgtaaca tcagagattt tgagacacaa cgtggctttc cccccccccc cattattgaa
4980gcatttatca gggttattgt ctcatgagcg gatacatatt tgaatgtatt tagaaaaata
5040aacaaatagg ggttccgcgc acatttcccc gaaaagtgcc acctgacgtc taagaaacca
5100ttattatcat gacattaacc tataaaaata ggcgtatcac gaggcccttt cgtctcgcgc
5160gtttcggtga tgacggtgaa aacctctgac acatgcagct cccggagacg gtcacagctt
5220gtctgtaagc ggatgccggg agcagacaag cccgtcaggg cgcgtcagcg ggtgttggcg
5280ggtgtcgggg ctggcttaac tatgcggcat cagagcagat tgtactgaga gtgcaccata
5340tgcggtgtga aataccgcac agatgcgtaa ggagaaaata ccgcatcaga ttggctat
5398924710DNAArtificial sequenceVR4759, Ligation of M2 into 10551
92tggccattgc atacgttgta tccatatcat aatatgtaca tttatattgg ctcatgtcca
60acattaccgc catgttgaca ttgattattg actagttatt aatagtaatc aattacgggg
120tcattagttc atagcccata tatggagttc cgcgttacat aacttacggt aaatggcccg
180cctggctgac cgcccaacga cccccgccca ttgacgtcaa taatgacgta tgttcccata
240gtaacgccaa tagggacttt ccattgacgt caatgggtgg agtatttacg gtaaactgcc
300cacttggcag tacatcaagt gtatcatatg ccaagtacgc cccctattga cgtcaatgac
360ggtaaatggc ccgcctggca ttatgcccag tacatgacct tatgggactt tcctacttgg
420cagtacatct acgtattagt catcgctatt accatggtga tgcggttttg gcagtacatc
480aatgggcgtg gatagcggtt tgactcacgg ggatttccaa gtctccaccc cattgacgtc
540aatgggagtt tgttttggca ccaaaatcaa cgggactttc caaaatgtcg taacaactcc
600gccccattga cgcaaatggg cggtaggcgt gtacggtggg aggtctatat aagcagagct
660cgtttagtga accgtcagat cgcctggaga cgccatccac gctgttttga cctccataga
720agacaccggg accgatccag cctccgcggc cgggaacggt gcattggaac gcggattccc
780cgtgccaaga gtgacgtaag taccgcctat agactctata ggcacacccc tttggctctt
840atgcatgcta tactgttttt ggcttggggc ctatacaccc ccgcttcctt atgctatagg
900tgatggtata gcttagccta taggtgtggg ttattgacca ttattgacca ctcccctatt
960ggtgacgata ctttccatta ctaatccata acatggctct ttgccacaac tatctctatt
1020ggctatatgc caatactctg tccttcagag actgacacgg actctgtatt tttacaggat
1080ggggtcccat ttattattta caaattcaca tatacaacaa cgccgtcccc cgtgcccgca
1140gtttttatta aacatagcgt gggatctcca cgcgaatctc gggtacgtgt tccggacatg
1200ggctcttctc cggtagcggc ggagcttcca catccgagcc ctggtcccat gcctccagcg
1260gctcatggtc gctcggcagc tccttgctcc taacagtgga ggccagactt aggcacagca
1320caatgcccac caccaccagt gtgccgcaca aggccgtggc ggtagggtat gtgtctgaaa
1380atgagcgtgg agattgggct cgcacggctg acgcagatgg aagacttaag gcagcggcag
1440aagaagatgc aggcagctga gttgttgtat tctgataaga gtcagaggta actcccgttg
1500cggtgctgtt aacggtggag ggcagtgtag tctgagcagt actcgttgct gccgcgcgcg
1560ccaccagaca taatagctga cagactaaca gactgttcct ttccatgggt cttttctgca
1620gtcaccgtcg tcggatatcg aattcgccac catgagcctg ctgaccgagg tggagacccc
1680catcagaaac gagtggggct gcagatgcaa cgacagcagc gaccccctgg tggtggccgc
1740cagcatcatc ggcatcctgc acctgatcct gtggatcctg gacagactgt tcttcaagtg
1800catctacaga ctgttcaagc acggcctgaa gagaggcccc agcaccgagg gcgtgcccga
1860gagcatgaga gaggagtaca gaaaggagca gcagaacgcc gtggacgccg acgacagcca
1920cttcgtgagc atcgagctgg agtgatcagt cgaccacgtg tgatccagat ctacttctgg
1980ctaataaaag atcagagctc tagagatctg tgtgttggtt ttttgtgtgg tactcttccg
2040cttcctcgct cactgactcg ctgcgctcgg tcgttcggct gcggcgagcg gtatcagctc
2100actcaaaggc ggtaatacgg ttatccacag aatcagggga taacgcagga aagaacatgt
2160gagcaaaagg ccagcaaaag gccaggaacc gtaaaaaggc cgcgttgctg gcgtttttcc
2220ataggctccg cccccctgac gagcatcaca aaaatcgacg ctcaagtcag aggtggcgaa
2280acccgacagg actataaaga taccaggcgt ttccccctgg aagctccctc gtgcgctctc
2340ctgttccgac cctgccgctt accggatacc tgtccgcctt tctcccttcg ggaagcgtgg
2400cgctttctca tagctcacgc tgtaggtatc tcagttcggt gtaggtcgtt cgctccaagc
2460tgggctgtgt gcacgaaccc cccgttcagc ccgaccgctg cgccttatcc ggtaactatc
2520gtcttgagtc caacccggta agacacgact tatcgccact ggcagcagcc actggtaaca
2580ggattagcag agcgaggtat gtaggcggtg ctacagagtt cttgaagtgg tggcctaact
2640acggctacac tagaagaaca gtatttggta tctgcgctct gctgaagcca gttaccttcg
2700gaaaaagagt tggtagctct tgatccggca aacaaaccac cgctggtagc ggtggttttt
2760ttgtttgcaa gcagcagatt acgcgcagaa aaaaaggatc tcaagaagat cctttgatct
2820tttctacggg gtctgacgct cagtggaacg aaaactcacg ttaagggatt ttggtcatga
2880gattatcaaa aaggatcttc acctagatcc ttttaaatta aaaatgaagt tttaaatcaa
2940tctaaagtat atatgagtaa acttggtctg acagttacca atgcttaatc agtgaggcac
3000ctatctcagc gatctgtcta tttcgttcat ccatagttgc ctgactcggg gggggggggc
3060gctgaggtct gcctcgtgaa gaaggtgttg ctgactcata ccaggcctga atcgccccat
3120catccagcca gaaagtgagg gagccacggt tgatgagagc tttgttgtag gtggaccagt
3180tggtgatttt gaacttttgc tttgccacgg aacggtctgc gttgtcggga agatgcgtga
3240tctgatcctt caactcagca aaagttcgat ttattcaaca aagccgccgt cccgtcaagt
3300cagcgtaatg ctctgccagt gttacaacca attaaccaat tctgattaga aaaactcatc
3360gagcatcaaa tgaaactgca atttattcat atcaggatta tcaataccat atttttgaaa
3420aagccgtttc tgtaatgaag gagaaaactc accgaggcag ttccatagga tggcaagatc
3480ctggtatcgg tctgcgattc cgactcgtcc aacatcaata caacctatta atttcccctc
3540gtcaaaaata aggttatcaa gtgagaaatc accatgagtg acgactgaat ccggtgagaa
3600tggcaaaagc ttatgcattt ctttccagac ttgttcaaca ggccagccat tacgctcgtc
3660atcaaaatca ctcgcatcaa ccaaaccgtt attcattcgt gattgcgcct gagcgagacg
3720aaatacgcga tcgctgttaa aaggacaatt acaaacagga atcgaatgca accggcgcag
3780gaacactgcc agcgcatcaa caatattttc acctgaatca ggatattctt ctaatacctg
3840gaatgctgtt ttcccgggga tcgcagtggt gagtaaccat gcatcatcag gagtacggat
3900aaaatgcttg atggtcggaa gaggcataaa ttccgtcagc cagtttagtc tgaccatctc
3960atctgtaaca tcattggcaa cgctaccttt gccatgtttc agaaacaact ctggcgcatc
4020gggcttccca tacaatcgat agattgtcgc acctgattgc ccgacattat cgcgagccca
4080tttataccca tataaatcag catccatgtt ggaatttaat cgcggcctcg agcaagacgt
4140ttcccgttga atatggctca taacacccct tgtattactg tttatgtaag cagacagttt
4200tattgttcat gatgatatat ttttatcttg tgcaatgtaa catcagagat tttgagacac
4260aacgtggctt tccccccccc cccattattg aagcatttat cagggttatt gtctcatgag
4320cggatacata tttgaatgta tttagaaaaa taaacaaata ggggttccgc gcacatttcc
4380ccgaaaagtg ccacctgacg tctaagaaac cattattatc atgacattaa cctataaaaa
4440taggcgtatc acgaggccct ttcgtctcgc gcgtttcggt gatgacggtg aaaacctctg
4500acacatgcag ctcccggaga cggtcacagc ttgtctgtaa gcggatgccg ggagcagaca
4560agcccgtcag ggcgcgtcag cgggtgttgg cgggtgtcgg ggctggctta actatgcggc
4620atcagagcag attgtactga gagtgcacca tatgcggtgt gaaataccgc acagatgcgt
4680aaggagaaaa taccgcatca gattggctat
4710935913DNAArtificial sequenceVR4762, Ligation of NP Consensus into
10551 93tggccattgc atacgttgta tccatatcat aatatgtaca tttatattgg ctcatgtcca
60acattaccgc catgttgaca ttgattattg actagttatt aatagtaatc aattacgggg
120tcattagttc atagcccata tatggagttc cgcgttacat aacttacggt aaatggcccg
180cctggctgac cgcccaacga cccccgccca ttgacgtcaa taatgacgta tgttcccata
240gtaacgccaa tagggacttt ccattgacgt caatgggtgg agtatttacg gtaaactgcc
300cacttggcag tacatcaagt gtatcatatg ccaagtacgc cccctattga cgtcaatgac
360ggtaaatggc ccgcctggca ttatgcccag tacatgacct tatgggactt tcctacttgg
420cagtacatct acgtattagt catcgctatt accatggtga tgcggttttg gcagtacatc
480aatgggcgtg gatagcggtt tgactcacgg ggatttccaa gtctccaccc cattgacgtc
540aatgggagtt tgttttggca ccaaaatcaa cgggactttc caaaatgtcg taacaactcc
600gccccattga cgcaaatggg cggtaggcgt gtacggtggg aggtctatat aagcagagct
660cgtttagtga accgtcagat cgcctggaga cgccatccac gctgttttga cctccataga
720agacaccggg accgatccag cctccgcggc cgggaacggt gcattggaac gcggattccc
780cgtgccaaga gtgacgtaag taccgcctat agactctata ggcacacccc tttggctctt
840atgcatgcta tactgttttt ggcttggggc ctatacaccc ccgcttcctt atgctatagg
900tgatggtata gcttagccta taggtgtggg ttattgacca ttattgacca ctcccctatt
960ggtgacgata ctttccatta ctaatccata acatggctct ttgccacaac tatctctatt
1020ggctatatgc caatactctg tccttcagag actgacacgg actctgtatt tttacaggat
1080ggggtcccat ttattattta caaattcaca tatacaacaa cgccgtcccc cgtgcccgca
1140gtttttatta aacatagcgt gggatctcca cgcgaatctc gggtacgtgt tccggacatg
1200ggctcttctc cggtagcggc ggagcttcca catccgagcc ctggtcccat gcctccagcg
1260gctcatggtc gctcggcagc tccttgctcc taacagtgga ggccagactt aggcacagca
1320caatgcccac caccaccagt gtgccgcaca aggccgtggc ggtagggtat gtgtctgaaa
1380atgagcgtgg agattgggct cgcacggctg acgcagatgg aagacttaag gcagcggcag
1440aagaagatgc aggcagctga gttgttgtat tctgataaga gtcagaggta actcccgttg
1500cggtgctgtt aacggtggag ggcagtgtag tctgagcagt actcgttgct gccgcgcgcg
1560ccaccagaca taatagctga cagactaaca gactgttcct ttccatgggt cttttctgca
1620gtcaccgtcg tcggatatcg aattcgccac catggccagc cagggcacca agagaagcta
1680cgagcagatg gagaccgacg gcgagagaca gaacgccacc gagatcagag ccagcgtggg
1740caagatgatc gacggcatcg gcagattcta catccagatg tgcaccgagc tgaagctgag
1800cgactacgag ggcagactga tccagaacag cctgaccatc gagagaatgg tgctgagcgc
1860cttcgacgag agaagaaaca gatacctgga ggagcacccc agcgccggca aggaccccaa
1920gaagaccggc ggccccatct acagaagagt ggacggcaag tggatgagag agctggtgct
1980gtacgacaag gaggagatca gaagaatctg gagacaggcc aacaacggcg aggacgccac
2040cgccggcctg acccacatga tgatctggca cagcaacctg aacgacacca cctaccagag
2100aaccagagcc ctggtgcgga ccggcatgga ccccagaatg tgcagcctga tgcagggcag
2160caccctgccc agaagaagcg gcgccgccgg cgccgccgtg aagggcatcg gcaccatggt
2220gatggagctg atcagaatga tcaagagagg catcaacgac agaaacttct ggagaggcga
2280gaacggcaga aagaccagaa gcgcctacga gagaatgtgc aacatcctga agggcaagtt
2340ccagaccgcc gcccagagag ccatgatgga ccaggtccgg gagagcagaa accccggcaa
2400cgccgagatc gaggacctga tcttcctggc cagaagcgcc ctgatcctga gaggcagcgt
2460ggcccacaag agctgcctgc ccgcctgcgt gtacggcccc gccgtgagca gcggctacga
2520cttcgagaag gagggctaca gcctggtggg catcgacccc ttcaagctgc tgcagaacag
2580ccaggtgtac agcctgatca gacccaacga gaaccccgcc cacaagagcc agctggtgtg
2640gatggcctgc cacagcgccg ccttcgagga cctgagactg ctgagcttca tcagaggcac
2700caaggtgtcc cccagaggca agctgagcac cagaggcgtg cagatcgcca gcaacgagaa
2760catggacaac atgggcagca gcaccctgga gctgagaagc agatactggg ccatcagaac
2820cagaagcggc ggcaacacca accagcagag agccagcgcc ggccagatca gcgtgcagcc
2880caccttcagc gtgcagagaa acctgccctt cgagaagagc accgtgatgg ccgccttcac
2940cggcaacacc gagggcagaa ccagcgacat gagagccgag atcatcagaa tgatggaggg
3000cgccaagccc gaggaggtgt ccttcagagg cagaggcgtg ttcgagctga gcgacgagaa
3060ggccaccaac cccatcgtgc ctagcttcga catgagcaac gagggcagct acttcttcgg
3120cgacaacgcc gaggagtacg acaactgatc agtcgaccac gtgtgatcca gatctacttc
3180tggctaataa aagatcagag ctctagagat ctgtgtgttg gttttttgtg tggtactctt
3240ccgcttcctc gctcactgac tcgctgcgct cggtcgttcg gctgcggcga gcggtatcag
3300ctcactcaaa ggcggtaata cggttatcca cagaatcagg ggataacgca ggaaagaaca
3360tgtgagcaaa aggccagcaa aaggccagga accgtaaaaa ggccgcgttg ctggcgtttt
3420tccataggct ccgcccccct gacgagcatc acaaaaatcg acgctcaagt cagaggtggc
3480gaaacccgac aggactataa agataccagg cgtttccccc tggaagctcc ctcgtgcgct
3540ctcctgttcc gaccctgccg cttaccggat acctgtccgc ctttctccct tcgggaagcg
3600tggcgctttc tcatagctca cgctgtaggt atctcagttc ggtgtaggtc gttcgctcca
3660agctgggctg tgtgcacgaa ccccccgttc agcccgaccg ctgcgcctta tccggtaact
3720atcgtcttga gtccaacccg gtaagacacg acttatcgcc actggcagca gccactggta
3780acaggattag cagagcgagg tatgtaggcg gtgctacaga gttcttgaag tggtggccta
3840actacggcta cactagaaga acagtatttg gtatctgcgc tctgctgaag ccagttacct
3900tcggaaaaag agttggtagc tcttgatccg gcaaacaaac caccgctggt agcggtggtt
3960tttttgtttg caagcagcag attacgcgca gaaaaaaagg atctcaagaa gatcctttga
4020tcttttctac ggggtctgac gctcagtgga acgaaaactc acgttaaggg attttggtca
4080tgagattatc aaaaaggatc ttcacctaga tccttttaaa ttaaaaatga agttttaaat
4140caatctaaag tatatatgag taaacttggt ctgacagtta ccaatgctta atcagtgagg
4200cacctatctc agcgatctgt ctatttcgtt catccatagt tgcctgactc gggggggggg
4260ggcgctgagg tctgcctcgt gaagaaggtg ttgctgactc ataccaggcc tgaatcgccc
4320catcatccag ccagaaagtg agggagccac ggttgatgag agctttgttg taggtggacc
4380agttggtgat tttgaacttt tgctttgcca cggaacggtc tgcgttgtcg ggaagatgcg
4440tgatctgatc cttcaactca gcaaaagttc gatttattca acaaagccgc cgtcccgtca
4500agtcagcgta atgctctgcc agtgttacaa ccaattaacc aattctgatt agaaaaactc
4560atcgagcatc aaatgaaact gcaatttatt catatcagga ttatcaatac catatttttg
4620aaaaagccgt ttctgtaatg aaggagaaaa ctcaccgagg cagttccata ggatggcaag
4680atcctggtat cggtctgcga ttccgactcg tccaacatca atacaaccta ttaatttccc
4740ctcgtcaaaa ataaggttat caagtgagaa atcaccatga gtgacgactg aatccggtga
4800gaatggcaaa agcttatgca tttctttcca gacttgttca acaggccagc cattacgctc
4860gtcatcaaaa tcactcgcat caaccaaacc gttattcatt cgtgattgcg cctgagcgag
4920acgaaatacg cgatcgctgt taaaaggaca attacaaaca ggaatcgaat gcaaccggcg
4980caggaacact gccagcgcat caacaatatt ttcacctgaa tcaggatatt cttctaatac
5040ctggaatgct gttttcccgg ggatcgcagt ggtgagtaac catgcatcat caggagtacg
5100gataaaatgc ttgatggtcg gaagaggcat aaattccgtc agccagttta gtctgaccat
5160ctcatctgta acatcattgg caacgctacc tttgccatgt ttcagaaaca actctggcgc
5220atcgggcttc ccatacaatc gatagattgt cgcacctgat tgcccgacat tatcgcgagc
5280ccatttatac ccatataaat cagcatccat gttggaattt aatcgcggcc tcgagcaaga
5340cgtttcccgt tgaatatggc tcataacacc ccttgtatta ctgtttatgt aagcagacag
5400ttttattgtt catgatgata tatttttatc ttgtgcaatg taacatcaga gattttgaga
5460cacaacgtgg ctttcccccc ccccccatta ttgaagcatt tatcagggtt attgtctcat
5520gagcggatac atatttgaat gtatttagaa aaataaacaa ataggggttc cgcgcacatt
5580tccccgaaaa gtgccacctg acgtctaaga aaccattatt atcatgacat taacctataa
5640aaataggcgt atcacgaggc cctttcgtct cgcgcgtttc ggtgatgacg gtgaaaacct
5700ctgacacatg cagctcccgg agacggtcac agcttgtctg taagcggatg ccgggagcag
5760acaagcccgt cagggcgcgt cagcgggtgt tggcgggtgt cggggctggc ttaactatgc
5820ggcatcagag cagattgtac tgagagtgca ccatatgcgg tgtgaaatac cgcacagatg
5880cgtaaggaga aaataccgca tcagattggc tat
5913943817DNAArtificial sequenceVR10682 94tcgcgcgttt cggtgatgac
ggtgaaaacc tctgacacat gcagctcccg gagacggtca 60cagcttgtct gtaagcggat
gccgggagca gacaagcccg tcagggcgcg tcagcgggtg 120ttggcgggtg tcggggctgg
cttaactatg cggcatcaga gcagattgta ctgagagtgc 180accatatggt gcactctcag
tacaatctgc tctgatgccg catagttaag ccagtatctg 240ctccctgctt gtgtgttgga
ggtcgctgag tagtgcgcga gcaaaattta agctacaaca 300aggcaaggct tgaccgacaa
ttgcatgaag aatctgctta gggttaggcg ttttgcgctg 360cttcgcgatg tacgggccag
atatacgcgt atctgagggg actagggtgt gtttaggcga 420aaagcggggc ttcggttgta
cgcggttagg agtcccctca ggatatagta gtttcgcttt 480tgcataggga gggggaaatg
tagtcttatg caatactctt gtagtcttgc aacatggtaa 540cgatgagtta gcaacatgcc
ttacaaggag agaaaaagca ccgtgcatgc cgattggtgg 600aagtaaggtg gtacgatcgt
gccttattag gaaggcaaca gacgggtctg acatggattg 660gacgaaccac tgaattccgc
attgcagaga tattgtattt aagtgcctag ctcgatactc 720tagacgccat ttgaccattc
accacattgg tgtgcacctc caagcttccg tcaccgtcgt 780cgacacgtgt gatcagatat
cgcggccgct ctagaccagg cgcctggatc cagatctgct 840gtgccttcta gttgccagcc
atctgttgtt tgcccctccc ccgtgccttc cttgaccctg 900gaaggtgcca ctcccactgt
cctttcctaa taaaatgagg aaattgcatc gcattgtctg 960agtaggtgtc attctattct
ggggggtggg gtggggcagg acagcaaggg ggaggattgg 1020gaagacaata gcaggcatgc
tggggatgcg gtgggctcta tgggtaccca ggtgctgaag 1080aattgacccg gttcctcctg
ggccagaaag aagcaggcac atccccttct ctgtgacaca 1140ccctgtccac gcccctggtt
cttagttcca gccccactca taggacactc atagctcagg 1200agggctccgc cttcaatccc
acccgctaaa gtacttggag cggtctctcc ctccctcatc 1260agcccaccaa accaaaccta
gcctccaaga gtgggaagaa attaaagcaa gataggctat 1320taagtgcaga gggagagaaa
atgcctccaa catgtgagga agtaatgaga gaaatcatag 1380aatttcttcc gcttcctcgc
tcactgactc gctgcgctcg gtcgttcggc tgcggcgagc 1440ggtatcagct cactcaaagg
cggtaatacg gttatccaca gaatcagggg ataacgcagg 1500aaagaacatg tgagcaaaag
gccagcaaaa ggccaggaac cgtaaaaagg ccgcgttgct 1560ggcgtttttc cataggctcc
gcccccctga cgagcatcac aaaaatcgac gctcaagtca 1620gaggtggcga aacccgacag
gactataaag ataccaggcg tttccccctg gaagctccct 1680cgtgcgctct cctgttccga
ccctgccgct taccggatac ctgtccgcct ttctcccttc 1740gggaagcgtg gcgctttctc
atagctcacg ctgtaggtat ctcagttcgg tgtaggtcgt 1800tcgctccaag ctgggctgtg
tgcacgaacc ccccgttcag cccgaccgct gcgccttatc 1860cggtaactat cgtcttgagt
ccaacccggt aagacacgac ttatcgccac tggcagcagc 1920cactggtaac aggattagca
gagcgaggta tgtaggcggt gctacagagt tcttgaagtg 1980gtggcctaac tacggctaca
ctagaagaac agtatttggt atctgcgctc tgctgaagcc 2040agttaccttc ggaaaaagag
ttggtagctc ttgatccggc aaacaaacca ccgctggtag 2100cggtggtttt tttgtttgca
agcagcagat tacgcgcaga aaaaaaggat ctcaagaaga 2160tcctttgatc ttttctacgg
ggtctgacgc tcagtggaac gaaaactcac gttaagggat 2220tttggtcatg agattatcaa
aaaggatctt cacctagatc cttttaaatt aaaaatgaag 2280ttttaaatca atctaaagta
tatatgagta aacttggtct gacagttacc aatgcttaat 2340cagtgaggca cctatctcag
cgatctgtct atttcgttca tccatagttg cctgactcgg 2400gggggggggg cgctgaggtc
tgcctcgtga agaaggtgtt gctgactcat accaggcctg 2460aatcgcccca tcatccagcc
agaaagtgag ggagccacgg ttgatgagag ctttgttgta 2520ggtggaccag ttggtgattt
tgaacttttg ctttgccacg gaacggtctg cgttgtcggg 2580aagatgcgtg atctgatcct
tcaactcagc aaaagttcga tttattcaac aaagccgccg 2640tcccgtcaag tcagcgtaat
gctctgccag tgttacaacc aattaaccaa ttctgattag 2700aaaaactcat cgagcatcaa
atgaaactgc aatttattca tatcaggatt atcaatacca 2760tatttttgaa aaagccgttt
ctgtaatgaa ggagaaaact caccgaggca gttccatagg 2820atggcaagat cctggtatcg
gtctgcgatt ccgactcgtc caacatcaat acaacctatt 2880aatttcccct cgtcaaaaat
aaggttatca agtgagaaat caccatgagt gacgactgaa 2940tccggtgaga atggcaaaag
cttatgcatt tctttccaga cttgttcaac aggccagcca 3000ttacgctcgt catcaaaatc
actcgcatca accaaaccgt tattcattcg tgattgcgcc 3060tgagcgagac gaaatacgcg
atcgctgtta aaaggacaat tacaaacagg aatcgaatgc 3120aaccggcgca ggaacactgc
cagcgcatca acaatatttt cacctgaatc aggatattct 3180tctaatacct ggaatgctgt
tttcccgggg atcgcagtgg tgagtaacca tgcatcatca 3240ggagtacgga taaaatgctt
gatggtcgga agaggcataa attccgtcag ccagtttagt 3300ctgaccatct catctgtaac
atcattggca acgctacctt tgccatgttt cagaaacaac 3360tctggcgcat cgggcttccc
atacaatcga tagattgtcg cacctgattg cccgacatta 3420tcgcgagccc atttataccc
atataaatca gcatccatgt tggaatttaa tcgcggcctc 3480gagcaagacg tttcccgttg
aatatggctc ataacacccc ttgtattact gtttatgtaa 3540gcagacagtt ttattgttca
tgatgatata tttttatctt gtgcaatgta acatcagaga 3600ttttgagaca caacgtggct
ttcccccccc ccccattatt gaagcattta tcagggttat 3660tgtctcatga gcggatacat
atttgaatgt atttagaaaa ataaacaaat aggggttccg 3720cgcacatttc cccgaaaagt
gccacctgac gtctaagaaa ccattattat catgacatta 3780acctataaaa ataggcgtat
cacgaggccc tttcgtc 3817954822DNAArtificial
sequenceVR4764, Ligation of VR4756 RV-SalI into VR10682 RV
95tcgcgcgttt cggtgatgac ggtgaaaacc tctgacacat gcagctcccg gagacggtca
60cagcttgtct gtaagcggat gccgggagca gacaagcccg tcagggcgcg tcagcgggtg
120ttggcgggtg tcggggctgg cttaactatg cggcatcaga gcagattgta ctgagagtgc
180accatatggt gcactctcag tacaatctgc tctgatgccg catagttaag ccagtatctg
240ctccctgctt gtgtgttgga ggtcgctgag tagtgcgcga gcaaaattta agctacaaca
300aggcaaggct tgaccgacaa ttgcatgaag aatctgctta gggttaggcg ttttgcgctg
360cttcgcgatg tacgggccag atatacgcgt atctgagggg actagggtgt gtttaggcga
420aaagcggggc ttcggttgta cgcggttagg agtcccctca ggatatagta gtttcgcttt
480tgcataggga gggggaaatg tagtcttatg caatactctt gtagtcttgc aacatggtaa
540cgatgagtta gcaacatgcc ttacaaggag agaaaaagca ccgtgcatgc cgattggtgg
600aagtaaggtg gtacgatcgt gccttattag gaaggcaaca gacgggtctg acatggattg
660gacgaaccac tgaattccgc attgcagaga tattgtattt aagtgcctag ctcgatactc
720tagacgccat ttgaccattc accacattgg tgtgcacctc caagcttccg tcaccgtcgt
780cgacacgtgt gatcagatat cgaattcgcc accatgagcc ttctaaccga ggtcgaaacg
840tatgttctct ctatcgttcc atcaggcccc ctcaaagccg aaatcgcgca gagacttgaa
900gatgtctttg ctgggaaaaa cacagatctt gaggctctca tggaatggct aaagacaaga
960ccaatcctgt cacctctgac taaggggatt ttggggtttg tgttcacgct caccgtgccc
1020agtgagcgag gactgcagcg tagacgcttt gtccaaaatg ccctcaatgg gaatggggat
1080ccaaataaca tggacagagc agttaaacta tatagaaaac ttaagaggga gattacattc
1140catggggcca aagaaatagc actcagttat tctgctggtg cacttgccag ttgcatgggc
1200ctcatataca acagaatggg ggctgtaacc actgaagtgg cctttggcct ggtatgtgca
1260acatgtgaac agattgctga ctcccagcac aggtctcata ggcaaatggt ggcaacaacc
1320aatccattaa taaggcatga gaacagaatg gttttggcca gcactacagc taaggctatg
1380gagcaaatgg ctggatcaag tgagcaggca gcggaggcca tggaaattgc tagtcaggcc
1440aggcaaatgg tgcaggcaat gagagccatt gggactcatc ctagctccag tgctggtcta
1500aaagatgatc ttcttgaaaa tttgcagacc tatcagaaac gaatgggggt gcagatgcaa
1560cgattcaagt gacccgcttg ttgttgctgc gagtatcatt gggatcttgc acttgatatt
1620gtggattctt gatcgtcttt ttttcaaatg catctatcga ctcttcaaac acggtctgaa
1680aagagggcct tctacggaag gagtacctga gtctatgagg gaagaatatc gaaaggaaca
1740gcagaatgct gtggatgctg acgacagtca ttttgtcagc atagagctgg agtaatcagt
1800cgaatcgcgg ccgctctaga ccaggcgcct ggatccagat ctgctgtgcc ttctagttgc
1860cagccatctg ttgtttgccc ctcccccgtg ccttccttga ccctggaagg tgccactccc
1920actgtccttt cctaataaaa tgaggaaatt gcatcgcatt gtctgagtag gtgtcattct
1980attctggggg gtggggtggg gcaggacagc aagggggagg attgggaaga caatagcagg
2040catgctgggg atgcggtggg ctctatgggt acccaggtgc tgaagaattg acccggttcc
2100tcctgggcca gaaagaagca ggcacatccc cttctctgtg acacaccctg tccacgcccc
2160tggttcttag ttccagcccc actcatagga cactcatagc tcaggagggc tccgccttca
2220atcccacccg ctaaagtact tggagcggtc tctccctccc tcatcagccc accaaaccaa
2280acctagcctc caagagtggg aagaaattaa agcaagatag gctattaagt gcagagggag
2340agaaaatgcc tccaacatgt gaggaagtaa tgagagaaat catagaattt cttccgcttc
2400ctcgctcact gactcgctgc gctcggtcgt tcggctgcgg cgagcggtat cagctcactc
2460aaaggcggta atacggttat ccacagaatc aggggataac gcaggaaaga acatgtgagc
2520aaaaggccag caaaaggcca ggaaccgtaa aaaggccgcg ttgctggcgt ttttccatag
2580gctccgcccc cctgacgagc atcacaaaaa tcgacgctca agtcagaggt ggcgaaaccc
2640gacaggacta taaagatacc aggcgtttcc ccctggaagc tccctcgtgc gctctcctgt
2700tccgaccctg ccgcttaccg gatacctgtc cgcctttctc ccttcgggaa gcgtggcgct
2760ttctcatagc tcacgctgta ggtatctcag ttcggtgtag gtcgttcgct ccaagctggg
2820ctgtgtgcac gaaccccccg ttcagcccga ccgctgcgcc ttatccggta actatcgtct
2880tgagtccaac ccggtaagac acgacttatc gccactggca gcagccactg gtaacaggat
2940tagcagagcg aggtatgtag gcggtgctac agagttcttg aagtggtggc ctaactacgg
3000ctacactaga agaacagtat ttggtatctg cgctctgctg aagccagtta ccttcggaaa
3060aagagttggt agctcttgat ccggcaaaca aaccaccgct ggtagcggtg gtttttttgt
3120ttgcaagcag cagattacgc gcagaaaaaa aggatctcaa gaagatcctt tgatcttttc
3180tacggggtct gacgctcagt ggaacgaaaa ctcacgttaa gggattttgg tcatgagatt
3240atcaaaaagg atcttcacct agatcctttt aaattaaaaa tgaagtttta aatcaatcta
3300aagtatatat gagtaaactt ggtctgacag ttaccaatgc ttaatcagtg aggcacctat
3360ctcagcgatc tgtctatttc gttcatccat agttgcctga ctcggggggg gggggcgctg
3420aggtctgcct cgtgaagaag gtgttgctga ctcataccag gcctgaatcg ccccatcatc
3480cagccagaaa gtgagggagc cacggttgat gagagctttg ttgtaggtgg accagttggt
3540gattttgaac ttttgctttg ccacggaacg gtctgcgttg tcgggaagat gcgtgatctg
3600atccttcaac tcagcaaaag ttcgatttat tcaacaaagc cgccgtcccg tcaagtcagc
3660gtaatgctct gccagtgtta caaccaatta accaattctg attagaaaaa ctcatcgagc
3720atcaaatgaa actgcaattt attcatatca ggattatcaa taccatattt ttgaaaaagc
3780cgtttctgta atgaaggaga aaactcaccg aggcagttcc ataggatggc aagatcctgg
3840tatcggtctg cgattccgac tcgtccaaca tcaatacaac ctattaattt cccctcgtca
3900aaaataaggt tatcaagtga gaaatcacca tgagtgacga ctgaatccgg tgagaatggc
3960aaaagcttat gcatttcttt ccagacttgt tcaacaggcc agccattacg ctcgtcatca
4020aaatcactcg catcaaccaa accgttattc attcgtgatt gcgcctgagc gagacgaaat
4080acgcgatcgc tgttaaaagg acaattacaa acaggaatcg aatgcaaccg gcgcaggaac
4140actgccagcg catcaacaat attttcacct gaatcaggat attcttctaa tacctggaat
4200gctgttttcc cggggatcgc agtggtgagt aaccatgcat catcaggagt acggataaaa
4260tgcttgatgg tcggaagagg cataaattcc gtcagccagt ttagtctgac catctcatct
4320gtaacatcat tggcaacgct acctttgcca tgtttcagaa acaactctgg cgcatcgggc
4380ttcccataca atcgatagat tgtcgcacct gattgcccga cattatcgcg agcccattta
4440tacccatata aatcagcatc catgttggaa tttaatcgcg gcctcgagca agacgtttcc
4500cgttgaatat ggctcataac accccttgta ttactgttta tgtaagcaga cagttttatt
4560gttcatgatg atatattttt atcttgtgca atgtaacatc agagattttg agacacaacg
4620tggctttccc ccccccccca ttattgaagc atttatcagg gttattgtct catgagcgga
4680tacatatttg aatgtattta gaaaaataaa caaatagggg ttccgcgcac atttccccga
4740aaagtgccac ctgacgtcta agaaaccatt attatcatga cattaaccta taaaaatagg
4800cgtatcacga ggccctttcg tc
4822965341DNAArtificial sequenceVR4765, Ligation of NP from 4762 into
VR10682 96tcgcgcgttt cggtgatgac ggtgaaaacc tctgacacat gcagctcccg
gagacggtca 60cagcttgtct gtaagcggat gccgggagca gacaagcccg tcagggcgcg
tcagcgggtg 120ttggcgggtg tcggggctgg cttaactatg cggcatcaga gcagattgta
ctgagagtgc 180accatatggt gcactctcag tacaatctgc tctgatgccg catagttaag
ccagtatctg 240ctccctgctt gtgtgttgga ggtcgctgag tagtgcgcga gcaaaattta
agctacaaca 300aggcaaggct tgaccgacaa ttgcatgaag aatctgctta gggttaggcg
ttttgcgctg 360cttcgcgatg tacgggccag atatacgcgt atctgagggg actagggtgt
gtttaggcga 420aaagcggggc ttcggttgta cgcggttagg agtcccctca ggatatagta
gtttcgcttt 480tgcataggga gggggaaatg tagtcttatg caatactctt gtagtcttgc
aacatggtaa 540cgatgagtta gcaacatgcc ttacaaggag agaaaaagca ccgtgcatgc
cgattggtgg 600aagtaaggtg gtacgatcgt gccttattag gaaggcaaca gacgggtctg
acatggattg 660gacgaaccac tgaattccgc attgcagaga tattgtattt aagtgcctag
ctcgatactc 720tagacgccat ttgaccattc accacattgg tgtgcacctc caagcttccg
tcaccgtcgt 780cgacacgtgt gatcagatat cgaattcgcc accatggcca gccagggcac
caagagaagc 840tacgagcaga tggagaccga cggcgagaga cagaacgcca ccgagatcag
agccagcgtg 900ggcaagatga tcgacggcat cggcagattc tacatccaga tgtgcaccga
gctgaagctg 960agcgactacg agggcagact gatccagaac agcctgacca tcgagagaat
ggtgctgagc 1020gccttcgacg agagaagaaa cagatacctg gaggagcacc ccagcgccgg
caaggacccc 1080aagaagaccg gcggccccat ctacagaaga gtggacggca agtggatgag
agagctggtg 1140ctgtacgaca aggaggagat cagaagaatc tggagacagg ccaacaacgg
cgaggacgcc 1200accgccggcc tgacccacat gatgatctgg cacagcaacc tgaacgacac
cacctaccag 1260agaaccagag ccctggtgcg gaccggcatg gaccccagaa tgtgcagcct
gatgcagggc 1320agcaccctgc ccagaagaag cggcgccgcc ggcgccgccg tgaagggcat
cggcaccatg 1380gtgatggagc tgatcagaat gatcaagaga ggcatcaacg acagaaactt
ctggagaggc 1440gagaacggca gaaagaccag aagcgcctac gagagaatgt gcaacatcct
gaagggcaag 1500ttccagaccg ccgcccagag agccatgatg gaccaggtcc gggagagcag
aaaccccggc 1560aacgccgaga tcgaggacct gatcttcctg gccagaagcg ccctgatcct
gagaggcagc 1620gtggcccaca agagctgcct gcccgcctgc gtgtacggcc ccgccgtgag
cagcggctac 1680gacttcgaga aggagggcta cagcctggtg ggcatcgacc ccttcaagct
gctgcagaac 1740agccaggtgt acagcctgat cagacccaac gagaaccccg cccacaagag
ccagctggtg 1800tggatggcct gccacagcgc cgccttcgag gacctgagac tgctgagctt
catcagaggc 1860accaaggtgt cccccagagg caagctgagc accagaggcg tgcagatcgc
cagcaacgag 1920aacatggaca acatgggcag cagcaccctg gagctgagaa gcagatactg
ggccatcaga 1980accagaagcg gcggcaacac caaccagcag agagccagcg ccggccagat
cagcgtgcag 2040cccaccttca gcgtgcagag aaacctgccc ttcgagaaga gcaccgtgat
ggccgccttc 2100accggcaaca ccgagggcag aaccagcgac atgagagccg agatcatcag
aatgatggag 2160ggcgccaagc ccgaggaggt gtccttcaga ggcagaggcg tgttcgagct
gagcgacgag 2220aaggccacca accccatcgt gcctagcttc gacatgagca acgagggcag
ctacttcttc 2280ggcgacaacg ccgaggagta cgacaactga tcagtcgacc acatcgcggc
cgctctagac 2340caggcgcctg gatccagatc tgctgtgcct tctagttgcc agccatctgt
tgtttgcccc 2400tcccccgtgc cttccttgac cctggaaggt gccactccca ctgtcctttc
ctaataaaat 2460gaggaaattg catcgcattg tctgagtagg tgtcattcta ttctgggggg
tggggtgggg 2520caggacagca agggggagga ttgggaagac aatagcaggc atgctgggga
tgcggtgggc 2580tctatgggta cccaggtgct gaagaattga cccggttcct cctgggccag
aaagaagcag 2640gcacatcccc ttctctgtga cacaccctgt ccacgcccct ggttcttagt
tccagcccca 2700ctcataggac actcatagct caggagggct ccgccttcaa tcccacccgc
taaagtactt 2760ggagcggtct ctccctccct catcagccca ccaaaccaaa cctagcctcc
aagagtggga 2820agaaattaaa gcaagatagg ctattaagtg cagagggaga gaaaatgcct
ccaacatgtg 2880aggaagtaat gagagaaatc atagaatttc ttccgcttcc tcgctcactg
actcgctgcg 2940ctcggtcgtt cggctgcggc gagcggtatc agctcactca aaggcggtaa
tacggttatc 3000cacagaatca ggggataacg caggaaagaa catgtgagca aaaggccagc
aaaaggccag 3060gaaccgtaaa aaggccgcgt tgctggcgtt tttccatagg ctccgccccc
ctgacgagca 3120tcacaaaaat cgacgctcaa gtcagaggtg gcgaaacccg acaggactat
aaagatacca 3180ggcgtttccc cctggaagct ccctcgtgcg ctctcctgtt ccgaccctgc
cgcttaccgg 3240atacctgtcc gcctttctcc cttcgggaag cgtggcgctt tctcatagct
cacgctgtag 3300gtatctcagt tcggtgtagg tcgttcgctc caagctgggc tgtgtgcacg
aaccccccgt 3360tcagcccgac cgctgcgcct tatccggtaa ctatcgtctt gagtccaacc
cggtaagaca 3420cgacttatcg ccactggcag cagccactgg taacaggatt agcagagcga
ggtatgtagg 3480cggtgctaca gagttcttga agtggtggcc taactacggc tacactagaa
gaacagtatt 3540tggtatctgc gctctgctga agccagttac cttcggaaaa agagttggta
gctcttgatc 3600cggcaaacaa accaccgctg gtagcggtgg tttttttgtt tgcaagcagc
agattacgcg 3660cagaaaaaaa ggatctcaag aagatccttt gatcttttct acggggtctg
acgctcagtg 3720gaacgaaaac tcacgttaag ggattttggt catgagatta tcaaaaagga
tcttcaccta 3780gatcctttta aattaaaaat gaagttttaa atcaatctaa agtatatatg
agtaaacttg 3840gtctgacagt taccaatgct taatcagtga ggcacctatc tcagcgatct
gtctatttcg 3900ttcatccata gttgcctgac tcgggggggg ggggcgctga ggtctgcctc
gtgaagaagg 3960tgttgctgac tcataccagg cctgaatcgc cccatcatcc agccagaaag
tgagggagcc 4020acggttgatg agagctttgt tgtaggtgga ccagttggtg attttgaact
tttgctttgc 4080cacggaacgg tctgcgttgt cgggaagatg cgtgatctga tccttcaact
cagcaaaagt 4140tcgatttatt caacaaagcc gccgtcccgt caagtcagcg taatgctctg
ccagtgttac 4200aaccaattaa ccaattctga ttagaaaaac tcatcgagca tcaaatgaaa
ctgcaattta 4260ttcatatcag gattatcaat accatatttt tgaaaaagcc gtttctgtaa
tgaaggagaa 4320aactcaccga ggcagttcca taggatggca agatcctggt atcggtctgc
gattccgact 4380cgtccaacat caatacaacc tattaatttc ccctcgtcaa aaataaggtt
atcaagtgag 4440aaatcaccat gagtgacgac tgaatccggt gagaatggca aaagcttatg
catttctttc 4500cagacttgtt caacaggcca gccattacgc tcgtcatcaa aatcactcgc
atcaaccaaa 4560ccgttattca ttcgtgattg cgcctgagcg agacgaaata cgcgatcgct
gttaaaagga 4620caattacaaa caggaatcga atgcaaccgg cgcaggaaca ctgccagcgc
atcaacaata 4680ttttcacctg aatcaggata ttcttctaat acctggaatg ctgttttccc
ggggatcgca 4740gtggtgagta accatgcatc atcaggagta cggataaaat gcttgatggt
cggaagaggc 4800ataaattccg tcagccagtt tagtctgacc atctcatctg taacatcatt
ggcaacgcta 4860cctttgccat gtttcagaaa caactctggc gcatcgggct tcccatacaa
tcgatagatt 4920gtcgcacctg attgcccgac attatcgcga gcccatttat acccatataa
atcagcatcc 4980atgttggaat ttaatcgcgg cctcgagcaa gacgtttccc gttgaatatg
gctcataaca 5040ccccttgtat tactgtttat gtaagcagac agttttattg ttcatgatga
tatattttta 5100tcttgtgcaa tgtaacatca gagattttga gacacaacgt ggctttcccc
ccccccccat 5160tattgaagca tttatcaggg ttattgtctc atgagcggat acatatttga
atgtatttag 5220aaaaataaac aaataggggt tccgcgcaca tttccccgaa aagtgccacc
tgacgtctaa 5280gaaaccatta ttatcatgac attaacctat aaaaataggc gtatcacgag
gccctttcgt 5340c
5341977798DNAArtificial sequenceVR4766, Ligation of Seg7 into
VR4762 97tggccattgc atacgttgta tccatatcat aatatgtaca tttatattgg
ctcatgtcca 60acattaccgc catgttgaca ttgattattg actagttatt aatagtaatc
aattacgggg 120tcattagttc atagcccata tatggagttc cgcgttacat aacttacggt
aaatggcccg 180cctggctgac cgcccaacga cccccgccca ttgacgtcaa taatgacgta
tgttcccata 240gtaacgccaa tagggacttt ccattgacgt caatgggtgg agtatttacg
gtaaactgcc 300cacttggcag tacatcaagt gtatcatatg ccaagtacgc cccctattga
cgtcaatgac 360ggtaaatggc ccgcctggca ttatgcccag tacatgacct tatgggactt
tcctacttgg 420cagtacatct acgtattagt catcgctatt accatggtga tgcggttttg
gcagtacatc 480aatgggcgtg gatagcggtt tgactcacgg ggatttccaa gtctccaccc
cattgacgtc 540aatgggagtt tgttttggca ccaaaatcaa cgggactttc caaaatgtcg
taacaactcc 600gccccattga cgcaaatggg cggtaggcgt gtacggtggg aggtctatat
aagcagagct 660cgtttagtga accgtcagat cgcctggaga cgccatccac gctgttttga
cctccataga 720agacaccggg accgatccag cctccgcggc cgggaacggt gcattggaac
gcggattccc 780cgtgccaaga gtgacgtaag taccgcctat agactctata ggcacacccc
tttggctctt 840atgcatgcta tactgttttt ggcttggggc ctatacaccc ccgcttcctt
atgctatagg 900tgatggtata gcttagccta taggtgtggg ttattgacca ttattgacca
ctcccctatt 960ggtgacgata ctttccatta ctaatccata acatggctct ttgccacaac
tatctctatt 1020ggctatatgc caatactctg tccttcagag actgacacgg actctgtatt
tttacaggat 1080ggggtcccat ttattattta caaattcaca tatacaacaa cgccgtcccc
cgtgcccgca 1140gtttttatta aacatagcgt gggatctcca cgcgaatctc gggtacgtgt
tccggacatg 1200ggctcttctc cggtagcggc ggagcttcca catccgagcc ctggtcccat
gcctccagcg 1260gctcatggtc gctcggcagc tccttgctcc taacagtgga ggccagactt
aggcacagca 1320caatgcccac caccaccagt gtgccgcaca aggccgtggc ggtagggtat
gtgtctgaaa 1380atgagcgtgg agattgggct cgcacggctg acgcagatgg aagacttaag
gcagcggcag 1440aagaagatgc aggcagctga gttgttgtat tctgataaga gtcagaggta
actcccgttg 1500cggtgctgtt aacggtggag ggcagtgtag tctgagcagt actcgttgct
gccgcgcgcg 1560ccaccagaca taatagctga cagactaaca gactgttcct ttccatgggt
cttttctgca 1620gtcaccgtcg tcggatatcg aattcgccac catggccagc cagggcacca
agagaagcta 1680cgagcagatg gagaccgacg gcgagagaca gaacgccacc gagatcagag
ccagcgtggg 1740caagatgatc gacggcatcg gcagattcta catccagatg tgcaccgagc
tgaagctgag 1800cgactacgag ggcagactga tccagaacag cctgaccatc gagagaatgg
tgctgagcgc 1860cttcgacgag agaagaaaca gatacctgga ggagcacccc agcgccggca
aggaccccaa 1920gaagaccggc ggccccatct acagaagagt ggacggcaag tggatgagag
agctggtgct 1980gtacgacaag gaggagatca gaagaatctg gagacaggcc aacaacggcg
aggacgccac 2040cgccggcctg acccacatga tgatctggca cagcaacctg aacgacacca
cctaccagag 2100aaccagagcc ctggtgcgga ccggcatgga ccccagaatg tgcagcctga
tgcagggcag 2160caccctgccc agaagaagcg gcgccgccgg cgccgccgtg aagggcatcg
gcaccatggt 2220gatggagctg atcagaatga tcaagagagg catcaacgac agaaacttct
ggagaggcga 2280gaacggcaga aagaccagaa gcgcctacga gagaatgtgc aacatcctga
agggcaagtt 2340ccagaccgcc gcccagagag ccatgatgga ccaggtccgg gagagcagaa
accccggcaa 2400cgccgagatc gaggacctga tcttcctggc cagaagcgcc ctgatcctga
gaggcagcgt 2460ggcccacaag agctgcctgc ccgcctgcgt gtacggcccc gccgtgagca
gcggctacga 2520cttcgagaag gagggctaca gcctggtggg catcgacccc ttcaagctgc
tgcagaacag 2580ccaggtgtac agcctgatca gacccaacga gaaccccgcc cacaagagcc
agctggtgtg 2640gatggcctgc cacagcgccg ccttcgagga cctgagactg ctgagcttca
tcagaggcac 2700caaggtgtcc cccagaggca agctgagcac cagaggcgtg cagatcgcca
gcaacgagaa 2760catggacaac atgggcagca gcaccctgga gctgagaagc agatactggg
ccatcagaac 2820cagaagcggc ggcaacacca accagcagag agccagcgcc ggccagatca
gcgtgcagcc 2880caccttcagc gtgcagagaa acctgccctt cgagaagagc accgtgatgg
ccgccttcac 2940cggcaacacc gagggcagaa ccagcgacat gagagccgag atcatcagaa
tgatggaggg 3000cgccaagccc gaggaggtgt ccttcagagg cagaggcgtg ttcgagctga
gcgacgagaa 3060ggccaccaac cccatcgtgc ctagcttcga catgagcaac gagggcagct
acttcttcgg 3120cgacaacgcc gaggagtacg acaactgatc agtcgaccac gtgtgatcca
gatctacttc 3180tggctaataa aagatcagag ctctagagat ctgtgtgttg gttttttgtg
tggtactctt 3240ccgcttcctc gctcactgac tcgctgcgct cggtcgttcg gctgcggcga
gcggtatcag 3300ctcactcaaa ggcggtaata cggttatcca cagaatcagg ggataacgca
ggaaagaaca 3360tgtgagcaaa aggccagcaa aaggccagga accgtaaaaa ggccgcgttg
ctggcgtttt 3420tccataggct ccgcccccct gacgagcatc acaaaaatcg acgctcaagt
cagaggtggc 3480gaaacccgac aggactataa agataccagg cgtttccccc tggaagctcc
ctcgtgcgct 3540ctcctgttcc gaccctgccg cttaccggat acctgtccgc ctttctccct
tcgggaagcg 3600tggcgctttc tcatagctca cgctgtaggt atctcagttc ggtgtaggtc
gttcgctcca 3660agctgggctg tgtgcacgaa ccccccgttc agcccgaccg ctgcgcctta
tccggtaact 3720atcgtcttga gtccaacccg gtaagacacg acttatcgcc actggcagca
gccactggta 3780acaggattag cagagcgagg tatgtaggcg gtgctacaga gttcttgaag
tggtggccta 3840actacggcta cactagaaga acagtatttg gtatctgcgc tctgctgaag
ccagttacct 3900tcggaaaaag agttggtagc tcttgatccg gcaaacaaac caccgctggt
agcggtggtt 3960tttttgtttg caagcagcag attacgcgca gaaaaaaagg atctcaagaa
gatcctttga 4020tcttttctac ggggtctgac gctcagtgga acgaaaactc acgttaaggg
attttggtca 4080tgagattatc aaaaaggatc ttcacctaga tccttttaaa ttaaaaatga
agttttaaat 4140caatctaaag tatatatgag taaacttggt ctgacagtta ccaatgctta
atcagtgagg 4200cacctatctc agcgatctgt ctatttcgtt catccatagt tgcctgactc
gggggggggg 4260ggcgctgagg tctgcctcgt gaagaaggtg ttgctgactc ataccaggcc
tgaatcgccc 4320catcatccag ccagaaagtg agggagccac ggttgatgag agctttgttg
taggtggacc 4380agttggtgat tttgaacttt tgctttgcca cggaacggtc tgcgttgtcg
ggaagatgcg 4440tgatctgatc cttcaactca gcaaaagttc gatttattca acaaagccgc
cgtcccgtca 4500agtcagcgta atgctctgcc agtgttacaa ccaattaacc aattctgatt
agaaaaactc 4560atcgagcatc aaatgaaact gcaatttatt catatcagga ttatcaatac
catatttttg 4620aaaaagccgt ttctgtaatg aaggagaaaa ctcaccgagg cagttccata
ggatggcaag 4680atcctggtat cggtctgcga ttccgactcg tccaacatca atacaaccta
ttaatttccc 4740ctcgtcaaaa ataaggttat caagtgagaa atcaccatga gtgacgactg
aatccggtga 4800gaatggcaaa agcttatgca tttctttcca gacttgttca acaggccagc
cattacgctc 4860gtcatcaaaa tcactcgcat caaccaaacc gttattcatt cgtgattgcg
cctgagcgag 4920acgaaatacg cgatcgctgt taaaaggaca attacaaaca ggaatcgaat
gcaaccggcg 4980caggaacact gccagcgcat caacaatatt ttcacctgaa tcaggatatt
cttctaatac 5040ctggaatgct gttttcccgg ggatcgcagt ggtgagtaac catgcatcat
caggagtacg 5100gataaaatgc ttgatggtcg gaagaggcat aaattccgtc agccagttta
gtctgaccat 5160ctcatctgta acatcattgg caacgctacc tttgccatgt ttcagaaaca
actctggcgc 5220atcgggcttc ccatacaatc gatagattgt cgcacctgat tgcccgacat
tatcgcgagc 5280ccatttatac ccatataaat cagcatccat gttggaattt aatcgcggcc
tcgagcaaga 5340cgtttcccgt tgaatatggc tcataacacc ccttgtatta ctgtttatgt
aagcagacag 5400ttttattgtt catgatgata tatttttatc ttgtgcaatg taacatcaga
gattttgaga 5460cactatggtg cactctcagt acaatctgct ctgatgccgc atagttaagc
cagtatctgc 5520tccctgcttg tgtgttggag gtcgctgagt agtgcgcgag caaaatttaa
gctacaacaa 5580ggcaaggctt gaccgacaat tgcatgaaga atctgcttag ggttaggcgt
tttgcgctgc 5640ttcgcgatgt acgggccaga tatacgcgta tctgagggga ctagggtgtg
tttaggcgaa 5700aagcggggct tcggttgtac gcggttagga gtcccctcag gatatagtag
tttcgctttt 5760gcatagggag ggggaaatgt agtcttatgc aatactcttg tagtcttgca
acatggtaac 5820gatgagttag caacatgcct tacaaggaga gaaaaagcac cgtgcatgcc
gattggtgga 5880agtaaggtgg tacgatcgtg ccttattagg aaggcaacag acgggtctga
catggattgg 5940acgaaccact gaattccgca ttgcagagat attgtattta agtgcctagc
tcgatactct 6000agacgccatt tgaccattca ccacattggt gtgcacctcc aagcttccgt
caccgtcgtc 6060gacacgtgtg atcagatatc gaattcgcca ccatgagcct tctaaccgag
gtcgaaacgt 6120atgttctctc tatcgttcca tcaggccccc tcaaagccga aatcgcgcag
agacttgaag 6180atgtctttgc tgggaaaaac acagatcttg aggctctcat ggaatggcta
aagacaagac 6240caatcctgtc acctctgact aaggggattt tggggtttgt gttcacgctc
accgtgccca 6300gtgagcgagg actgcagcgt agacgctttg tccaaaatgc cctcaatggg
aatggggatc 6360caaataacat ggacagagca gttaaactat atagaaaact taagagggag
attacattcc 6420atggggccaa agaaatagca ctcagttatt ctgctggtgc acttgccagt
tgcatgggcc 6480tcatatacaa cagaatgggg gctgtaacca ctgaagtggc ctttggcctg
gtatgtgcaa 6540catgtgaaca gattgctgac tcccagcaca ggtctcatag gcaaatggtg
gcaacaacca 6600atccattaat aaggcatgag aacagaatgg ttttggccag cactacagct
aaggctatgg 6660agcaaatggc tggatcaagt gagcaggcag cggaggccat ggaaattgct
agtcaggcca 6720ggcaaatggt gcaggcaatg agagccattg ggactcatcc tagctccagt
gctggtctaa 6780aagatgatct tcttgaaaat ttgcagacct atcagaaacg aatgggggtg
cagatgcaac 6840gattcaagtg acccgcttgt tgttgctgcg agtatcattg ggatcttgca
cttgatattg 6900tggattcttg atcgtctttt tttcaaatgc atctatcgac tcttcaaaca
cggtctgaaa 6960agagggcctt ctacggaagg agtacctgag tctatgaggg aagaatatcg
aaaggaacag 7020cagaatgctg tggatgctga cgacagtcat tttgtcagca tagagctgga
gtaatcagtc 7080gaccacatcg cggccgctct agaccaggcg cctggatcca gatctgctgt
gccttctagt 7140tgccagccat ctgttgtttg cccctccccc gtgccttcct tgaccctgga
aggtgccact 7200cccactgtcc tttcctaata aaatgaggaa attgcatcgc attgtctgag
taggtgtcat 7260tctattctgg ggggtggggt ggggcaggac agcaaggggg aggattggga
agacaatagc 7320aggcatgctg gggatgcggt gggctctatg ggtggctttc cccccccccc
cattattgaa 7380gcatttatca gggttattgt ctcatgagcg gatacatatt tgaatgtatt
tagaaaaata 7440aacaaatagg ggttccgcgc acatttcccc gaaaagtgcc acctgacgtc
taagaaacca 7500ttattatcat gacattaacc tataaaaata ggcgtatcac gaggcccttt
cgtctcgcgc 7560gtttcggtga tgacggtgaa aacctctgac acatgcagct cccggagacg
gtcacagctt 7620gtctgtaagc ggatgccggg agcagacaag cccgtcaggg cgcgtcagcg
ggtgttggcg 7680ggtgtcgggg ctggcttaac tatgcggcat cagagcagat tgtactgaga
gtgcaccata 7740tgcggtgtga aataccgcac agatgcgtaa ggagaaaata ccgcatcaga
ttggctat 7798987798DNAArtificial sequenceVR4767, Ligation of Inverted
RSVSeg7 into VR4762 98tggccattgc atacgttgta tccatatcat aatatgtaca
tttatattgg ctcatgtcca 60acattaccgc catgttgaca ttgattattg actagttatt
aatagtaatc aattacgggg 120tcattagttc atagcccata tatggagttc cgcgttacat
aacttacggt aaatggcccg 180cctggctgac cgcccaacga cccccgccca ttgacgtcaa
taatgacgta tgttcccata 240gtaacgccaa tagggacttt ccattgacgt caatgggtgg
agtatttacg gtaaactgcc 300cacttggcag tacatcaagt gtatcatatg ccaagtacgc
cccctattga cgtcaatgac 360ggtaaatggc ccgcctggca ttatgcccag tacatgacct
tatgggactt tcctacttgg 420cagtacatct acgtattagt catcgctatt accatggtga
tgcggttttg gcagtacatc 480aatgggcgtg gatagcggtt tgactcacgg ggatttccaa
gtctccaccc cattgacgtc 540aatgggagtt tgttttggca ccaaaatcaa cgggactttc
caaaatgtcg taacaactcc 600gccccattga cgcaaatggg cggtaggcgt gtacggtggg
aggtctatat aagcagagct 660cgtttagtga accgtcagat cgcctggaga cgccatccac
gctgttttga cctccataga 720agacaccggg accgatccag cctccgcggc cgggaacggt
gcattggaac gcggattccc 780cgtgccaaga gtgacgtaag taccgcctat agactctata
ggcacacccc tttggctctt 840atgcatgcta tactgttttt ggcttggggc ctatacaccc
ccgcttcctt atgctatagg 900tgatggtata gcttagccta taggtgtggg ttattgacca
ttattgacca ctcccctatt 960ggtgacgata ctttccatta ctaatccata acatggctct
ttgccacaac tatctctatt 1020ggctatatgc caatactctg tccttcagag actgacacgg
actctgtatt tttacaggat 1080ggggtcccat ttattattta caaattcaca tatacaacaa
cgccgtcccc cgtgcccgca 1140gtttttatta aacatagcgt gggatctcca cgcgaatctc
gggtacgtgt tccggacatg 1200ggctcttctc cggtagcggc ggagcttcca catccgagcc
ctggtcccat gcctccagcg 1260gctcatggtc gctcggcagc tccttgctcc taacagtgga
ggccagactt aggcacagca 1320caatgcccac caccaccagt gtgccgcaca aggccgtggc
ggtagggtat gtgtctgaaa 1380atgagcgtgg agattgggct cgcacggctg acgcagatgg
aagacttaag gcagcggcag 1440aagaagatgc aggcagctga gttgttgtat tctgataaga
gtcagaggta actcccgttg 1500cggtgctgtt aacggtggag ggcagtgtag tctgagcagt
actcgttgct gccgcgcgcg 1560ccaccagaca taatagctga cagactaaca gactgttcct
ttccatgggt cttttctgca 1620gtcaccgtcg tcggatatcg aattcgccac catggccagc
cagggcacca agagaagcta 1680cgagcagatg gagaccgacg gcgagagaca gaacgccacc
gagatcagag ccagcgtggg 1740caagatgatc gacggcatcg gcagattcta catccagatg
tgcaccgagc tgaagctgag 1800cgactacgag ggcagactga tccagaacag cctgaccatc
gagagaatgg tgctgagcgc 1860cttcgacgag agaagaaaca gatacctgga ggagcacccc
agcgccggca aggaccccaa 1920gaagaccggc ggccccatct acagaagagt ggacggcaag
tggatgagag agctggtgct 1980gtacgacaag gaggagatca gaagaatctg gagacaggcc
aacaacggcg aggacgccac 2040cgccggcctg acccacatga tgatctggca cagcaacctg
aacgacacca cctaccagag 2100aaccagagcc ctggtgcgga ccggcatgga ccccagaatg
tgcagcctga tgcagggcag 2160caccctgccc agaagaagcg gcgccgccgg cgccgccgtg
aagggcatcg gcaccatggt 2220gatggagctg atcagaatga tcaagagagg catcaacgac
agaaacttct ggagaggcga 2280gaacggcaga aagaccagaa gcgcctacga gagaatgtgc
aacatcctga agggcaagtt 2340ccagaccgcc gcccagagag ccatgatgga ccaggtccgg
gagagcagaa accccggcaa 2400cgccgagatc gaggacctga tcttcctggc cagaagcgcc
ctgatcctga gaggcagcgt 2460ggcccacaag agctgcctgc ccgcctgcgt gtacggcccc
gccgtgagca gcggctacga 2520cttcgagaag gagggctaca gcctggtggg catcgacccc
ttcaagctgc tgcagaacag 2580ccaggtgtac agcctgatca gacccaacga gaaccccgcc
cacaagagcc agctggtgtg 2640gatggcctgc cacagcgccg ccttcgagga cctgagactg
ctgagcttca tcagaggcac 2700caaggtgtcc cccagaggca agctgagcac cagaggcgtg
cagatcgcca gcaacgagaa 2760catggacaac atgggcagca gcaccctgga gctgagaagc
agatactggg ccatcagaac 2820cagaagcggc ggcaacacca accagcagag agccagcgcc
ggccagatca gcgtgcagcc 2880caccttcagc gtgcagagaa acctgccctt cgagaagagc
accgtgatgg ccgccttcac 2940cggcaacacc gagggcagaa ccagcgacat gagagccgag
atcatcagaa tgatggaggg 3000cgccaagccc gaggaggtgt ccttcagagg cagaggcgtg
ttcgagctga gcgacgagaa 3060ggccaccaac cccatcgtgc ctagcttcga catgagcaac
gagggcagct acttcttcgg 3120cgacaacgcc gaggagtacg acaactgatc agtcgaccac
gtgtgatcca gatctacttc 3180tggctaataa aagatcagag ctctagagat ctgtgtgttg
gttttttgtg tggtactctt 3240ccgcttcctc gctcactgac tcgctgcgct cggtcgttcg
gctgcggcga gcggtatcag 3300ctcactcaaa ggcggtaata cggttatcca cagaatcagg
ggataacgca ggaaagaaca 3360tgtgagcaaa aggccagcaa aaggccagga accgtaaaaa
ggccgcgttg ctggcgtttt 3420tccataggct ccgcccccct gacgagcatc acaaaaatcg
acgctcaagt cagaggtggc 3480gaaacccgac aggactataa agataccagg cgtttccccc
tggaagctcc ctcgtgcgct 3540ctcctgttcc gaccctgccg cttaccggat acctgtccgc
ctttctccct tcgggaagcg 3600tggcgctttc tcatagctca cgctgtaggt atctcagttc
ggtgtaggtc gttcgctcca 3660agctgggctg tgtgcacgaa ccccccgttc agcccgaccg
ctgcgcctta tccggtaact 3720atcgtcttga gtccaacccg gtaagacacg acttatcgcc
actggcagca gccactggta 3780acaggattag cagagcgagg tatgtaggcg gtgctacaga
gttcttgaag tggtggccta 3840actacggcta cactagaaga acagtatttg gtatctgcgc
tctgctgaag ccagttacct 3900tcggaaaaag agttggtagc tcttgatccg gcaaacaaac
caccgctggt agcggtggtt 3960tttttgtttg caagcagcag attacgcgca gaaaaaaagg
atctcaagaa gatcctttga 4020tcttttctac ggggtctgac gctcagtgga acgaaaactc
acgttaaggg attttggtca 4080tgagattatc aaaaaggatc ttcacctaga tccttttaaa
ttaaaaatga agttttaaat 4140caatctaaag tatatatgag taaacttggt ctgacagtta
ccaatgctta atcagtgagg 4200cacctatctc agcgatctgt ctatttcgtt catccatagt
tgcctgactc gggggggggg 4260ggcgctgagg tctgcctcgt gaagaaggtg ttgctgactc
ataccaggcc tgaatcgccc 4320catcatccag ccagaaagtg agggagccac ggttgatgag
agctttgttg taggtggacc 4380agttggtgat tttgaacttt tgctttgcca cggaacggtc
tgcgttgtcg ggaagatgcg 4440tgatctgatc cttcaactca gcaaaagttc gatttattca
acaaagccgc cgtcccgtca 4500agtcagcgta atgctctgcc agtgttacaa ccaattaacc
aattctgatt agaaaaactc 4560atcgagcatc aaatgaaact gcaatttatt catatcagga
ttatcaatac catatttttg 4620aaaaagccgt ttctgtaatg aaggagaaaa ctcaccgagg
cagttccata ggatggcaag 4680atcctggtat cggtctgcga ttccgactcg tccaacatca
atacaaccta ttaatttccc 4740ctcgtcaaaa ataaggttat caagtgagaa atcaccatga
gtgacgactg aatccggtga 4800gaatggcaaa agcttatgca tttctttcca gacttgttca
acaggccagc cattacgctc 4860gtcatcaaaa tcactcgcat caaccaaacc gttattcatt
cgtgattgcg cctgagcgag 4920acgaaatacg cgatcgctgt taaaaggaca attacaaaca
ggaatcgaat gcaaccggcg 4980caggaacact gccagcgcat caacaatatt ttcacctgaa
tcaggatatt cttctaatac 5040ctggaatgct gttttcccgg ggatcgcagt ggtgagtaac
catgcatcat caggagtacg 5100gataaaatgc ttgatggtcg gaagaggcat aaattccgtc
agccagttta gtctgaccat 5160ctcatctgta acatcattgg caacgctacc tttgccatgt
ttcagaaaca actctggcgc 5220atcgggcttc ccatacaatc gatagattgt cgcacctgat
tgcccgacat tatcgcgagc 5280ccatttatac ccatataaat cagcatccat gttggaattt
aatcgcggcc tcgagcaaga 5340cgtttcccgt tgaatatggc tcataacacc ccttgtatta
ctgtttatgt aagcagacag 5400ttttattgtt catgatgata tatttttatc ttgtgcaatg
taacatcaga gattttgaga 5460cacccataga gcccaccgca tccccagcat gcctgctatt
gtcttcccaa tcctccccct 5520tgctgtcctg ccccacccca ccccccagaa tagaatgaca
cctactcaga caatgcgatg 5580caatttcctc attttattag gaaaggacag tgggagtggc
accttccagg gtcaaggaag 5640gcacggggga ggggcaaaca acagatggct ggcaactaga
aggcacagca gatctggatc 5700caggcgcctg gtctagagcg gccgcgatgt ggtcgactga
ttactccagc tctatgctga 5760caaaatgact gtcgtcagca tccacagcat tctgctgttc
ctttcgatat tcttccctca 5820tagactcagg tactccttcc gtagaaggcc ctcttttcag
accgtgtttg aagagtcgat 5880agatgcattt gaaaaaaaga cgatcaagaa tccacaatat
caagtgcaag atcccaatga 5940tactcgcagc aacaacaagc gggtcacttg aatcgttgca
tctgcacccc cattcgtttc 6000tgataggtct gcaaattttc aagaagatca tcttttagac
cagcactgga gctaggatga 6060gtcccaatgg ctctcattgc ctgcaccatt tgcctggcct
gactagcaat ttccatggcc 6120tccgctgcct gctcacttga tccagccatt tgctccatag
ccttagctgt agtgctggcc 6180aaaaccattc tgttctcatg ccttattaat ggattggttg
ttgccaccat ttgcctatga 6240gacctgtgct gggagtcagc aatctgttca catgttgcac
ataccaggcc aaaggccact 6300tcagtggtta cagcccccat tctgttgtat atgaggccca
tgcaactggc aagtgcacca 6360gcagaataac tgagtgctat ttctttggcc ccatggaatg
taatctccct cttaagtttt 6420ctatatagtt taactgctct gtccatgtta tttggatccc
cattcccatt gagggcattt 6480tggacaaagc gtctacgctg cagtcctcgc tcactgggca
cggtgagcgt gaacacaaac 6540cccaaaatcc ccttagtcag aggtgacagg attggtcttg
tctttagcca ttccatgaga 6600gcctcaagat ctgtgttttt cccagcaaag acatcttcaa
gtctctgcgc gatttcggct 6660ttgagggggc ctgatggaac gatagagaga acatacgttt
cgacctcggt tagaaggctc 6720atggtggcga attcgatatc tgatcacacg tgtcgacgac
ggtgacggaa gcttggaggt 6780gcacaccaat gtggtgaatg gtcaaatggc gtctagagta
tcgagctagg cacttaaata 6840caatatctct gcaatgcgga attcagtggt tcgtccaatc
catgtcagac ccgtctgttg 6900ccttcctaat aaggcacgat cgtaccacct tacttccacc
aatcggcatg cacggtgctt 6960tttctctcct tgtaaggcat gttgctaact catcgttacc
atgttgcaag actacaagag 7020tattgcataa gactacattt ccccctccct atgcaaaagc
gaaactacta tatcctgagg 7080ggactcctaa ccgcgtacaa ccgaagcccc gcttttcgcc
taaacacacc ctagtcccct 7140cagatacgcg tatatctggc ccgtacatcg cgaagcagcg
caaaacgcct aaccctaagc 7200agattcttca tgcaattgtc ggtcaagcct tgccttgttg
tagcttaaat tttgctcgcg 7260cactactcag cgacctccaa cacacaagca gggagcagat
actggcttaa ctatgcggca 7320tcagagcaga ttgtactgag agtgcaccat agtggctttc
cccccccccc cattattgaa 7380gcatttatca gggttattgt ctcatgagcg gatacatatt
tgaatgtatt tagaaaaata 7440aacaaatagg ggttccgcgc acatttcccc gaaaagtgcc
acctgacgtc taagaaacca 7500ttattatcat gacattaacc tataaaaata ggcgtatcac
gaggcccttt cgtctcgcgc 7560gtttcggtga tgacggtgaa aacctctgac acatgcagct
cccggagacg gtcacagctt 7620gtctgtaagc ggatgccggg agcagacaag cccgtcaggg
cgcgtcagcg ggtgttggcg 7680ggtgtcgggg ctggcttaac tatgcggcat cagagcagat
tgtactgaga gtgcaccata 7740tgcggtgtga aataccgcac agatgcgtaa ggagaaaata
ccgcatcaga ttggctat 7798997798DNAArtificial sequenceVR4768, Ligation
of RSVNP into VR4756 99tggccattgc atacgttgta tccatatcat aatatgtaca
tttatattgg ctcatgtcca 60acattaccgc catgttgaca ttgattattg actagttatt
aatagtaatc aattacgggg 120tcattagttc atagcccata tatggagttc cgcgttacat
aacttacggt aaatggcccg 180cctggctgac cgcccaacga cccccgccca ttgacgtcaa
taatgacgta tgttcccata 240gtaacgccaa tagggacttt ccattgacgt caatgggtgg
agtatttacg gtaaactgcc 300cacttggcag tacatcaagt gtatcatatg ccaagtacgc
cccctattga cgtcaatgac 360ggtaaatggc ccgcctggca ttatgcccag tacatgacct
tatgggactt tcctacttgg 420cagtacatct acgtattagt catcgctatt accatggtga
tgcggttttg gcagtacatc 480aatgggcgtg gatagcggtt tgactcacgg ggatttccaa
gtctccaccc cattgacgtc 540aatgggagtt tgttttggca ccaaaatcaa cgggactttc
caaaatgtcg taacaactcc 600gccccattga cgcaaatggg cggtaggcgt gtacggtggg
aggtctatat aagcagagct 660cgtttagtga accgtcagat cgcctggaga cgccatccac
gctgttttga cctccataga 720agacaccggg accgatccag cctccgcggc cgggaacggt
gcattggaac gcggattccc 780cgtgccaaga gtgacgtaag taccgcctat agactctata
ggcacacccc tttggctctt 840atgcatgcta tactgttttt ggcttggggc ctatacaccc
ccgcttcctt atgctatagg 900tgatggtata gcttagccta taggtgtggg ttattgacca
ttattgacca ctcccctatt 960ggtgacgata ctttccatta ctaatccata acatggctct
ttgccacaac tatctctatt 1020ggctatatgc caatactctg tccttcagag actgacacgg
actctgtatt tttacaggat 1080ggggtcccat ttattattta caaattcaca tatacaacaa
cgccgtcccc cgtgcccgca 1140gtttttatta aacatagcgt gggatctcca cgcgaatctc
gggtacgtgt tccggacatg 1200ggctcttctc cggtagcggc ggagcttcca catccgagcc
ctggtcccat gcctccagcg 1260gctcatggtc gctcggcagc tccttgctcc taacagtgga
ggccagactt aggcacagca 1320caatgcccac caccaccagt gtgccgcaca aggccgtggc
ggtagggtat gtgtctgaaa 1380atgagcgtgg agattgggct cgcacggctg acgcagatgg
aagacttaag gcagcggcag 1440aagaagatgc aggcagctga gttgttgtat tctgataaga
gtcagaggta actcccgttg 1500cggtgctgtt aacggtggag ggcagtgtag tctgagcagt
actcgttgct gccgcgcgcg 1560ccaccagaca taatagctga cagactaaca gactgttcct
ttccatgggt cttttctgca 1620gtcaccgtcg tcggatatcg aattcgccac catgagcctt
ctaaccgagg tcgaaacgta 1680tgttctctct atcgttccat caggccccct caaagccgaa
atcgcgcaga gacttgaaga 1740tgtctttgct gggaaaaaca cagatcttga ggctctcatg
gaatggctaa agacaagacc 1800aatcctgtca cctctgacta aggggatttt ggggtttgtg
ttcacgctca ccgtgcccag 1860tgagcgagga ctgcagcgta gacgctttgt ccaaaatgcc
ctcaatggga atggggatcc 1920aaataacatg gacagagcag ttaaactata tagaaaactt
aagagggaga ttacattcca 1980tggggccaaa gaaatagcac tcagttattc tgctggtgca
cttgccagtt gcatgggcct 2040catatacaac agaatggggg ctgtaaccac tgaagtggcc
tttggcctgg tatgtgcaac 2100atgtgaacag attgctgact cccagcacag gtctcatagg
caaatggtgg caacaaccaa 2160tccattaata aggcatgaga acagaatggt tttggccagc
actacagcta aggctatgga 2220gcaaatggct ggatcaagtg agcaggcagc ggaggccatg
gaaattgcta gtcaggccag 2280gcaaatggtg caggcaatga gagccattgg gactcatcct
agctccagtg ctggtctaaa 2340agatgatctt cttgaaaatt tgcagaccta tcagaaacga
atgggggtgc agatgcaacg 2400attcaagtga cccgcttgtt gttgctgcga gtatcattgg
gatcttgcac ttgatattgt 2460ggattcttga tcgtcttttt ttcaaatgca tctatcgact
cttcaaacac ggtctgaaaa 2520gagggccttc tacggaagga gtacctgagt ctatgaggga
agaatatcga aaggaacagc 2580agaatgctgt ggatgctgac gacagtcatt ttgtcagcat
agagctggag taatcagtcg 2640accacgtgtg atccagatct acttctggct aataaaagat
cagagctcta gagatctgtg 2700tgttggtttt ttgtgtggta ctcttccgct tcctcgctca
ctgactcgct gcgctcggtc 2760gttcggctgc ggcgagcggt atcagctcac tcaaaggcgg
taatacggtt atccacagaa 2820tcaggggata acgcaggaaa gaacatgtga gcaaaaggcc
agcaaaaggc caggaaccgt 2880aaaaaggccg cgttgctggc gtttttccat aggctccgcc
cccctgacga gcatcacaaa 2940aatcgacgct caagtcagag gtggcgaaac ccgacaggac
tataaagata ccaggcgttt 3000ccccctggaa gctccctcgt gcgctctcct gttccgaccc
tgccgcttac cggatacctg 3060tccgcctttc tcccttcggg aagcgtggcg ctttctcata
gctcacgctg taggtatctc 3120agttcggtgt aggtcgttcg ctccaagctg ggctgtgtgc
acgaaccccc cgttcagccc 3180gaccgctgcg ccttatccgg taactatcgt cttgagtcca
acccggtaag acacgactta 3240tcgccactgg cagcagccac tggtaacagg attagcagag
cgaggtatgt aggcggtgct 3300acagagttct tgaagtggtg gcctaactac ggctacacta
gaagaacagt atttggtatc 3360tgcgctctgc tgaagccagt taccttcgga aaaagagttg
gtagctcttg atccggcaaa 3420caaaccaccg ctggtagcgg tggttttttt gtttgcaagc
agcagattac gcgcagaaaa 3480aaaggatctc aagaagatcc tttgatcttt tctacggggt
ctgacgctca gtggaacgaa 3540aactcacgtt aagggatttt ggtcatgaga ttatcaaaaa
ggatcttcac ctagatcctt 3600ttaaattaaa aatgaagttt taaatcaatc taaagtatat
atgagtaaac ttggtctgac 3660agttaccaat gcttaatcag tgaggcacct atctcagcga
tctgtctatt tcgttcatcc 3720atagttgcct gactcggggg gggggggcgc tgaggtctgc
ctcgtgaaga aggtgttgct 3780gactcatacc aggcctgaat cgccccatca tccagccaga
aagtgaggga gccacggttg 3840atgagagctt tgttgtaggt ggaccagttg gtgattttga
acttttgctt tgccacggaa 3900cggtctgcgt tgtcgggaag atgcgtgatc tgatccttca
actcagcaaa agttcgattt 3960attcaacaaa gccgccgtcc cgtcaagtca gcgtaatgct
ctgccagtgt tacaaccaat 4020taaccaattc tgattagaaa aactcatcga gcatcaaatg
aaactgcaat ttattcatat 4080caggattatc aataccatat ttttgaaaaa gccgtttctg
taatgaagga gaaaactcac 4140cgaggcagtt ccataggatg gcaagatcct ggtatcggtc
tgcgattccg actcgtccaa 4200catcaataca acctattaat ttcccctcgt caaaaataag
gttatcaagt gagaaatcac 4260catgagtgac gactgaatcc ggtgagaatg gcaaaagctt
atgcatttct ttccagactt 4320gttcaacagg ccagccatta cgctcgtcat caaaatcact
cgcatcaacc aaaccgttat 4380tcattcgtga ttgcgcctga gcgagacgaa atacgcgatc
gctgttaaaa ggacaattac 4440aaacaggaat cgaatgcaac cggcgcagga acactgccag
cgcatcaaca atattttcac 4500ctgaatcagg atattcttct aatacctgga atgctgtttt
cccggggatc gcagtggtga 4560gtaaccatgc atcatcagga gtacggataa aatgcttgat
ggtcggaaga ggcataaatt 4620ccgtcagcca gtttagtctg accatctcat ctgtaacatc
attggcaacg ctacctttgc 4680catgtttcag aaacaactct ggcgcatcgg gcttcccata
caatcgatag attgtcgcac 4740ctgattgccc gacattatcg cgagcccatt tatacccata
taaatcagca tccatgttgg 4800aatttaatcg cggcctcgag caagacgttt cccgttgaat
atggctcata acaccccttg 4860tattactgtt tatgtaagca gacagtttta ttgttcatga
tgatatattt ttatcttgtg 4920caatgtaaca tcagagattt tgagacacta tggtgcactc
tcagtacaat ctgctctgat 4980gccgcatagt taagccagta tctgctccct gcttgtgtgt
tggaggtcgc tgagtagtgc 5040gcgagcaaaa tttaagctac aacaaggcaa ggcttgaccg
acaattgcat gaagaatctg 5100cttagggtta ggcgttttgc gctgcttcgc gatgtacggg
ccagatatac gcgtatctga 5160ggggactagg gtgtgtttag gcgaaaagcg gggcttcggt
tgtacgcggt taggagtccc 5220ctcaggatat agtagtttcg cttttgcata gggaggggga
aatgtagtct tatgcaatac 5280tcttgtagtc ttgcaacatg gtaacgatga gttagcaaca
tgccttacaa ggagagaaaa 5340agcaccgtgc atgccgattg gtggaagtaa ggtggtacga
tcgtgcctta ttaggaaggc 5400aacagacggg tctgacatgg attggacgaa ccactgaatt
ccgcattgca gagatattgt 5460atttaagtgc ctagctcgat actctagacg ccatttgacc
attcaccaca ttggtgtgca 5520cctccaagct tccgtcaccg tcgtcgacac gtgtgatcag
atatcgaatt cgccaccatg 5580gccagccagg gcaccaagag aagctacgag cagatggaga
ccgacggcga gagacagaac 5640gccaccgaga tcagagccag cgtgggcaag atgatcgacg
gcatcggcag attctacatc 5700cagatgtgca ccgagctgaa gctgagcgac tacgagggca
gactgatcca gaacagcctg 5760accatcgaga gaatggtgct gagcgccttc gacgagagaa
gaaacagata cctggaggag 5820caccccagcg ccggcaagga ccccaagaag accggcggcc
ccatctacag aagagtggac 5880ggcaagtgga tgagagagct ggtgctgtac gacaaggagg
agatcagaag aatctggaga 5940caggccaaca acggcgagga cgccaccgcc ggcctgaccc
acatgatgat ctggcacagc 6000aacctgaacg acaccaccta ccagagaacc agagccctgg
tgcggaccgg catggacccc 6060agaatgtgca gcctgatgca gggcagcacc ctgcccagaa
gaagcggcgc cgccggcgcc 6120gccgtgaagg gcatcggcac catggtgatg gagctgatca
gaatgatcaa gagaggcatc 6180aacgacagaa acttctggag aggcgagaac ggcagaaaga
ccagaagcgc ctacgagaga 6240atgtgcaaca tcctgaaggg caagttccag accgccgccc
agagagccat gatggaccag 6300gtccgggaga gcagaaaccc cggcaacgcc gagatcgagg
acctgatctt cctggccaga 6360agcgccctga tcctgagagg cagcgtggcc cacaagagct
gcctgcccgc ctgcgtgtac 6420ggccccgccg tgagcagcgg ctacgacttc gagaaggagg
gctacagcct ggtgggcatc 6480gaccccttca agctgctgca gaacagccag gtgtacagcc
tgatcagacc caacgagaac 6540cccgcccaca agagccagct ggtgtggatg gcctgccaca
gcgccgcctt cgaggacctg 6600agactgctga gcttcatcag aggcaccaag gtgtccccca
gaggcaagct gagcaccaga 6660ggcgtgcaga tcgccagcaa cgagaacatg gacaacatgg
gcagcagcac cctggagctg 6720agaagcagat actgggccat cagaaccaga agcggcggca
acaccaacca gcagagagcc 6780agcgccggcc agatcagcgt gcagcccacc ttcagcgtgc
agagaaacct gcccttcgag 6840aagagcaccg tgatggccgc cttcaccggc aacaccgagg
gcagaaccag cgacatgaga 6900gccgagatca tcagaatgat ggagggcgcc aagcccgagg
aggtgtcctt cagaggcaga 6960ggcgtgttcg agctgagcga cgagaaggcc accaacccca
tcgtgcctag cttcgacatg 7020agcaacgagg gcagctactt cttcggcgac aacgccgagg
agtacgacaa ctgatcagtc 7080gaccacatcg cggccgctct agaccaggcg cctggatcca
gatctgctgt gccttctagt 7140tgccagccat ctgttgtttg cccctccccc gtgccttcct
tgaccctgga aggtgccact 7200cccactgtcc tttcctaata aaatgaggaa attgcatcgc
attgtctgag taggtgtcat 7260tctattctgg ggggtggggt ggggcaggac agcaaggggg
aggattggga agacaatagc 7320aggcatgctg gggatgcggt gggctctatg ggtggctttc
cccccccccc cattattgaa 7380gcatttatca gggttattgt ctcatgagcg gatacatatt
tgaatgtatt tagaaaaata 7440aacaaatagg ggttccgcgc acatttcccc gaaaagtgcc
acctgacgtc taagaaacca 7500ttattatcat gacattaacc tataaaaata ggcgtatcac
gaggcccttt cgtctcgcgc 7560gtttcggtga tgacggtgaa aacctctgac acatgcagct
cccggagacg gtcacagctt 7620gtctgtaagc ggatgccggg agcagacaag cccgtcaggg
cgcgtcagcg ggtgttggcg 7680ggtgtcgggg ctggcttaac tatgcggcat cagagcagat
tgtactgaga gtgcaccata 7740tgcggtgtga aataccgcac agatgcgtaa ggagaaaata
ccgcatcaga ttggctat 77981007798DNAArtificial sequenceVR4769, Ligation
of Inverted NP into VR4756 100tggccattgc atacgttgta tccatatcat aatatgtaca
tttatattgg ctcatgtcca 60acattaccgc catgttgaca ttgattattg actagttatt
aatagtaatc aattacgggg 120tcattagttc atagcccata tatggagttc cgcgttacat
aacttacggt aaatggcccg 180cctggctgac cgcccaacga cccccgccca ttgacgtcaa
taatgacgta tgttcccata 240gtaacgccaa tagggacttt ccattgacgt caatgggtgg
agtatttacg gtaaactgcc 300cacttggcag tacatcaagt gtatcatatg ccaagtacgc
cccctattga cgtcaatgac 360ggtaaatggc ccgcctggca ttatgcccag tacatgacct
tatgggactt tcctacttgg 420cagtacatct acgtattagt catcgctatt accatggtga
tgcggttttg gcagtacatc 480aatgggcgtg gatagcggtt tgactcacgg ggatttccaa
gtctccaccc cattgacgtc 540aatgggagtt tgttttggca ccaaaatcaa cgggactttc
caaaatgtcg taacaactcc 600gccccattga cgcaaatggg cggtaggcgt gtacggtggg
aggtctatat aagcagagct 660cgtttagtga accgtcagat cgcctggaga cgccatccac
gctgttttga cctccataga 720agacaccggg accgatccag cctccgcggc cgggaacggt
gcattggaac gcggattccc 780cgtgccaaga gtgacgtaag taccgcctat agactctata
ggcacacccc tttggctctt 840atgcatgcta tactgttttt ggcttggggc ctatacaccc
ccgcttcctt atgctatagg 900tgatggtata gcttagccta taggtgtggg ttattgacca
ttattgacca ctcccctatt 960ggtgacgata ctttccatta ctaatccata acatggctct
ttgccacaac tatctctatt 1020ggctatatgc caatactctg tccttcagag actgacacgg
actctgtatt tttacaggat 1080ggggtcccat ttattattta caaattcaca tatacaacaa
cgccgtcccc cgtgcccgca 1140gtttttatta aacatagcgt gggatctcca cgcgaatctc
gggtacgtgt tccggacatg 1200ggctcttctc cggtagcggc ggagcttcca catccgagcc
ctggtcccat gcctccagcg 1260gctcatggtc gctcggcagc tccttgctcc taacagtgga
ggccagactt aggcacagca 1320caatgcccac caccaccagt gtgccgcaca aggccgtggc
ggtagggtat gtgtctgaaa 1380atgagcgtgg agattgggct cgcacggctg acgcagatgg
aagacttaag gcagcggcag 1440aagaagatgc aggcagctga gttgttgtat tctgataaga
gtcagaggta actcccgttg 1500cggtgctgtt aacggtggag ggcagtgtag tctgagcagt
actcgttgct gccgcgcgcg 1560ccaccagaca taatagctga cagactaaca gactgttcct
ttccatgggt cttttctgca 1620gtcaccgtcg tcggatatcg aattcgccac catgagcctt
ctaaccgagg tcgaaacgta 1680tgttctctct atcgttccat caggccccct caaagccgaa
atcgcgcaga gacttgaaga 1740tgtctttgct gggaaaaaca cagatcttga ggctctcatg
gaatggctaa agacaagacc 1800aatcctgtca cctctgacta aggggatttt ggggtttgtg
ttcacgctca ccgtgcccag 1860tgagcgagga ctgcagcgta gacgctttgt ccaaaatgcc
ctcaatggga atggggatcc 1920aaataacatg gacagagcag ttaaactata tagaaaactt
aagagggaga ttacattcca 1980tggggccaaa gaaatagcac tcagttattc tgctggtgca
cttgccagtt gcatgggcct 2040catatacaac agaatggggg ctgtaaccac tgaagtggcc
tttggcctgg tatgtgcaac 2100atgtgaacag attgctgact cccagcacag gtctcatagg
caaatggtgg caacaaccaa 2160tccattaata aggcatgaga acagaatggt tttggccagc
actacagcta aggctatgga 2220gcaaatggct ggatcaagtg agcaggcagc ggaggccatg
gaaattgcta gtcaggccag 2280gcaaatggtg caggcaatga gagccattgg gactcatcct
agctccagtg ctggtctaaa 2340agatgatctt cttgaaaatt tgcagaccta tcagaaacga
atgggggtgc agatgcaacg 2400attcaagtga cccgcttgtt gttgctgcga gtatcattgg
gatcttgcac ttgatattgt 2460ggattcttga tcgtcttttt ttcaaatgca tctatcgact
cttcaaacac ggtctgaaaa 2520gagggccttc tacggaagga gtacctgagt ctatgaggga
agaatatcga aaggaacagc 2580agaatgctgt ggatgctgac gacagtcatt ttgtcagcat
agagctggag taatcagtcg 2640accacgtgtg atccagatct acttctggct aataaaagat
cagagctcta gagatctgtg 2700tgttggtttt ttgtgtggta ctcttccgct tcctcgctca
ctgactcgct gcgctcggtc 2760gttcggctgc ggcgagcggt atcagctcac tcaaaggcgg
taatacggtt atccacagaa 2820tcaggggata acgcaggaaa gaacatgtga gcaaaaggcc
agcaaaaggc caggaaccgt 2880aaaaaggccg cgttgctggc gtttttccat aggctccgcc
cccctgacga gcatcacaaa 2940aatcgacgct caagtcagag gtggcgaaac ccgacaggac
tataaagata ccaggcgttt 3000ccccctggaa gctccctcgt gcgctctcct gttccgaccc
tgccgcttac cggatacctg 3060tccgcctttc tcccttcggg aagcgtggcg ctttctcata
gctcacgctg taggtatctc 3120agttcggtgt aggtcgttcg ctccaagctg ggctgtgtgc
acgaaccccc cgttcagccc 3180gaccgctgcg ccttatccgg taactatcgt cttgagtcca
acccggtaag acacgactta 3240tcgccactgg cagcagccac tggtaacagg attagcagag
cgaggtatgt aggcggtgct 3300acagagttct tgaagtggtg gcctaactac ggctacacta
gaagaacagt atttggtatc 3360tgcgctctgc tgaagccagt taccttcgga aaaagagttg
gtagctcttg atccggcaaa 3420caaaccaccg ctggtagcgg tggttttttt gtttgcaagc
agcagattac gcgcagaaaa 3480aaaggatctc aagaagatcc tttgatcttt tctacggggt
ctgacgctca gtggaacgaa 3540aactcacgtt aagggatttt ggtcatgaga ttatcaaaaa
ggatcttcac ctagatcctt 3600ttaaattaaa aatgaagttt taaatcaatc taaagtatat
atgagtaaac ttggtctgac 3660agttaccaat gcttaatcag tgaggcacct atctcagcga
tctgtctatt tcgttcatcc 3720atagttgcct gactcggggg gggggggcgc tgaggtctgc
ctcgtgaaga aggtgttgct 3780gactcatacc aggcctgaat cgccccatca tccagccaga
aagtgaggga gccacggttg 3840atgagagctt tgttgtaggt ggaccagttg gtgattttga
acttttgctt tgccacggaa 3900cggtctgcgt tgtcgggaag atgcgtgatc tgatccttca
actcagcaaa agttcgattt 3960attcaacaaa gccgccgtcc cgtcaagtca gcgtaatgct
ctgccagtgt tacaaccaat 4020taaccaattc tgattagaaa aactcatcga gcatcaaatg
aaactgcaat ttattcatat 4080caggattatc aataccatat ttttgaaaaa gccgtttctg
taatgaagga gaaaactcac 4140cgaggcagtt ccataggatg gcaagatcct ggtatcggtc
tgcgattccg actcgtccaa 4200catcaataca acctattaat ttcccctcgt caaaaataag
gttatcaagt gagaaatcac 4260catgagtgac gactgaatcc ggtgagaatg gcaaaagctt
atgcatttct ttccagactt 4320gttcaacagg ccagccatta cgctcgtcat caaaatcact
cgcatcaacc aaaccgttat 4380tcattcgtga ttgcgcctga gcgagacgaa atacgcgatc
gctgttaaaa ggacaattac 4440aaacaggaat cgaatgcaac cggcgcagga acactgccag
cgcatcaaca atattttcac 4500ctgaatcagg atattcttct aatacctgga atgctgtttt
cccggggatc gcagtggtga 4560gtaaccatgc atcatcagga gtacggataa aatgcttgat
ggtcggaaga ggcataaatt 4620ccgtcagcca gtttagtctg accatctcat ctgtaacatc
attggcaacg ctacctttgc 4680catgtttcag aaacaactct ggcgcatcgg gcttcccata
caatcgatag attgtcgcac 4740ctgattgccc gacattatcg cgagcccatt tatacccata
taaatcagca tccatgttgg 4800aatttaatcg cggcctcgag caagacgttt cccgttgaat
atggctcata acaccccttg 4860tattactgtt tatgtaagca gacagtttta ttgttcatga
tgatatattt ttatcttgtg 4920caatgtaaca tcagagattt tgagacaccc atagagccca
ccgcatcccc agcatgcctg 4980ctattgtctt cccaatcctc ccccttgctg tcctgcccca
ccccaccccc cagaatagaa 5040tgacacctac tcagacaatg cgatgcaatt tcctcatttt
attaggaaag gacagtggga 5100gtggcacctt ccagggtcaa ggaaggcacg ggggaggggc
aaacaacaga tggctggcaa 5160ctagaaggca cagcagatct ggatccaggc gcctggtcta
gagcggccgc gatgtggtcg 5220actgatcagt tgtcgtactc ctcggcgttg tcgccgaaga
agtagctgcc ctcgttgctc 5280atgtcgaagc taggcacgat ggggttggtg gccttctcgt
cgctcagctc gaacacgcct 5340ctgcctctga aggacacctc ctcgggcttg gcgccctcca
tcattctgat gatctcggct 5400ctcatgtcgc tggttctgcc ctcggtgttg ccggtgaagg
cggccatcac ggtgctcttc 5460tcgaagggca ggtttctctg cacgctgaag gtgggctgca
cgctgatctg gccggcgctg 5520gctctctgct ggttggtgtt gccgccgctt ctggttctga
tggcccagta tctgcttctc 5580agctccaggg tgctgctgcc catgttgtcc atgttctcgt
tgctggcgat ctgcacgcct 5640ctggtgctca gcttgcctct gggggacacc ttggtgcctc
tgatgaagct cagcagtctc 5700aggtcctcga aggcggcgct gtggcaggcc atccacacca
gctggctctt gtgggcgggg 5760ttctcgttgg gtctgatcag gctgtacacc tggctgttct
gcagcagctt gaaggggtcg 5820atgcccacca ggctgtagcc ctccttctcg aagtcgtagc
cgctgctcac ggcggggccg 5880tacacgcagg cgggcaggca gctcttgtgg gccacgctgc
ctctcaggat cagggcgctt 5940ctggccagga agatcaggtc ctcgatctcg gcgttgccgg
ggtttctgct ctcccggacc 6000tggtccatca tggctctctg ggcggcggtc tggaacttgc
ccttcaggat gttgcacatt 6060ctctcgtagg cgcttctggt ctttctgccg ttctcgcctc
tccagaagtt tctgtcgttg 6120atgcctctct tgatcattct gatcagctcc atcaccatgg
tgccgatgcc cttcacggcg 6180gcgccggcgg cgccgcttct tctgggcagg gtgctgccct
gcatcaggct gcacattctg 6240gggtccatgc cggtccgcac cagggctctg gttctctggt
aggtggtgtc gttcaggttg 6300ctgtgccaga tcatcatgtg ggtcaggccg gcggtggcgt
cctcgccgtt gttggcctgt 6360ctccagattc ttctgatctc ctccttgtcg tacagcacca
gctctctcat ccacttgccg 6420tccactcttc tgtagatggg gccgccggtc ttcttggggt
ccttgccggc gctggggtgc 6480tcctccaggt atctgtttct tctctcgtcg aaggcgctca
gcaccattct ctcgatggtc 6540aggctgttct ggatcagtct gccctcgtag tcgctcagct
tcagctcggt gcacatctgg 6600atgtagaatc tgccgatgcc gtcgatcatc ttgcccacgc
tggctctgat ctcggtggcg 6660ttctgtctct cgccgtcggt ctccatctgc tcgtagcttc
tcttggtgcc ctggctggcc 6720atggtggcga attcgatatc tgatcacacg tgtcgacgac
ggtgacggaa gcttggaggt 6780gcacaccaat gtggtgaatg gtcaaatggc gtctagagta
tcgagctagg cacttaaata 6840caatatctct gcaatgcgga attcagtggt tcgtccaatc
catgtcagac ccgtctgttg 6900ccttcctaat aaggcacgat cgtaccacct tacttccacc
aatcggcatg cacggtgctt 6960tttctctcct tgtaaggcat gttgctaact catcgttacc
atgttgcaag actacaagag 7020tattgcataa gactacattt ccccctccct atgcaaaagc
gaaactacta tatcctgagg 7080ggactcctaa ccgcgtacaa ccgaagcccc gcttttcgcc
taaacacacc ctagtcccct 7140cagatacgcg tatatctggc ccgtacatcg cgaagcagcg
caaaacgcct aaccctaagc 7200agattcttca tgcaattgtc ggtcaagcct tgccttgttg
tagcttaaat tttgctcgcg 7260cactactcag cgacctccaa cacacaagca gggagcagat
actggcttaa ctatgcggca 7320tcagagcaga ttgtactgag agtgcaccat agtggctttc
cccccccccc cattattgaa 7380gcatttatca gggttattgt ctcatgagcg gatacatatt
tgaatgtatt tagaaaaata 7440aacaaatagg ggttccgcgc acatttcccc gaaaagtgcc
acctgacgtc taagaaacca 7500ttattatcat gacattaacc tataaaaata ggcgtatcac
gaggcccttt cgtctcgcgc 7560gtttcggtga tgacggtgaa aacctctgac acatgcagct
cccggagacg gtcacagctt 7620gtctgtaagc ggatgccggg agcagacaag cccgtcaggg
cgcgtcagcg ggtgttggcg 7680ggtgtcgggg ctggcttaac tatgcggcat cagagcagat
tgtactgaga gtgcaccata 7740tgcggtgtga aataccgcac agatgcgtaa ggagaaaata
ccgcatcaga ttggctat 77981015161DNAArtificial sequenceVR4770, M2
Insert Replacing WNV Insert in VR6430 101tcgcgcgttt cggtgatgac
ggtgaaaacc tctgacacat gcagctcccg gagacggtca 60cagcttgtct gtaagcggat
gccgggagca gacaagcccg tcagggcgcg tcagcgggtg 120ttggcgggtg tcggggctgg
cttaactatg cggcatcaga gcagattgta ctgagagtgc 180accatatgcg gtgtgaaata
ccgcacagat gcgtaaggag aaaataccgc atcagattgg 240ctattggctg ctccctgctt
gtgtgttgga ggtcgctgag tagtgcgcga gcaaaattta 300agctacaaca aggcaaggct
tgaccgacaa ttgcatgaag aatctgctta gggttaggcg 360ttttgcgctg cttcgcgatg
tacgggccag atatacgcgt atctgagggg actagggtgt 420gtttaggcga aaagcggggc
ttcggttgta cgcggttagg agtcccctca ggatatagta 480gtttcgcttt tgcataggga
gggggaaatg tagtcttatg caatactctt gtagtcttgc 540aacatggtaa cgatgagtta
gcaacatgcc ttacaaggag agaaaaagca ccgtgcatgc 600cgattggtgg aagtaaggtg
gtacgatcgt gccttattag gaaggcaaca gacgggtctg 660acatggattg gacgaaccac
tgaattccgc attgcagaga tattgtattt aagtgcctag 720ctcgatacaa taaacgccat
ttgaccattc accacattgg tgtgcacctc catcggctcg 780catctctcct tcacgcgccc
gccgccctac ctgaggccgc catccacgcc ggttgagtcg 840cgttctgccg cctcccgcct
gtggtgcctc ctgaactgcg tccgccgtct aggtaagttt 900aaagctcagg tcgagaccgg
gcctttgtcc ggcgctccct tggagcctac ctagactcag 960ccggctctcc acgctttgcc
tgaccctgct tgctcaactc tagttaacgg tggagggcag 1020tgtagtctga gcagtactcg
ttgctgccgc gcgcgccacc agacataata gctgacagac 1080taacagactg ttcctttcca
tgggtctttt ctgcagtcac cgtcgtcgga tatcgaattc 1140gccaccatga gccttctaac
cgaggtcgaa acgtatgttc tctctatcgt tccatcaggc 1200cccctcaaag ccgaaatcgc
gcagagactt gaagatgtct ttgctgggaa aaacacagat 1260cttgaggctc tcatggaatg
gctaaagaca agaccaatcc tgtcacctct gactaagggg 1320attttggggt ttgtgttcac
gctcaccgtg cccagtgagc gaggactgca gcgtagacgc 1380tttgtccaaa atgccctcaa
tgggaatggg gatccaaata acatggacag agcagttaaa 1440ctatatagaa aacttaagag
ggagattaca ttccatgggg ccaaagaaat agcactcagt 1500tattctgctg gtgcacttgc
cagttgcatg ggcctcatat acaacagaat gggggctgta 1560accactgaag tggcctttgg
cctggtatgt gcaacatgtg aacagattgc tgactcccag 1620cacaggtctc ataggcaaat
ggtggcaaca accaatccat taataaggca tgagaacaga 1680atggttttgg ccagcactac
agctaaggct atggagcaaa tggctggatc aagtgagcag 1740gcagcggagg ccatggaaat
tgctagtcag gccaggcaaa tggtgcaggc aatgagagcc 1800attgggactc atcctagctc
cagtgctggt ctaaaagatg atcttcttga aaatttgcag 1860acctatcaga aacgaatggg
ggtgcagatg caacgattca agtgacccgc ttgttgttgc 1920tgcgagtatc attgggatct
tgcacttgat attgtggatt cttgatcgtc tttttttcaa 1980atgcatctat cgactcttca
aacacggtct gaaaagaggg ccttctacgg aaggagtacc 2040tgagtctatg agggaagaat
atcgaaagga acagcagaat gctgtggatg ctgacgacag 2100tcattttgtc agcatagagc
tggagtaatc agtcgagatc cagatctgct gtgccttcta 2160gttgccagcc atctgttgtt
tgcccctccc ccgtgccttc cttgaccctg gaaggtgcca 2220ctcccactgt cctttcctaa
taaaatgagg aaattgcatc gcattgtctg agtaggtgtc 2280attctattct ggggggtggg
gtggggcagg acagcaaggg ggaggattgg gaagacaata 2340gcaggcatgc tggggatgcg
gtgggctcta tgggtaccca ggtgctgaag aattgacccg 2400gttcctcctg ggccagaaag
aagcaggcac atccccttct ctgtgacaca ccctgtccac 2460gcccctggtt cttagttcca
gccccactca taggacactc atagctcagg agggctccgc 2520cttcaatccc acccgctaaa
gtacttggag cggtctctcc ctccctcatc agcccaccaa 2580accaaaccta gcctccaaga
gtgggaagaa attaaagcaa gataggctat taagtgcaga 2640gggagagaaa atgcctccaa
catgtgagga agtaatgaga gaaatcatag aattttaagg 2700ccatgattta aggccatcat
ggccttaatc ttccgcttcc tcgctcactg actcgctgcg 2760ctcggtcgtt cggctgcggc
gagcggtatc agctcactca aaggcggtaa tacggttatc 2820cacagaatca ggggataacg
caggaaagaa catgtgagca aaaggccagc aaaaggccag 2880gaaccgtaaa aaggccgcgt
tgctggcgtt tttccatagg ctccgccccc ctgacgagca 2940tcacaaaaat cgacgctcaa
gtcagaggtg gcgaaacccg acaggactat aaagatacca 3000ggcgtttccc cctggaagct
ccctcgtgcg ctctcctgtt ccgaccctgc cgcttaccgg 3060atacctgtcc gcctttctcc
cttcgggaag cgtggcgctt tctcatagct cacgctgtag 3120gtatctcagt tcggtgtagg
tcgttcgctc caagctgggc tgtgtgcacg aaccccccgt 3180tcagcccgac cgctgcgcct
tatccggtaa ctatcgtctt gagtccaacc cggtaagaca 3240cgacttatcg ccactggcag
cagccactgg taacaggatt agcagagcga ggtatgtagg 3300cggtgctaca gagttcttga
agtggtggcc taactacggc tacactagaa gaacagtatt 3360tggtatctgc gctctgctga
agccagttac cttcggaaaa agagttggta gctcttgatc 3420cggcaaacaa accaccgctg
gtagcggtgg tttttttgtt tgcaagcagc agattacgcg 3480cagaaaaaaa ggatctcaag
aagatccttt gatcttttct acggggtctg acgctcagtg 3540gaacgaaaac tcacgttaag
ggattttggt catgagatta tcaaaaagga tcttcaccta 3600gatcctttta aattaaaaat
gaagttttaa atcaatctaa agtatatatg agtaaacttg 3660gtctgacagt taccaatgct
taatcagtga ggcacctatc tcagcgatct gtctatttcg 3720ttcatccata gttgcctgac
tcgggggggg ggggcgctga ggtctgcctc gtgaagaagg 3780tgttgctgac tcataccagg
cctgaatcgc cccatcatcc agccagaaag tgagggagcc 3840acggttgatg agagctttgt
tgtaggtgga ccagttggtg attttgaact tttgctttgc 3900cacggaacgg tctgcgttgt
cgggaagatg cgtgatctga tccttcaact cagcaaaagt 3960tcgatttatt caacaaagcc
gccgtcccgt caagtcagcg taatgctctg ccagtgttac 4020aaccaattaa ccaattctga
ttagaaaaac tcatcgagca tcaaatgaaa ctgcaattta 4080ttcatatcag gattatcaat
accatatttt tgaaaaagcc gtttctgtaa tgaaggagaa 4140aactcaccga ggcagttcca
taggatggca agatcctggt atcggtctgc gattccgact 4200cgtccaacat caatacaacc
tattaatttc ccctcgtcaa aaataaggtt atcaagtgag 4260aaatcaccat gagtgacgac
tgaatccggt gagaatggca aaagcttatg catttctttc 4320cagacttgtt caacaggcca
gccattacgc tcgtcatcaa aatcactcgc atcaaccaaa 4380ccgttattca ttcgtgattg
cgcctgagcg agacgaaata cgcgatcgct gttaaaagga 4440caattacaaa caggaatcga
atgcaaccgg cgcaggaaca ctgccagcgc atcaacaata 4500ttttcacctg aatcaggata
ttcttctaat acctggaatg ctgttttccc ggggatcgca 4560gtggtgagta accatgcatc
atcaggagta cggataaaat gcttgatggt cggaagaggc 4620ataaattccg tcagccagtt
tagtctgacc atctcatctg taacatcatt ggcaacgcta 4680cctttgccat gtttcagaaa
caactctggc gcatcgggct tcccatacaa tcgatagatt 4740gtcgcacctg attgcccgac
attatcgcga gcccatttat acccatataa atcagcatcc 4800atgttggaat ttaatcgcgg
cctcgagcaa gacgtttccc gttgaatatg gctcataaca 4860ccccttgtat tactgtttat
gtaagcagac agttttattg ttcatgatga tatattttta 4920tcttgtgcaa tgtaacatca
gagattttga gacacaacgt ggctttcccc ccccccccat 4980tattgaagca tttatcaggg
ttattgtctc atgagcggat acatatttga atgtatttag 5040aaaaataaac aaataggggt
tccgcgcaca tttccccgaa aagtgccacc tgacgtctaa 5100gaaaccatta ttatcatgac
attaacctat aaaaataggc gtatcacgag gccctttcgt 5160c
51611025684DNAArtificial
sequenceVR4771, NP Insert Repacing WNV Insert in VR6430
102tcgcgcgttt cggtgatgac ggtgaaaacc tctgacacat gcagctcccg gagacggtca
60cagcttgtct gtaagcggat gccgggagca gacaagcccg tcagggcgcg tcagcgggtg
120ttggcgggtg tcggggctgg cttaactatg cggcatcaga gcagattgta ctgagagtgc
180accatatgcg gtgtgaaata ccgcacagat gcgtaaggag aaaataccgc atcagattgg
240ctattggctg ctccctgctt gtgtgttgga ggtcgctgag tagtgcgcga gcaaaattta
300agctacaaca aggcaaggct tgaccgacaa ttgcatgaag aatctgctta gggttaggcg
360ttttgcgctg cttcgcgatg tacgggccag atatacgcgt atctgagggg actagggtgt
420gtttaggcga aaagcggggc ttcggttgta cgcggttagg agtcccctca ggatatagta
480gtttcgcttt tgcataggga gggggaaatg tagtcttatg caatactctt gtagtcttgc
540aacatggtaa cgatgagtta gcaacatgcc ttacaaggag agaaaaagca ccgtgcatgc
600cgattggtgg aagtaaggtg gtacgatcgt gccttattag gaaggcaaca gacgggtctg
660acatggattg gacgaaccac tgaattccgc attgcagaga tattgtattt aagtgcctag
720ctcgatacaa taaacgccat ttgaccattc accacattgg tgtgcacctc catcggctcg
780catctctcct tcacgcgccc gccgccctac ctgaggccgc catccacgcc ggttgagtcg
840cgttctgccg cctcccgcct gtggtgcctc ctgaactgcg tccgccgtct aggtaagttt
900aaagctcagg tcgagaccgg gcctttgtcc ggcgctccct tggagcctac ctagactcag
960ccggctctcc acgctttgcc tgaccctgct tgctcaactc tagttaacgg tggagggcag
1020tgtagtctga gcagtactcg ttgctgccgc gcgcgccacc agacataata gctgacagac
1080taacagactg ttcctttcca tgggtctttt ctgcagtcac cgtcgtcgga tatcgaattc
1140gccaccatgg ccagccaggg caccaagaga agctacgagc agatggagac cgacggcgag
1200agacagaacg ccaccgagat cagagccagc gtgggcaaga tgatcgacgg catcggcaga
1260ttctacatcc agatgtgcac cgagctgaag ctgagcgact acgagggcag actgatccag
1320aacagcctga ccatcgagag aatggtgctg agcgccttcg acgagagaag aaacagatac
1380ctggaggagc accccagcgc cggcaaggac cccaagaaga ccggcggccc catctacaga
1440agagtggacg gcaagtggat gagagagctg gtgctgtacg acaaggagga gatcagaaga
1500atctggagac aggccaacaa cggcgaggac gccaccgccg gcctgaccca catgatgatc
1560tggcacagca acctgaacga caccacctac cagagaacca gagccctggt gcggaccggc
1620atggacccca gaatgtgcag cctgatgcag ggcagcaccc tgcccagaag aagcggcgcc
1680gccggcgccg ccgtgaaggg catcggcacc atggtgatgg agctgatcag aatgatcaag
1740agaggcatca acgacagaaa cttctggaga ggcgagaacg gcagaaagac cagaagcgcc
1800tacgagagaa tgtgcaacat cctgaagggc aagttccaga ccgccgccca gagagccatg
1860atggaccagg tccgggagag cagaaacccc ggcaacgccg agatcgagga cctgatcttc
1920ctggccagaa gcgccctgat cctgagaggc agcgtggccc acaagagctg cctgcccgcc
1980tgcgtgtacg gccccgccgt gagcagcggc tacgacttcg agaaggaggg ctacagcctg
2040gtgggcatcg accccttcaa gctgctgcag aacagccagg tgtacagcct gatcagaccc
2100aacgagaacc ccgcccacaa gagccagctg gtgtggatgg cctgccacag cgccgccttc
2160gaggacctga gactgctgag cttcatcaga ggcaccaagg tgtcccccag aggcaagctg
2220agcaccagag gcgtgcagat cgccagcaac gagaacatgg acaacatggg cagcagcacc
2280ctggagctga gaagcagata ctgggccatc agaaccagaa gcggcggcaa caccaaccag
2340cagagagcca gcgccggcca gatcagcgtg cagcccacct tcagcgtgca gagaaacctg
2400cccttcgaga agagcaccgt gatggccgcc ttcaccggca acaccgaggg cagaaccagc
2460gacatgagag ccgagatcat cagaatgatg gagggcgcca agcccgagga ggtgtccttc
2520agaggcagag gcgtgttcga gctgagcgac gagaaggcca ccaaccccat cgtgcctagc
2580ttcgacatga gcaacgaggg cagctacttc ttcggcgaca acgccgagga gtacgacaac
2640tgatcagtcg accacgtgtg atccagatct gctgtgcctt ctagttgcca gccatctgtt
2700gtttgcccct cccccgtgcc ttccttgacc ctggaaggtg ccactcccac tgtcctttcc
2760taataaaatg aggaaattgc atcgcattgt ctgagtaggt gtcattctat tctggggggt
2820ggggtggggc aggacagcaa gggggaggat tgggaagaca atagcaggca tgctggggat
2880gcggtgggct ctatgggtac ccaggtgctg aagaattgac ccggttcctc ctgggccaga
2940aagaagcagg cacatcccct tctctgtgac acaccctgtc cacgcccctg gttcttagtt
3000ccagccccac tcataggaca ctcatagctc aggagggctc cgccttcaat cccacccgct
3060aaagtacttg gagcggtctc tccctccctc atcagcccac caaaccaaac ctagcctcca
3120agagtgggaa gaaattaaag caagataggc tattaagtgc agagggagag aaaatgcctc
3180caacatgtga ggaagtaatg agagaaatca tagaatttta aggccatgat ttaaggccat
3240catggcctta atcttccgct tcctcgctca ctgactcgct gcgctcggtc gttcggctgc
3300ggcgagcggt atcagctcac tcaaaggcgg taatacggtt atccacagaa tcaggggata
3360acgcaggaaa gaacatgtga gcaaaaggcc agcaaaaggc caggaaccgt aaaaaggccg
3420cgttgctggc gtttttccat aggctccgcc cccctgacga gcatcacaaa aatcgacgct
3480caagtcagag gtggcgaaac ccgacaggac tataaagata ccaggcgttt ccccctggaa
3540gctccctcgt gcgctctcct gttccgaccc tgccgcttac cggatacctg tccgcctttc
3600tcccttcggg aagcgtggcg ctttctcata gctcacgctg taggtatctc agttcggtgt
3660aggtcgttcg ctccaagctg ggctgtgtgc acgaaccccc cgttcagccc gaccgctgcg
3720ccttatccgg taactatcgt cttgagtcca acccggtaag acacgactta tcgccactgg
3780cagcagccac tggtaacagg attagcagag cgaggtatgt aggcggtgct acagagttct
3840tgaagtggtg gcctaactac ggctacacta gaagaacagt atttggtatc tgcgctctgc
3900tgaagccagt taccttcgga aaaagagttg gtagctcttg atccggcaaa caaaccaccg
3960ctggtagcgg tggttttttt gtttgcaagc agcagattac gcgcagaaaa aaaggatctc
4020aagaagatcc tttgatcttt tctacggggt ctgacgctca gtggaacgaa aactcacgtt
4080aagggatttt ggtcatgaga ttatcaaaaa ggatcttcac ctagatcctt ttaaattaaa
4140aatgaagttt taaatcaatc taaagtatat atgagtaaac ttggtctgac agttaccaat
4200gcttaatcag tgaggcacct atctcagcga tctgtctatt tcgttcatcc atagttgcct
4260gactcggggg gggggggcgc tgaggtctgc ctcgtgaaga aggtgttgct gactcatacc
4320aggcctgaat cgccccatca tccagccaga aagtgaggga gccacggttg atgagagctt
4380tgttgtaggt ggaccagttg gtgattttga acttttgctt tgccacggaa cggtctgcgt
4440tgtcgggaag atgcgtgatc tgatccttca actcagcaaa agttcgattt attcaacaaa
4500gccgccgtcc cgtcaagtca gcgtaatgct ctgccagtgt tacaaccaat taaccaattc
4560tgattagaaa aactcatcga gcatcaaatg aaactgcaat ttattcatat caggattatc
4620aataccatat ttttgaaaaa gccgtttctg taatgaagga gaaaactcac cgaggcagtt
4680ccataggatg gcaagatcct ggtatcggtc tgcgattccg actcgtccaa catcaataca
4740acctattaat ttcccctcgt caaaaataag gttatcaagt gagaaatcac catgagtgac
4800gactgaatcc ggtgagaatg gcaaaagctt atgcatttct ttccagactt gttcaacagg
4860ccagccatta cgctcgtcat caaaatcact cgcatcaacc aaaccgttat tcattcgtga
4920ttgcgcctga gcgagacgaa atacgcgatc gctgttaaaa ggacaattac aaacaggaat
4980cgaatgcaac cggcgcagga acactgccag cgcatcaaca atattttcac ctgaatcagg
5040atattcttct aatacctgga atgctgtttt cccggggatc gcagtggtga gtaaccatgc
5100atcatcagga gtacggataa aatgcttgat ggtcggaaga ggcataaatt ccgtcagcca
5160gtttagtctg accatctcat ctgtaacatc attggcaacg ctacctttgc catgtttcag
5220aaacaactct ggcgcatcgg gcttcccata caatcgatag attgtcgcac ctgattgccc
5280gacattatcg cgagcccatt tatacccata taaatcagca tccatgttgg aatttaatcg
5340cggcctcgag caagacgttt cccgttgaat atggctcata acaccccttg tattactgtt
5400tatgtaagca gacagtttta ttgttcatga tgatatattt ttatcttgtg caatgtaaca
5460tcagagattt tgagacacaa cgtggctttc cccccccccc cattattgaa gcatttatca
5520gggttattgt ctcatgagcg gatacatatt tgaatgtatt tagaaaaata aacaaatagg
5580ggttccgcgc acatttcccc gaaaagtgcc acctgacgtc taagaaacca ttattatcat
5640gacattaacc tataaaaata ggcgtatcac gaggcccttt cgtc
56841034473DNAArtificial sequenceVR4772, M2 Insert Replacing WNV Insert
from VR6430 103tcgcgcgttt cggtgatgac ggtgaaaacc tctgacacat
gcagctcccg gagacggtca 60cagcttgtct gtaagcggat gccgggagca gacaagcccg
tcagggcgcg tcagcgggtg 120ttggcgggtg tcggggctgg cttaactatg cggcatcaga
gcagattgta ctgagagtgc 180accatatgcg gtgtgaaata ccgcacagat gcgtaaggag
aaaataccgc atcagattgg 240ctattggctg ctccctgctt gtgtgttgga ggtcgctgag
tagtgcgcga gcaaaattta 300agctacaaca aggcaaggct tgaccgacaa ttgcatgaag
aatctgctta gggttaggcg 360ttttgcgctg cttcgcgatg tacgggccag atatacgcgt
atctgagggg actagggtgt 420gtttaggcga aaagcggggc ttcggttgta cgcggttagg
agtcccctca ggatatagta 480gtttcgcttt tgcataggga gggggaaatg tagtcttatg
caatactctt gtagtcttgc 540aacatggtaa cgatgagtta gcaacatgcc ttacaaggag
agaaaaagca ccgtgcatgc 600cgattggtgg aagtaaggtg gtacgatcgt gccttattag
gaaggcaaca gacgggtctg 660acatggattg gacgaaccac tgaattccgc attgcagaga
tattgtattt aagtgcctag 720ctcgatacaa taaacgccat ttgaccattc accacattgg
tgtgcacctc catcggctcg 780catctctcct tcacgcgccc gccgccctac ctgaggccgc
catccacgcc ggttgagtcg 840cgttctgccg cctcccgcct gtggtgcctc ctgaactgcg
tccgccgtct aggtaagttt 900aaagctcagg tcgagaccgg gcctttgtcc ggcgctccct
tggagcctac ctagactcag 960ccggctctcc acgctttgcc tgaccctgct tgctcaactc
tagttaacgg tggagggcag 1020tgtagtctga gcagtactcg ttgctgccgc gcgcgccacc
agacataata gctgacagac 1080taacagactg ttcctttcca tgggtctttt ctgcagtcac
cgtcgtcgga tatcgaattc 1140gccaccatga gcctgctgac cgaggtggag acccccatca
gaaacgagtg gggctgcaga 1200tgcaacgaca gcagcgaccc cctggtggtg gccgccagca
tcatcggcat cctgcacctg 1260atcctgtgga tcctggacag actgttcttc aagtgcatct
acagactgtt caagcacggc 1320ctgaagagag gccccagcac cgagggcgtg cccgagagca
tgagagagga gtacagaaag 1380gagcagcaga acgccgtgga cgccgacgac agccacttcg
tgagcatcga gctggagtga 1440tcagtcgaga tccagatctg ctgtgccttc tagttgccag
ccatctgttg tttgcccctc 1500ccccgtgcct tccttgaccc tggaaggtgc cactcccact
gtcctttcct aataaaatga 1560ggaaattgca tcgcattgtc tgagtaggtg tcattctatt
ctggggggtg gggtggggca 1620ggacagcaag ggggaggatt gggaagacaa tagcaggcat
gctggggatg cggtgggctc 1680tatgggtacc caggtgctga agaattgacc cggttcctcc
tgggccagaa agaagcaggc 1740acatcccctt ctctgtgaca caccctgtcc acgcccctgg
ttcttagttc cagccccact 1800cataggacac tcatagctca ggagggctcc gccttcaatc
ccacccgcta aagtacttgg 1860agcggtctct ccctccctca tcagcccacc aaaccaaacc
tagcctccaa gagtgggaag 1920aaattaaagc aagataggct attaagtgca gagggagaga
aaatgcctcc aacatgtgag 1980gaagtaatga gagaaatcat agaattttaa ggccatgatt
taaggccatc atggccttaa 2040tcttccgctt cctcgctcac tgactcgctg cgctcggtcg
ttcggctgcg gcgagcggta 2100tcagctcact caaaggcggt aatacggtta tccacagaat
caggggataa cgcaggaaag 2160aacatgtgag caaaaggcca gcaaaaggcc aggaaccgta
aaaaggccgc gttgctggcg 2220tttttccata ggctccgccc ccctgacgag catcacaaaa
atcgacgctc aagtcagagg 2280tggcgaaacc cgacaggact ataaagatac caggcgtttc
cccctggaag ctccctcgtg 2340cgctctcctg ttccgaccct gccgcttacc ggatacctgt
ccgcctttct cccttcggga 2400agcgtggcgc tttctcatag ctcacgctgt aggtatctca
gttcggtgta ggtcgttcgc 2460tccaagctgg gctgtgtgca cgaacccccc gttcagcccg
accgctgcgc cttatccggt 2520aactatcgtc ttgagtccaa cccggtaaga cacgacttat
cgccactggc agcagccact 2580ggtaacagga ttagcagagc gaggtatgta ggcggtgcta
cagagttctt gaagtggtgg 2640cctaactacg gctacactag aagaacagta tttggtatct
gcgctctgct gaagccagtt 2700accttcggaa aaagagttgg tagctcttga tccggcaaac
aaaccaccgc tggtagcggt 2760ggtttttttg tttgcaagca gcagattacg cgcagaaaaa
aaggatctca agaagatcct 2820ttgatctttt ctacggggtc tgacgctcag tggaacgaaa
actcacgtta agggattttg 2880gtcatgagat tatcaaaaag gatcttcacc tagatccttt
taaattaaaa atgaagtttt 2940aaatcaatct aaagtatata tgagtaaact tggtctgaca
gttaccaatg cttaatcagt 3000gaggcaccta tctcagcgat ctgtctattt cgttcatcca
tagttgcctg actcgggggg 3060ggggggcgct gaggtctgcc tcgtgaagaa ggtgttgctg
actcatacca ggcctgaatc 3120gccccatcat ccagccagaa agtgagggag ccacggttga
tgagagcttt gttgtaggtg 3180gaccagttgg tgattttgaa cttttgcttt gccacggaac
ggtctgcgtt gtcgggaaga 3240tgcgtgatct gatccttcaa ctcagcaaaa gttcgattta
ttcaacaaag ccgccgtccc 3300gtcaagtcag cgtaatgctc tgccagtgtt acaaccaatt
aaccaattct gattagaaaa 3360actcatcgag catcaaatga aactgcaatt tattcatatc
aggattatca ataccatatt 3420tttgaaaaag ccgtttctgt aatgaaggag aaaactcacc
gaggcagttc cataggatgg 3480caagatcctg gtatcggtct gcgattccga ctcgtccaac
atcaatacaa cctattaatt 3540tcccctcgtc aaaaataagg ttatcaagtg agaaatcacc
atgagtgacg actgaatccg 3600gtgagaatgg caaaagctta tgcatttctt tccagacttg
ttcaacaggc cagccattac 3660gctcgtcatc aaaatcactc gcatcaacca aaccgttatt
cattcgtgat tgcgcctgag 3720cgagacgaaa tacgcgatcg ctgttaaaag gacaattaca
aacaggaatc gaatgcaacc 3780ggcgcaggaa cactgccagc gcatcaacaa tattttcacc
tgaatcagga tattcttcta 3840atacctggaa tgctgttttc ccggggatcg cagtggtgag
taaccatgca tcatcaggag 3900tacggataaa atgcttgatg gtcggaagag gcataaattc
cgtcagccag tttagtctga 3960ccatctcatc tgtaacatca ttggcaacgc tacctttgcc
atgtttcaga aacaactctg 4020gcgcatcggg cttcccatac aatcgataga ttgtcgcacc
tgattgcccg acattatcgc 4080gagcccattt atacccatat aaatcagcat ccatgttgga
atttaatcgc ggcctcgagc 4140aagacgtttc ccgttgaata tggctcataa caccccttgt
attactgttt atgtaagcag 4200acagttttat tgttcatgat gatatatttt tatcttgtgc
aatgtaacat cagagatttt 4260gagacacaac gtggctttcc cccccccccc attattgaag
catttatcag ggttattgtc 4320tcatgagcgg atacatattt gaatgtattt agaaaaataa
acaaataggg gttccgcgca 4380catttccccg aaaagtgcca cctgacgtct aagaaaccat
tattatcatg acattaacct 4440ataaaaatag gcgtatcacg aggccctttc gtc
44731048450DNAArtificial sequenceVR4773, Ligation
of RSV RNP into VR4756 104tggccattgc atacgttgta tccatatcat aatatgtaca
tttatattgg ctcatgtcca 60acattaccgc catgttgaca ttgattattg actagttatt
aatagtaatc aattacgggg 120tcattagttc atagcccata tatggagttc cgcgttacat
aacttacggt aaatggcccg 180cctggctgac cgcccaacga cccccgccca ttgacgtcaa
taatgacgta tgttcccata 240gtaacgccaa tagggacttt ccattgacgt caatgggtgg
agtatttacg gtaaactgcc 300cacttggcag tacatcaagt gtatcatatg ccaagtacgc
cccctattga cgtcaatgac 360ggtaaatggc ccgcctggca ttatgcccag tacatgacct
tatgggactt tcctacttgg 420cagtacatct acgtattagt catcgctatt accatggtga
tgcggttttg gcagtacatc 480aatgggcgtg gatagcggtt tgactcacgg ggatttccaa
gtctccaccc cattgacgtc 540aatgggagtt tgttttggca ccaaaatcaa cgggactttc
caaaatgtcg taacaactcc 600gccccattga cgcaaatggg cggtaggcgt gtacggtggg
aggtctatat aagcagagct 660cgtttagtga accgtcagat cgcctggaga cgccatccac
gctgttttga cctccataga 720agacaccggg accgatccag cctccgcggc cgggaacggt
gcattggaac gcggattccc 780cgtgccaaga gtgacgtaag taccgcctat agactctata
ggcacacccc tttggctctt 840atgcatgcta tactgttttt ggcttggggc ctatacaccc
ccgcttcctt atgctatagg 900tgatggtata gcttagccta taggtgtggg ttattgacca
ttattgacca ctcccctatt 960ggtgacgata ctttccatta ctaatccata acatggctct
ttgccacaac tatctctatt 1020ggctatatgc caatactctg tccttcagag actgacacgg
actctgtatt tttacaggat 1080ggggtcccat ttattattta caaattcaca tatacaacaa
cgccgtcccc cgtgcccgca 1140gtttttatta aacatagcgt gggatctcca cgcgaatctc
gggtacgtgt tccggacatg 1200ggctcttctc cggtagcggc ggagcttcca catccgagcc
ctggtcccat gcctccagcg 1260gctcatggtc gctcggcagc tccttgctcc taacagtgga
ggccagactt aggcacagca 1320caatgcccac caccaccagt gtgccgcaca aggccgtggc
ggtagggtat gtgtctgaaa 1380atgagcgtgg agattgggct cgcacggctg acgcagatgg
aagacttaag gcagcggcag 1440aagaagatgc aggcagctga gttgttgtat tctgataaga
gtcagaggta actcccgttg 1500cggtgctgtt aacggtggag ggcagtgtag tctgagcagt
actcgttgct gccgcgcgcg 1560ccaccagaca taatagctga cagactaaca gactgttcct
ttccatgggt cttttctgca 1620gtcaccgtcg tcggatatcg aattcgccac catgagcctt
ctaaccgagg tcgaaacgta 1680tgttctctct atcgttccat caggccccct caaagccgaa
atcgcgcaga gacttgaaga 1740tgtctttgct gggaaaaaca cagatcttga ggctctcatg
gaatggctaa agacaagacc 1800aatcctgtca cctctgacta aggggatttt ggggtttgtg
ttcacgctca ccgtgcccag 1860tgagcgagga ctgcagcgta gacgctttgt ccaaaatgcc
ctcaatggga atggggatcc 1920aaataacatg gacagagcag ttaaactata tagaaaactt
aagagggaga ttacattcca 1980tggggccaaa gaaatagcac tcagttattc tgctggtgca
cttgccagtt gcatgggcct 2040catatacaac agaatggggg ctgtaaccac tgaagtggcc
tttggcctgg tatgtgcaac 2100atgtgaacag attgctgact cccagcacag gtctcatagg
caaatggtgg caacaaccaa 2160tccattaata aggcatgaga acagaatggt tttggccagc
actacagcta aggctatgga 2220gcaaatggct ggatcaagtg agcaggcagc ggaggccatg
gaaattgcta gtcaggccag 2280gcaaatggtg caggcaatga gagccattgg gactcatcct
agctccagtg ctggtctaaa 2340agatgatctt cttgaaaatt tgcagaccta tcagaaacga
atgggggtgc agatgcaacg 2400attcaagtga cccgcttgtt gttgctgcga gtatcattgg
gatcttgcac ttgatattgt 2460ggattcttga tcgtcttttt ttcaaatgca tctatcgact
cttcaaacac ggtctgaaaa 2520gagggccttc tacggaagga gtacctgagt ctatgaggga
agaatatcga aaggaacagc 2580agaatgctgt ggatgctgac gacagtcatt ttgtcagcat
agagctggag taatcagtcg 2640accacgtgtg atccagatct acttctggct aataaaagat
cagagctcta gagatctgtg 2700tgttggtttt ttgtgtggta ctcttccgct tcctcgctca
ctgactcgct gcgctcggtc 2760gttcggctgc ggcgagcggt atcagctcac tcaaaggcgg
taatacggtt atccacagaa 2820tcaggggata acgcaggaaa gaacatgtga gcaaaaggcc
agcaaaaggc caggaaccgt 2880aaaaaggccg cgttgctggc gtttttccat aggctccgcc
cccctgacga gcatcacaaa 2940aatcgacgct caagtcagag gtggcgaaac ccgacaggac
tataaagata ccaggcgttt 3000ccccctggaa gctccctcgt gcgctctcct gttccgaccc
tgccgcttac cggatacctg 3060tccgcctttc tcccttcggg aagcgtggcg ctttctcata
gctcacgctg taggtatctc 3120agttcggtgt aggtcgttcg ctccaagctg ggctgtgtgc
acgaaccccc cgttcagccc 3180gaccgctgcg ccttatccgg taactatcgt cttgagtcca
acccggtaag acacgactta 3240tcgccactgg cagcagccac tggtaacagg attagcagag
cgaggtatgt aggcggtgct 3300acagagttct tgaagtggtg gcctaactac ggctacacta
gaagaacagt atttggtatc 3360tgcgctctgc tgaagccagt taccttcgga aaaagagttg
gtagctcttg atccggcaaa 3420caaaccaccg ctggtagcgg tggttttttt gtttgcaagc
agcagattac gcgcagaaaa 3480aaaggatctc aagaagatcc tttgatcttt tctacggggt
ctgacgctca gtggaacgaa 3540aactcacgtt aagggatttt ggtcatgaga ttatcaaaaa
ggatcttcac ctagatcctt 3600ttaaattaaa aatgaagttt taaatcaatc taaagtatat
atgagtaaac ttggtctgac 3660agttaccaat gcttaatcag tgaggcacct atctcagcga
tctgtctatt tcgttcatcc 3720atagttgcct gactcggggg gggggggcgc tgaggtctgc
ctcgtgaaga aggtgttgct 3780gactcatacc aggcctgaat cgccccatca tccagccaga
aagtgaggga gccacggttg 3840atgagagctt tgttgtaggt ggaccagttg gtgattttga
acttttgctt tgccacggaa 3900cggtctgcgt tgtcgggaag atgcgtgatc tgatccttca
actcagcaaa agttcgattt 3960attcaacaaa gccgccgtcc cgtcaagtca gcgtaatgct
ctgccagtgt tacaaccaat 4020taaccaattc tgattagaaa aactcatcga gcatcaaatg
aaactgcaat ttattcatat 4080caggattatc aataccatat ttttgaaaaa gccgtttctg
taatgaagga gaaaactcac 4140cgaggcagtt ccataggatg gcaagatcct ggtatcggtc
tgcgattccg actcgtccaa 4200catcaataca acctattaat ttcccctcgt caaaaataag
gttatcaagt gagaaatcac 4260catgagtgac gactgaatcc ggtgagaatg gcaaaagctt
atgcatttct ttccagactt 4320gttcaacagg ccagccatta cgctcgtcat caaaatcact
cgcatcaacc aaaccgttat 4380tcattcgtga ttgcgcctga gcgagacgaa atacgcgatc
gctgttaaaa ggacaattac 4440aaacaggaat cgaatgcaac cggcgcagga acactgccag
cgcatcaaca atattttcac 4500ctgaatcagg atattcttct aatacctgga atgctgtttt
cccggggatc gcagtggtga 4560gtaaccatgc atcatcagga gtacggataa aatgcttgat
ggtcggaaga ggcataaatt 4620ccgtcagcca gtttagtctg accatctcat ctgtaacatc
attggcaacg ctacctttgc 4680catgtttcag aaacaactct ggcgcatcgg gcttcccata
caatcgatag attgtcgcac 4740ctgattgccc gacattatcg cgagcccatt tatacccata
taaatcagca tccatgttgg 4800aatttaatcg cggcctcgag caagacgttt cccgttgaat
atggctcata acaccccttg 4860tattactgtt tatgtaagca gacagtttta ttgttcatga
tgatatattt ttatcttgtg 4920caatgtaaca tcagagattt tgagacacta tgcggtgtga
aataccgcac agatgcgtaa 4980ggagaaaata ccgcatcaga ttggctattg gctgctccct
gcttgtgtgt tggaggtcgc 5040tgagtagtgc gcgagcaaaa tttaagctac aacaaggcaa
ggcttgaccg acaattgcat 5100gaagaatctg cttagggtta ggcgttttgc gctgcttcgc
gatgtacggg ccagatatac 5160gcgtatctga ggggactagg gtgtgtttag gcgaaaagcg
gggcttcggt tgtacgcggt 5220taggagtccc ctcaggatat agtagtttcg cttttgcata
gggaggggga aatgtagtct 5280tatgcaatac tcttgtagtc ttgcaacatg gtaacgatga
gttagcaaca tgccttacaa 5340ggagagaaaa agcaccgtgc atgccgattg gtggaagtaa
ggtggtacga tcgtgcctta 5400ttaggaaggc aacagacggg tctgacatgg attggacgaa
ccactgaatt ccgcattgca 5460gagatattgt atttaagtgc ctagctcgat acaataaacg
ccatttgacc attcaccaca 5520ttggtgtgca cctccatcgg ctcgcatctc tccttcacgc
gcccgccgcc ctacctgagg 5580ccgccatcca cgccggttga gtcgcgttct gccgcctccc
gcctgtggtg cctcctgaac 5640tgcgtccgcc gtctaggtaa gtttaaagct caggtcgaga
ccgggccttt gtccggcgct 5700cccttggagc ctacctagac tcagccggct ctccacgctt
tgcctgaccc tgcttgctca 5760actctagtta acggtggagg gcagtgtagt ctgagcagta
ctcgttgctg ccgcgcgcgc 5820caccagacat aatagctgac agactaacag actgttcctt
tccatgggtc ttttctgcag 5880tcaccgtcgt cggatatcga attcgccacc atggccagcc
agggcaccaa gagaagctac 5940gagcagatgg agaccgacgg cgagagacag aacgccaccg
agatcagagc cagcgtgggc 6000aagatgatcg acggcatcgg cagattctac atccagatgt
gcaccgagct gaagctgagc 6060gactacgagg gcagactgat ccagaacagc ctgaccatcg
agagaatggt gctgagcgcc 6120ttcgacgaga gaagaaacag atacctggag gagcacccca
gcgccggcaa ggaccccaag 6180aagaccggcg gccccatcta cagaagagtg gacggcaagt
ggatgagaga gctggtgctg 6240tacgacaagg aggagatcag aagaatctgg agacaggcca
acaacggcga ggacgccacc 6300gccggcctga cccacatgat gatctggcac agcaacctga
acgacaccac ctaccagaga 6360accagagccc tggtgcggac cggcatggac cccagaatgt
gcagcctgat gcagggcagc 6420accctgccca gaagaagcgg cgccgccggc gccgccgtga
agggcatcgg caccatggtg 6480atggagctga tcagaatgat caagagaggc atcaacgaca
gaaacttctg gagaggcgag 6540aacggcagaa agaccagaag cgcctacgag agaatgtgca
acatcctgaa gggcaagttc 6600cagaccgccg cccagagagc catgatggac caggtccggg
agagcagaaa ccccggcaac 6660gccgagatcg aggacctgat cttcctggcc agaagcgccc
tgatcctgag aggcagcgtg 6720gcccacaaga gctgcctgcc cgcctgcgtg tacggccccg
ccgtgagcag cggctacgac 6780ttcgagaagg agggctacag cctggtgggc atcgacccct
tcaagctgct gcagaacagc 6840caggtgtaca gcctgatcag acccaacgag aaccccgccc
acaagagcca gctggtgtgg 6900atggcctgcc acagcgccgc cttcgaggac ctgagactgc
tgagcttcat cagaggcacc 6960aaggtgtccc ccagaggcaa gctgagcacc agaggcgtgc
agatcgccag caacgagaac 7020atggacaaca tgggcagcag caccctggag ctgagaagca
gatactgggc catcagaacc 7080agaagcggcg gcaacaccaa ccagcagaga gccagcgccg
gccagatcag cgtgcagccc 7140accttcagcg tgcagagaaa cctgcccttc gagaagagca
ccgtgatggc cgccttcacc 7200ggcaacaccg agggcagaac cagcgacatg agagccgaga
tcatcagaat gatggagggc 7260gccaagcccg aggaggtgtc cttcagaggc agaggcgtgt
tcgagctgag cgacgagaag 7320gccaccaacc ccatcgtgcc tagcttcgac atgagcaacg
agggcagcta cttcttcggc 7380gacaacgccg aggagtacga caactgatca gtcgaccacg
tgtgatccag atctgctgtg 7440ccttctagtt gccagccatc tgttgtttgc ccctcccccg
tgccttcctt gaccctggaa 7500ggtgccactc ccactgtcct ttcctaataa aatgaggaaa
ttgcatcgca ttgtctgagt 7560aggtgtcatt ctattctggg gggtggggtg gggcaggaca
gcaaggggga ggattgggaa 7620gacaatagca ggcatgctgg ggatgcggtg ggctctatgg
gtacccaggt gctgaagaat 7680tgacccggtt cctcctgggc cagaaagaag caggcacatc
cccttctctg tgacacaccc 7740tgtccacgcc cctggttctt agttccagcc ccactcatag
gacactcata gctcaggagg 7800gctccgcctt caatcccacc cgctaaagta cttggagcgg
tctctccctc cctcatcagc 7860ccaccaaacc aaacctagcc tccaagagtg ggaagaaatt
aaagcaagat aggctattaa 7920gtgcagaggg agagaaaatg cctccaacat gtgaggaagt
aatgagagaa atcatagaat 7980tttaaggcca tgatttaagg ccagtggctt tccccccccc
cccattattg aagcatttat 8040cagggttatt gtctcatgag cggatacata tttgaatgta
tttagaaaaa taaacaaata 8100ggggttccgc gcacatttcc ccgaaaagtg ccacctgacg
tctaagaaac cattattatc 8160atgacattaa cctataaaaa taggcgtatc acgaggccct
ttcgtctcgc gcgtttcggt 8220gatgacggtg aaaacctctg acacatgcag ctcccggaga
cggtcacagc ttgtctgtaa 8280gcggatgccg ggagcagaca agcccgtcag ggcgcgtcag
cgggtgttgg cgggtgtcgg 8340ggctggctta actatgcggc atcagagcag attgtactga
gagtgcacca tatgcggtgt 8400gaaataccgc acagatgcgt aaggagaaaa taccgcatca
gattggctat 84501058450DNAArtificial sequenceVR4774, Ligation
of Inverted RSV RNP into VR4756 105tggccattgc atacgttgta tccatatcat
aatatgtaca tttatattgg ctcatgtcca 60acattaccgc catgttgaca ttgattattg
actagttatt aatagtaatc aattacgggg 120tcattagttc atagcccata tatggagttc
cgcgttacat aacttacggt aaatggcccg 180cctggctgac cgcccaacga cccccgccca
ttgacgtcaa taatgacgta tgttcccata 240gtaacgccaa tagggacttt ccattgacgt
caatgggtgg agtatttacg gtaaactgcc 300cacttggcag tacatcaagt gtatcatatg
ccaagtacgc cccctattga cgtcaatgac 360ggtaaatggc ccgcctggca ttatgcccag
tacatgacct tatgggactt tcctacttgg 420cagtacatct acgtattagt catcgctatt
accatggtga tgcggttttg gcagtacatc 480aatgggcgtg gatagcggtt tgactcacgg
ggatttccaa gtctccaccc cattgacgtc 540aatgggagtt tgttttggca ccaaaatcaa
cgggactttc caaaatgtcg taacaactcc 600gccccattga cgcaaatggg cggtaggcgt
gtacggtggg aggtctatat aagcagagct 660cgtttagtga accgtcagat cgcctggaga
cgccatccac gctgttttga cctccataga 720agacaccggg accgatccag cctccgcggc
cgggaacggt gcattggaac gcggattccc 780cgtgccaaga gtgacgtaag taccgcctat
agactctata ggcacacccc tttggctctt 840atgcatgcta tactgttttt ggcttggggc
ctatacaccc ccgcttcctt atgctatagg 900tgatggtata gcttagccta taggtgtggg
ttattgacca ttattgacca ctcccctatt 960ggtgacgata ctttccatta ctaatccata
acatggctct ttgccacaac tatctctatt 1020ggctatatgc caatactctg tccttcagag
actgacacgg actctgtatt tttacaggat 1080ggggtcccat ttattattta caaattcaca
tatacaacaa cgccgtcccc cgtgcccgca 1140gtttttatta aacatagcgt gggatctcca
cgcgaatctc gggtacgtgt tccggacatg 1200ggctcttctc cggtagcggc ggagcttcca
catccgagcc ctggtcccat gcctccagcg 1260gctcatggtc gctcggcagc tccttgctcc
taacagtgga ggccagactt aggcacagca 1320caatgcccac caccaccagt gtgccgcaca
aggccgtggc ggtagggtat gtgtctgaaa 1380atgagcgtgg agattgggct cgcacggctg
acgcagatgg aagacttaag gcagcggcag 1440aagaagatgc aggcagctga gttgttgtat
tctgataaga gtcagaggta actcccgttg 1500cggtgctgtt aacggtggag ggcagtgtag
tctgagcagt actcgttgct gccgcgcgcg 1560ccaccagaca taatagctga cagactaaca
gactgttcct ttccatgggt cttttctgca 1620gtcaccgtcg tcggatatcg aattcgccac
catgagcctt ctaaccgagg tcgaaacgta 1680tgttctctct atcgttccat caggccccct
caaagccgaa atcgcgcaga gacttgaaga 1740tgtctttgct gggaaaaaca cagatcttga
ggctctcatg gaatggctaa agacaagacc 1800aatcctgtca cctctgacta aggggatttt
ggggtttgtg ttcacgctca ccgtgcccag 1860tgagcgagga ctgcagcgta gacgctttgt
ccaaaatgcc ctcaatggga atggggatcc 1920aaataacatg gacagagcag ttaaactata
tagaaaactt aagagggaga ttacattcca 1980tggggccaaa gaaatagcac tcagttattc
tgctggtgca cttgccagtt gcatgggcct 2040catatacaac agaatggggg ctgtaaccac
tgaagtggcc tttggcctgg tatgtgcaac 2100atgtgaacag attgctgact cccagcacag
gtctcatagg caaatggtgg caacaaccaa 2160tccattaata aggcatgaga acagaatggt
tttggccagc actacagcta aggctatgga 2220gcaaatggct ggatcaagtg agcaggcagc
ggaggccatg gaaattgcta gtcaggccag 2280gcaaatggtg caggcaatga gagccattgg
gactcatcct agctccagtg ctggtctaaa 2340agatgatctt cttgaaaatt tgcagaccta
tcagaaacga atgggggtgc agatgcaacg 2400attcaagtga cccgcttgtt gttgctgcga
gtatcattgg gatcttgcac ttgatattgt 2460ggattcttga tcgtcttttt ttcaaatgca
tctatcgact cttcaaacac ggtctgaaaa 2520gagggccttc tacggaagga gtacctgagt
ctatgaggga agaatatcga aaggaacagc 2580agaatgctgt ggatgctgac gacagtcatt
ttgtcagcat agagctggag taatcagtcg 2640accacgtgtg atccagatct acttctggct
aataaaagat cagagctcta gagatctgtg 2700tgttggtttt ttgtgtggta ctcttccgct
tcctcgctca ctgactcgct gcgctcggtc 2760gttcggctgc ggcgagcggt atcagctcac
tcaaaggcgg taatacggtt atccacagaa 2820tcaggggata acgcaggaaa gaacatgtga
gcaaaaggcc agcaaaaggc caggaaccgt 2880aaaaaggccg cgttgctggc gtttttccat
aggctccgcc cccctgacga gcatcacaaa 2940aatcgacgct caagtcagag gtggcgaaac
ccgacaggac tataaagata ccaggcgttt 3000ccccctggaa gctccctcgt gcgctctcct
gttccgaccc tgccgcttac cggatacctg 3060tccgcctttc tcccttcggg aagcgtggcg
ctttctcata gctcacgctg taggtatctc 3120agttcggtgt aggtcgttcg ctccaagctg
ggctgtgtgc acgaaccccc cgttcagccc 3180gaccgctgcg ccttatccgg taactatcgt
cttgagtcca acccggtaag acacgactta 3240tcgccactgg cagcagccac tggtaacagg
attagcagag cgaggtatgt aggcggtgct 3300acagagttct tgaagtggtg gcctaactac
ggctacacta gaagaacagt atttggtatc 3360tgcgctctgc tgaagccagt taccttcgga
aaaagagttg gtagctcttg atccggcaaa 3420caaaccaccg ctggtagcgg tggttttttt
gtttgcaagc agcagattac gcgcagaaaa 3480aaaggatctc aagaagatcc tttgatcttt
tctacggggt ctgacgctca gtggaacgaa 3540aactcacgtt aagggatttt ggtcatgaga
ttatcaaaaa ggatcttcac ctagatcctt 3600ttaaattaaa aatgaagttt taaatcaatc
taaagtatat atgagtaaac ttggtctgac 3660agttaccaat gcttaatcag tgaggcacct
atctcagcga tctgtctatt tcgttcatcc 3720atagttgcct gactcggggg gggggggcgc
tgaggtctgc ctcgtgaaga aggtgttgct 3780gactcatacc aggcctgaat cgccccatca
tccagccaga aagtgaggga gccacggttg 3840atgagagctt tgttgtaggt ggaccagttg
gtgattttga acttttgctt tgccacggaa 3900cggtctgcgt tgtcgggaag atgcgtgatc
tgatccttca actcagcaaa agttcgattt 3960attcaacaaa gccgccgtcc cgtcaagtca
gcgtaatgct ctgccagtgt tacaaccaat 4020taaccaattc tgattagaaa aactcatcga
gcatcaaatg aaactgcaat ttattcatat 4080caggattatc aataccatat ttttgaaaaa
gccgtttctg taatgaagga gaaaactcac 4140cgaggcagtt ccataggatg gcaagatcct
ggtatcggtc tgcgattccg actcgtccaa 4200catcaataca acctattaat ttcccctcgt
caaaaataag gttatcaagt gagaaatcac 4260catgagtgac gactgaatcc ggtgagaatg
gcaaaagctt atgcatttct ttccagactt 4320gttcaacagg ccagccatta cgctcgtcat
caaaatcact cgcatcaacc aaaccgttat 4380tcattcgtga ttgcgcctga gcgagacgaa
atacgcgatc gctgttaaaa ggacaattac 4440aaacaggaat cgaatgcaac cggcgcagga
acactgccag cgcatcaaca atattttcac 4500ctgaatcagg atattcttct aatacctgga
atgctgtttt cccggggatc gcagtggtga 4560gtaaccatgc atcatcagga gtacggataa
aatgcttgat ggtcggaaga ggcataaatt 4620ccgtcagcca gtttagtctg accatctcat
ctgtaacatc attggcaacg ctacctttgc 4680catgtttcag aaacaactct ggcgcatcgg
gcttcccata caatcgatag attgtcgcac 4740ctgattgccc gacattatcg cgagcccatt
tatacccata taaatcagca tccatgttgg 4800aatttaatcg cggcctcgag caagacgttt
cccgttgaat atggctcata acaccccttg 4860tattactgtt tatgtaagca gacagtttta
ttgttcatga tgatatattt ttatcttgtg 4920caatgtaaca tcagagattt tgagacactg
gccttaaatc atggccttaa aattctatga 4980tttctctcat tacttcctca catgttggag
gcattttctc tccctctgca cttaatagcc 5040tatcttgctt taatttcttc ccactcttgg
aggctaggtt tggtttggtg ggctgatgag 5100ggagggagag accgctccaa gtactttagc
gggtgggatt gaaggcggag ccctcctgag 5160ctatgagtgt cctatgagtg gggctggaac
taagaaccag gggcgtggac agggtgtgtc 5220acagagaagg ggatgtgcct gcttctttct
ggcccaggag gaaccgggtc aattcttcag 5280cacctgggta cccatagagc ccaccgcatc
cccagcatgc ctgctattgt cttcccaatc 5340ctcccccttg ctgtcctgcc ccaccccacc
ccccagaata gaatgacacc tactcagaca 5400atgcgatgca atttcctcat tttattagga
aaggacagtg ggagtggcac cttccagggt 5460caaggaaggc acgggggagg ggcaaacaac
agatggctgg caactagaag gcacagcaga 5520tctggatcac acgtggtcga ctgatcagtt
gtcgtactcc tcggcgttgt cgccgaagaa 5580gtagctgccc tcgttgctca tgtcgaagct
aggcacgatg gggttggtgg ccttctcgtc 5640gctcagctcg aacacgcctc tgcctctgaa
ggacacctcc tcgggcttgg cgccctccat 5700cattctgatg atctcggctc tcatgtcgct
ggttctgccc tcggtgttgc cggtgaaggc 5760ggccatcacg gtgctcttct cgaagggcag
gtttctctgc acgctgaagg tgggctgcac 5820gctgatctgg ccggcgctgg ctctctgctg
gttggtgttg ccgccgcttc tggttctgat 5880ggcccagtat ctgcttctca gctccagggt
gctgctgccc atgttgtcca tgttctcgtt 5940gctggcgatc tgcacgcctc tggtgctcag
cttgcctctg ggggacacct tggtgcctct 6000gatgaagctc agcagtctca ggtcctcgaa
ggcggcgctg tggcaggcca tccacaccag 6060ctggctcttg tgggcggggt tctcgttggg
tctgatcagg ctgtacacct ggctgttctg 6120cagcagcttg aaggggtcga tgcccaccag
gctgtagccc tccttctcga agtcgtagcc 6180gctgctcacg gcggggccgt acacgcaggc
gggcaggcag ctcttgtggg ccacgctgcc 6240tctcaggatc agggcgcttc tggccaggaa
gatcaggtcc tcgatctcgg cgttgccggg 6300gtttctgctc tcccggacct ggtccatcat
ggctctctgg gcggcggtct ggaacttgcc 6360cttcaggatg ttgcacattc tctcgtaggc
gcttctggtc tttctgccgt tctcgcctct 6420ccagaagttt ctgtcgttga tgcctctctt
gatcattctg atcagctcca tcaccatggt 6480gccgatgccc ttcacggcgg cgccggcggc
gccgcttctt ctgggcaggg tgctgccctg 6540catcaggctg cacattctgg ggtccatgcc
ggtccgcacc agggctctgg ttctctggta 6600ggtggtgtcg ttcaggttgc tgtgccagat
catcatgtgg gtcaggccgg cggtggcgtc 6660ctcgccgttg ttggcctgtc tccagattct
tctgatctcc tccttgtcgt acagcaccag 6720ctctctcatc cacttgccgt ccactcttct
gtagatgggg ccgccggtct tcttggggtc 6780cttgccggcg ctggggtgct cctccaggta
tctgtttctt ctctcgtcga aggcgctcag 6840caccattctc tcgatggtca ggctgttctg
gatcagtctg ccctcgtagt cgctcagctt 6900cagctcggtg cacatctgga tgtagaatct
gccgatgccg tcgatcatct tgcccacgct 6960ggctctgatc tcggtggcgt tctgtctctc
gccgtcggtc tccatctgct cgtagcttct 7020cttggtgccc tggctggcca tggtggcgaa
ttcgatatcc gacgacggtg actgcagaaa 7080agacccatgg aaaggaacag tctgttagtc
tgtcagctat tatgtctggt ggcgcgcgcg 7140gcagcaacga gtactgctca gactacactg
ccctccaccg ttaactagag ttgagcaagc 7200agggtcaggc aaagcgtgga gagccggctg
agtctaggta ggctccaagg gagcgccgga 7260caaaggcccg gtctcgacct gagctttaaa
cttacctaga cggcggacgc agttcaggag 7320gcaccacagg cgggaggcgg cagaacgcga
ctcaaccggc gtggatggcg gcctcaggta 7380gggcggcggg cgcgtgaagg agagatgcga
gccgatggag gtgcacacca atgtggtgaa 7440tggtcaaatg gcgtttattg tatcgagcta
ggcacttaaa tacaatatct ctgcaatgcg 7500gaattcagtg gttcgtccaa tccatgtcag
acccgtctgt tgccttccta ataaggcacg 7560atcgtaccac cttacttcca ccaatcggca
tgcacggtgc tttttctctc cttgtaaggc 7620atgttgctaa ctcatcgtta ccatgttgca
agactacaag agtattgcat aagactacat 7680ttccccctcc ctatgcaaaa gcgaaactac
tatatcctga ggggactcct aaccgcgtac 7740aaccgaagcc ccgcttttcg cctaaacaca
ccctagtccc ctcagatacg cgtatatctg 7800gcccgtacat cgcgaagcag cgcaaaacgc
ctaaccctaa gcagattctt catgcaattg 7860tcggtcaagc cttgccttgt tgtagcttaa
attttgctcg cgcactactc agcgacctcc 7920aacacacaag cagggagcag ccaatagcca
atctgatgcg gtattttctc cttacgcatc 7980tgtgcggtat ttcacaccgc atagtggctt
tccccccccc cccattattg aagcatttat 8040cagggttatt gtctcatgag cggatacata
tttgaatgta tttagaaaaa taaacaaata 8100ggggttccgc gcacatttcc ccgaaaagtg
ccacctgacg tctaagaaac cattattatc 8160atgacattaa cctataaaaa taggcgtatc
acgaggccct ttcgtctcgc gcgtttcggt 8220gatgacggtg aaaacctctg acacatgcag
ctcccggaga cggtcacagc ttgtctgtaa 8280gcggatgccg ggagcagaca agcccgtcag
ggcgcgtcag cgggtgttgg cgggtgtcgg 8340ggctggctta actatgcggc atcagagcag
attgtactga gagtgcacca tatgcggtgt 8400gaaataccgc acagatgcgt aaggagaaaa
taccgcatca gattggctat 84501068442DNAArtificial
sequenceVR4775, Ligation of RSV RSeg7 into VR4762 106tggccattgc
atacgttgta tccatatcat aatatgtaca tttatattgg ctcatgtcca 60acattaccgc
catgttgaca ttgattattg actagttatt aatagtaatc aattacgggg 120tcattagttc
atagcccata tatggagttc cgcgttacat aacttacggt aaatggcccg 180cctggctgac
cgcccaacga cccccgccca ttgacgtcaa taatgacgta tgttcccata 240gtaacgccaa
tagggacttt ccattgacgt caatgggtgg agtatttacg gtaaactgcc 300cacttggcag
tacatcaagt gtatcatatg ccaagtacgc cccctattga cgtcaatgac 360ggtaaatggc
ccgcctggca ttatgcccag tacatgacct tatgggactt tcctacttgg 420cagtacatct
acgtattagt catcgctatt accatggtga tgcggttttg gcagtacatc 480aatgggcgtg
gatagcggtt tgactcacgg ggatttccaa gtctccaccc cattgacgtc 540aatgggagtt
tgttttggca ccaaaatcaa cgggactttc caaaatgtcg taacaactcc 600gccccattga
cgcaaatggg cggtaggcgt gtacggtggg aggtctatat aagcagagct 660cgtttagtga
accgtcagat cgcctggaga cgccatccac gctgttttga cctccataga 720agacaccggg
accgatccag cctccgcggc cgggaacggt gcattggaac gcggattccc 780cgtgccaaga
gtgacgtaag taccgcctat agactctata ggcacacccc tttggctctt 840atgcatgcta
tactgttttt ggcttggggc ctatacaccc ccgcttcctt atgctatagg 900tgatggtata
gcttagccta taggtgtggg ttattgacca ttattgacca ctcccctatt 960ggtgacgata
ctttccatta ctaatccata acatggctct ttgccacaac tatctctatt 1020ggctatatgc
caatactctg tccttcagag actgacacgg actctgtatt tttacaggat 1080ggggtcccat
ttattattta caaattcaca tatacaacaa cgccgtcccc cgtgcccgca 1140gtttttatta
aacatagcgt gggatctcca cgcgaatctc gggtacgtgt tccggacatg 1200ggctcttctc
cggtagcggc ggagcttcca catccgagcc ctggtcccat gcctccagcg 1260gctcatggtc
gctcggcagc tccttgctcc taacagtgga ggccagactt aggcacagca 1320caatgcccac
caccaccagt gtgccgcaca aggccgtggc ggtagggtat gtgtctgaaa 1380atgagcgtgg
agattgggct cgcacggctg acgcagatgg aagacttaag gcagcggcag 1440aagaagatgc
aggcagctga gttgttgtat tctgataaga gtcagaggta actcccgttg 1500cggtgctgtt
aacggtggag ggcagtgtag tctgagcagt actcgttgct gccgcgcgcg 1560ccaccagaca
taatagctga cagactaaca gactgttcct ttccatgggt cttttctgca 1620gtcaccgtcg
tcggatatcg aattcgccac catggccagc cagggcacca agagaagcta 1680cgagcagatg
gagaccgacg gcgagagaca gaacgccacc gagatcagag ccagcgtggg 1740caagatgatc
gacggcatcg gcagattcta catccagatg tgcaccgagc tgaagctgag 1800cgactacgag
ggcagactga tccagaacag cctgaccatc gagagaatgg tgctgagcgc 1860cttcgacgag
agaagaaaca gatacctgga ggagcacccc agcgccggca aggaccccaa 1920gaagaccggc
ggccccatct acagaagagt ggacggcaag tggatgagag agctggtgct 1980gtacgacaag
gaggagatca gaagaatctg gagacaggcc aacaacggcg aggacgccac 2040cgccggcctg
acccacatga tgatctggca cagcaacctg aacgacacca cctaccagag 2100aaccagagcc
ctggtgcgga ccggcatgga ccccagaatg tgcagcctga tgcagggcag 2160caccctgccc
agaagaagcg gcgccgccgg cgccgccgtg aagggcatcg gcaccatggt 2220gatggagctg
atcagaatga tcaagagagg catcaacgac agaaacttct ggagaggcga 2280gaacggcaga
aagaccagaa gcgcctacga gagaatgtgc aacatcctga agggcaagtt 2340ccagaccgcc
gcccagagag ccatgatgga ccaggtccgg gagagcagaa accccggcaa 2400cgccgagatc
gaggacctga tcttcctggc cagaagcgcc ctgatcctga gaggcagcgt 2460ggcccacaag
agctgcctgc ccgcctgcgt gtacggcccc gccgtgagca gcggctacga 2520cttcgagaag
gagggctaca gcctggtggg catcgacccc ttcaagctgc tgcagaacag 2580ccaggtgtac
agcctgatca gacccaacga gaaccccgcc cacaagagcc agctggtgtg 2640gatggcctgc
cacagcgccg ccttcgagga cctgagactg ctgagcttca tcagaggcac 2700caaggtgtcc
cccagaggca agctgagcac cagaggcgtg cagatcgcca gcaacgagaa 2760catggacaac
atgggcagca gcaccctgga gctgagaagc agatactggg ccatcagaac 2820cagaagcggc
ggcaacacca accagcagag agccagcgcc ggccagatca gcgtgcagcc 2880caccttcagc
gtgcagagaa acctgccctt cgagaagagc accgtgatgg ccgccttcac 2940cggcaacacc
gagggcagaa ccagcgacat gagagccgag atcatcagaa tgatggaggg 3000cgccaagccc
gaggaggtgt ccttcagagg cagaggcgtg ttcgagctga gcgacgagaa 3060ggccaccaac
cccatcgtgc ctagcttcga catgagcaac gagggcagct acttcttcgg 3120cgacaacgcc
gaggagtacg acaactgatc agtcgaccac gtgtgatcca gatctacttc 3180tggctaataa
aagatcagag ctctagagat ctgtgtgttg gttttttgtg tggtactctt 3240ccgcttcctc
gctcactgac tcgctgcgct cggtcgttcg gctgcggcga gcggtatcag 3300ctcactcaaa
ggcggtaata cggttatcca cagaatcagg ggataacgca ggaaagaaca 3360tgtgagcaaa
aggccagcaa aaggccagga accgtaaaaa ggccgcgttg ctggcgtttt 3420tccataggct
ccgcccccct gacgagcatc acaaaaatcg acgctcaagt cagaggtggc 3480gaaacccgac
aggactataa agataccagg cgtttccccc tggaagctcc ctcgtgcgct 3540ctcctgttcc
gaccctgccg cttaccggat acctgtccgc ctttctccct tcgggaagcg 3600tggcgctttc
tcatagctca cgctgtaggt atctcagttc ggtgtaggtc gttcgctcca 3660agctgggctg
tgtgcacgaa ccccccgttc agcccgaccg ctgcgcctta tccggtaact 3720atcgtcttga
gtccaacccg gtaagacacg acttatcgcc actggcagca gccactggta 3780acaggattag
cagagcgagg tatgtaggcg gtgctacaga gttcttgaag tggtggccta 3840actacggcta
cactagaaga acagtatttg gtatctgcgc tctgctgaag ccagttacct 3900tcggaaaaag
agttggtagc tcttgatccg gcaaacaaac caccgctggt agcggtggtt 3960tttttgtttg
caagcagcag attacgcgca gaaaaaaagg atctcaagaa gatcctttga 4020tcttttctac
ggggtctgac gctcagtgga acgaaaactc acgttaaggg attttggtca 4080tgagattatc
aaaaaggatc ttcacctaga tccttttaaa ttaaaaatga agttttaaat 4140caatctaaag
tatatatgag taaacttggt ctgacagtta ccaatgctta atcagtgagg 4200cacctatctc
agcgatctgt ctatttcgtt catccatagt tgcctgactc gggggggggg 4260ggcgctgagg
tctgcctcgt gaagaaggtg ttgctgactc ataccaggcc tgaatcgccc 4320catcatccag
ccagaaagtg agggagccac ggttgatgag agctttgttg taggtggacc 4380agttggtgat
tttgaacttt tgctttgcca cggaacggtc tgcgttgtcg ggaagatgcg 4440tgatctgatc
cttcaactca gcaaaagttc gatttattca acaaagccgc cgtcccgtca 4500agtcagcgta
atgctctgcc agtgttacaa ccaattaacc aattctgatt agaaaaactc 4560atcgagcatc
aaatgaaact gcaatttatt catatcagga ttatcaatac catatttttg 4620aaaaagccgt
ttctgtaatg aaggagaaaa ctcaccgagg cagttccata ggatggcaag 4680atcctggtat
cggtctgcga ttccgactcg tccaacatca atacaaccta ttaatttccc 4740ctcgtcaaaa
ataaggttat caagtgagaa atcaccatga gtgacgactg aatccggtga 4800gaatggcaaa
agcttatgca tttctttcca gacttgttca acaggccagc cattacgctc 4860gtcatcaaaa
tcactcgcat caaccaaacc gttattcatt cgtgattgcg cctgagcgag 4920acgaaatacg
cgatcgctgt taaaaggaca attacaaaca ggaatcgaat gcaaccggcg 4980caggaacact
gccagcgcat caacaatatt ttcacctgaa tcaggatatt cttctaatac 5040ctggaatgct
gttttcccgg ggatcgcagt ggtgagtaac catgcatcat caggagtacg 5100gataaaatgc
ttgatggtcg gaagaggcat aaattccgtc agccagttta gtctgaccat 5160ctcatctgta
acatcattgg caacgctacc tttgccatgt ttcagaaaca actctggcgc 5220atcgggcttc
ccatacaatc gatagattgt cgcacctgat tgcccgacat tatcgcgagc 5280ccatttatac
ccatataaat cagcatccat gttggaattt aatcgcggcc tcgagcaaga 5340cgtttcccgt
tgaatatggc tcataacacc ccttgtatta ctgtttatgt aagcagacag 5400ttttattgtt
catgatgata tatttttatc ttgtgcaatg taacatcaga gattttgaga 5460cactatgcgg
tgtgaaatac cgcacagatg cgtaaggaga aaataccgca tcagattggc 5520tattggctgc
tccctgcttg tgtgttggag gtcgctgagt agtgcgcgag caaaatttaa 5580gctacaacaa
ggcaaggctt gaccgacaat tgcatgaaga atctgcttag ggttaggcgt 5640tttgcgctgc
ttcgcgatgt acgggccaga tatacgcgta tctgagggga ctagggtgtg 5700tttaggcgaa
aagcggggct tcggttgtac gcggttagga gtcccctcag gatatagtag 5760tttcgctttt
gcatagggag ggggaaatgt agtcttatgc aatactcttg tagtcttgca 5820acatggtaac
gatgagttag caacatgcct tacaaggaga gaaaaagcac cgtgcatgcc 5880gattggtgga
agtaaggtgg tacgatcgtg ccttattagg aaggcaacag acgggtctga 5940catggattgg
acgaaccact gaattccgca ttgcagagat attgtattta agtgcctagc 6000tcgatacaat
aaacgccatt tgaccattca ccacattggt gtgcacctcc atcggctcgc 6060atctctcctt
cacgcgcccg ccgccctacc tgaggccgcc atccacgccg gttgagtcgc 6120gttctgccgc
ctcccgcctg tggtgcctcc tgaactgcgt ccgccgtcta ggtaagttta 6180aagctcaggt
cgagaccggg cctttgtccg gcgctccctt ggagcctacc tagactcagc 6240cggctctcca
cgctttgcct gaccctgctt gctcaactct agttaacggt ggagggcagt 6300gtagtctgag
cagtactcgt tgctgccgcg cgcgccacca gacataatag ctgacagact 6360aacagactgt
tcctttccat gggtcttttc tgcagtcacc gtcgtcggat atcgaattcg 6420ccaccatgag
ccttctaacc gaggtcgaaa cgtatgttct ctctatcgtt ccatcaggcc 6480ccctcaaagc
cgaaatcgcg cagagacttg aagatgtctt tgctgggaaa aacacagatc 6540ttgaggctct
catggaatgg ctaaagacaa gaccaatcct gtcacctctg actaagggga 6600ttttggggtt
tgtgttcacg ctcaccgtgc ccagtgagcg aggactgcag cgtagacgct 6660ttgtccaaaa
tgccctcaat gggaatgggg atccaaataa catggacaga gcagttaaac 6720tatatagaaa
acttaagagg gagattacat tccatggggc caaagaaata gcactcagtt 6780attctgctgg
tgcacttgcc agttgcatgg gcctcatata caacagaatg ggggctgtaa 6840ccactgaagt
ggcctttggc ctggtatgtg caacatgtga acagattgct gactcccagc 6900acaggtctca
taggcaaatg gtggcaacaa ccaatccatt aataaggcat gagaacagaa 6960tggttttggc
cagcactaca gctaaggcta tggagcaaat ggctggatca agtgagcagg 7020cagcggaggc
catggaaatt gctagtcagg ccaggcaaat ggtgcaggca atgagagcca 7080ttgggactca
tcctagctcc agtgctggtc taaaagatga tcttcttgaa aatttgcaga 7140cctatcagaa
acgaatgggg gtgcagatgc aacgattcaa gtgacccgct tgttgttgct 7200gcgagtatca
ttgggatctt gcacttgata ttgtggattc ttgatcgtct ttttttcaaa 7260tgcatctatc
gactcttcaa acacggtctg aaaagagggc cttctacgga aggagtacct 7320gagtctatga
gggaagaata tcgaaaggaa cagcagaatg ctgtggatgc tgacgacagt 7380cattttgtca
gcatagagct ggagtaatca gtcgagatcc agatctgctg tgccttctag 7440ttgccagcca
tctgttgttt gcccctcccc cgtgccttcc ttgaccctgg aaggtgccac 7500tcccactgtc
ctttcctaat aaaatgagga aattgcatcg cattgtctga gtaggtgtca 7560ttctattctg
gggggtgggg tggggcagga cagcaagggg gaggattggg aagacaatag 7620caggcatgct
ggggatgcgg tgggctctat gggtacccag gtgctgaaga attgacccgg 7680ttcctcctgg
gccagaaaga agcaggcaca tccccttctc tgtgacacac cctgtccacg 7740cccctggttc
ttagttccag ccccactcat aggacactca tagctcagga gggctccgcc 7800ttcaatccca
cccgctaaag tacttggagc ggtctctccc tccctcatca gcccaccaaa 7860ccaaacctag
cctccaagag tgggaagaaa ttaaagcaag ataggctatt aagtgcagag 7920ggagagaaaa
tgcctccaac atgtgaggaa gtaatgagag aaatcataga attttaaggc 7980catgatttaa
ggccagtggc tttccccccc cccccattat tgaagcattt atcagggtta 8040ttgtctcatg
agcggataca tatttgaatg tatttagaaa aataaacaaa taggggttcc 8100gcgcacattt
ccccgaaaag tgccacctga cgtctaagaa accattatta tcatgacatt 8160aacctataaa
aataggcgta tcacgaggcc ctttcgtctc gcgcgtttcg gtgatgacgg 8220tgaaaacctc
tgacacatgc agctcccgga gacggtcaca gcttgtctgt aagcggatgc 8280cgggagcaga
caagcccgtc agggcgcgtc agcgggtgtt ggcgggtgtc ggggctggct 8340taactatgcg
gcatcagagc agattgtact gagagtgcac catatgcggt gtgaaatacc 8400gcacagatgc
gtaaggagaa aataccgcat cagattggct at
84421078442DNAArtificial sequenceVR4776, Ligation of Inverted RSV R Seg7
into VR4762 107tggccattgc atacgttgta tccatatcat aatatgtaca
tttatattgg ctcatgtcca 60acattaccgc catgttgaca ttgattattg actagttatt
aatagtaatc aattacgggg 120tcattagttc atagcccata tatggagttc cgcgttacat
aacttacggt aaatggcccg 180cctggctgac cgcccaacga cccccgccca ttgacgtcaa
taatgacgta tgttcccata 240gtaacgccaa tagggacttt ccattgacgt caatgggtgg
agtatttacg gtaaactgcc 300cacttggcag tacatcaagt gtatcatatg ccaagtacgc
cccctattga cgtcaatgac 360ggtaaatggc ccgcctggca ttatgcccag tacatgacct
tatgggactt tcctacttgg 420cagtacatct acgtattagt catcgctatt accatggtga
tgcggttttg gcagtacatc 480aatgggcgtg gatagcggtt tgactcacgg ggatttccaa
gtctccaccc cattgacgtc 540aatgggagtt tgttttggca ccaaaatcaa cgggactttc
caaaatgtcg taacaactcc 600gccccattga cgcaaatggg cggtaggcgt gtacggtggg
aggtctatat aagcagagct 660cgtttagtga accgtcagat cgcctggaga cgccatccac
gctgttttga cctccataga 720agacaccggg accgatccag cctccgcggc cgggaacggt
gcattggaac gcggattccc 780cgtgccaaga gtgacgtaag taccgcctat agactctata
ggcacacccc tttggctctt 840atgcatgcta tactgttttt ggcttggggc ctatacaccc
ccgcttcctt atgctatagg 900tgatggtata gcttagccta taggtgtggg ttattgacca
ttattgacca ctcccctatt 960ggtgacgata ctttccatta ctaatccata acatggctct
ttgccacaac tatctctatt 1020ggctatatgc caatactctg tccttcagag actgacacgg
actctgtatt tttacaggat 1080ggggtcccat ttattattta caaattcaca tatacaacaa
cgccgtcccc cgtgcccgca 1140gtttttatta aacatagcgt gggatctcca cgcgaatctc
gggtacgtgt tccggacatg 1200ggctcttctc cggtagcggc ggagcttcca catccgagcc
ctggtcccat gcctccagcg 1260gctcatggtc gctcggcagc tccttgctcc taacagtgga
ggccagactt aggcacagca 1320caatgcccac caccaccagt gtgccgcaca aggccgtggc
ggtagggtat gtgtctgaaa 1380atgagcgtgg agattgggct cgcacggctg acgcagatgg
aagacttaag gcagcggcag 1440aagaagatgc aggcagctga gttgttgtat tctgataaga
gtcagaggta actcccgttg 1500cggtgctgtt aacggtggag ggcagtgtag tctgagcagt
actcgttgct gccgcgcgcg 1560ccaccagaca taatagctga cagactaaca gactgttcct
ttccatgggt cttttctgca 1620gtcaccgtcg tcggatatcg aattcgccac catggccagc
cagggcacca agagaagcta 1680cgagcagatg gagaccgacg gcgagagaca gaacgccacc
gagatcagag ccagcgtggg 1740caagatgatc gacggcatcg gcagattcta catccagatg
tgcaccgagc tgaagctgag 1800cgactacgag ggcagactga tccagaacag cctgaccatc
gagagaatgg tgctgagcgc 1860cttcgacgag agaagaaaca gatacctgga ggagcacccc
agcgccggca aggaccccaa 1920gaagaccggc ggccccatct acagaagagt ggacggcaag
tggatgagag agctggtgct 1980gtacgacaag gaggagatca gaagaatctg gagacaggcc
aacaacggcg aggacgccac 2040cgccggcctg acccacatga tgatctggca cagcaacctg
aacgacacca cctaccagag 2100aaccagagcc ctggtgcgga ccggcatgga ccccagaatg
tgcagcctga tgcagggcag 2160caccctgccc agaagaagcg gcgccgccgg cgccgccgtg
aagggcatcg gcaccatggt 2220gatggagctg atcagaatga tcaagagagg catcaacgac
agaaacttct ggagaggcga 2280gaacggcaga aagaccagaa gcgcctacga gagaatgtgc
aacatcctga agggcaagtt 2340ccagaccgcc gcccagagag ccatgatgga ccaggtccgg
gagagcagaa accccggcaa 2400cgccgagatc gaggacctga tcttcctggc cagaagcgcc
ctgatcctga gaggcagcgt 2460ggcccacaag agctgcctgc ccgcctgcgt gtacggcccc
gccgtgagca gcggctacga 2520cttcgagaag gagggctaca gcctggtggg catcgacccc
ttcaagctgc tgcagaacag 2580ccaggtgtac agcctgatca gacccaacga gaaccccgcc
cacaagagcc agctggtgtg 2640gatggcctgc cacagcgccg ccttcgagga cctgagactg
ctgagcttca tcagaggcac 2700caaggtgtcc cccagaggca agctgagcac cagaggcgtg
cagatcgcca gcaacgagaa 2760catggacaac atgggcagca gcaccctgga gctgagaagc
agatactggg ccatcagaac 2820cagaagcggc ggcaacacca accagcagag agccagcgcc
ggccagatca gcgtgcagcc 2880caccttcagc gtgcagagaa acctgccctt cgagaagagc
accgtgatgg ccgccttcac 2940cggcaacacc gagggcagaa ccagcgacat gagagccgag
atcatcagaa tgatggaggg 3000cgccaagccc gaggaggtgt ccttcagagg cagaggcgtg
ttcgagctga gcgacgagaa 3060ggccaccaac cccatcgtgc ctagcttcga catgagcaac
gagggcagct acttcttcgg 3120cgacaacgcc gaggagtacg acaactgatc agtcgaccac
gtgtgatcca gatctacttc 3180tggctaataa aagatcagag ctctagagat ctgtgtgttg
gttttttgtg tggtactctt 3240ccgcttcctc gctcactgac tcgctgcgct cggtcgttcg
gctgcggcga gcggtatcag 3300ctcactcaaa ggcggtaata cggttatcca cagaatcagg
ggataacgca ggaaagaaca 3360tgtgagcaaa aggccagcaa aaggccagga accgtaaaaa
ggccgcgttg ctggcgtttt 3420tccataggct ccgcccccct gacgagcatc acaaaaatcg
acgctcaagt cagaggtggc 3480gaaacccgac aggactataa agataccagg cgtttccccc
tggaagctcc ctcgtgcgct 3540ctcctgttcc gaccctgccg cttaccggat acctgtccgc
ctttctccct tcgggaagcg 3600tggcgctttc tcatagctca cgctgtaggt atctcagttc
ggtgtaggtc gttcgctcca 3660agctgggctg tgtgcacgaa ccccccgttc agcccgaccg
ctgcgcctta tccggtaact 3720atcgtcttga gtccaacccg gtaagacacg acttatcgcc
actggcagca gccactggta 3780acaggattag cagagcgagg tatgtaggcg gtgctacaga
gttcttgaag tggtggccta 3840actacggcta cactagaaga acagtatttg gtatctgcgc
tctgctgaag ccagttacct 3900tcggaaaaag agttggtagc tcttgatccg gcaaacaaac
caccgctggt agcggtggtt 3960tttttgtttg caagcagcag attacgcgca gaaaaaaagg
atctcaagaa gatcctttga 4020tcttttctac ggggtctgac gctcagtgga acgaaaactc
acgttaaggg attttggtca 4080tgagattatc aaaaaggatc ttcacctaga tccttttaaa
ttaaaaatga agttttaaat 4140caatctaaag tatatatgag taaacttggt ctgacagtta
ccaatgctta atcagtgagg 4200cacctatctc agcgatctgt ctatttcgtt catccatagt
tgcctgactc gggggggggg 4260ggcgctgagg tctgcctcgt gaagaaggtg ttgctgactc
ataccaggcc tgaatcgccc 4320catcatccag ccagaaagtg agggagccac ggttgatgag
agctttgttg taggtggacc 4380agttggtgat tttgaacttt tgctttgcca cggaacggtc
tgcgttgtcg ggaagatgcg 4440tgatctgatc cttcaactca gcaaaagttc gatttattca
acaaagccgc cgtcccgtca 4500agtcagcgta atgctctgcc agtgttacaa ccaattaacc
aattctgatt agaaaaactc 4560atcgagcatc aaatgaaact gcaatttatt catatcagga
ttatcaatac catatttttg 4620aaaaagccgt ttctgtaatg aaggagaaaa ctcaccgagg
cagttccata ggatggcaag 4680atcctggtat cggtctgcga ttccgactcg tccaacatca
atacaaccta ttaatttccc 4740ctcgtcaaaa ataaggttat caagtgagaa atcaccatga
gtgacgactg aatccggtga 4800gaatggcaaa agcttatgca tttctttcca gacttgttca
acaggccagc cattacgctc 4860gtcatcaaaa tcactcgcat caaccaaacc gttattcatt
cgtgattgcg cctgagcgag 4920acgaaatacg cgatcgctgt taaaaggaca attacaaaca
ggaatcgaat gcaaccggcg 4980caggaacact gccagcgcat caacaatatt ttcacctgaa
tcaggatatt cttctaatac 5040ctggaatgct gttttcccgg ggatcgcagt ggtgagtaac
catgcatcat caggagtacg 5100gataaaatgc ttgatggtcg gaagaggcat aaattccgtc
agccagttta gtctgaccat 5160ctcatctgta acatcattgg caacgctacc tttgccatgt
ttcagaaaca actctggcgc 5220atcgggcttc ccatacaatc gatagattgt cgcacctgat
tgcccgacat tatcgcgagc 5280ccatttatac ccatataaat cagcatccat gttggaattt
aatcgcggcc tcgagcaaga 5340cgtttcccgt tgaatatggc tcataacacc ccttgtatta
ctgtttatgt aagcagacag 5400ttttattgtt catgatgata tatttttatc ttgtgcaatg
taacatcaga gattttgaga 5460cactggcctt aaatcatggc cttaaaattc tatgatttct
ctcattactt cctcacatgt 5520tggaggcatt ttctctccct ctgcacttaa tagcctatct
tgctttaatt tcttcccact 5580cttggaggct aggtttggtt tggtgggctg atgagggagg
gagagaccgc tccaagtact 5640ttagcgggtg ggattgaagg cggagccctc ctgagctatg
agtgtcctat gagtggggct 5700ggaactaaga accaggggcg tggacagggt gtgtcacaga
gaaggggatg tgcctgcttc 5760tttctggccc aggaggaacc gggtcaattc ttcagcacct
gggtacccat agagcccacc 5820gcatccccag catgcctgct attgtcttcc caatcctccc
ccttgctgtc ctgccccacc 5880ccacccccca gaatagaatg acacctactc agacaatgcg
atgcaatttc ctcattttat 5940taggaaagga cagtgggagt ggcaccttcc agggtcaagg
aaggcacggg ggaggggcaa 6000acaacagatg gctggcaact agaaggcaca gcagatctgg
atctcgactg attactccag 6060ctctatgctg acaaaatgac tgtcgtcagc atccacagca
ttctgctgtt cctttcgata 6120ttcttccctc atagactcag gtactccttc cgtagaaggc
cctcttttca gaccgtgttt 6180gaagagtcga tagatgcatt tgaaaaaaag acgatcaaga
atccacaata tcaagtgcaa 6240gatcccaatg atactcgcag caacaacaag cgggtcactt
gaatcgttgc atctgcaccc 6300ccattcgttt ctgataggtc tgcaaatttt caagaagatc
atcttttaga ccagcactgg 6360agctaggatg agtcccaatg gctctcattg cctgcaccat
ttgcctggcc tgactagcaa 6420tttccatggc ctccgctgcc tgctcacttg atccagccat
ttgctccata gccttagctg 6480tagtgctggc caaaaccatt ctgttctcat gccttattaa
tggattggtt gttgccacca 6540tttgcctatg agacctgtgc tgggagtcag caatctgttc
acatgttgca cataccaggc 6600caaaggccac ttcagtggtt acagccccca ttctgttgta
tatgaggccc atgcaactgg 6660caagtgcacc agcagaataa ctgagtgcta tttctttggc
cccatggaat gtaatctccc 6720tcttaagttt tctatatagt ttaactgctc tgtccatgtt
atttggatcc ccattcccat 6780tgagggcatt ttggacaaag cgtctacgct gcagtcctcg
ctcactgggc acggtgagcg 6840tgaacacaaa ccccaaaatc cccttagtca gaggtgacag
gattggtctt gtctttagcc 6900attccatgag agcctcaaga tctgtgtttt tcccagcaaa
gacatcttca agtctctgcg 6960cgatttcggc tttgaggggg cctgatggaa cgatagagag
aacatacgtt tcgacctcgg 7020ttagaaggct catggtggcg aattcgatat ccgacgacgg
tgactgcaga aaagacccat 7080ggaaaggaac agtctgttag tctgtcagct attatgtctg
gtggcgcgcg cggcagcaac 7140gagtactgct cagactacac tgccctccac cgttaactag
agttgagcaa gcagggtcag 7200gcaaagcgtg gagagccggc tgagtctagg taggctccaa
gggagcgccg gacaaaggcc 7260cggtctcgac ctgagcttta aacttaccta gacggcggac
gcagttcagg aggcaccaca 7320ggcgggaggc ggcagaacgc gactcaaccg gcgtggatgg
cggcctcagg tagggcggcg 7380ggcgcgtgaa ggagagatgc gagccgatgg aggtgcacac
caatgtggtg aatggtcaaa 7440tggcgtttat tgtatcgagc taggcactta aatacaatat
ctctgcaatg cggaattcag 7500tggttcgtcc aatccatgtc agacccgtct gttgccttcc
taataaggca cgatcgtacc 7560accttacttc caccaatcgg catgcacggt gctttttctc
tccttgtaag gcatgttgct 7620aactcatcgt taccatgttg caagactaca agagtattgc
ataagactac atttccccct 7680ccctatgcaa aagcgaaact actatatcct gaggggactc
ctaaccgcgt acaaccgaag 7740ccccgctttt cgcctaaaca caccctagtc ccctcagata
cgcgtatatc tggcccgtac 7800atcgcgaagc agcgcaaaac gcctaaccct aagcagattc
ttcatgcaat tgtcggtcaa 7860gccttgcctt gttgtagctt aaattttgct cgcgcactac
tcagcgacct ccaacacaca 7920agcagggagc agccaatagc caatctgatg cggtattttc
tccttacgca tctgtgcggt 7980atttcacacc gcatagtggc tttccccccc cccccattat
tgaagcattt atcagggtta 8040ttgtctcatg agcggataca tatttgaatg tatttagaaa
aataaacaaa taggggttcc 8100gcgcacattt ccccgaaaag tgccacctga cgtctaagaa
accattatta tcatgacatt 8160aacctataaa aataggcgta tcacgaggcc ctttcgtctc
gcgcgtttcg gtgatgacgg 8220tgaaaacctc tgacacatgc agctcccgga gacggtcaca
gcttgtctgt aagcggatgc 8280cgggagcaga caagcccgtc agggcgcgtc agcgggtgtt
ggcgggtgtc ggggctggct 8340taactatgcg gcatcagagc agattgtact gagagtgcac
catatgcggt gtgaaatacc 8400gcacagatgc gtaaggagaa aataccgcat cagattggct
at 84421087754DNAArtificial sequenceVR4777, Ligation
of RSVRM2 into VR4762 108tggccattgc atacgttgta tccatatcat aatatgtaca
tttatattgg ctcatgtcca 60acattaccgc catgttgaca ttgattattg actagttatt
aatagtaatc aattacgggg 120tcattagttc atagcccata tatggagttc cgcgttacat
aacttacggt aaatggcccg 180cctggctgac cgcccaacga cccccgccca ttgacgtcaa
taatgacgta tgttcccata 240gtaacgccaa tagggacttt ccattgacgt caatgggtgg
agtatttacg gtaaactgcc 300cacttggcag tacatcaagt gtatcatatg ccaagtacgc
cccctattga cgtcaatgac 360ggtaaatggc ccgcctggca ttatgcccag tacatgacct
tatgggactt tcctacttgg 420cagtacatct acgtattagt catcgctatt accatggtga
tgcggttttg gcagtacatc 480aatgggcgtg gatagcggtt tgactcacgg ggatttccaa
gtctccaccc cattgacgtc 540aatgggagtt tgttttggca ccaaaatcaa cgggactttc
caaaatgtcg taacaactcc 600gccccattga cgcaaatggg cggtaggcgt gtacggtggg
aggtctatat aagcagagct 660cgtttagtga accgtcagat cgcctggaga cgccatccac
gctgttttga cctccataga 720agacaccggg accgatccag cctccgcggc cgggaacggt
gcattggaac gcggattccc 780cgtgccaaga gtgacgtaag taccgcctat agactctata
ggcacacccc tttggctctt 840atgcatgcta tactgttttt ggcttggggc ctatacaccc
ccgcttcctt atgctatagg 900tgatggtata gcttagccta taggtgtggg ttattgacca
ttattgacca ctcccctatt 960ggtgacgata ctttccatta ctaatccata acatggctct
ttgccacaac tatctctatt 1020ggctatatgc caatactctg tccttcagag actgacacgg
actctgtatt tttacaggat 1080ggggtcccat ttattattta caaattcaca tatacaacaa
cgccgtcccc cgtgcccgca 1140gtttttatta aacatagcgt gggatctcca cgcgaatctc
gggtacgtgt tccggacatg 1200ggctcttctc cggtagcggc ggagcttcca catccgagcc
ctggtcccat gcctccagcg 1260gctcatggtc gctcggcagc tccttgctcc taacagtgga
ggccagactt aggcacagca 1320caatgcccac caccaccagt gtgccgcaca aggccgtggc
ggtagggtat gtgtctgaaa 1380atgagcgtgg agattgggct cgcacggctg acgcagatgg
aagacttaag gcagcggcag 1440aagaagatgc aggcagctga gttgttgtat tctgataaga
gtcagaggta actcccgttg 1500cggtgctgtt aacggtggag ggcagtgtag tctgagcagt
actcgttgct gccgcgcgcg 1560ccaccagaca taatagctga cagactaaca gactgttcct
ttccatgggt cttttctgca 1620gtcaccgtcg tcggatatcg aattcgccac catggccagc
cagggcacca agagaagcta 1680cgagcagatg gagaccgacg gcgagagaca gaacgccacc
gagatcagag ccagcgtggg 1740caagatgatc gacggcatcg gcagattcta catccagatg
tgcaccgagc tgaagctgag 1800cgactacgag ggcagactga tccagaacag cctgaccatc
gagagaatgg tgctgagcgc 1860cttcgacgag agaagaaaca gatacctgga ggagcacccc
agcgccggca aggaccccaa 1920gaagaccggc ggccccatct acagaagagt ggacggcaag
tggatgagag agctggtgct 1980gtacgacaag gaggagatca gaagaatctg gagacaggcc
aacaacggcg aggacgccac 2040cgccggcctg acccacatga tgatctggca cagcaacctg
aacgacacca cctaccagag 2100aaccagagcc ctggtgcgga ccggcatgga ccccagaatg
tgcagcctga tgcagggcag 2160caccctgccc agaagaagcg gcgccgccgg cgccgccgtg
aagggcatcg gcaccatggt 2220gatggagctg atcagaatga tcaagagagg catcaacgac
agaaacttct ggagaggcga 2280gaacggcaga aagaccagaa gcgcctacga gagaatgtgc
aacatcctga agggcaagtt 2340ccagaccgcc gcccagagag ccatgatgga ccaggtccgg
gagagcagaa accccggcaa 2400cgccgagatc gaggacctga tcttcctggc cagaagcgcc
ctgatcctga gaggcagcgt 2460ggcccacaag agctgcctgc ccgcctgcgt gtacggcccc
gccgtgagca gcggctacga 2520cttcgagaag gagggctaca gcctggtggg catcgacccc
ttcaagctgc tgcagaacag 2580ccaggtgtac agcctgatca gacccaacga gaaccccgcc
cacaagagcc agctggtgtg 2640gatggcctgc cacagcgccg ccttcgagga cctgagactg
ctgagcttca tcagaggcac 2700caaggtgtcc cccagaggca agctgagcac cagaggcgtg
cagatcgcca gcaacgagaa 2760catggacaac atgggcagca gcaccctgga gctgagaagc
agatactggg ccatcagaac 2820cagaagcggc ggcaacacca accagcagag agccagcgcc
ggccagatca gcgtgcagcc 2880caccttcagc gtgcagagaa acctgccctt cgagaagagc
accgtgatgg ccgccttcac 2940cggcaacacc gagggcagaa ccagcgacat gagagccgag
atcatcagaa tgatggaggg 3000cgccaagccc gaggaggtgt ccttcagagg cagaggcgtg
ttcgagctga gcgacgagaa 3060ggccaccaac cccatcgtgc ctagcttcga catgagcaac
gagggcagct acttcttcgg 3120cgacaacgcc gaggagtacg acaactgatc agtcgaccac
gtgtgatcca gatctacttc 3180tggctaataa aagatcagag ctctagagat ctgtgtgttg
gttttttgtg tggtactctt 3240ccgcttcctc gctcactgac tcgctgcgct cggtcgttcg
gctgcggcga gcggtatcag 3300ctcactcaaa ggcggtaata cggttatcca cagaatcagg
ggataacgca ggaaagaaca 3360tgtgagcaaa aggccagcaa aaggccagga accgtaaaaa
ggccgcgttg ctggcgtttt 3420tccataggct ccgcccccct gacgagcatc acaaaaatcg
acgctcaagt cagaggtggc 3480gaaacccgac aggactataa agataccagg cgtttccccc
tggaagctcc ctcgtgcgct 3540ctcctgttcc gaccctgccg cttaccggat acctgtccgc
ctttctccct tcgggaagcg 3600tggcgctttc tcatagctca cgctgtaggt atctcagttc
ggtgtaggtc gttcgctcca 3660agctgggctg tgtgcacgaa ccccccgttc agcccgaccg
ctgcgcctta tccggtaact 3720atcgtcttga gtccaacccg gtaagacacg acttatcgcc
actggcagca gccactggta 3780acaggattag cagagcgagg tatgtaggcg gtgctacaga
gttcttgaag tggtggccta 3840actacggcta cactagaaga acagtatttg gtatctgcgc
tctgctgaag ccagttacct 3900tcggaaaaag agttggtagc tcttgatccg gcaaacaaac
caccgctggt agcggtggtt 3960tttttgtttg caagcagcag attacgcgca gaaaaaaagg
atctcaagaa gatcctttga 4020tcttttctac ggggtctgac gctcagtgga acgaaaactc
acgttaaggg attttggtca 4080tgagattatc aaaaaggatc ttcacctaga tccttttaaa
ttaaaaatga agttttaaat 4140caatctaaag tatatatgag taaacttggt ctgacagtta
ccaatgctta atcagtgagg 4200cacctatctc agcgatctgt ctatttcgtt catccatagt
tgcctgactc gggggggggg 4260ggcgctgagg tctgcctcgt gaagaaggtg ttgctgactc
ataccaggcc tgaatcgccc 4320catcatccag ccagaaagtg agggagccac ggttgatgag
agctttgttg taggtggacc 4380agttggtgat tttgaacttt tgctttgcca cggaacggtc
tgcgttgtcg ggaagatgcg 4440tgatctgatc cttcaactca gcaaaagttc gatttattca
acaaagccgc cgtcccgtca 4500agtcagcgta atgctctgcc agtgttacaa ccaattaacc
aattctgatt agaaaaactc 4560atcgagcatc aaatgaaact gcaatttatt catatcagga
ttatcaatac catatttttg 4620aaaaagccgt ttctgtaatg aaggagaaaa ctcaccgagg
cagttccata ggatggcaag 4680atcctggtat cggtctgcga ttccgactcg tccaacatca
atacaaccta ttaatttccc 4740ctcgtcaaaa ataaggttat caagtgagaa atcaccatga
gtgacgactg aatccggtga 4800gaatggcaaa agcttatgca tttctttcca gacttgttca
acaggccagc cattacgctc 4860gtcatcaaaa tcactcgcat caaccaaacc gttattcatt
cgtgattgcg cctgagcgag 4920acgaaatacg cgatcgctgt taaaaggaca attacaaaca
ggaatcgaat gcaaccggcg 4980caggaacact gccagcgcat caacaatatt ttcacctgaa
tcaggatatt cttctaatac 5040ctggaatgct gttttcccgg ggatcgcagt ggtgagtaac
catgcatcat caggagtacg 5100gataaaatgc ttgatggtcg gaagaggcat aaattccgtc
agccagttta gtctgaccat 5160ctcatctgta acatcattgg caacgctacc tttgccatgt
ttcagaaaca actctggcgc 5220atcgggcttc ccatacaatc gatagattgt cgcacctgat
tgcccgacat tatcgcgagc 5280ccatttatac ccatataaat cagcatccat gttggaattt
aatcgcggcc tcgagcaaga 5340cgtttcccgt tgaatatggc tcataacacc ccttgtatta
ctgtttatgt aagcagacag 5400ttttattgtt catgatgata tatttttatc ttgtgcaatg
taacatcaga gattttgaga 5460cactatgcgg tgtgaaatac cgcacagatg cgtaaggaga
aaataccgca tcagattggc 5520tattggctgc tccctgcttg tgtgttggag gtcgctgagt
agtgcgcgag caaaatttaa 5580gctacaacaa ggcaaggctt gaccgacaat tgcatgaaga
atctgcttag ggttaggcgt 5640tttgcgctgc ttcgcgatgt acgggccaga tatacgcgta
tctgagggga ctagggtgtg 5700tttaggcgaa aagcggggct tcggttgtac gcggttagga
gtcccctcag gatatagtag 5760tttcgctttt gcatagggag ggggaaatgt agtcttatgc
aatactcttg tagtcttgca 5820acatggtaac gatgagttag caacatgcct tacaaggaga
gaaaaagcac cgtgcatgcc 5880gattggtgga agtaaggtgg tacgatcgtg ccttattagg
aaggcaacag acgggtctga 5940catggattgg acgaaccact gaattccgca ttgcagagat
attgtattta agtgcctagc 6000tcgatacaat aaacgccatt tgaccattca ccacattggt
gtgcacctcc atcggctcgc 6060atctctcctt cacgcgcccg ccgccctacc tgaggccgcc
atccacgccg gttgagtcgc 6120gttctgccgc ctcccgcctg tggtgcctcc tgaactgcgt
ccgccgtcta ggtaagttta 6180aagctcaggt cgagaccggg cctttgtccg gcgctccctt
ggagcctacc tagactcagc 6240cggctctcca cgctttgcct gaccctgctt gctcaactct
agttaacggt ggagggcagt 6300gtagtctgag cagtactcgt tgctgccgcg cgcgccacca
gacataatag ctgacagact 6360aacagactgt tcctttccat gggtcttttc tgcagtcacc
gtcgtcggat atcgaattcg 6420ccaccatgag cctgctgacc gaggtggaga cccccatcag
aaacgagtgg ggctgcagat 6480gcaacgacag cagcgacccc ctggtggtgg ccgccagcat
catcggcatc ctgcacctga 6540tcctgtggat cctggacaga ctgttcttca agtgcatcta
cagactgttc aagcacggcc 6600tgaagagagg ccccagcacc gagggcgtgc ccgagagcat
gagagaggag tacagaaagg 6660agcagcagaa cgccgtggac gccgacgaca gccacttcgt
gagcatcgag ctggagtgat 6720cagtcgagat ccagatctgc tgtgccttct agttgccagc
catctgttgt ttgcccctcc 6780cccgtgcctt ccttgaccct ggaaggtgcc actcccactg
tcctttccta ataaaatgag 6840gaaattgcat cgcattgtct gagtaggtgt cattctattc
tggggggtgg ggtggggcag 6900gacagcaagg gggaggattg ggaagacaat agcaggcatg
ctggggatgc ggtgggctct 6960atgggtaccc aggtgctgaa gaattgaccc ggttcctcct
gggccagaaa gaagcaggca 7020catccccttc tctgtgacac accctgtcca cgcccctggt
tcttagttcc agccccactc 7080ataggacact catagctcag gagggctccg ccttcaatcc
cacccgctaa agtacttgga 7140gcggtctctc cctccctcat cagcccacca aaccaaacct
agcctccaag agtgggaaga 7200aattaaagca agataggcta ttaagtgcag agggagagaa
aatgcctcca acatgtgagg 7260aagtaatgag agaaatcata gaattttaag gccatgattt
aaggccagtg gctttccccc 7320cccccccatt attgaagcat ttatcagggt tattgtctca
tgagcggata catatttgaa 7380tgtatttaga aaaataaaca aataggggtt ccgcgcacat
ttccccgaaa agtgccacct 7440gacgtctaag aaaccattat tatcatgaca ttaacctata
aaaataggcg tatcacgagg 7500ccctttcgtc tcgcgcgttt cggtgatgac ggtgaaaacc
tctgacacat gcagctcccg 7560gagacggtca cagcttgtct gtaagcggat gccgggagca
gacaagcccg tcagggcgcg 7620tcagcgggtg ttggcgggtg tcggggctgg cttaactatg
cggcatcaga gcagattgta 7680ctgagagtgc accatatgcg gtgtgaaata ccgcacagat
gcgtaaggag aaaataccgc 7740atcagattgg ctat
77541097754DNAArtificial sequenceVR4778, Ligation
of Inverted RSV RM2 into VR4762 109tggccattgc atacgttgta tccatatcat
aatatgtaca tttatattgg ctcatgtcca 60acattaccgc catgttgaca ttgattattg
actagttatt aatagtaatc aattacgggg 120tcattagttc atagcccata tatggagttc
cgcgttacat aacttacggt aaatggcccg 180cctggctgac cgcccaacga cccccgccca
ttgacgtcaa taatgacgta tgttcccata 240gtaacgccaa tagggacttt ccattgacgt
caatgggtgg agtatttacg gtaaactgcc 300cacttggcag tacatcaagt gtatcatatg
ccaagtacgc cccctattga cgtcaatgac 360ggtaaatggc ccgcctggca ttatgcccag
tacatgacct tatgggactt tcctacttgg 420cagtacatct acgtattagt catcgctatt
accatggtga tgcggttttg gcagtacatc 480aatgggcgtg gatagcggtt tgactcacgg
ggatttccaa gtctccaccc cattgacgtc 540aatgggagtt tgttttggca ccaaaatcaa
cgggactttc caaaatgtcg taacaactcc 600gccccattga cgcaaatggg cggtaggcgt
gtacggtggg aggtctatat aagcagagct 660cgtttagtga accgtcagat cgcctggaga
cgccatccac gctgttttga cctccataga 720agacaccggg accgatccag cctccgcggc
cgggaacggt gcattggaac gcggattccc 780cgtgccaaga gtgacgtaag taccgcctat
agactctata ggcacacccc tttggctctt 840atgcatgcta tactgttttt ggcttggggc
ctatacaccc ccgcttcctt atgctatagg 900tgatggtata gcttagccta taggtgtggg
ttattgacca ttattgacca ctcccctatt 960ggtgacgata ctttccatta ctaatccata
acatggctct ttgccacaac tatctctatt 1020ggctatatgc caatactctg tccttcagag
actgacacgg actctgtatt tttacaggat 1080ggggtcccat ttattattta caaattcaca
tatacaacaa cgccgtcccc cgtgcccgca 1140gtttttatta aacatagcgt gggatctcca
cgcgaatctc gggtacgtgt tccggacatg 1200ggctcttctc cggtagcggc ggagcttcca
catccgagcc ctggtcccat gcctccagcg 1260gctcatggtc gctcggcagc tccttgctcc
taacagtgga ggccagactt aggcacagca 1320caatgcccac caccaccagt gtgccgcaca
aggccgtggc ggtagggtat gtgtctgaaa 1380atgagcgtgg agattgggct cgcacggctg
acgcagatgg aagacttaag gcagcggcag 1440aagaagatgc aggcagctga gttgttgtat
tctgataaga gtcagaggta actcccgttg 1500cggtgctgtt aacggtggag ggcagtgtag
tctgagcagt actcgttgct gccgcgcgcg 1560ccaccagaca taatagctga cagactaaca
gactgttcct ttccatgggt cttttctgca 1620gtcaccgtcg tcggatatcg aattcgccac
catggccagc cagggcacca agagaagcta 1680cgagcagatg gagaccgacg gcgagagaca
gaacgccacc gagatcagag ccagcgtggg 1740caagatgatc gacggcatcg gcagattcta
catccagatg tgcaccgagc tgaagctgag 1800cgactacgag ggcagactga tccagaacag
cctgaccatc gagagaatgg tgctgagcgc 1860cttcgacgag agaagaaaca gatacctgga
ggagcacccc agcgccggca aggaccccaa 1920gaagaccggc ggccccatct acagaagagt
ggacggcaag tggatgagag agctggtgct 1980gtacgacaag gaggagatca gaagaatctg
gagacaggcc aacaacggcg aggacgccac 2040cgccggcctg acccacatga tgatctggca
cagcaacctg aacgacacca cctaccagag 2100aaccagagcc ctggtgcgga ccggcatgga
ccccagaatg tgcagcctga tgcagggcag 2160caccctgccc agaagaagcg gcgccgccgg
cgccgccgtg aagggcatcg gcaccatggt 2220gatggagctg atcagaatga tcaagagagg
catcaacgac agaaacttct ggagaggcga 2280gaacggcaga aagaccagaa gcgcctacga
gagaatgtgc aacatcctga agggcaagtt 2340ccagaccgcc gcccagagag ccatgatgga
ccaggtccgg gagagcagaa accccggcaa 2400cgccgagatc gaggacctga tcttcctggc
cagaagcgcc ctgatcctga gaggcagcgt 2460ggcccacaag agctgcctgc ccgcctgcgt
gtacggcccc gccgtgagca gcggctacga 2520cttcgagaag gagggctaca gcctggtggg
catcgacccc ttcaagctgc tgcagaacag 2580ccaggtgtac agcctgatca gacccaacga
gaaccccgcc cacaagagcc agctggtgtg 2640gatggcctgc cacagcgccg ccttcgagga
cctgagactg ctgagcttca tcagaggcac 2700caaggtgtcc cccagaggca agctgagcac
cagaggcgtg cagatcgcca gcaacgagaa 2760catggacaac atgggcagca gcaccctgga
gctgagaagc agatactggg ccatcagaac 2820cagaagcggc ggcaacacca accagcagag
agccagcgcc ggccagatca gcgtgcagcc 2880caccttcagc gtgcagagaa acctgccctt
cgagaagagc accgtgatgg ccgccttcac 2940cggcaacacc gagggcagaa ccagcgacat
gagagccgag atcatcagaa tgatggaggg 3000cgccaagccc gaggaggtgt ccttcagagg
cagaggcgtg ttcgagctga gcgacgagaa 3060ggccaccaac cccatcgtgc ctagcttcga
catgagcaac gagggcagct acttcttcgg 3120cgacaacgcc gaggagtacg acaactgatc
agtcgaccac gtgtgatcca gatctacttc 3180tggctaataa aagatcagag ctctagagat
ctgtgtgttg gttttttgtg tggtactctt 3240ccgcttcctc gctcactgac tcgctgcgct
cggtcgttcg gctgcggcga gcggtatcag 3300ctcactcaaa ggcggtaata cggttatcca
cagaatcagg ggataacgca ggaaagaaca 3360tgtgagcaaa aggccagcaa aaggccagga
accgtaaaaa ggccgcgttg ctggcgtttt 3420tccataggct ccgcccccct gacgagcatc
acaaaaatcg acgctcaagt cagaggtggc 3480gaaacccgac aggactataa agataccagg
cgtttccccc tggaagctcc ctcgtgcgct 3540ctcctgttcc gaccctgccg cttaccggat
acctgtccgc ctttctccct tcgggaagcg 3600tggcgctttc tcatagctca cgctgtaggt
atctcagttc ggtgtaggtc gttcgctcca 3660agctgggctg tgtgcacgaa ccccccgttc
agcccgaccg ctgcgcctta tccggtaact 3720atcgtcttga gtccaacccg gtaagacacg
acttatcgcc actggcagca gccactggta 3780acaggattag cagagcgagg tatgtaggcg
gtgctacaga gttcttgaag tggtggccta 3840actacggcta cactagaaga acagtatttg
gtatctgcgc tctgctgaag ccagttacct 3900tcggaaaaag agttggtagc tcttgatccg
gcaaacaaac caccgctggt agcggtggtt 3960tttttgtttg caagcagcag attacgcgca
gaaaaaaagg atctcaagaa gatcctttga 4020tcttttctac ggggtctgac gctcagtgga
acgaaaactc acgttaaggg attttggtca 4080tgagattatc aaaaaggatc ttcacctaga
tccttttaaa ttaaaaatga agttttaaat 4140caatctaaag tatatatgag taaacttggt
ctgacagtta ccaatgctta atcagtgagg 4200cacctatctc agcgatctgt ctatttcgtt
catccatagt tgcctgactc gggggggggg 4260ggcgctgagg tctgcctcgt gaagaaggtg
ttgctgactc ataccaggcc tgaatcgccc 4320catcatccag ccagaaagtg agggagccac
ggttgatgag agctttgttg taggtggacc 4380agttggtgat tttgaacttt tgctttgcca
cggaacggtc tgcgttgtcg ggaagatgcg 4440tgatctgatc cttcaactca gcaaaagttc
gatttattca acaaagccgc cgtcccgtca 4500agtcagcgta atgctctgcc agtgttacaa
ccaattaacc aattctgatt agaaaaactc 4560atcgagcatc aaatgaaact gcaatttatt
catatcagga ttatcaatac catatttttg 4620aaaaagccgt ttctgtaatg aaggagaaaa
ctcaccgagg cagttccata ggatggcaag 4680atcctggtat cggtctgcga ttccgactcg
tccaacatca atacaaccta ttaatttccc 4740ctcgtcaaaa ataaggttat caagtgagaa
atcaccatga gtgacgactg aatccggtga 4800gaatggcaaa agcttatgca tttctttcca
gacttgttca acaggccagc cattacgctc 4860gtcatcaaaa tcactcgcat caaccaaacc
gttattcatt cgtgattgcg cctgagcgag 4920acgaaatacg cgatcgctgt taaaaggaca
attacaaaca ggaatcgaat gcaaccggcg 4980caggaacact gccagcgcat caacaatatt
ttcacctgaa tcaggatatt cttctaatac 5040ctggaatgct gttttcccgg ggatcgcagt
ggtgagtaac catgcatcat caggagtacg 5100gataaaatgc ttgatggtcg gaagaggcat
aaattccgtc agccagttta gtctgaccat 5160ctcatctgta acatcattgg caacgctacc
tttgccatgt ttcagaaaca actctggcgc 5220atcgggcttc ccatacaatc gatagattgt
cgcacctgat tgcccgacat tatcgcgagc 5280ccatttatac ccatataaat cagcatccat
gttggaattt aatcgcggcc tcgagcaaga 5340cgtttcccgt tgaatatggc tcataacacc
ccttgtatta ctgtttatgt aagcagacag 5400ttttattgtt catgatgata tatttttatc
ttgtgcaatg taacatcaga gattttgaga 5460cactggcctt aaatcatggc cttaaaattc
tatgatttct ctcattactt cctcacatgt 5520tggaggcatt ttctctccct ctgcacttaa
tagcctatct tgctttaatt tcttcccact 5580cttggaggct aggtttggtt tggtgggctg
atgagggagg gagagaccgc tccaagtact 5640ttagcgggtg ggattgaagg cggagccctc
ctgagctatg agtgtcctat gagtggggct 5700ggaactaaga accaggggcg tggacagggt
gtgtcacaga gaaggggatg tgcctgcttc 5760tttctggccc aggaggaacc gggtcaattc
ttcagcacct gggtacccat agagcccacc 5820gcatccccag catgcctgct attgtcttcc
caatcctccc ccttgctgtc ctgccccacc 5880ccacccccca gaatagaatg acacctactc
agacaatgcg atgcaatttc ctcattttat 5940taggaaagga cagtgggagt ggcaccttcc
agggtcaagg aaggcacggg ggaggggcaa 6000acaacagatg gctggcaact agaaggcaca
gcagatctgg atctcgactg atcactccag 6060ctcgatgctc acgaagtggc tgtcgtcggc
gtccacggcg ttctgctgct cctttctgta 6120ctcctctctc atgctctcgg gcacgccctc
ggtgctgggg cctctcttca ggccgtgctt 6180gaacagtctg tagatgcact tgaagaacag
tctgtccagg atccacagga tcaggtgcag 6240gatgccgatg atgctggcgg ccaccaccag
ggggtcgctg ctgtcgttgc atctgcagcc 6300ccactcgttt ctgatggggg tctccacctc
ggtcagcagg ctcatggtgg cgaattcgat 6360atccgacgac ggtgactgca gaaaagaccc
atggaaagga acagtctgtt agtctgtcag 6420ctattatgtc tggtggcgcg cgcggcagca
acgagtactg ctcagactac actgccctcc 6480accgttaact agagttgagc aagcagggtc
aggcaaagcg tggagagccg gctgagtcta 6540ggtaggctcc aagggagcgc cggacaaagg
cccggtctcg acctgagctt taaacttacc 6600tagacggcgg acgcagttca ggaggcacca
caggcgggag gcggcagaac gcgactcaac 6660cggcgtggat ggcggcctca ggtagggcgg
cgggcgcgtg aaggagagat gcgagccgat 6720ggaggtgcac accaatgtgg tgaatggtca
aatggcgttt attgtatcga gctaggcact 6780taaatacaat atctctgcaa tgcggaattc
agtggttcgt ccaatccatg tcagacccgt 6840ctgttgcctt cctaataagg cacgatcgta
ccaccttact tccaccaatc ggcatgcacg 6900gtgctttttc tctccttgta aggcatgttg
ctaactcatc gttaccatgt tgcaagacta 6960caagagtatt gcataagact acatttcccc
ctccctatgc aaaagcgaaa ctactatatc 7020ctgaggggac tcctaaccgc gtacaaccga
agccccgctt ttcgcctaaa cacaccctag 7080tcccctcaga tacgcgtata tctggcccgt
acatcgcgaa gcagcgcaaa acgcctaacc 7140ctaagcagat tcttcatgca attgtcggtc
aagccttgcc ttgttgtagc ttaaattttg 7200ctcgcgcact actcagcgac ctccaacaca
caagcaggga gcagccaata gccaatctga 7260tgcggtattt tctccttacg catctgtgcg
gtatttcaca ccgcatagtg gctttccccc 7320cccccccatt attgaagcat ttatcagggt
tattgtctca tgagcggata catatttgaa 7380tgtatttaga aaaataaaca aataggggtt
ccgcgcacat ttccccgaaa agtgccacct 7440gacgtctaag aaaccattat tatcatgaca
ttaacctata aaaataggcg tatcacgagg 7500ccctttcgtc tcgcgcgttt cggtgatgac
ggtgaaaacc tctgacacat gcagctcccg 7560gagacggtca cagcttgtct gtaagcggat
gccgggagca gacaagcccg tcagggcgcg 7620tcagcgggtg ttggcgggtg tcggggctgg
cttaactatg cggcatcaga gcagattgta 7680ctgagagtgc accatatgcg gtgtgaaata
ccgcacagat gcgtaaggag aaaataccgc 7740atcagattgg ctat
77541107765DNAArtificial sequenceVR4779,
7765 bps DNA Circular 110tggtatgcgg tgtgaaatac cgcacagatg cgtaaggaga
aaataccgca tcagattggc 60tattggctgc tccctgcttg tgtgttggag gtcgctgagt
agtgcgcgag caaaatttaa 120gctacaacaa ggcaaggctt gaccgacaat tgcatgaaga
atctgcttag ggttaggcgt 180tttgcgctgc ttcgcgatgt acgggccaga tatacgcgta
tctgagggga ctagggtgtg 240tttaggcgaa aagcggggct tcggttgtac gcggttagga
gtcccctcag gatatagtag 300tttcgctttt gcatagggag ggggaaatgt agtcttatgc
aatactcttg tagtcttgca 360acatggtaac gatgagttag caacatgcct tacaaggaga
gaaaaagcac cgtgcatgcc 420gattggtgga agtaaggtgg tacgatcgtg ccttattagg
aaggcaacag acgggtctga 480catggattgg acgaaccact gaattccgca ttgcagagat
attgtattta agtgcctagc 540tcgatacaat aaacgccatt tgaccattca ccacattggt
gtgcacctcc atcggctcgc 600atctctcctt cacgcgcccg ccgccctacc tgaggccgcc
atccacgccg gttgagtcgc 660gttctgccgc ctcccgcctg tggtgcctcc tgaactgcgt
ccgccgtcta ggtaagttta 720aagctcaggt cgagaccggg cctttgtccg gcgctccctt
ggagcctacc tagactcagc 780cggctctcca cgctttgcct gaccctgctt gctcaactct
agttaacggt ggagggcagt 840gtagtctgag cagtactcgt tgctgccgcg cgcgccacca
gacataatag ctgacagact 900aacagactgt tcctttccat gggtcttttc tgcagtcacc
gtcgtcggat atcgaattcg 960ccaccatggc cagccagggc accaagagaa gctacgagca
gatggagacc gacggcgaga 1020gacagaacgc caccgagatc agagccagcg tgggcaagat
gatcgacggc atcggcagat 1080tctacatcca gatgtgcacc gagctgaagc tgagcgacta
cgagggcaga ctgatccaga 1140acagcctgac catcgagaga atggtgctga gcgccttcga
cgagagaaga aacagatacc 1200tggaggagca ccccagcgcc ggcaaggacc ccaagaagac
cggcggcccc atctacagaa 1260gagtggacgg caagtggatg agagagctgg tgctgtacga
caaggaggag atcagaagaa 1320tctggagaca ggccaacaac ggcgaggacg ccaccgccgg
cctgacccac atgatgatct 1380ggcacagcaa cctgaacgac accacctacc agagaaccag
agccctggtg cggaccggca 1440tggaccccag aatgtgcagc ctgatgcagg gcagcaccct
gcccagaaga agcggcgccg 1500ccggcgccgc cgtgaagggc atcggcacca tggtgatgga
gctgatcaga atgatcaaga 1560gaggcatcaa cgacagaaac ttctggagag gcgagaacgg
cagaaagacc agaagcgcct 1620acgagagaat gtgcaacatc ctgaagggca agttccagac
cgccgcccag agagccatga 1680tggaccaggt ccgggagagc agaaaccccg gcaacgccga
gatcgaggac ctgatcttcc 1740tggccagaag cgccctgatc ctgagaggca gcgtggccca
caagagctgc ctgcccgcct 1800gcgtgtacgg ccccgccgtg agcagcggct acgacttcga
gaaggagggc tacagcctgg 1860tgggcatcga ccccttcaag ctgctgcaga acagccaggt
gtacagcctg atcagaccca 1920acgagaaccc cgcccacaag agccagctgg tgtggatggc
ctgccacagc gccgccttcg 1980aggacctgag actgctgagc ttcatcagag gcaccaaggt
gtcccccaga ggcaagctga 2040gcaccagagg cgtgcagatc gccagcaacg agaacatgga
caacatgggc agcagcaccc 2100tggagctgag aagcagatac tgggccatca gaaccagaag
cggcggcaac accaaccagc 2160agagagccag cgccggccag atcagcgtgc agcccacctt
cagcgtgcag agaaacctgc 2220ccttcgagaa gagcaccgtg atggccgcct tcaccggcaa
caccgagggc agaaccagcg 2280acatgagagc cgagatcatc agaatgatgg agggcgccaa
gcccgaggag gtgtccttca 2340gaggcagagg cgtgttcgag ctgagcgacg agaaggccac
caaccccatc gtgcctagct 2400tcgacatgag caacgagggc agctacttct tcggcgacaa
cgccgaggag tacgacaact 2460gatcagtcga ccacgtgtga tccagatctg ctgtgccttc
tagttgccag ccatctgttg 2520tttgcccctc ccccgtgcct tccttgaccc tggaaggtgc
cactcccact gtcctttcct 2580aataaaatga ggaaattgca tcgcattgtc tgagtaggtg
tcattctatt ctggggggtg 2640gggtggggca ggacagcaag ggggaggatt gggaagacaa
tagcaggcat gctggggatg 2700cggtgggctc tatgggtacc caggtgctga agaattgacc
cggttcctcc tgggccagaa 2760agaagcaggc acatcccctt ctctgtgaca caccctgtcc
acgcccctgg ttcttagttc 2820cagccccact cataggacac tcatagctca ggagggctcc
gccttcaatc ccacccgcta 2880aagtacttgg agcggtctct ccctccctca tcagcccacc
aaaccaaacc tagcctccaa 2940gagtgggaag aaattaaagc aagataggct attaagtgca
gagggagaga aaatgcctcc 3000aacatgtgag gaagtaatga gagaaatcat agaattttaa
ggccatgatt taaggccacc 3060attgcatacg ttgtatccat atcataatat gtacatttat
attggctcat gtccaacatt 3120accgccatgt tgacattgat tattgactag ttattaatag
taatcaatta cggggtcatt 3180agttcatagc ccatatatgg agttccgcgt tacataactt
acggtaaatg gcccgcctgg 3240ctgaccgccc aacgaccccc gcccattgac gtcaataatg
acgtatgttc ccatagtaac 3300gccaataggg actttccatt gacgtcaatg ggtggagtat
ttacggtaaa ctgcccactt 3360ggcagtacat caagtgtatc atatgccaag tacgccccct
attgacgtca atgacggtaa 3420atggcccgcc tggcattatg cccagtacat gaccttatgg
gactttccta cttggcagta 3480catctacgta ttagtcatcg ctattaccat ggtgatgcgg
ttttggcagt acatcaatgg 3540gcgtggatag cggtttgact cacggggatt tccaagtctc
caccccattg acgtcaatgg 3600gagtttgttt tggcaccaaa atcaacggga ctttccaaaa
tgtcgtaaca actccgcccc 3660attgacgcaa atgggcggta ggcgtgtacg gtgggaggtc
tatataagca gagctcgttt 3720agtgaaccgt cagatcgcct ggagacgcca tccacgctgt
tttgacctcc atagaagaca 3780ccgggaccga tccagcctcc gcggccggga acggtgcatt
ggaacgcgga ttccccgtgc 3840caagagtgac gtaagtaccg cctatagact ctataggcac
acccctttgg ctcttatgca 3900tgctatactg tttttggctt ggggcctata cacccccgct
tccttatgct ataggtgatg 3960gtatagctta gcctataggt gtgggttatt gaccattatt
gaccactccc ctattggtga 4020cgatactttc cattactaat ccataacatg gctctttgcc
acaactatct ctattggcta 4080tatgccaata ctctgtcctt cagagactga cacggactct
gtatttttac aggatggggt 4140cccatttatt atttacaaat tcacatatac aacaacgccg
tcccccgtgc ccgcagtttt 4200tattaaacat agcgtgggat ctccacgcga atctcgggta
cgtgttccgg acatgggctc 4260ttctccggta gcggcggagc ttccacatcc gagccctggt
cccatgcctc cagcggctca 4320tggtcgctcg gcagctcctt gctcctaaca gtggaggcca
gacttaggca cagcacaatg 4380cccaccacca ccagtgtgcc gcacaaggcc gtggcggtag
ggtatgtgtc tgaaaatgag 4440cgtggagatt gggctcgcac ggctgacgca gatggaagac
ttaaggcagc ggcagaagaa 4500gatgcaggca gctgagttgt tgtattctga taagagtcag
aggtaactcc cgttgcggtg 4560ctgttaacgg tggagggcag tgtagtctga gcagtactcg
ttgctgccgc gcgcgccacc 4620agacataata gctgacagac taacagactg ttcctttcca
tgggtctttt ctgcagtcac 4680cgtcgtcgga tatcgaattc gccaccatga gcctgctgac
cgaggtggag acccccatca 4740gaaacgagtg gggctgcaga tgcaacgaca gcagcgaccc
cctggtggtg gccgccagca 4800tcatcggcat cctgcacctg atcctgtgga tcctggacag
actgttcttc aagtgcatct 4860acagactgtt caagcacggc ctgaagagag gccccagcac
cgagggcgtg cccgagagca 4920tgagagagga gtacagaaag gagcagcaga acgccgtgga
cgccgacgac agccacttcg 4980tgagcatcga gctggagtga tcagtcgacc acgtgtgatc
cagatctact tctggctaat 5040aaaagatcag agctctagag atctgtgtgt tggttttttg
tgtggtactc ttccgcttcc 5100tcgctcactg actcgctgcg ctcggtcgtt cggctgcggc
gagcggtatc agctcactca 5160aaggcggtaa tacggttatc cacagaatca ggggataacg
caggaaagaa catgtgagca 5220aaaggccagc aaaaggccag gaaccgtaaa aaggccgcgt
tgctggcgtt tttccatagg 5280ctccgccccc ctgacgagca tcacaaaaat cgacgctcaa
gtcagaggtg gcgaaacccg 5340acaggactat aaagatacca ggcgtttccc cctggaagct
ccctcgtgcg ctctcctgtt 5400ccgaccctgc cgcttaccgg atacctgtcc gcctttctcc
cttcgggaag cgtggcgctt 5460tctcatagct cacgctgtag gtatctcagt tcggtgtagg
tcgttcgctc caagctgggc 5520tgtgtgcacg aaccccccgt tcagcccgac cgctgcgcct
tatccggtaa ctatcgtctt 5580gagtccaacc cggtaagaca cgacttatcg ccactggcag
cagccactgg taacaggatt 5640agcagagcga ggtatgtagg cggtgctaca gagttcttga
agtggtggcc taactacggc 5700tacactagaa gaacagtatt tggtatctgc gctctgctga
agccagttac cttcggaaaa 5760agagttggta gctcttgatc cggcaaacaa accaccgctg
gtagcggtgg tttttttgtt 5820tgcaagcagc agattacgcg cagaaaaaaa ggatctcaag
aagatccttt gatcttttct 5880acggggtctg acgctcagtg gaacgaaaac tcacgttaag
ggattttggt catgagatta 5940tcaaaaagga tcttcaccta gatcctttta aattaaaaat
gaagttttaa atcaatctaa 6000agtatatatg agtaaacttg gtctgacagt taccaatgct
taatcagtga ggcacctatc 6060tcagcgatct gtctatttcg ttcatccata gttgcctgac
tcgggggggg ggggcgctga 6120ggtctgcctc gtgaagaagg tgttgctgac tcataccagg
cctgaatcgc cccatcatcc 6180agccagaaag tgagggagcc acggttgatg agagctttgt
tgtaggtgga ccagttggtg 6240attttgaact tttgctttgc cacggaacgg tctgcgttgt
cgggaagatg cgtgatctga 6300tccttcaact cagcaaaagt tcgatttatt caacaaagcc
gccgtcccgt caagtcagcg 6360taatgctctg ccagtgttac aaccaattaa ccaattctga
ttagaaaaac tcatcgagca 6420tcaaatgaaa ctgcaattta ttcatatcag gattatcaat
accatatttt tgaaaaagcc 6480gtttctgtaa tgaaggagaa aactcaccga ggcagttcca
taggatggca agatcctggt 6540atcggtctgc gattccgact cgtccaacat caatacaacc
tattaatttc ccctcgtcaa 6600aaataaggtt atcaagtgag aaatcaccat gagtgacgac
tgaatccggt gagaatggca 6660aaagcttatg catttctttc cagacttgtt caacaggcca
gccattacgc tcgtcatcaa 6720aatcactcgc atcaaccaaa ccgttattca ttcgtgattg
cgcctgagcg agacgaaata 6780cgcgatcgct gttaaaagga caattacaaa caggaatcga
atgcaaccgg cgcaggaaca 6840ctgccagcgc atcaacaata ttttcacctg aatcaggata
ttcttctaat acctggaatg 6900ctgttttccc ggggatcgca gtggtgagta accatgcatc
atcaggagta cggataaaat 6960gcttgatggt cggaagaggc ataaattccg tcagccagtt
tagtctgacc atctcatctg 7020taacatcatt ggcaacgcta cctttgccat gtttcagaaa
caactctggc gcatcgggct 7080tcccatacaa tcgatagatt gtcgcacctg attgcccgac
attatcgcga gcccatttat 7140acccatataa atcagcatcc atgttggaat ttaatcgcgg
cctcgagcaa gacgtttccc 7200gttgaatatg gctcataaca ccccttgtat tactgtttat
gtaagcagac agttttattg 7260ttcatgatga tatattttta tcttgtgcaa tgtaacatca
gagattttga gacacaacgt 7320ggctttcccc ccccccccat tattgaagca tttatcaggg
ttattgtctc atgagcggat 7380acatatttga atgtatttag aaaaataaac aaataggggt
tccgcgcaca tttccccgaa 7440aagtgccacc tgacgtctaa gaaaccatta ttatcatgac
attaacctat aaaaataggc 7500gtatcacgag gccctttcgt ctcgcgcgtt tcggtgatga
cggtgaaaac ctctgacaca 7560tgcagctccc ggagacggtc acagcttgtc tgtaagcgga
tgccgggagc agacaagccc 7620gtcagggcgc gtcagcgggt gttggcgggt gtcggggctg
gcttaactat gcggcatcag 7680agcagattgt actgagagtg caccatatgc ggtgtgaaat
accgcacaga tgcgtaagga 7740gaaaataccg catcagattg gctat
77651117765DNAArtificial sequenceVR4780, 7765 bps
DNA Circular 111tggtggcctt aaatcatggc cttaaaattc tatgatttct ctcattactt
cctcacatgt 60tggaggcatt ttctctccct ctgcacttaa tagcctatct tgctttaatt
tcttcccact 120cttggaggct aggtttggtt tggtgggctg atgagggagg gagagaccgc
tccaagtact 180ttagcgggtg ggattgaagg cggagccctc ctgagctatg agtgtcctat
gagtggggct 240ggaactaaga accaggggcg tggacagggt gtgtcacaga gaaggggatg
tgcctgcttc 300tttctggccc aggaggaacc gggtcaattc ttcagcacct gggtacccat
agagcccacc 360gcatccccag catgcctgct attgtcttcc caatcctccc ccttgctgtc
ctgccccacc 420ccacccccca gaatagaatg acacctactc agacaatgcg atgcaatttc
ctcattttat 480taggaaagga cagtgggagt ggcaccttcc agggtcaagg aaggcacggg
ggaggggcaa 540acaacagatg gctggcaact agaaggcaca gcagatctgg atcacacgtg
gtcgactgat 600cagttgtcgt actcctcggc gttgtcgccg aagaagtagc tgccctcgtt
gctcatgtcg 660aagctaggca cgatggggtt ggtggccttc tcgtcgctca gctcgaacac
gcctctgcct 720ctgaaggaca cctcctcggg cttggcgccc tccatcattc tgatgatctc
ggctctcatg 780tcgctggttc tgccctcggt gttgccggtg aaggcggcca tcacggtgct
cttctcgaag 840ggcaggtttc tctgcacgct gaaggtgggc tgcacgctga tctggccggc
gctggctctc 900tgctggttgg tgttgccgcc gcttctggtt ctgatggccc agtatctgct
tctcagctcc 960agggtgctgc tgcccatgtt gtccatgttc tcgttgctgg cgatctgcac
gcctctggtg 1020ctcagcttgc ctctggggga caccttggtg cctctgatga agctcagcag
tctcaggtcc 1080tcgaaggcgg cgctgtggca ggccatccac accagctggc tcttgtgggc
ggggttctcg 1140ttgggtctga tcaggctgta cacctggctg ttctgcagca gcttgaaggg
gtcgatgccc 1200accaggctgt agccctcctt ctcgaagtcg tagccgctgc tcacggcggg
gccgtacacg 1260caggcgggca ggcagctctt gtgggccacg ctgcctctca ggatcagggc
gcttctggcc 1320aggaagatca ggtcctcgat ctcggcgttg ccggggtttc tgctctcccg
gacctggtcc 1380atcatggctc tctgggcggc ggtctggaac ttgcccttca ggatgttgca
cattctctcg 1440taggcgcttc tggtctttct gccgttctcg cctctccaga agtttctgtc
gttgatgcct 1500ctcttgatca ttctgatcag ctccatcacc atggtgccga tgcccttcac
ggcggcgccg 1560gcggcgccgc ttcttctggg cagggtgctg ccctgcatca ggctgcacat
tctggggtcc 1620atgccggtcc gcaccagggc tctggttctc tggtaggtgg tgtcgttcag
gttgctgtgc 1680cagatcatca tgtgggtcag gccggcggtg gcgtcctcgc cgttgttggc
ctgtctccag 1740attcttctga tctcctcctt gtcgtacagc accagctctc tcatccactt
gccgtccact 1800cttctgtaga tggggccgcc ggtcttcttg gggtccttgc cggcgctggg
gtgctcctcc 1860aggtatctgt ttcttctctc gtcgaaggcg ctcagcacca ttctctcgat
ggtcaggctg 1920ttctggatca gtctgccctc gtagtcgctc agcttcagct cggtgcacat
ctggatgtag 1980aatctgccga tgccgtcgat catcttgccc acgctggctc tgatctcggt
ggcgttctgt 2040ctctcgccgt cggtctccat ctgctcgtag cttctcttgg tgccctggct
ggccatggtg 2100gcgaattcga tatccgacga cggtgactgc agaaaagacc catggaaagg
aacagtctgt 2160tagtctgtca gctattatgt ctggtggcgc gcgcggcagc aacgagtact
gctcagacta 2220cactgccctc caccgttaac tagagttgag caagcagggt caggcaaagc
gtggagagcc 2280ggctgagtct aggtaggctc caagggagcg ccggacaaag gcccggtctc
gacctgagct 2340ttaaacttac ctagacggcg gacgcagttc aggaggcacc acaggcggga
ggcggcagaa 2400cgcgactcaa ccggcgtgga tggcggcctc aggtagggcg gcgggcgcgt
gaaggagaga 2460tgcgagccga tggaggtgca caccaatgtg gtgaatggtc aaatggcgtt
tattgtatcg 2520agctaggcac ttaaatacaa tatctctgca atgcggaatt cagtggttcg
tccaatccat 2580gtcagacccg tctgttgcct tcctaataag gcacgatcgt accaccttac
ttccaccaat 2640cggcatgcac ggtgcttttt ctctccttgt aaggcatgtt gctaactcat
cgttaccatg 2700ttgcaagact acaagagtat tgcataagac tacatttccc cctccctatg
caaaagcgaa 2760actactatat cctgagggga ctcctaaccg cgtacaaccg aagccccgct
tttcgcctaa 2820acacacccta gtcccctcag atacgcgtat atctggcccg tacatcgcga
agcagcgcaa 2880aacgcctaac cctaagcaga ttcttcatgc aattgtcggt caagccttgc
cttgttgtag 2940cttaaatttt gctcgcgcac tactcagcga cctccaacac acaagcaggg
agcagccaat 3000agccaatctg atgcggtatt ttctccttac gcatctgtgc ggtatttcac
accgcatacc 3060attgcatacg ttgtatccat atcataatat gtacatttat attggctcat
gtccaacatt 3120accgccatgt tgacattgat tattgactag ttattaatag taatcaatta
cggggtcatt 3180agttcatagc ccatatatgg agttccgcgt tacataactt acggtaaatg
gcccgcctgg 3240ctgaccgccc aacgaccccc gcccattgac gtcaataatg acgtatgttc
ccatagtaac 3300gccaataggg actttccatt gacgtcaatg ggtggagtat ttacggtaaa
ctgcccactt 3360ggcagtacat caagtgtatc atatgccaag tacgccccct attgacgtca
atgacggtaa 3420atggcccgcc tggcattatg cccagtacat gaccttatgg gactttccta
cttggcagta 3480catctacgta ttagtcatcg ctattaccat ggtgatgcgg ttttggcagt
acatcaatgg 3540gcgtggatag cggtttgact cacggggatt tccaagtctc caccccattg
acgtcaatgg 3600gagtttgttt tggcaccaaa atcaacggga ctttccaaaa tgtcgtaaca
actccgcccc 3660attgacgcaa atgggcggta ggcgtgtacg gtgggaggtc tatataagca
gagctcgttt 3720agtgaaccgt cagatcgcct ggagacgcca tccacgctgt tttgacctcc
atagaagaca 3780ccgggaccga tccagcctcc gcggccggga acggtgcatt ggaacgcgga
ttccccgtgc 3840caagagtgac gtaagtaccg cctatagact ctataggcac acccctttgg
ctcttatgca 3900tgctatactg tttttggctt ggggcctata cacccccgct tccttatgct
ataggtgatg 3960gtatagctta gcctataggt gtgggttatt gaccattatt gaccactccc
ctattggtga 4020cgatactttc cattactaat ccataacatg gctctttgcc acaactatct
ctattggcta 4080tatgccaata ctctgtcctt cagagactga cacggactct gtatttttac
aggatggggt 4140cccatttatt atttacaaat tcacatatac aacaacgccg tcccccgtgc
ccgcagtttt 4200tattaaacat agcgtgggat ctccacgcga atctcgggta cgtgttccgg
acatgggctc 4260ttctccggta gcggcggagc ttccacatcc gagccctggt cccatgcctc
cagcggctca 4320tggtcgctcg gcagctcctt gctcctaaca gtggaggcca gacttaggca
cagcacaatg 4380cccaccacca ccagtgtgcc gcacaaggcc gtggcggtag ggtatgtgtc
tgaaaatgag 4440cgtggagatt gggctcgcac ggctgacgca gatggaagac ttaaggcagc
ggcagaagaa 4500gatgcaggca gctgagttgt tgtattctga taagagtcag aggtaactcc
cgttgcggtg 4560ctgttaacgg tggagggcag tgtagtctga gcagtactcg ttgctgccgc
gcgcgccacc 4620agacataata gctgacagac taacagactg ttcctttcca tgggtctttt
ctgcagtcac 4680cgtcgtcgga tatcgaattc gccaccatga gcctgctgac cgaggtggag
acccccatca 4740gaaacgagtg gggctgcaga tgcaacgaca gcagcgaccc cctggtggtg
gccgccagca 4800tcatcggcat cctgcacctg atcctgtgga tcctggacag actgttcttc
aagtgcatct 4860acagactgtt caagcacggc ctgaagagag gccccagcac cgagggcgtg
cccgagagca 4920tgagagagga gtacagaaag gagcagcaga acgccgtgga cgccgacgac
agccacttcg 4980tgagcatcga gctggagtga tcagtcgacc acgtgtgatc cagatctact
tctggctaat 5040aaaagatcag agctctagag atctgtgtgt tggttttttg tgtggtactc
ttccgcttcc 5100tcgctcactg actcgctgcg ctcggtcgtt cggctgcggc gagcggtatc
agctcactca 5160aaggcggtaa tacggttatc cacagaatca ggggataacg caggaaagaa
catgtgagca 5220aaaggccagc aaaaggccag gaaccgtaaa aaggccgcgt tgctggcgtt
tttccatagg 5280ctccgccccc ctgacgagca tcacaaaaat cgacgctcaa gtcagaggtg
gcgaaacccg 5340acaggactat aaagatacca ggcgtttccc cctggaagct ccctcgtgcg
ctctcctgtt 5400ccgaccctgc cgcttaccgg atacctgtcc gcctttctcc cttcgggaag
cgtggcgctt 5460tctcatagct cacgctgtag gtatctcagt tcggtgtagg tcgttcgctc
caagctgggc 5520tgtgtgcacg aaccccccgt tcagcccgac cgctgcgcct tatccggtaa
ctatcgtctt 5580gagtccaacc cggtaagaca cgacttatcg ccactggcag cagccactgg
taacaggatt 5640agcagagcga ggtatgtagg cggtgctaca gagttcttga agtggtggcc
taactacggc 5700tacactagaa gaacagtatt tggtatctgc gctctgctga agccagttac
cttcggaaaa 5760agagttggta gctcttgatc cggcaaacaa accaccgctg gtagcggtgg
tttttttgtt 5820tgcaagcagc agattacgcg cagaaaaaaa ggatctcaag aagatccttt
gatcttttct 5880acggggtctg acgctcagtg gaacgaaaac tcacgttaag ggattttggt
catgagatta 5940tcaaaaagga tcttcaccta gatcctttta aattaaaaat gaagttttaa
atcaatctaa 6000agtatatatg agtaaacttg gtctgacagt taccaatgct taatcagtga
ggcacctatc 6060tcagcgatct gtctatttcg ttcatccata gttgcctgac tcgggggggg
ggggcgctga 6120ggtctgcctc gtgaagaagg tgttgctgac tcataccagg cctgaatcgc
cccatcatcc 6180agccagaaag tgagggagcc acggttgatg agagctttgt tgtaggtgga
ccagttggtg 6240attttgaact tttgctttgc cacggaacgg tctgcgttgt cgggaagatg
cgtgatctga 6300tccttcaact cagcaaaagt tcgatttatt caacaaagcc gccgtcccgt
caagtcagcg 6360taatgctctg ccagtgttac aaccaattaa ccaattctga ttagaaaaac
tcatcgagca 6420tcaaatgaaa ctgcaattta ttcatatcag gattatcaat accatatttt
tgaaaaagcc 6480gtttctgtaa tgaaggagaa aactcaccga ggcagttcca taggatggca
agatcctggt 6540atcggtctgc gattccgact cgtccaacat caatacaacc tattaatttc
ccctcgtcaa 6600aaataaggtt atcaagtgag aaatcaccat gagtgacgac tgaatccggt
gagaatggca 6660aaagcttatg catttctttc cagacttgtt caacaggcca gccattacgc
tcgtcatcaa 6720aatcactcgc atcaaccaaa ccgttattca ttcgtgattg cgcctgagcg
agacgaaata 6780cgcgatcgct gttaaaagga caattacaaa caggaatcga atgcaaccgg
cgcaggaaca 6840ctgccagcgc atcaacaata ttttcacctg aatcaggata ttcttctaat
acctggaatg 6900ctgttttccc ggggatcgca gtggtgagta accatgcatc atcaggagta
cggataaaat 6960gcttgatggt cggaagaggc ataaattccg tcagccagtt tagtctgacc
atctcatctg 7020taacatcatt ggcaacgcta cctttgccat gtttcagaaa caactctggc
gcatcgggct 7080tcccatacaa tcgatagatt gtcgcacctg attgcccgac attatcgcga
gcccatttat 7140acccatataa atcagcatcc atgttggaat ttaatcgcgg cctcgagcaa
gacgtttccc 7200gttgaatatg gctcataaca ccccttgtat tactgtttat gtaagcagac
agttttattg 7260ttcatgatga tatattttta tcttgtgcaa tgtaacatca gagattttga
gacacaacgt 7320ggctttcccc ccccccccat tattgaagca tttatcaggg ttattgtctc
atgagcggat 7380acatatttga atgtatttag aaaaataaac aaataggggt tccgcgcaca
tttccccgaa 7440aagtgccacc tgacgtctaa gaaaccatta ttatcatgac attaacctat
aaaaataggc 7500gtatcacgag gccctttcgt ctcgcgcgtt tcggtgatga cggtgaaaac
ctctgacaca 7560tgcagctccc ggagacggtc acagcttgtc tgtaagcgga tgccgggagc
agacaagccc 7620gtcagggcgc gtcagcgggt gttggcgggt gtcggggctg gcttaactat
gcggcatcag 7680agcagattgt actgagagtg caccatatgc ggtgtgaaat accgcacaga
tgcgtaagga 7740gaaaataccg catcagattg gctat
77651124196DNAArtificial sequenceVR10686, 4196 bps DNA
Circular 112tcgcgcgttt cggtgatgac ggtgaaaacc tctgacacat gcagctcccg
gagacggtca 60cagcttgtct gtaagcggat gccgggagca gacaagcccg tcagggcgcg
tcagcgggtg 120ttggcgggtg tcggggctgg cttaactatg cggcatcaga gcagattgta
ctgagagtgc 180accatatgcg gtgtgaaata ccgcacagat gcgtaaggag aaaataccgc
atcagattgg 240ctattggctg ctccctgctt gtgtgttgga ggtcgctgag tagtgcgcga
gcaaaattta 300agctacaaca aggcaaggct tgaccgacaa ttgcatgaag aatctgctta
gggttaggcg 360ttttgcgctg cttcgcgatg tacgggccag atatacgcgt atctgagggg
actagggtgt 420gtttaggcga aaagcggggc ttcggttgta cgcggttagg agtcccctca
ggatatagta 480gtttcgcttt tgcataggga gggggaaatg tagtcttatg caatactctt
gtagtcttgc 540aacatggtaa cgatgagtta gcaacatgcc ttacaaggag agaaaaagca
ccgtgcatgc 600cgattggtgg aagtaaggtg gtacgatcgt gccttattag gaaggcaaca
gacgggtctg 660acatggattg gacgaaccac tgaattccgc attgcagaga tattgtattt
aagtgcctag 720ctcgatacaa taaacgccat ttgaccattc accacattgg tgtgcacctc
catcggctcg 780catctctcct tcacgcgccc gccgccctac ctgaggccgc catccacgcc
ggttgagtcg 840cgttctgccg cctcccgcct gtggtgcctc ctgaactgcg tccgccgtct
aggtaagttt 900aaagctcagg tcgagaccgg gcctttgtcc ggcgctccct tggagcctac
ctagactcag 960ccggctctcc acgctttgcc tgaccctgct tgctcaactc tagttaacgg
tggagggcag 1020tgtagtctga gcagtactcg ttgctgccgc gcgcgccacc agacataata
gctgacagac 1080taacagactg ttcctttcca tgggtctttt ctgcagtcac cgtcgtcgac
acgtgtgatc 1140agatatcgcg gccgctctag accaggccct ggatccagat ctgctgtgcc
ttctagttgc 1200cagccatctg ttgtttgccc ctcccccgtg ccttccttga ccctggaagg
tgccactccc 1260actgtccttt cctaataaaa tgaggaaatt gcatcgcatt gtctgagtag
gtgtcattct 1320attctggggg gtggggtggg gcaggacagc aagggggagg attgggaaga
caatagcagg 1380catgctgggg atgcggtggg ctctatgggt acccaggtgc tgaagaattg
acccggttcc 1440tcctgggcca gaaagaagca ggcacatccc cttctctgtg acacaccctg
tccacgcccc 1500tggttcttag ttccagcccc actcatagga cactcatagc tcaggagggc
tccgccttca 1560atcccacccg ctaaagtact tggagcggtc tctccctccc tcatcagccc
accaaaccaa 1620acctagcctc caagagtggg aagaaattaa agcaagatag gctattaagt
gcagagggag 1680agaaaatgcc tccaacatgt gaggaagtaa tgagagaaat catagaattt
taaggccatg 1740atttaaggcc atcatggcct taatcttccg cttcctcgct cactgactcg
ctgcgctcgg 1800tcgttcggct gcggcgagcg gtatcagctc actcaaaggc ggtaatacgg
ttatccacag 1860aatcagggga taacgcagga aagaacatgt gagcaaaagg ccagcaaaag
gccaggaacc 1920gtaaaaaggc cgcgttgctg gcgtttttcc ataggctccg cccccctgac
gagcatcaca 1980aaaatcgacg ctcaagtcag aggtggcgaa acccgacagg actataaaga
taccaggcgt 2040ttccccctgg aagctccctc gtgcgctctc ctgttccgac cctgccgctt
accggatacc 2100tgtccgcctt tctcccttcg ggaagcgtgg cgctttctca tagctcacgc
tgtaggtatc 2160tcagttcggt gtaggtcgtt cgctccaagc tgggctgtgt gcacgaaccc
cccgttcagc 2220ccgaccgctg cgccttatcc ggtaactatc gtcttgagtc caacccggta
agacacgact 2280tatcgccact ggcagcagcc actggtaaca ggattagcag agcgaggtat
gtaggcggtg 2340ctacagagtt cttgaagtgg tggcctaact acggctacac tagaagaaca
gtatttggta 2400tctgcgctct gctgaagcca gttaccttcg gaaaaagagt tggtagctct
tgatccggca 2460aacaaaccac cgctggtagc ggtggttttt ttgtttgcaa gcagcagatt
acgcgcagaa 2520aaaaaggatc tcaagaagat cctttgatct tttctacggg gtctgacgct
cagtggaacg 2580aaaactcacg ttaagggatt ttggtcatga gattatcaaa aaggatcttc
acctagatcc 2640ttttaaatta aaaatgaagt tttaaatcaa tctaaagtat atatgagtaa
acttggtctg 2700acagttacca atgcttaatc agtgaggcac ctatctcagc gatctgtcta
tttcgttcat 2760ccatagttgc ctgactcggg gggggggggc gctgaggtct gcctcgtgaa
gaaggtgttg 2820ctgactcata ccaggcctga atcgccccat catccagcca gaaagtgagg
gagccacggt 2880tgatgagagc tttgttgtag gtggaccagt tggtgatttt gaacttttgc
tttgccacgg 2940aacggtctgc gttgtcggga agatgcgtga tctgatcctt caactcagca
aaagttcgat 3000ttattcaaca aagccgccgt cccgtcaagt cagcgtaatg ctctgccagt
gttacaacca 3060attaaccaat tctgattaga aaaactcatc gagcatcaaa tgaaactgca
atttattcat 3120atcaggatta tcaataccat atttttgaaa aagccgtttc tgtaatgaag
gagaaaactc 3180accgaggcag ttccatagga tggcaagatc ctggtatcgg tctgcgattc
cgactcgtcc 3240aacatcaata caacctatta atttcccctc gtcaaaaata aggttatcaa
gtgagaaatc 3300accatgagtg acgactgaat ccggtgagaa tggcaaaagc ttatgcattt
ctttccagac 3360ttgttcaaca ggccagccat tacgctcgtc atcaaaatca ctcgcatcaa
ccaaaccgtt 3420attcattcgt gattgcgcct gagcgagacg aaatacgcga tcgctgttaa
aaggacaatt 3480acaaacagga atcgaatgca accggcgcag gaacactgcc agcgcatcaa
caatattttc 3540acctgaatca ggatattctt ctaatacctg gaatgctgtt ttcccgggga
tcgcagtggt 3600gagtaaccat gcatcatcag gagtacggat aaaatgcttg atggtcggaa
gaggcataaa 3660ttccgtcagc cagtttagtc tgaccatctc atctgtaaca tcattggcaa
cgctaccttt 3720gccatgtttc agaaacaact ctggcgcatc gggcttccca tacaatcgat
agattgtcgc 3780acctgattgc ccgacattat cgcgagccca tttataccca tataaatcag
catccatgtt 3840ggaatttaat cgcggcctcg agcaagacgt ttcccgttga atatggctca
taacacccct 3900tgtattactg tttatgtaag cagacagttt tattgttcat gatgatatat
ttttatcttg 3960tgcaatgtaa catcagagat tttgagacac aacgtggctt tccccccccc
cccattattg 4020aagcatttat cagggttatt gtctcatgag cggatacata tttgaatgta
tttagaaaaa 4080taaacaaata ggggttccgc gcacatttcc ccgaaaagtg ccacctgacg
tctaagaaac 4140cattattatc atgacattaa cctataaaaa taggcgtatc acgaggccct
ttcgtc 4196
User Contributions:
Comment about this patent or add new information about this topic: