Patent application title: SINDBIS CONTROL VIRUS
Inventors:
Russell Garlick (Needham, MA, US)
Catherine Huang (Elkridge, MD, US)
Bharathi Anekella (Clarksburg, MD, US)
Jonathan Li (Chestnut Hill, MA, US)
IPC8 Class: AC12N700FI
USPC Class:
1 1
Class name:
Publication date: 2019-01-03
Patent application number: 20190002838
Abstract:
Disclosed are compositions and methods related to replication deficient
Sindbis viruses that are able to function as controls for nucleic acid
diagnostic assays (e.g., nucleic acid sequencing based assays and/or
nucleic acid amplification based assays).Claims:
1. A replication deficient recombinant Sindbis virus comprising a RNA
genome comprising: an open reading frame (ORF) encoding functional
Sindbis non-structural proteins; and a heterologous RNA sequence.
2. The replication deficient recombinant Sindbis virus of claim 1, wherein the ORF encoding functional Sindbis non-structural proteins is located 5' of the heterologous RNA sequence.
3. The replication deficient recombinant Sindbis virus of claim 1, wherein the ORF encoding Sindbis non-structural proteins has a nucleotide sequence that is at least 90% identical to nucleotides 1-7648 of SEQ ID NO: 1.
4. (canceled)
5. The replication deficient recombinant Sindbis virus of claim 1, wherein the RNA genome lacks a sequence encoding a functional version of one or more of the Sindbis structural proteins.
6.-9. (canceled)
10. The replication deficient recombinant Sindbis virus of claim 1, wherein the heterologous RNA sequence comprises a non-Sindbis RNA virus sequence or a retrovirus sequence.
11.-16. (canceled)
17. The replication deficient recombinant Sindbis virus of claim 10, wherein the heterologous RNA sequence comprises a non-Sindbis RNA virus sequence.
18. The replication deficient recombinant Sindbis virus of claim 17, wherein the non-Sindbis RNA virus sequence is an Ebolavirus sequence, an influenza virus sequence, a SARS virus sequence, a hepatitis C virus sequence, a West Nile virus sequence, a Zika virus sequence, a poliovirus sequence, or a measles virus sequence.
19. The replication deficient recombinant Sindbis virus of claim 18, wherein the non-Sindbis RNA virus sequence is an Ebolavirus sequence.
20.-26. (canceled)
27. The replication deficient recombinant Sindbis virus of claim 10, wherein the heterologous RNA sequence comprises a retrovirus sequence.
28. The replication deficient recombinant Sindbis virus of claim 27, wherein the retrovirus sequence is an HIV-1 sequence, an HIV-2 sequence, an HTLV-1 sequence, or an HTLV-II sequence.
29. The replication deficient recombinant Sindbis virus of claim 28, wherein the retrovirus sequence is an HIV-1 sequence.
30.-65. (canceled)
66. A composition, comprising a replication deficient Sindbis virus of claim 1.
67. The composition of claim 66, wherein the replication deficient Sindbis virus comprising a RNA genome comprising a sequence that is at least 90% identical to SEQ ID NO: 11; and a replication deficient Sindbis virus comprising a RNA genome comprising a sequence that is at least 90% identical to SEQ ID NO: 13.
68. (canceled)
69. The composition of claim 66, wherein the replication deficient Sindbis virus comprising a RNA genome comprising a sequence that is at least 90% identical to SEQ ID NO: 12; and a replication deficient Sindbis virus comprising a RNA genome comprising a sequence that is at least 90% identical to SEQ ID NO: 14.
70. (canceled)
71. The composition of claim 66, wherein the replication deficient Sindbis virus comprising a RNA genome comprising a sequence that is at least 90% identical to either nucleotides 1-3446 of SEQ ID NO: 15, nucleotides 3294-5575 of SEQ ID NO: 15, nucleotides 5425-7722 of SEQ ID NO: 15, or nucleotides 7542-10272 of SEQ ID NO: 15.
72.-77. (canceled)
78. A nucleic acid molecule encoding the RNA genome of the replication deficient Sindbis virus of claim 1.
79. The nucleic acid molecule of claim 78, wherein the nucleic acid molecule is an RNA molecule.
80.-82. (canceled)
83. A method of making a replication deficient Sindbis virus comprising: (a) transfecting a cell with the RNA molecule of claim 79 encoding the RNA genome of the replication deficient Sindbis virus and a helper RNA molecule encoding functional Sindbis structural proteins; (b) culturing the transfected cell of step (a) under conditions such that the cell produces a replication deficient Sindbis virus comprising the mRNA genome; and (c) collecting the replication deficient Sindbis virus.
84. (canceled)
85. A method of testing a diagnostic assay, comprising performing the diagnostic assay on a composition of claim 66.
Description:
PRIORITY CLAIM
[0001] This application claims the benefit of priority to U.S. Provisional Patent Application No. 62/182,104, filed Jun. 19, 2015, which is hereby incorporated by reference in its entirety.
SEQUENCE LISTING
[0002] The instant application contains a Sequence Listing which has been submitted electronically in ASCII format and is hereby incorporated by reference in its entirety. Said ASCII copy, created on May 31, 2016, is named SCX_00325_SL.txt and is 144,758 bytes in size.
BACKGROUND
[0003] Regulatory agencies, such as the FDA, CLIA and CAP, generally require developers of nucleic acid-based in vitro diagnostic devices for pathogen detection to include quality controls in their regulatory submissions. Such quality control materials are important tools for the detection of analytical errors, the monitoring of long-term performance of diagnostic test kits, and the identification of changes in random or systematic error. A well-designed laboratory quality control program will generally incorporate at least some form of control that provides added confidence in the reliability of results obtained for unknown specimens.
[0004] Whole process controls are needed to monitor the entire analytical process, including sample lysis, nucleic acid extraction, amplification, detection and interpretation of results. Such controls can be natural material derived from infected patients, which have the advantage of behaving very similarly to a clinical sample. However, such natural source controls often have limited and unpredictable availability, concentration and stability. The use of cultured virus to generate positive controls alleviates some of these problems, but virus culture is often unavailable or technologically difficult. In addition, the preparation of large amount of human pathogens caries significant safety risks and is expensive.
[0005] Positive controls for amplification and detection are often provided as part of diagnostic test kits. The materials often have a known amount of input copy number and verify the integrity of the reaction components and instrument. However, such controls are not usually taken through the sample lysis or nucleic acid extraction process and are therefore unable to detect errors arising from these steps. Examples of this type of control include a non-infectious DNA plasmid containing the target sequence, purified RNA transcripts, or packaged RNA materials such as Armored RNA. These materials often also suffer from their limited stability at ambient temperatures.
[0006] Internal controls contain a non-target nucleotide sequence that is co-extracted and co-amplified with the target nucleic acid. Internal controls confirm the integrity of the reagents (e.g., polymerase, primers, etc.), equipment function (e.g., thermal cycler), and the absence of inhibitors in the sample. The internal control can take the form of a non-target organism that is added to the sample prior to sample lysis and extraction. Alternatively, it could be a non-infectious, non-target DNA or RNA sequence that is added to the sample either prior to or after sample lysis and extraction.
[0007] Thus, there is a need for improved compositions able to serve as controls in diagnostic assays.
SUMMARY
[0008] Provided herein are compositions and methods related to replication deficient Sindbis viruses that are able to function as controls for nucleic acid diagnostic assays (e.g., nucleic acid sequencing based assays and/or nucleic acid amplification based assays).
[0009] In certain aspects, disclosed herein is a replication deficient recombinant Sindbis virus comprising a RNA genome comprising (a) an open reading frame (ORF) encoding functional Sindbis non-structural proteins and (b) a heterologous (i.e., non-Sindbis) RNA sequence. In some embodiments, the ORF encoding the functional Sindbis non-structural proteins is located 5' of the heterologous RNA sequence.
[0010] In some embodiments, the ORF encoding the Sindbis non-structural proteins encodes a nsP1 protein, a nsP2 protein, a nsP3 protein and a nsP4 protein. In some embodiments, the ORF encoding Sindbis non-structural proteins has a nucleotide sequence that is at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or 100% identical to nucleotides 1-7648 of SEQ ID NO: 1. These nucleotides encode non-structural Sindbis proteins.
[0011] In some embodiments, the RNA genome of the replication deficient Sindbis virus lacks a sequence encoding a functional version of one or more of the Sindbis structural proteins (e.g., Sindbis capsid protein, E3 protein, E2 protein, 6k protein and/or E1 protein),In some embodiments, the RNA genome lacks an RNA sequence encoding any functional Sindbis structural proteins. In some embodiments, the heterologous RNA sequence replaces the ORF encoding the Sindbis structural proteins in the RNA genome.
[0012] In some embodiments, the replication deficient recombinant Sindbis virus of claim any one of claims 1 to 8, wherein the RNA genome comprises a 26S subgenomic promoter at the 3' end of the ORF encoding the Sindbis non-structural proteins.
[0013] In some embodiments, the heterologous RNA sequence in the RNA genome comprises a non-Sindbis RNA virus sequence or a retrovirus sequence. In some embodiments, the heterologous RNA sequence includes at least 10, 50, 100, 150, 200, 250, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000, 1100, 1200, 1300, 1400, 1500, 1600, 1700, 1800, 1900 or 2000 bp of a non-Sindbis RNA virus sequence or a retrovirus sequence. In some embodiments, the heterologous RNA sequence includes 100-300 bp of a non-Sindbis RNA virus sequence or a retrovirus sequence. In some embodiments, the heterologous RNA sequence includes 100-200 bp of a non-Sindbis RNA virus sequence or a retrovirus sequence. In some embodiments, the heterologous RNA sequence is at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or 100% identical to a non-Sindbis RNA virus sequence or a retrovirus sequence. In some embodiments, the non-Sindbis RNA virus sequence or retrovirus sequence comprises one or more mutations that convey a drug resistant phenotype when present in the non-Sindbis RNA virus or the retrovirus. For example, in some embodiments the non-Sindbis RNA virus sequence or retrovirus sequence comprises at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29 or 30 mutations that convey a drug resistant phenotype when present in the non-Sindbis RNA virus or the retrovirus.
[0014] In some embodiments, the heterologous RNA sequence comprises a non-Sindbis RNA virus sequence. In some embodiments, the non-Sindbis RNA virus sequence is an Ebolavirus sequence, an influenza virus sequence, a SARS virus sequence, a hepatitis C virus sequence, a West Nile virus sequence, a Zika virus sequence, a poliovirus sequence or a measles virus sequence.
[0015] In some embodiments, the non-Sindbis RNA virus sequence is an Ebolavirus sequence (e.g., a Zaire ebolavirus sequence, a Bundibugyo ebolavirus sequence, a Reston ebolavirus sequence, a Sudan ebolavirus sequence or a Tai Forest ebolavirus sequence). In some embodiments, the Ebolavirus sequence comprises at least a portion of an Ebolavirus GP gene sequence, an Ebolavirus NP gene sequence or an Ebolavirus VP24 gene sequence. In some embodiments, the heterologous RNA sequence does not encode a functional Ebola protein (e.g., the heterologous RNA sequence encodes truncated Ebola proteins, Ebola proteins with frame-shift mutations and/or Ebola protein sequences lacking a start codon). In some embodiments, the heterologous RNA sequence comprises a sequence at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or 100% identical to SEQ ID NO: 2 or SEQ ID NO: 3. SEQ ID NO: 2 is the nucleotide sequence of a GP Ebola target sequence used in an exemplary Ebola Sindbis control virus described in Example 1. SEQ ID NO: 3 is the nucleotide sequence of a NP/VP24 Ebola target sequence used in an exemplary Ebola Sindbis control virus described in Example 1. The portion of the Ebola NP gene consists of nucleotides 1 to 1577 of SEQ ID NO: 3, the portion of the Ebola VP24 gene consists of nucleotides 1578 to 2127 and the sequence of the human RNAse P internal control consists of nucleotides 2128 to 2217.
[0016] In some-embodiments, the heterologous RNA sequence comprises a retrovirus sequence. In some embodiments, the retrovirus sequence is an HIV-1 sequence, an HIV-2 sequence, an HTLV-1 sequence, or an HTLV-II sequence.
[0017] In some embodiments, the heterologous RNA sequence comprises an HIV-1 sequence. In some embodiments, the HIV-1 sequence comprises one or more mutations that, when present in a HIV-1 virus, conveys a drug resistance phenotype (e.g., resistance to a protease inhibitor, a nucleoside analogue reverse transcriptase inhibitor and/or a non-nucleoside analog reverse transcriptase inhibitor). For example, in some embodiments the HIV-1 virus sequence comprises at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29 or 30 mutations that convey a drug resistant phenotype. In some embodiments, the one or more mutations, when present in HIV-1 virus, convey resistance to a drug selected from the group consisting of: atazanavir, ritonavir, darunavir, fosamprenavir, indinavir, lopinavir, nelfinavir, saquinavir, tipranavir, abacavir, didanosine, emtricitabine, lamivudine, stavudine, tenofovir, zidovudine, efavirenz, etavirine, nevirapine or rilpivirine. In some embodiments, the one or more mutations are selected from the group consisting of L24I, D30N, V32I, M46I, I47V, G48V, 150V, I54M,
[0018] G73S, L76V, V82A, I84V, N88D, L90M, M41L, K65R, D67N, T69S insert SS, K7OR, L74V, F77L, Y115F, F116Y, Q151M, M184V, L210W, T215Y, K219Q, L100I, K101E, K103N, V106A, V1081, Y181C, Y188L, G190A, P225H and M230L. In some embodiments, the one or more mutations are selected from the group consisting of L24I (TTA to ATA), D30N (GAT to AAT), V32I (GTA to ATA), M46I (ATG to ATA), I47V (ATA to CTA), G48V (GGG to GTG), I50V (ATT to GTT), I54M (ATC to ATG), G73 S(GGT to GCT), L76V (TTA to GTA), V82A (GTC to GCC), I84V (ATA to GTA), N88D (AAT to GAT), L9OM (TTG to ATG), M41L (ATG to TTG), K65R (AAA to AGA), D67N (GAC to AAC), T69S insert SS (ACT to TCT and insertion of TCC and TCC), K7OR (AAA to AGA), L74V (TTA to GTA), F77L (TTC to CTC), Y115F (TAT to TTT), F116Y (TTT to TAT), Q151M (CAG to ATG), M184V (ATG to GTG), L210W (TTG to TGG), T215Y (ACC to TAC), K219Q (AAA to CAA), L100I (TTA to ATA), K101E (AAA to GAA), K103N (AAA to AAC), V106A (GTA to GCA), V108I (GTA to ATA), Y181C (TAT to TGT), Y188L (TAT to TTA), G190A (GGA to GCA), P225H (CCT to CAT) and M230L (CCT to CAT). In some embodiments, the HIV-1 sequence comprises at least a portion of an HIV-1 gene selected from p7, pl, p6, HIV protease, reverse transcriptase, p51 RNAse, integrase and gp120. In some embodiments, the HIV-1 sequence comprises at least a portion of p7, pl, p6, HIV protease, reverse transcriptase and integrase. In certain embodiments, the HIV-1 sequence comprises at least a portion of 6p120, wherein the portion comprises the V1-V5 variable loops. In some embodiments, the HIV-1 sequence comprises a sequence that is at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or 100% identical to nucleotides 1900 through 5400 and/or 6300 through 7825 of the HXB2 strain of HIV-1 (SEQ ID NO: 4). SEQ ID NO: 4 is the nucleotide sequence of the HIV-1 HXB2 genome. In some embodiments, the HIV-1 sequence is identical to nucleotides 1900 through 5400 and/or 6300 through 7825of the HXB2 strain of HIV-1 (SEQ ID NO: 4) except for the presence of the mutations that convey a drug resistance phenotype.
[0019] In some embodiments, the heterologous RNA sequence comprises a sequence at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or 100% identical to SEQ ID NO: 5 and/or SEQ ID NO: 7. SEQ ID NO: 5 is the nucleotide sequence of a 5' multi-mutant HIV-1 target sequence comprising a number of drug resistance mutations, used in an exemplary multi-mutant HIV-1 control virus described in Example 2. SEQ ID NO: 7 is the nucleotide sequence of a 3' mutant HIV-1 target sequence used in an exemplary multi-mutant HIV-1 control virus described in Example 2.
[0020] In some embodiments, the heterologous RNA sequence comprises a sequence at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or 100% identical to SEQ ID NO: 6 and/or SEQ ID NO: 8. SEQ ID NO: 6 is the nucleotide sequence of a 5' wild-type HIV-1 target sequence in an exemplary HIV-1 control virus, described in Example 2. SEQ ID NO: 8 is the nucleotide sequence of a 3' wild-type HIV-1 target sequence used in an exemplary HIV-1 control virus, described in Example 2.
[0021] In some embodiments, the heterologous RNA sequence comprises a sequence that is at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or 100% identical to either nucleotides 1-3446, nucleotides 3294-5575, nucleotides 5425-7722, or nucleotides 7542-10272 of SEQ ID NO: 15.
[0022] In some embodiments, the RNA genome of the replication deficient Sindbis virus comprises a nucleotide sequence that is at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or 100% identical to SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO: 14, or SEQ ID NO: 15.
[0023] In certain aspects, provided herein is a composition comprising a replication deficient Sindbis virus described herein. In certain aspects, the composition comprises two or more of the replication deficient Sindbis viruses described herein. For example, in some embodiments, the composition comprises a replication deficient Sindbis virus comprising a RNA genome comprising a sequence that is at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or 100% identical to SEQ ID NO: 11 and a replication deficient Sindbis virus comprising a RNA genome comprising a sequence that is at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or 100% identical to SEQ ID NO: 13. In some embodiments, the composition comprises a replication deficient Sindbis virus comprising a RNA genome comprising a sequence that is at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or 100% identical to SEQ ID NO: 12 and a replication deficient Sindbis virus comprising a RNA genome comprising a sequence that is at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or 100% identical to SEQ ID NO: 14. In some embodiments, the composition further comprises human DNA. In some embodiments, the replication deficient Sindbis virus is in a human bodily fluid. In some embodiments, the human bodily fluid is human plasma (e.g., defibrinated human plasma). In some embodiments, the composition further comprises a preservative, such as sodium azide.
[0024] In some embodiments, the composition comprises a replication deficient Sindbis virus comprising a RNA genome comprising a sequence that is at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or 100% identical to either nucleotides 1-3446, nucleotides 3294-5575, nucleotides 5425-7722, or nucleotides 7542-10272 of SEQ ID NO: 15.
[0025] In certain aspects, provided herein is a nucleic acid molecule encoding the RNA genome of the replication deficient Sindbis virus described herein. In some embodiments, the nucleic acid molecule is a DNA molecule. In some embodiments, the nucleic acid molecule is an RNA molecule. In some embodiments, the nucleic acid molecule is a plasmid (e.g., a circular plasmid or a linearized plasmid, such as a circular expression plasmid or a linearized expression plasmid). In some embodiments, the nucleic acid molecule is isolated. In certain embodiments, provided herein is a cell comprising a nucleic acid described herein. In some embodiments ,the cell is a BHK cell.
[0026] In certain aspects, provided herein is a method of making a replication deficient Sindbis virus. In certain embodiments, the method includes the step of transfecting a cell (e.g., a BHK cell) with a nucleic acid molecule (e.g., an RNA molecule) encoding the RNA genome of a replication deficient Sindbis virus described herein and with a nucleic acid (e.g., an RNA molecule) encoding functional Sindbis structural proteins. In some embodiments, the cell is then cultured under conditions such that the cell produces the replication deficient Sindbis virus into the culture medium. In some embodiments, the method further comprises collecting the replication deficient Sindbis virus (e.g., by collecting the culture supernatant). In some embodiments, the method further comprises filtering and/or heat inactivating the culture supernatant. In some embodiments, the method further comprises determining the titer of the virus (e.g., using real-time PCR).
[0027] In certain aspects, provided herein are methods of testing a diagnostic assay by running the diagnostic assay on a composition comprising the replication deficient Sindbis virus described herein. In some embodiments, the diagnostic assay is a nucleic acid amplification based diagnostic assay. In some embodiments, the diagnostic assay is a sequencing based diagnostic assay. In some embodiments the diagnostic assay is an assay for the detection of a RNA virus and/or a retrovirus. In some embodiments, the diagnostic assay is an assay for the detection of Ebolavirus, an influenza virus, a SARS virus, a hepatitis C virus, a West Nile virus, a Zika virus, a poliovirus, a measles virus, an HIV-1 virus, an HIV-2 virus, an HTLV-I virus and/or an HTLV-II virus. In certain embodiments, the heterologous RNA sequence in the RNA genome of the replication deficient Sindbis virus contains the target sequence detected in the diagnostic assay. In some embodiments, the method includes the performance of a sample lysis step on the composition comprising the replication deficient Sindbis virus. In some embodiments, the method comprises performing a nucleic acid extraction step. In some embodiments, the method comprises performing a nucleic acid amplification step (e.g., performing a real-time nucleic acid amplification/detection process). In some embodiments, the method comprises performing a nucleic acid sequencing step. In some embodiments the method comprises performing a nucleic acid detection step.
BRIEF DESCRIPTION OF FIGURES
[0028] FIG. 1 shows a schematic depiction of the genomic organization of Sindbis virus. Some of the genes shown encode nonstructural proteins (nsP1-4), which include the helicase and RNA polymerase. Some of the genes are the structural genes, and encode the capsid (C) as well as proteins involved in budding.
[0029] FIG. 2 shows a schematic depiction of the genomic organization of a Sindbis control vector of certain embodiments described herein.
[0030] FIG. 3 shows an exemplary schematic for the production of recombinant Sindbis control viruses.
[0031] FIG. 4 shows the results of a TaqMan real time quantitation assay of Sindbis control samples unstressed (time=0) or stressed for 1, 3, 5, 11 or 22 days at 37.degree. C.
[0032] FIG. 5 shows the results of a TaqMan real time quantitation assay of non-stressed Sindbis control samples or samples stressed through one, two or three Freeze/Thaw cycles.
[0033] FIG. 6 shows the results of a TaqMan real time quantitation assay of a Sindbis control virus stored frozen at -20.degree. C., stored refrigerated at 2-8.degree. C. or stored at ambient lab temperature across seven months.
[0034] FIG. 7 shows a workflow overview for the production of Sindbis control virus.
[0035] FIG. 8 shows a map of the SinRep SC vector. Figure discloses "His8" as SEQ ID NO: 16.
[0036] FIG. 9 consists of two maps of the Zika virus genome. The genome was divided into four regions for the construction of four different Zika virus reference materials, and each region is depicted by a rectangle. A first reference material comprises nucleotides 1 to 3446 of the Zika virus from GenBank Accession number EU545988.1, referred to as the "Zika Env Construct" (Construct -1). A second reference material comprises nucleotides 3294 to 5575, referred to as the "Zika NS2/NS3 Construct" (Construct -2). A third reference material comprises nucleotides 5425 to 7722, referred to as the "Zika NS4 Construct" (Construct -3). A fourth reference material comprises nucleotides 7542 to 10272, referred to as the "Zika NS5 Construct" (Construct -4).
[0037] FIG. 10A depicts nucleotides 1 to 3446 of the Zika virus from GenBank Accession number EU545988.1, referred to as the "Zika Env Construct," which includes the NS1 gene. This portion of the Zika virus genome was integrated into a Zika virus reference material.
[0038] FIG. 10B depicts of nucleotides 3294 to 5575 of the Zika virus from GenBank Accession number EU545988.1, referred to as the "Zika NS2/NS3 Construct," which includes the NS2 and NS3 genes as well as a portion of the NS1 gene. This portion of the Zika virus genome was integrated into a Zika virus reference material.
[0039] FIG. 10C depicts nucleotides 5425 to 7722 of the Zika virus from GenBank Accession number EU545988.1, referred to as the "Zika NS4 Construct," which includes the NS4A and NS4B genes as well as a portion of the NS3 gene. This portion of the Zika virus genome was integrated into a Zika virus reference material.
[0040] FIG. 10D depicts nucleotides 7542 to 10272 of the Zika virus from GenBank Accession number EU545988.1, referred to as the "Zika NS5 Construct," which includes the NS5 gene and a portion of the NS4B gene. This portion of the Zika virus genome was integrated into a Zika virus reference material.
[0041] FIG. 11A shows stability results of TaqMan real time quantitation of an H7N9 influenza reference material stored at -20.degree. C., 4.degree. C., or room temperature(.about.25.degree.) for seventeen months. The results depicted are for reference materials formulated with buffer.
[0042] FIG. 11B shows stability results of TaqMan real time quantitation of an H7N9 influenza reference material stored at -20.degree. C., 4.degree. C., or room temperature (.about.25.degree.) for seventeen months. The results depicted are for reference materials formulated with human plasma.
[0043] FIG. 12 shows stability results of TaqMan real time quantitation of a H7N9 influenza reference material stored at ambient temperature for seventeen months. Each error bar corresponds to 1 standard deviation from the mean.
DETAILED DESCRIPTION
General
[0044] Provided herein are compositions and methods related to replication deficient
[0045] Sindbis viruses that are able to function as controls for nucleic acid diagnostic assays (e.g., nucleic acid sequencing based assays and/or nucleic acid amplification based assays). In certain aspects, provided herein are Sindbis control virus are useful as whole process controls, positive controls and/or internal controls in nucleic acid diagnostic assays. Such control virus can benefit diagnostics manufacturers by providing a less expensive, consistent and safe source of starting material for controls. The control virus described herein use Sindbis virus, an RNA containing enveloped virus which can be engineered to contain target RNA sequences such as sequences from another virus and/or an internal control sequence. The Sindbis virus coat provides the RNA genome with improved stability. In some embodiments, the recombinant Sindbis virus system described herein results in viral particles that are packaged, so they can be used to evaluate nucleic acid extraction processes that are used before nucleic acid detection. Also provided herein are compositions comprising such viruses, nucleic acid molecules encoding the RNA genome of such control viruses, methods of making such control viruses and methods of using such control viruses.
Definitions
[0046] For convenience, certain terms employed in the specification, examples, and appended claims are collected here.
[0047] The articles "a" and "an" are used herein to refer to one or to more than one (i.e., to at least one) of the grammatical object of the article. By way of example, "an element" means one element or more than one element.
[0048] The term "biological sample," "tissue sample," or simply "sample" each refers to a collection of cells obtained from a tissue of a subject. The source of the tissue sample may be solid tissue, as from a fresh, frozen and/or preserved organ, tissue sample, biopsy, or aspirate; blood or any blood constituents, serum, blood; bodily fluids such as cerebral spinal fluid, amniotic fluid, peritoneal fluid or interstitial fluid, urine, saliva, stool, tears; or cells from any time in gestation or development of the subject.
[0049] The term "control" includes any portion of an experimental system designed to demonstrate that the factor being tested is responsible for the observed effect, and is therefore useful to isolate and quantify the effect of one variable on a system.
[0050] The term "gene" is used broadly to refer to any nucleic acid associated with a biological function. The term "gene" applies to a specific genomic sequence, as well as to a cDNA or an mRNA encoded by that genomic sequence.
[0051] As used herein, the term "heterologous RNA" refers to RNA present in a recombinant Sindbis virus that is not derived from wild-type Sindbis virus. For example, heterologous RNA in a Sindbis virus can be an RNA sequence normally found in a different virus (e.g., a different RNA virus or retrovirus), can be an RNA sequence normally found a non-viral organism, or can be a completely artificial RNA sequence.
[0052] The term "isolated nucleic acid" refers to a polynucleotide of natural or synthetic origin or some combination thereof, which (1) is not associated with the cell in which the "isolated nucleic acid" is found in nature, and/or (2) is operably linked to a polynucleotide to which it is not linked in nature.
[0053] The terms "polynucleotide", and "nucleic acid" are used interchangeably. They refer to a polymeric form of nucleotides of any length, either deoxyribonucleotides or ribonucleotides, or analogs thereof. Polynucleotides may have any three-dimensional structure, and may perform any function. The following are non-limiting examples of polynucleotides: coding or non-coding regions of a gene or gene fragment, loci (locus) defined from linkage analysis, exons, introns, messenger RNA (mRNA), transfer RNA, ribosomal RNA, ribozymes, cDNA, recombinant polynucleotides, branched polynucleotides, plasmids, vectors, isolated DNA of any sequence, isolated RNA of any sequence, nucleic acid probes, and primers. A polynucleotide may comprise modified nucleotides, such as methylated nucleotides and nucleotide analogs. If present, modifications to the nucleotide structure may be imparted before or after assembly of the polymer. A polynucleotide may be further modified, such as by conjugation with a labeling component. In all nucleotide sequences provided herein, U nucleotides are interchangeable with T nucleotides.
[0054] As used herein, the term "Sindbis virus" includes viral particles made up of an icosahedral capsid that comprises Sindbis virus capsid, E1 and E2 proteins encompassing a single-stranded RNA genome. The RNA genome can include non-Sindbis RNA (i.e., heterologous RNA) and does not need to include all parts of the wild-type Sindbis genome. For example, in some embodiments the RNA genome does not encode one or more of the Sindbis structural proteins.
Replication Deficient Sindbis Control Viruses
[0055] In certain embodiments, provided herein are replication deficient Sindbis control viruses. In some embodiments, such viruses have an RNA genome that includes (a) an open reading frame (ORF) encoding functional Sindbis non-structural proteins and (b) a heterologous (i.e., non-Sindbis) RNA sequence. In some embodiments, the ORF encoding the functional Sindbis non-structural proteins is located 5' of the heterologous RNA sequence. In some embodiments, the heterologous RNA sequence is a sequence from a different RNA virus (e.g., an Ebolavirus sequence, an influenza virus sequence, a SARS virus sequence, a hepatitis C virus sequence, a West Nile virus sequence, a Zika virus sequence, a poliovirus sequence or a measles virus sequence) or a sequence from a retrovirus (e.g., an HIV-1 sequence, an HIV-2 sequence, an HTLV-1 sequence, or an HTLV-II sequence).
[0056] Wild-type Sindbis virus is a member of Alphavirus genus, family Togaviridae. The viral genome is approximately 11,700 nucleotides. As such, Sindbis virus has approximately the same genomic complexity as many human pathogenic viruses, including, for example, HIV-1 (9270 nucleotides), HCV (9700 nucleotides) and Ebola Zaire (18959 nucleotides). This offers a technical advantage over certain other technologies used to package RNA controls, such as Armored RNA, which are based on MS2 bacteriophage technology and produce recombinant RNA molecules as small as 900 bases in length, which in many instances does adequately reflect the complexity or RNA secondary structure of the pathogenic viruses found in patient samples.
[0057] As depicted in FIG. 1, wild-type Sindbis virus contains a single-stranded positive sense genomic RNA which encodes both viral structural proteins (for capsid assembly and viral budding) as well as the nonstructural proteins (such as the replication enzymes). Upon entry of the virus into a cell, the RNA is released into cytoplasm and drives production of the viral replicase proteins (non-structural proteins 1-4). These proteins form replication and transcription complexes and are responsible for generating the negative strand of the genomic RNA. Promoters in the negative strand genomic RNA drive transcription of two mRNA species: The full-length genomic RNA encodes the nonstructural proteins and the smaller subgenomic RNA encodes the structural proteins. The 5' ends of both transcripts are capped with 7-methylguanosine and the 3' ends are polyadenylated.
[0058] In certain embodiments, the recombinant Sindbis control viruses described herein are replication deficient. In some embodiments, any method can be used to render the recombinant Sindbis control virus replication deficient. For example, in some embodiments the Sindbis control virus does not encode one or more functional structural proteins. For example, in some embodiments, the In some embodiments the recombinant Sindbis control virus genome does not encode one or more functional nonstructural proteins. In some embodiments, the Sindbis control virus does not encode a functional nsP1 protein, a functional nsP2 protein, a functional nsP3 protein and/or a functional nsP4 protein.
[0059] As described herein, separation of the Sindbis viral genome into two ORF facilitates the manipulation of the viral genome through replacement of the genes coding for the structural proteins with target sequences. This modified genomic RNA can be transcribed in vitro and introduced into cells along with a helper RNA (e.g., encoding structural proteins not encoded for in the modified RNA genome) for the defective virus. In some embodiments, the helper RNA encodes the four structural proteins required for Sindbis Virus packaging. In some embodiments, the helper RNA does not contain a packaging signal, and so does not get incorporated into the assembled viral particles. Thus, in certain embodiments, the viral particles produced therefore contain the target sequences but are replication defective because they do not bear the genetic information to produce the structural proteins. The recombinant viruses produced are effective quality control materials since they bear the selected target sequences, but the design of the recombinant Sindbis system provided herein ensures that the virus particles are safe and are not capable of establishing continuous infection. This is a distinct advantage for these materials over patient sourced or cultured viral materials as controls. FIG. 2 illustrates transcribed RNAs used for assembly of replication defective recombinant viruses.
[0060] Assembly of the virus particle occurs at the plasma membrane. A heterodimer of the structural proteins, E1 and E2, inserts into the plasma membrane and the E2 cytoplasmic tail is thought to provide the binding site for the nucleocapsid. This interaction between E2 and the nucleocapsid is thought to initiate the actual budding and release of the virus. When recombinant Sindbis viruses are produced in cultured cells, the virus particles are collected from the culture media, where they typically reach concentrations greater than 1.times.10.sup.8 viral copies/mL. The budding process results in the recombinant Sindbis virus being enveloped into a lipid bilayer. This is important since the structure of the recombinant virus is thus similar to many other viruses generally classified as RNA-containing enveloped viruses such as HIV-1, HCV, HTLV, Influenza, and SARS. Therefore, the replication deficient Sindbis vectors described herein can be a true whole process control as they undergo sample lysis and nucleic acid processing similar to human pathogenic viruses that may be found in patient samples.
[0061] In recombinant Sindbis viruses, the target sequences replace the structural genes. This gives the system great flexibility in the size of the target sequences that can be accommodated and packaged efficiently. Target sequences of less than 100 bp to greater than 4000 bp can be efficiently incorporated in the recombinant viruses. The ability to accommodate large sequences is a distinct advantage, especially when producing controls for multiplexed assays. Multiple target sequences (from different pathogens or from different genes within the same pathogen) can be combined in one recombinant virus to form a multiplex control.
[0062] In some embodiments, the Sindbis control viruses described herein comprise HIV-1 sequence and are therefore useful as a control for HIV-1 diagnostic assays. In some embodiments, the HIV-1 sequence in the Sindbis control virus is distinct from naturally occurring HIV-1 virus sequence in that it contains resistance mutations arising from multiple classes of current HIV-1 therapies. Such multiplexed mutations do not occur in nature. In some embodiments, the control virus has the various drug resistance mutations present at the same allelic ratio. This provides users with a clear expectation for their test results. In certain embodiments, stop codons are engineered into the HIV-1 sequences so that no functional HIV-1 proteins are produced.
[0063] In some aspects provided herein is an HIV-1 Sindbis control virus that comprises an HIV-1 sequence in its RNA genome. In some embodiments, the HIV-1 sequence comprises one or more mutations that, when present in a HIV-1 virus, conveys a drug resistance phenotype (e.g., resistance to a protease inhibitor, a nucleoside analogue reverse transcriptase inhibitor and/or a non-nucleoside analog reverse transcriptase inhibitor). For example, in some embodiments the HIV-1 virus sequence comprises at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29 or 30 mutations that convey a drug resistant phenotype. In some embodiments, the one or more mutations, when present in HIV-1 virus, convey resistance to a drug selected from the group consisting of: atazanavir, ritonavir, darunavir, fosamprenavir, indinavir, lopinavir, nelfinavir, saquinavir, tipranavir, abacavir, didanosine, emtricitabine, lamivudine, stavudine, tenofovir, zidovudine, efavirenz, etavirine, nevirapine or rilpivirine. In some embodiments, the one or more mutations are selected from the group consisting of L241, D30N, V321, M461, I47V, G48V, 150V, I54M, G73S, L76V, V82A, I84V, N88D, L90M, M41L, K65R, D67N, T69S insert SS, K7OR, L74V, F77L, Y115F, F116Y, Q151M, M184V, L210W, T215Y, K219Q, L100I, K101E, K103N, V106A, V1081, Y181C, Y188L, G190A, P225H and M230L. In some embodiments, the one or more mutations are selected from the group consisting of L241 (TTA to ATA), D30N (GAT to AAT), V321 (GTA to ATA), M461 (ATG to ATA), I47V (ATA to CTA), G48V (GGG to GTG), 150V (ATT to GTT), I54M (ATC to ATG), G73S(GGT to GCT), L76V (TTA to GTA), V82A (GTC to GCC), I84V (ATA to GTA), N88D (AAT to GAT), L9OM (TTG to ATG), M41L (ATG to TTG), K65R (AAA to AGA), D67N (GAC to AAC), T69S insert SS (ACT to TCT and insertion of TCC and TCC), K7OR (AAA to AGA), L74V (TTA to GTA), F77L (TTC to CTC), Y115F (TAT to TTT), F116Y (TTT to TAT), Q151M (CAG to ATG), M184V (ATG to GTG), L210W (TTG to TGG), T215Y (ACC to TAC), K219Q (AAA to CAA), L100I (TTA to ATA), K101E (AAA to GAA), K103N (AAA to AAC), V106A (GTA to GCA), V108I (GTA to ATA), Y181C (TAT to TGT), Y188L (TAT to TTA), G190A (GGA to GCA), P225H (CCT to CAT) and M230L (CCT to CAT). In some embodiments, the HIV-1 sequence comprises at least a portion of an HIV-1 gene selected from p7, pl, p6, HIV protease, reverse transcriptase, p51 RNAse, integrase and gp120. In some embodiments, the HIV-1 sequence comprises at least a portion of p7, pl, p6, HIV protease, reverse transcriptase and integrase. In certain embodiments, the HIV-1 sequence comprises at least a portion of 6p120, wherein the portion comprises the V1-V5 variable loops. In some embodiments, the HIV-1 sequence comprises a sequence that is at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or 100% identical to nucleotides 1900 through 5400 and/or 6300 through 7825 of the HXB2 strain of HIV-1 (SEQ ID NO: 4). In some embodiments, the HIV-1 sequence is identical to nucleotides 1900 through 5400 and/or 6300 through 7825of the HXB2 strain of HIV-1 (SEQ ID NO: 4) except for the presence of the mutations that convey a drug resistance phenotype. In some embodiments, the heterologous RNA sequence comprises a sequence at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or 100% identical to SEQ ID NO: 5 and/or SEQ ID NO: 7. In some embodiments, the heterologous RNA sequence comprises a sequence at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or 100% identical to SEQ ID NO: 6 and/or SEQ ID NO: 8.
[0064] In some embodiments, the Sindbis control viruses described herein comprise Ebolavirus sequence and are therefore useful as a control for Ebolavirus diagnostic assays. In some embodiments, the Ebolavirus sequence comprises at least a portion of an Ebolavirus GP gene sequence, an Ebolavirus NP gene sequence or an Ebolavirus VP24 gene sequence. In some embodiments, the heterologous RNA sequence does not encode a functional Ebola protein (e.g., the heterologous RNA sequence encodes truncated Ebola proteins, Ebola proteins with frame-shift mutations and/or Ebola protein sequences lacking a start codon). In some embodiments, the heterologous RNA sequence comprises a sequence at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or 100% identical to SEQ ID NO: 2 or SEQ ID NO: 3.
[0065] The Sindbis control viruses described herein can be generated using any method known in the art. An exemplary method of generating the Sindbis control viruses described herein is illustrated in FIG. 3. In this exemplary method, capped in vitro transcripts of recombinant RNA bearing sequence of interest and the helper RNA are first synthesized. The synthesized RNAs are then electroporated into an appropriate cell, such as a BHK cell. The Sindbis structural proteins are expressed, but since the RNA does not encode replicase enzymes, so no new RNA is transcribed. Recombinant RNA is packaged by the capsid proteins. Viral glycoproteins associate with the nucleocapsid and viral particles bud into the culture medium. The culture supernatant is then collected, filtered and heat inactivated. The viral titer can then be determined using an appropriate method, such as real-time PCR and appropriate quality control tests can be performed to ensure that the RNA is fully encapsulated and there is no contaminating template DNA.
Use of Sindbis Control Vectors in Nucleic Acid Diagnostic Assays
[0066] In certain aspects, provided herein are methods of testing a diagnostic assay by running the diagnostic assay on a composition comprising the replication deficient Sindbis virus described herein. In some embodiments, the diagnostic assay is an assay for the detection of Ebolavirus, an influenza virus, a SARS virus, a hepatitis C virus, a West Nile virus, a Zika virus, a poliovirus, a measles virus, an HIV-1 virus, an HIV-2 virus, an HTLV-I virus and/or an HTLV-II virus. In certain embodiments, the heterologous RNA sequence in the RNA genome of the replication deficient Sindbis virus contains the target sequence detected in the diagnostic assay.
[0067] In some embodiments, the diagnostic assay is a nucleic acid amplification based diagnostic assay. In some embodiments, the nucleic acid amplification based diagnostic assay includes a sample lysis step, a nucleic acid extraction step (e.g., a magnetic-bead based nucleic acid extraction step), a nucleic acid amplification step and/or a nucleic acid detection step. In some embodiments, the nucleic acid amplification and detection steps are performed simultaneously (e.g., through the use of a real-time detection technology, such as TaqMan probes or molecular beacons). Examples of nucleic acid amplification processes include, but are not limited to, polymerase chain reaction (PCR), LATE-PCR a non-symmetric PCR method of amplification, ligase chain reaction (LCR), strand displacement amplification (SDA), transcription mediated amplification (TMA), self-sustained sequence replication (3 SR), Q.beta. replicase based amplification, nucleic acid sequence-based amplification (NASBA), repair chain reaction (RCR), boomerang DNA amplification (BDA) and/or rolling circle amplification (RCA).
[0068] In some embodiments, the diagnostic assay is a nucleic acid sequencing based diagnostic assay (e.g., a next-generation sequencing based diagnostic assay). In some embodiments, the nucleic acid sequencing based diagnostic assay includes a sample lysis step, a nucleic acid extraction step (e.g., a magnetic-bead based nucleic acid extraction step), a nucleic acid amplification step, and/or a nucleic acid sequencing step. Examples of nucleic acid sequencing processes include, but are not limited to chain termination sequencing, sequencing by ligation, sequencing by synthesis, pyrosequencing, ion semiconductor sequencing, single-molecule real-time sequencing, 454 sequencing, and/or Dilute-`N`-Go sequencing.
EXAMPLES
Example 1
Production of an Ebola Sindbis Control Virus
[0069] Ebola is a Filovirus with a single stranded, negative sense RNA genome. The Ebola virus genome includes the glycoprotein gene (GP) and the nucleoprotein gene (NP); these two genes were the targets of common nucleic acid-based diagnostic assays.
[0070] Ebola Sindbis Control virus was generated to serve as a control in such diagnostic assays. Recombinant Sindbis constructs were designed by cloning either about 2 kb of Ebola Zaire GP gene sequence (SEQ ID NO: 9) or about 1.5 kb of NP gene sequence and about 0.5 kb of a third Ebola gene, VP24 (SEQ ID NO: 10) into the Xba I restriction site of a SinRep SC vector (SEQ ID NO: 1). To ensure that no functional Ebola proteins would be produced, the constructs were designed to encode severely truncated GP and NP gene sequences. The GP constructs also lacked the AUG start codon for translation initiation and the NP construct contained a large internal deletion that changes the reading frame. Engineered stop codons were introduced in both constructs. These measures increase the safety of the product, but do not interfere with target detection (primer and probe binding) of the targeted diagnostic assays.
[0071] SEQ ID NO: 9 is an exemplary complete GP Ebola Sindbis control virus genome. Nucleotides 1 to 7652 and 9708 to 10080 of SEQ ID NO: 9 are Sindbis gene sequences, and nucleotides 7653 to 9707 of SEQ ID NO: 9 are Ebola GP insert sequences. SEQ ID NO: 10 is an exemplary complete NP/VP24 Ebola Sindbis control virus genome. Nucleotides 1 to 7652 and 9976 to 10348 of SEQ ID NO: 10 are Sindbis gene sequences, and nucleotides 7653 to 9975 of SEQ ID NO: 10 are Ebola NP/VP24 insert sequences.
[0072] Capped Ebola Sindbis control virus RNA was transcribed in vitro along with the helper RNA and introduced into baby hamster kidney cells. At 24 hours post-transfection, the cell supernatant was collected and the viral particles were purified and concentrated. Heat treatment was performed using a time and temperature known to inactivate similar RNA viruses as a further safety precaution. After titering the viruses using a TaqMan reverse transcription PCR assay, the viruses were combined and diluted into defibrinated human plasma containing human genomic DNA and 0.09% sodium azide as a preservative.
[0073] Three independent lots of the Ebola Sindbis control virus were tested in a real-time nucleic acid amplification based diagnostic assay developed for the detection of Ebola Zaire virus. The control material was processed identically to how an unknown patient sample would be processed. Representative results of this assay are shown in Table 1. In this table, Ct is the Cycle threshold value and SAC is the sample adequacy control (which verifies human source DNA in the sample).
TABLE-US-00001 TABLE 1 Ebola Sindbis control virus tested in a Ebola detection diagnostic assay. Sample Input SAC ID Volume Test Result GP Ct NP Ct Ct Lot 1 250 .mu.L Ebola GP DETECTED; 29.5 28.5 35.0 Ebola NP DETECTED Lot 2 250 .mu.L Ebola GP DETECTED; 30.6 29.5 34.9 Ebola NP DETECTED Lot 3 250 .mu.L Ebola GP DETECTED; 30.3 29.5 34.4 Ebola NP DETECTED
Example 2
Stability of an Ebola Sindbis Control Virus
[0074] Stability of quality control materials is critical, especially considering that for many automated systems, reagents are loaded onto the instrument and must be stable at ambient temperatures for extended periods. Thus, the stability of the Ebola Sindbis Control virus produced as described in Example lunder various storage conditions was tested.
[0075] Vials of the Ebola Sindbis Control virus produced as described in Example 1 were subjected to 37.degree. C. At designated time points, vials were removed from the stress condition and extracted using the Qiagen QIAamp Viral RNA Mini Kit. Testing was performed via a TaqMan quantitative real time PCR assay. Results are shown in FIG. 4 and indicate no loss of stability after 22 days at 37.degree. C. Using a model based on the Arrhenius equation, this stability at 37.degree. C. correlates with a stability at a storage temperature of 2-8.degree. C. of at least a 2 years.
[0076] Vials of the Ebola Sindbis Control virus produced as described in Example lwere subjected to multiple rounds of freezing and thawing (F/T). As shown in FIG. 5, subjecting the Sindbis control virus to three freeze-thaw cycles did not have an adverse effect on the stability of the virus.
[0077] To test the extended stability of a Sindbis control vector at various temperatures, a recombinant Sindbis virus (bearing 0.8 Kb of target sequence) was diluted into defibrinated human plasma at 5.times.10.sup.5 copies/mL target concentration. The material was dispensed into vials and vials were stored frozen at -20.degree. C., refrigerated at 2-8.degree. C. or at ambient lab temperature (approximately 25.degree. C.) for up to 200 days. Vials were tested periodically using a TaqMan real time PCR test. No loss of stability was detected across the seven months of storage, even for samples stored at ambient temperatures. This demonstrates that the viral coat proteins and envelop of the Sindbis virus form a stable protective barrier that prevents nucleases in complex clinical matrices such as plasma from degrading the target RNA sequence.
Example 3
Production of HIV-1 Multiplex Drug Resistance Sindbis Control Virus
[0078] A Sindbis control virus was generated for use in diagnostic assays for the detection of drug resistant HIV-1 viruses. The Los Alamos National Laboratory HIV Sequence Database was used to generate a "reference sequence" for the control virus. Based on this database as well as the publication Special Contribution Update of the Drug Resistance Mutations in HIV-1: March 2013 by Victoria A. Johnson et al., in Topics in Antiviral Medicine, mutations in the HIV-1 genome that confer resistance to which therapeutic drugs were identified. These mutations and drugs are summarized in Table 2.
TABLE-US-00002 TABLE 2 Drug resistant mutations of HIV included in the HIV-1 multiplex drug resistance Sindbis control virus. Resistance DNA Sequence change Drug Class Therapy Mutations from reference sequence Protease Inhibitors Atazanavir +/- ritonavir L24I L24I (TTA to ATA) Darunavir/ritonavir D30N D30N (GAT to AAT) Fosamprenavir/ritonavir V32I V32I (GTA to ATA) Indinavir/ritonavir M46I M46I (ATG to ATA) Lopinavir/ritonavir I47V I47V (ATA to CTA) Nelfinavir G48V G48V (GGG to GTG) Saquinavir/ritonavir I50V I50V (ATT to GTT) Tipranavir/ritonavir I54M I54M (ATC to ATG) G73S G73S (GGT to GCT) L76V L76V (TTA to GTA) V82A V82A (GTC to GCC) I84V I84V (ATA to GTA) N88D N88D (AAT to GAT) L90M L90M (TTG to ATG Nucleoside and Abacavir M41L M41L (ATG to TTG) Nucleotide Analogue Didanosine K65R K65R (AAA to AGA) Reverse Emtricitabine D67N D67N (GAC to AAC) Transcriptase Lamivudine T69S insert SS T69S (ACT to TCT and Inhibitors (NRTI) Stavudine insertion of TCC TCC) Tenofovir K70R K70R (AAA to AGA) Zidovudine L74V L74V (TTA to GTA) F77L F77L (TTC to CTC) Y115F Y115F (TAT to TTT) F116Y F116Y (TTT to TAT) Q151M Q151M (CAG to ATG) M184VL210W M184V (ATG to GTG) L210W (TTG to TGG) T215Y T215Y (ACC to TAC) K219Q) K219Q (AAA to CAA) Non-Nucleoside Efavirenz L100I L100I (TTA to ATA) Analogue Reverse Etravirine K101E K101E (AAA to GAA) Transcriptase Nevirapine K103N K103N (AAA to AAC) Inhibitors (NNRTI) Rilpivirine V106A V106A (GTA to GCA) V108I V108I (GTA to ATA) Y181C Y181C (TAT to TGT) Y188L Y188L (TAT to TTA) G190A G190A (GGA to GCA) P225H P225H (CCT to CAT) M230L M230L (ATG to CTG)
[0079] In addition to the mutations described above, virus entry inhibitor drugs such as Miraviroc are blocked by mutations in the envelop gene. This drug is a CC chemokine receptor 5 (CCR5) antagonists and is only effective for patients with virus that uses the CCR5 co-receptor for viral entry. Viruses that use both CCR5 and CXC chemokine receptor 4 (CXCR4) or only CXCR4 will not respond to treatment with CCR5 antagonists. A virus's ability to use CXCR4 co-receptor is not defined by a single mutation, but instead is determined by the sequence of several variable "loops" in the gp120 envelop gene.
[0080] HXB2 strain of HIV-1 is a CXCR4 utilizing virus. HXB2 sequence is available from the Los Alamos National Laboratory HIV Sequence Database. Its sequence was used in the development of the recombinant virus representing the mutant CXCR4 virus. BaL strain of HIV-1 uses exclusively CCR5 co-receptor. Its sequence was obtained from the NCI database and used in the development of recombinant Sindbis virus representing wild type CCR5 virus.
[0081] Four DNA sequences were chemically synthesized and cloned into the Xba I restriction site of a SinRep SC Sindbis expression plasmid (SEQ ID NO: 1), which bears genes required for Sindbis virus production. Four Sindbis control viruses were generated, one that contained the 5' end of a wild-type HIV-1 genome, one that contained the 5' end of a multidrug resistant HIV-1 viral genome, one that contained the 3' end of a wild-type HIV-1 genome and one that contained the 3' end of a multidrug resistant HIV-1 viral genome. The insert sequences for these four control viruses are described in Table 3.
TABLE-US-00003 TABLE 3 Description of recombinant virus sequences. Genes included in the HXB2-Nucleotide Construct Designation sequence Positions 5' multi-mutant (SEQ ID part of p7, p1, p6, Protease, Contains continuous NO: 11) RT, p51 RNAse and sequence from nucleotides Integrase 1900 through 5400. The mutations shown in Table 1 are incorporated 5' WT (SEQ ID NO: 12) part of p7, p1, p6, Protease, Contains continuous RT, p51 RNAse and sequence from nucleotides Integrase 1900 through 5400. 3' mutant (SEQ ID NO: 13) A portion of gp120 nucleotides 6300-7825 of including V1-V5 variable HXB2 sequence are loops included 3' WT (SEQ ID NO: 14) A portion of gp120 The BaL sequence which including V1-V5 variable corresponds to HXB2 6300-7825 loops (as determined by BLAST alignment) is included
[0082] SEQ ID NO: 11 is the DNA counterpart to an exemplary complete 5' multi-mutant HIV-1 Sindbis control virus genome. Nucleotides 1 to 7646 and 11167 to 11655 indicate Sindbis gene sequences, and nucleotides 7647 to 11166 indicate multi-mutant HIV-1 insert sequences. SEQ ID NO: 12 is the DNA counterpart to an exemplary complete 5' wild-type HIV-1 Sindbis control virus genome. Nucleotides 1 to 7646 and 11161 to 11649 indicate Sindbis gene sequences, and nucleotides 7647 to 11160 indicate wild-type HIV-1 insert sequences. SEQ ID NO: 13 is the DNA counterpart to an exemplary complete 3' mutant HIV-1 Sindbis control virus genome. Nucleotides 1 to 7646 and 9187 to 9675 indicate Sindbis gene sequences, and nucleotides 7647 to 9186 indicate mutant HIV-1 insert sequences. SEQ ID NO: 14 is the DNA counterpart to an exemplary complete 3' wild-type HIV-1 Sindbis control virus genome. Nucleotides 1 to 7646 and 9182 to 9670 indicate Sindbis gene sequences, and nucleotides 7647 to 9181 indicate wild-type HIV-1 insert sequences.
[0083] The process used to produce the recombinant HIV-1 Sindbis control viruses is outlined in FIG. 7. Briefly plasmids that contain the target HIV-1 sequences in the SinRep vector were linearized with Not I restriction enzyme. An aliquot was analyzed by agarose gel electrophoresis to ensure that DNA cutting is complete.
[0084] Ambion mMessage mMachine SP6 kit was used for in vitro transcription of large amounts of capped RNA using reaction conditions optimized for long transcripts. DHBB is a helper RNA needed for packaging of the replication defective Sindbis virus; this helper RNA was transcribed from a linearized plasmid as well. The integrity and identity of the transcribed RNA was analyzed by denaturing agarose gel electrophoresis. The RNA was treated with DNAse to remove template plasmid DNA and purified using Ambion MegaClear kit.
[0085] To ensure optimal cell viability, BHK-21 (Baby Hamster Kidney cells) were amplified in culture for 2-4 passages after revival of frozen stock. Immediately prior to electroporation, the fetal bovine serum in the culture media was reduced, which helps reduce this cell's tendency to form clumps. Preventing cell clumps is desirable to maximize the transfection efficiency during electroporation.
[0086] The in vitro transcribed RNA was introduced into the BHK-21 cells via electroporation. The cells were washed at 6 hours post transfection to remove any unincorporated RNA.
[0087] The in vitro transcribed RNAs (HIV-1 sequences in SinRep RNA and DHBB helper RNA) were translated within the cells to produce the proteins required for recombinant Sindbis virus assembly and budding. The recombinant viruses were released into the culture media. The culture media was collected at 24 hours post transfection. The crude viral supernatant and the purified viruses were titered by extracting the viral nucleic acids using the Qiagen QlAamp Viral RNA mini kit and then using quantitative TaqMan real time PCR assay which targets a portion of the Sindbis viral vector RNA.
Example 4
Production of a Zika Sindbis Control Virus
[0088] Zika virus is a positive-sense, single-stranded RNA molecule of about 10794 bases long, and it codes a single polyprotein that is subsequently cleaved into capsid (C), precursor membrane (prM), envelope (E), and non-structural proteins (NS). Zika virus reference materials were designed based on a 2007 Zika virus strain with GenBank Accession number EU545988.1 (SEQ ID NO: 15). For the Zika Reference Materials, this genome was divided across four different constructs with at least .about.150 bp overlap between constructs and breakpoints at the ends of conserved domains. The overlap design is shown in FIGS. 9 and 10.
[0089] There was a 152 bp overlap between the "Zika Env" and "Zika NS2/NS3" construct, 150 bp overlap between "Zika NS2/NS3" and "Zika N54" construct and 180 bp overlap between "Zika N54" and "Zika NS5" constructs. These overlaps are designed to cover any diagnostic assays that target the ends of conserved domains. All four constructs were synthesized and introduced into Sindbis plasmids, which were used to prepare recombinant Sindbis virus.
[0090] The recombinant Zika/Sindbis virus were expressed, and high titer stock solutions of the viruses were prepared. The high titer stock solutions of recombinant Zika/Sindbis virus were diluted 1:100 in PBS, and RNA was extracted and eluted into 120 .mu.L of 1:10 diluted AVE buffer. Extracted RNA was assayed by droplet digital PCR using a one-step RT-ddPCR master mix (Bio-Rad, 186-4021) at neat and 1:10 dilutions. Vector specific primer/probe sets were used for quantifying all four constructs as shown in Table 4.
TABLE-US-00004 TABLE 4 Quantification of Zika Construct Copy Numbers in Zika Virus Reference Materials Average Copies Copies per copies per Copies per Back Per .mu.L of .mu.L mL of calculated 20 .mu.L Extracted Extracted Extracted copies per Sample Well RNA RNA Sample mL stock Zika Env 3580 716 6.81E+02 5.84E+05 5.84E+07 Zika Env 2800 560 Zika Env 3840 768 Zika Env 1:10 396 79.2 7.84E+01 6.72E+04 6.72E+07 Zika Env 1:10 388 77.6 Zika Env 1:10 392 78.4 Zika NS2/NS3 6040 1208 1.20E+03 1.03E+06 1.03E+08 Zika NS2/NS3 6360 1272 Zika NS2/NS3 5580 1116 Zika NS2/NS3 692 138.4 1.27E+02 1.09E+05 1.09E+08 1:10 Zika NS2/NS3 636 127.2 1:10 Zika NS2/NS3 578 115.6 1:10 Zika NS4 6460 1292 1.29E+03 1.10E+06 1.10E+08 Zika NS4 5380 1076 Zika NS4 7480 1496 Zika NS4 1:10 532 106.4 1.33E+02 1.14E+05 1.14E+08 Zika NS4 1:10 750 150 Zika NS4 1:10 718 143.6 Zika NS5 2528 505.6 5.33E+02 4.57E+05 4.57E+07 Zika NS5 2268 453.6 Zika NS5 3200 640 Zika NS5 1:10 264 52.8 5.13E+01 4.40E+04 4.40E+07 Zika NS5 1:10 196 39.2 Zika NS5 1:10 310 62
[0091] Based on the high titer stock concentration, a 35 mL bulk was formulated at 5.0E+05 copies/mL in filtered human plasma (Basematrix) containing 0.09% NaN.sub.3 diluent and human genomic DNA (H9 DNA, 50 ng/mL). A Pall Acropak 1000 Filter Capsule (PES RM-1002220) was used for filtering the plasma. To 900 mL of filtered plasma, 810 mg of sodium azide and 45 .mu.g of human genomic DNA was added and mixed for 15 minutes. All four constructs were targeted to 5.0E+05copies/mL in the prepared bulk. Bulk was mixed thoroughly for about 15 minutes, and RNA was extracted in triplicate and assayed using ddPCR with a One-Step RT-PCR master mix from Bio-Rad Laboratories (Catalogue# 186-4021). Assay specific primers/probe were used to quantify each construct. Data is shown in Table 5.
TABLE-US-00005 TABLE 5 Quantification of Zika Construct Copy Numbers in Zika Virus Reference Materials Formulated with Human Plasma Copies Copies per per .mu.L of Average 20 .mu.L Extracted Copies per Copies per Sample Conc. Well RNA mL of bulk mL of bulk Zika Env 50.8 1016 203.2 1.74E+05 1.72E+05 Zika Env 48.6 972 194.4 1.67E+05 Zika Env 50.7 1014 202.8 1.74E+05 Zika NS2/NS3 33.6 672 134.4 1.15E+05 1.22E+05 Zika NS2/NS3 37.3 746 149.2 1.28E+05 Zika NS2/NS3 36.2 724 144.8 1.24E+05 Zika NS4 125.6 2512 502.4 4.31E+05 4.07E+05 Zika NS4 117.2 2344 468.8 4.02E+05 Zika NS4 113.1 2262 452.4 3.88E+05 Zika NS5 207 4140 828 7.10E+05 6.88E+05 Zika NS5 197 3940 788 6.75E+05 Zika NS5 198 3960 792 6.79E+05
[0092] An Altona Realstar Zika RT-PCR assay was performed on the extracted RNA from prepared bulk. The Altona Zika RT-PCR assay is a qualitative assay that gives a Positive or Negative result as shown in Table 6. Data is shown for both Zika and internal control analytes. The internal control (IC Zika (JOE)) should be detected in all negative and positive wells for a valid result, whereas Zika signal (Zika (FAM)) should be detected only in Positive wells. Bulk was tested in five replicates with Ct values around 28. Negative control was undetermined as expected, and the positive control Ct was 32.
TABLE-US-00006 TABLE 6 Altona Realstar Zika RT-PCR Assay Performed on Zika Virus Reference Materials Formulated with Human Plasma Well Sample Name Detector Task Ct Result A1 Zika Bulk Zika (FAM) Unknown 28.4665 POSITIVE A1 Zika Bulk IC Zika (JOE) Unknown 30.7525 VALID A2 Zika Bulk Zika (FAM) Unknown 28.5619 POSITIVE A2 Zika Bulk IC Zika (JOE) Unknown 30.9062 VALID A3 Zika Bulk Zika (FAM) Unknown 28.5517 POSITIVE A3 Zika Bulk IC Zika (JOE) Unknown 30.9627 VALID A4 Zika Bulk Zika (FAM) Unknown 28.7069 POSITIVE A4 Zika Bulk IC Zika (JOE) Unknown 31.0635 VALID A5 Zika Bulk Zika (FAM) Unknown 28.6494 POSITIVE A5 Zika Bulk IC Zika (JOE) Unknown 31.0911 VALID C1 Negative control Zika (FAM) NTC Undetermined NEGATIVE C1 Negative control IC Zika (JOE) NTC 30.8181 VALID C2 Positive Control Zika (FAM) Unknown 32.0931 POSITIVE C2 Positive Control IC Zika (JOE) Unknown 30.809 VALID
[0093] 6 mL of prepared bulk was sent to a commercial laboratory for bioburden testing. The bioburden result was 0 cfu/mL for bacterial growth and the Zika reference materials passed the acceptance criteria (<100cfu/mL or No growth).
[0094] Extracted viral RNA from recombinant Sindbis virus was sequence-verified by Sanger sequencing. All four constructs were PCR amplified at the beginning and end of the insert, and each nucleotide sequence displayed 100% sequence homology with the EU545988.1 sequence used to design the constructs (SEQ ID NO:15).
Example 5
Stability of an Influenza Sindbis Control Virus
[0095] An influenza reference material comprising an 800-nucleotide sequence of the H7N9 influenza virus was constructed using methods similar to those described above. The influenza reference material was diluted into aqueous buffer or defibrinated human plasma at 5.times.10.sup.5 copies/mL in a commutable matrix. The material was dispensed into vials and stored at -20.degree. C., 4.degree. C., or room temperature (.about.25.degree. C.). Vials were tested periodically using a laboratory developed H7N9 TaqMan real time PCR test. As shown in FIGS. 11 and 12, the influenza reference material stored at ambient temperature for 500 days was stable as only .about.15% loss of signal was observed. This stability profile suggests that the viral coat and envelope proteins form a stable protective barrier that prevents nucleases in complex clinical matrices (such as plasma) from degrading the target RNA sequence.
INCORPORATION BY REFERENCE
[0096] All publications, patents, and patent applications mentioned herein are hereby incorporated by reference in their entirety as if each individual publication, patent or patent application was specifically and individually indicated to be incorporated by reference. In case of conflict, the present application, including any definitions herein, will control.
EQUIVALENTS
[0097] Those skilled in the art will recognize, or be able to ascertain using no more than routine experimentation, many equivalents to the specific embodiments of the invention described herein. Such equivalents are intended to be encompassed by the following claims.
Sequence CWU
1
1
16110052DNAArtificial SequenceDescription of Artificial Sequence Synthetic
polynucleotide 1attgacggcg tagtacacac tattgaatca aacagccgac
caattgcact accatcacaa 60tggagaagcc agtagtaaac gtagacgtag acccccagag
tccgtttgtc gtgcaactgc 120aaaaaagctt cccgcaattt gaggtagtag cacagcaggt
cactccaaat gaccatgcta 180atgccagagc attttcgcat ctggccagta aactaatcga
gctggaggtt cctaccacag 240cgacgatctt ggacataggc agcgcaccgg ctcgtagaat
gttttccgag caccagtatc 300attgtgtctg ccccatgcgt agtccagaag acccggaccg
catgatgaaa tacgccagta 360aactggcgga aaaagcgtgc aagattacaa acaagaactt
gcatgagaag attaaggatc 420tccggaccgt acttgatacg ccggatgctg aaacaccatc
gctctgcttt cacaacgatg 480ttacctgcaa catgcgtgcc gaatattccg tcatgcagga
cgtgtatatc aacgctcccg 540gaactatcta tcatcaggct atgaaaggcg tgcggaccct
gtactggatt ggcttcgaca 600ccacccagtt catgttctcg gctatggcag gttcgtaccc
tgcgtacaac accaactggg 660ccgacgagaa agtccttgaa gcgcgtaaca tcggactttg
cagcacaaag ctgagtgaag 720gtaggacagg aaaattgtcg ataatgagga agaaggagtt
gaagcccggg tcgcgggttt 780atttctccgt aggatcgaca ctttatccag aacacagagc
cagcttgcag agctggcatc 840ttccatcggt gttccacttg aatggaaagc agtcgtacac
ttgccgctgt gatacagtgg 900tgagttgcga aggctacgta gtgaagaaaa tcaccatcag
tcccgggatc acgggagaaa 960ccgtgggata cgcggttaca cacaatagcg agggcttctt
gctatgcaaa gttactgaca 1020cagtaaaagg agaacgggta tcgttccctg tgtgcacgta
catcccggcc accatatgcg 1080atcagatgac tggtataatg gccacggata tatcacctga
cgatgcacaa aaacttctgg 1140ttgggctcaa ccagcgaatt gtcattaacg gtaggactaa
caggaacacc aacaccatgc 1200aaaattacct tctgccgatc atagcacaag ggttcagcaa
atgggctaag gagcgcaagg 1260atgatcttga taacgagaaa atgctgggta ctagagaacg
caagcttacg tatggctgct 1320tgtgggcgtt tcgcactaag aaagtacatt cgttttatcg
cccacctgga acgcagacct 1380gcgtaaaagt cccagcctct tttagcgctt ttcccatgtc
gtccgtatgg acgacctctt 1440tgcccatgtc gctgaggcag aaattgaaac tggcattgca
accaaagaag gaggaaaaac 1500tgctgcaggt ctcggaggaa ttagtcatgg aggccaaggc
tgcttttgag gatgctcagg 1560aggaagccag agcggagaag ctccgagaag cacttccacc
attagtggca gacaaaggca 1620tcgaggcagc cgcagaagtt gtctgcgaag tggaggggct
ccaggcggac atcggagcag 1680cattagttga aaccccgcgc ggtcacgtaa ggataatacc
tcaagcaaat gaccgtatga 1740tcggacagta tatcgttgtc tcgccaaact ctgtgctgaa
gaatgccaaa ctcgcaccag 1800cgcacccgct agcagatcag gttaagatca taacacactc
cggaagatca ggaaggtacg 1860cggtcgaacc atacgacgct aaagtactga tgccagcagg
aggtgccgta ccatggccag 1920aattcctagc actgagtgag agcgccacgt tagtgtacaa
cgaaagagag tttgtgaacc 1980gcaaactata ccacattgcc atgcatggcc ccgccaagaa
tacagaagag gagcagtaca 2040aggttacaaa ggcagagctt gcagaaacag agtacgtgtt
tgacgtggac aagaagcgtt 2100gcgttaagaa ggaagaagcc tcaggtctgg tcctctcggg
agaactgacc aaccctccct 2160atcatgagct agctctggag ggactgaaga cccgacctgc
ggtcccgtac aaggtcgaaa 2220caataggagt gataggcaca ccggggtcgg gcaagtcagc
tattatcaag tcaactgtca 2280cggcacgaga tcttgttacc agcggaaaga aagaaaattg
tcgcgaaatt gaggccgacg 2340tgctaagact gaggggtatg cagattacgt cgaagacagt
agattcggtt atgctcaacg 2400gatgccacaa agccgtagaa gtgctgtacg ttgacgaagc
gttcgcgtgc cacgcaggag 2460cactacttgc cttgattgct atcgtcaggc cccgcaagaa
ggtagtacta tgcggagacc 2520ccatgcaatg cggattcttc aacatgatgc aactaaaggt
acatttcaat caccctgaaa 2580aagacatatg caccaagaca ttctacaagt atatctcccg
gcgttgcaca cagccagtta 2640cagctattgt atcgacactg cattacgatg gaaagatgaa
aaccacgaac ccgtgcaaga 2700agaacattga aatcgatatt acaggggcca caaagccgaa
gccaggggat atcatcctga 2760catgtttccg cgggtgggtt aagcaattgc aaatcgacta
tcccggacat gaagtaatga 2820cagccgcggc ctcacaaggg ctaaccagaa aaggagtgta
tgccgtccgg caaaaagtca 2880atgaaaaccc actgtacgcg atcacatcag agcatgtgaa
cgtgttgctc acccgcactg 2940aggacaggct agtgtggaaa accttgcagg gcgacccatg
gattaagcag cccactaaca 3000tacctaaagg aaactttcag gctactatag aggactggga
agctgaacac aagggaataa 3060ttgctgcaat aaacagcccc actccccgtg ccaatccgtt
cagctgcaag accaacgttt 3120gctgggcgaa agcattggaa ccgatactag ccacggccgg
tatcgtactt accggttgcc 3180agtggagcga actgttccca cagtttgcgg atgacaaacc
acattcggcc atttacgcct 3240tagacgtaat ttgcattaag tttttcggca tggacttgac
aagcggactg ttttctaaac 3300agagcatccc actaacgtac catcccgccg attcagcgag
gccggtagct cattgggaca 3360acagcccagg aacccgcaag tatgggtacg atcacgccat
tgccgccgaa ctctcccgta 3420gatttccggt gttccagcta gctgggaagg gcacacaact
tgatttgcag acggggagaa 3480ccagagttat ctctgcacag cataacctgg tcccggtgaa
ccgcaatctt cctcacgcct 3540tagtccccga gtacaaggag aagcaacccg gcccggtcaa
aaaattcttg aaccagttca 3600aacaccactc agtacttgtg gtatcagagg aaaaaattga
agctccccgt aagagaatcg 3660aatggatcgc cccgattggc atagccggtg cagataagaa
ctacaacctg gctttcgggt 3720ttccgccgca ggcacggtac gacctggtgt tcatcaacat
tggaactaaa tacagaaacc 3780accactttca gcagtgcgaa gaccatgcgg cgaccttaaa
aaccctttcg cgttcggccc 3840tgaattgcct taacccagga ggcaccctcg tggtgaagtc
ctatggctac gccgaccgca 3900acagtgagga cgtagtcacc gctcttgcca gaaagtttgt
cagggtgtct gcagcgagac 3960cagattgtgt ctcaagcaat acagaaatgt acctgatttt
ccgacaacta gacaacagcc 4020gtacacggca attcaccccg caccatctga attgcgtgat
ttcgtccgtg tatgagggta 4080caagagatgg agttggagcc gcgccgtcat accgcaccaa
aagggagaat attgctgact 4140gtcaagagga agcagttgtc aacgcagcca atccgctggg
tagaccaggc gaaggagtct 4200gccgtgccat ctataaacgt tggccgacca gttttaccga
ttcagccacg gagacaggca 4260ccgcaagaat gactgtgtgc ctaggaaaga aagtgatcca
cgcggtcggc cctgatttcc 4320ggaagcaccc agaagcagaa gccttgaaat tgctacaaaa
cgcctaccat gcagtggcag 4380acttagtaaa tgaacataac atcaagtctg tcgccattcc
actgctatct acaggcattt 4440acgcagccgg aaaagaccgc cttgaagtat cacttaactg
cttgacaacc gcgctagaca 4500gaactgacgc ggacgtaacc atctattgcc tggataagaa
gtggaaggaa agaatcgacg 4560cggcactcca acttaaggag tctgtaacag agctgaagga
tgaagatatg gagatcgacg 4620atgagttagt atggattcat ccagacagtt gcttgaaggg
aagaaaggga ttcagtacta 4680caaaaggaaa attgtattcg tacttcgaag gcaccaaatt
ccatcaagca gcaaaagaca 4740tggcggagat aaaggtcctg ttccctaatg accaggaaag
taatgaacaa ctgtgtgcct 4800acatattggg tgagaccatg gaagcaatcc gcgaaaagtg
cccggtcgac cataacccgt 4860cgtctagccc gcccaaaacg ttgccgtgcc tttgcatgta
tgccatgacg ccagaaaggg 4920tccacagact tagaagcaat aacgtcaaag aagttacagt
atgctcctcc accccccttc 4980ctaagcacaa aattaagaat gttcagaagg ttcagtgcac
gaaagtagtc ctgtttaatc 5040cgcacactcc cgcattcgtt cccgcccgta agtacataga
agtgccagaa cagcctaccg 5100ctcctcctgc acaggccgag gaggcccccg aagttgtagc
gacaccgtca ccatctacag 5160ctgataacac ctcgcttgat gtcacagaca tctcactgga
tatggatgac agtagcgaag 5220gctcactttt ttcgagcttt agcggatcgg acaactctat
tactagtatg gacagttggt 5280cgtcaggacc tagttcacta gagatagtag accgaaggca
ggtggtggtg gctgacgttc 5340atgccgtcca agagcctgcc cctattccac cgccaaggct
aaagaagatg gcccgcctgg 5400cagcggcaag aaaagagccc actccaccgg caagcaatag
ctctgagtcc ctccacctct 5460cttttggtgg ggtatccatg tccctcggat caattttcga
cggagagacg gcccgccagg 5520cagcggtaca acccctggca acaggcccca cggatgtgcc
tatgtctttc ggatcgtttt 5580ccgacggaga gattgatgag ctgagccgca gagtaactga
gtccgaaccc gtcctgtttg 5640gatcatttga accgggcgaa gtgaactcaa ttatatcgtc
ccgatcagcc gtatcttttc 5700cactacgcaa gcagagacgt agacgcagga gcaggaggac
tgaatactga ctaaccgggg 5760taggtgggta catattttcg acggacacag gccctgggca
cttgcaaaag aagtccgttc 5820tgcagaacca gcttacagaa ccgaccttgg agcgcaatgt
cctggaaaga attcatgccc 5880cggtgctcga cacgtcgaaa gaggaacaac tcaaactcag
gtaccagatg atgcccaccg 5940aagccaacaa aagtaggtac cagtctcgta aagtagaaaa
tcagaaagcc ataaccactg 6000agcgactact gtcaggacta cgactgtata actctgccac
agatcagcca gaatgctata 6060agatcaccta tccgaaacca ttgtactcca gtagcgtacc
ggcgaactac tccgatccac 6120agttcgctgt agctgtctgt aacaactatc tgcatgagaa
ctatccgaca gtagcatctt 6180atcagattac tgacgagtac gatgcttact tggatatggt
agacgggaca gtcgcctgcc 6240tggatactgc aaccttctgc cccgctaagc ttagaagtta
cccgaaaaaa catgagtata 6300gagccccgaa tatccgcagt gcggttccat cagcgatgca
gaacacgcta caaaatgtgc 6360tcattgccgc aactaaaaga aattgcaacg tcacgcagat
gcgtgaactg ccaacactgg 6420actcagcgac attcaatgtc gaatgctttc gaaaatatgc
atgtaatgac gagtattggg 6480aggagttcgc tcggaagcca attaggatta ccactgagtt
tgtcaccgca tatgtagcta 6540gactgaaagg ccctaaggcc gccgcactat ttgcaaagac
gtataatttg gtcccattgc 6600aagaagtgcc tatggataga ttcgtcatgg acatgaaaag
agacgtgaaa gttacaccag 6660gcacgaaaca cacagaagaa agaccgaaag tacaagtgat
acaagccgca gaacccctgg 6720cgactgctta cttatgcggg attcaccggg aattagtgcg
taggcttacg gccgtcttgc 6780ttccaaacat tcacacgctt tttgacatgt cggcggagga
ttttgatgca atcatagcag 6840aacacttcaa gcaaggcgac ccggtactgg agacggatat
cgcatcattc gacaaaagcc 6900aagacgacgc tatggcgtta accggtctga tgatcttgga
ggacctgggt gtggatcaac 6960cactactcga cttgatcgag tgcgcctttg gagaaatatc
atccacccat ctacctacgg 7020gtactcgttt taaattcggg gcgatgatga aatccggaat
gttcctcaca ctttttgtca 7080acacagtttt gaatgtcgtt atcgccagca gagtactaga
agagcggctt aaaacgtcca 7140gatgtgcagc gttcattggc gacgacaaca tcatacatgg
agtagtatct gacaaagaaa 7200tggctgagag gtgcgccacc tggctcaaca tggaggttaa
gatcatcgac gcagtcatcg 7260gtgagagacc accttacttc tgcggcggat ttatcttgca
agattcggtt acttccacag 7320cgtgccgcgt ggcggatccc ctgaaaaggc tgtttaagtt
gggtaaaccg ctcccagccg 7380acgacgagca agacgaagac agaagacgcg ctctgctaga
tgaaacaaag gcgtggttta 7440gagtaggtat aacaggcact ttagcagtgg ccgtgacgac
ccggtatgag gtagacaata 7500ttacacctgt cctactggca ttgagaactt ttgcccagag
caaaagagca ttccaagcca 7560tcagagggga aataaagcat ctctacggtg gtcctaaata
gtcagcatag tacatttcat 7620ctgactaata ctacaacacc accacctcta gagtttaaac
aggcctggcg cgccacgtga 7680cgcgtgcatg catttaaata tcgagggcag gagcgagagg
gccaaggcca gggagggcgg 7740ccaccaccat caccaccatc accattagta atgaggtaac
cgtggggccc aatgatccga 7800ccagcaaaac tcgatgtact tccgaggaac tgatgtgcat
aatgcatcag gctggtacat 7860tagatccccg cttaccgcgg gcaatatagc aacactaaaa
actcgatgta cttccgagga 7920agcgcagtgc ataatgctgc gcagtgttgc cacataacca
ctatattaac catttatcta 7980gcggacgcca aaaactcaat gtatttctga ggaagcgtgg
tgcataatgc cacgcagcgt 8040ctgcataact tttattattt cttttattaa tcaacaaaat
tttgttttta acatttcaaa 8100aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaagggaat
tcctcgatta attaagcggc 8160cgctcgaggg gaattaattc ttgaagacga aagggccagg
tggcactttt cggggaaatg 8220tgcgcggaac ccctatttgt ttatttttct aaatacattc
aaatatgtat ccgctcatga 8280gacaataacc ctgataaatg cttcaataat attgaaaaag
gaagagtatg agtattcaac 8340atttccgtgt cgcccttatt cccttttttg cggcattttg
ccttcctgtt tttgctcacc 8400cagaaacgct ggtgaaagta aaagatgctg aagatcagtt
gggtgcacga gtgggttaca 8460tcgaactgga tctcaacagc ggtaagatcc ttgagagttt
tcgccccgaa gaacgttttc 8520caatgatgag cacttttaaa gttctgctat gtggcgcggt
attatcccgt gttgacgccg 8580ggcaagagca actcggtcgc cgcatacact attctcagaa
tgacttggtt gagtactcac 8640cagtcacaga aaagcatctt acggatggca tgacagtaag
agaattatgc agtgctgcca 8700taaccatgag tgataacact gcggccaact tacttctgac
aacgatcgga ggaccgaagg 8760agctaaccgc ttttttgcac aacatggggg atcatgtaac
tcgccttgat cgttgggaac 8820cggagctgaa tgaagccata ccaaacgacg agcgtgacac
cacgatgcct gtagcaatgg 8880caacaacgtt gcgcaaacta ttaactggcg aactacttac
tctagcttcc cggcaacaat 8940taatagactg gatggaggcg gataaagttg caggaccact
tctgcgctcg gcccttccgg 9000ctggctggtt tattgctgat aaatctggag ccggtgagcg
tgggtctcgc ggtatcattg 9060cagcactggg gccagatggt aagccctccc gtatcgtagt
tatctacacg acggggagtc 9120aggcaactat ggatgaacga aatagacaga tcgctgagat
aggtgcctca ctgattaagc 9180attggtaact gtcagaccaa gtttactcat atatacttta
gattgattta aaacttcatt 9240tttaatttaa aaggatctag gtgaagatcc tttttgataa
tctcatgacc aaaatccctt 9300aacgtgagtt ttcgttccac tgagcgtcag accccgtaga
aaagatcaaa ggatcttctt 9360gagatccttt ttttctgcgc gtaatctgct gcttgcaaac
aaaaaaacca ccgctaccag 9420cggtggtttg tttgccggat caagagctac caactctttt
tccgaaggta actggcttca 9480gcagagcgca gataccaaat actgtccttc tagtgtagcc
gtagttaggc caccacttca 9540agaactctgt agcaccgcct acatacctcg ctctgctaat
cctgttacca gtggctgctg 9600ccagtggcga taagtcgtgt cttaccgggt tggactcaag
acgatagtta ccggataagg 9660cgcagcggtc gggctgaacg gggggttcgt gcacacagcc
cagcttggag cgaacgacct 9720acaccgaact gagataccta cagcgtgagc attgagaaag
cgccacgctt cccgaaggga 9780gaaaggcgga caggtatccg gtaagcggca gggtcggaac
aggagagcgc acgagggagc 9840ttccaggggg aaacgcctgg tatctttata gtcctgtcgg
gtttcgccac ctctgacttg 9900agcgtcgatt tttgtgatgc tcgtcagggg ggcggagcct
atggaaaaac gccagcaacg 9960cgagctcgta tggacatatt gtcgttagaa cgcggctaca
attaatacat aaccttatgt 10020atcatacaca tacgatttag gggacactat ag
1005221950DNAZaire ebolavirus 2aggaatattg cagttacctc
gtgatcgatt caagaggaca tcattctttc tttgggtaat 60tatccttttc caaagaacat
tttccatccc gcttggagtt atccacaata gtacattaca 120ggttagtgat gtcgacaaac
tagtttgtcg tgacaaactg tcatccacaa atcaattgag 180atcagttgga ctgaatctcg
aggggaatgg agtggcaact gacgtgccat ctgcgactaa 240aagatggggc ttcaggtccg
gtgtcccacc aaaggtggtc aattatgaag ctggtgaatg 300ggctgaaaac tgctacaatc
ttgaaatcaa aaaacctgac gggagtgagt gtctaccagc 360agcgccagac gggattcggg
gcttcccccg gtgccggtat gtgcacaaag tatcaggaac 420gggaccatgt gccggagact
ttgccttcca caaagagggt gctttcttcc tgtatgatcg 480acttgcttcc acagttatct
accgaggaac gactttcgct gaaggtgtcg ttgcatttct 540gatactgccc caagctaaga
aggacttctt cagctcacac cccttgagag agccggtcaa 600tgcaacggag gacccgtcga
gtggctatta ttctaccaca attagatatc aggctaccgg 660ttttggaact aatgagacag
agtacttgtt cgaggttgac aatttgacct acgtccaact 720tgaatcaaga ttcacaccac
agtttctgct ccagctgaat gagacaatat atgcaagtgg 780gaagaggagc aacaccacgg
gaaaactaat ttggaaggtc aaccccgaaa ttgatacaac 840aatcggggag tgggccttca
gggaaactaa aaaaacctca ctagaaaaat tcgcagtgaa 900gagttgtctt tcacagctgt
atcaaacgga cccaaaaaca tcagtggtca gagtccggcg 960cgaacttctt ccgacccaga
gaccaacaca acaaatgaag accacaaaat catggcttca 1020gaaaattcct ctgcaatggt
tcaagtgcac agtcaaggaa ggaaagctgc agtgtcgcat 1080ctgacaaccc ttgccacaat
ctccacgagt cctcaacctc ccacaaccaa aacaggtccg 1140gacaacagca cccataatac
acccgtgtat aaacttgaca tctctgaggc aactcaagtt 1200ggacaacatc accgtagagc
agacaacgac agcacagcct ccgacactcc ccccgccacg 1260accgcagccg gacccttaaa
agcagagaac accaacacga gtaagagcgc tgactccctg 1320gacctcgcca ccacgataag
cccccaaaac tacagcgaga ctgctggcaa caacaacact 1380catcaccaag ataccggaga
agagagtgcc agcagcggga agctaggctt aattaccaat 1440actattgctg gagtagcagg
actgatcaca ggcgggagaa ggactcgaag agaagtaatt 1500gtcaatgctc aacccaaatg
caaccccaat ttacattact ggactactca ggatgaaggt 1560gctgcaatcg gattggcctg
gataccatat ttcgggccag cagccgaagg aatttacaca 1620gaggggctaa tgcacaacca
agatggttta atctgtgggt tgaggcagct ggccaacgaa 1680acgactcaag ctctccaact
gttcctgaga gccacaactg agctgcgaac cttttcaatc 1740ctcaaccgta aggcaattga
cttcctgctg cagcgatggg gtggcacatg ccacattttg 1800ggaccggact gctgtatcga
accacatgat tggaccaaga acataacaga caaaattgat 1860cagattattc atgattttgt
tgataaaacc cttccggacc agggggacaa tgacaattgg 1920tggacaggat ggagacaatg
gataccggca 195032217DNAArtificial
SequenceDescription of Artificial Sequence Synthetic
polynucleotideMISC_FEATURE(1)..(1577)Ebola NP gene
sequenceMISC_FEATURE(1578)..(2127)Ebola VP24
sequenceMISC_FEATURE(2128)..(2217)Human RNAse P internal control sequence
3tactgtaatc atacctggtt tgtttcagag ccatatcacc aagatagaga acaacctagg
60tctccggagg gggcaagggc atcagtgtgc tcagttgaaa atcccttgtc aacatctagg
120ccttatcaca tcacaagttc cgccttaaac tctgcagggt gatccaacaa ccttaatagc
180aacattattg ttaaaggaca gcattagttc acagtcaaac aagcaagatt gagaattaac
240tttgattttg aacctgaaca cccagaggac tggagactca acaaccctaa agcctggggt
300aaaacattag aaatagttta aagacaaatt gctcggaatc acaaaattcc gagtatggat
360tctcgtcctc agaaagtctg gtagacgccg agtctcactg aatctgacat ggattaccac
420aagatcttga cagcaggtct gtccgttcaa caggggattg ttcggcaaag agtcatccca
480gtgtatcaag taaacaatct tgaggaaatt tgccaactta tcatacaggc ctttgaagct
540ggtgttgatt ttcaagagag tgcggacagt ttccttctca tgctttgtct tcatcatgcg
600taccaaggag attacaaact tttcttggaa agtggcgcag tcaagtattt ggaagggcac
660gggttccgtt ttgaagtcaa gaagcgtgat ggagtgaagc gccttgagga attgctgcca
720gcagtatcta gtgggagaaa cattaagaga acacttgctg ccatgccgga agaggagacg
780acttaatgcc ggacatgatg ccaacgatgc tgtgatttca aattcagtgg ctcaagctcg
840tttttcaggt ctattgattg tcaaaacagt acttgatcat atcctacaaa agacagaacg
900aggagttcgt ctccatcctc ttgcaaggac cgccaaggta aaaaatgagg tgaactcctt
960caaggctgca ctcagctccc tggccaagca tggagagtat gctcctttcg cccgactttt
1020gaacctttct ggagtaaata atcttgagca tggtcttttc cctcaactgt cggcaattgc
1080actcggagtc gccacagccc acgggagcac cctcgcagga gtaaatgttg gagaacagta
1140tcaacagctc agagaggcag ccactgaggc tgagaagcaa ctccaacaat atgcggagtc
1200tcgtgaactt gaccatcttg gacttgatga tcaggaaaag aaaattctta tgaacttcca
1260tcagaaaaag aacgaaatca gcttccagca aacaaacgcg atggtaactc taagaaaaga
1320gcgcctggcc aagctgacag aagctatcac tgctgcatca ctgcccaaaa caagtggaca
1380ttacgatgat gatgacgaca ttccctttcc aggacccatc aatgatgacg acaatcctgg
1440ccatcaagat gatgatccga ctgactcaca ggatacgacc attcccgatg tggtagttga
1500tcccgatgat ggaggctacg gcgaatacca aagttactcg gaaaacggca tgagtgcacc
1560agatgacttg gtcctatgtc ttttagctgt ataccagttg cccctgagat acgccacaaa
1620agtgtctctg agctaaagtg gtctgtacac atctcataca ttgtattagg ggcaataata
1680tctaattgaa cttagccatt taaaatttag tgcataaatc tgggctaact ccaccaggtc
1740aactccattg gctgaaaaga agcccaccta caacgaacat tactttgagc gccctcacaa
1800ttaaaaaata agagcgtcgt tccaacaatc gagcgcaagg ttacaaggtt gaactgagag
1860tgtctagaca acaaaatatc gatactccag acaccaagca agacctgaga aaaaaccatg
1920gccaaagcta cgggacgata caatctaata tcgcccaaaa aggacctgga gaaaggggtt
1980gtcttaagcg acctctgtaa cttcttagtt agtcaaacta ttcaagggtg gaaagtttat
2040tgggctggta ttgagtttga tgtgactcac aaaggaatgg ccctattgca tagactgaaa
2100actaatgact ttgcccctgc atggtcatgg cggtgtttgc agatttggac ctgcgagcgg
2160gttctgacct gaaggctctg cgcggacttg tggagacagc cgctcacctt ggctatt
221749719DNAHuman immunodeficiency virus 4tggaagggct aattcactcc
caacgaagac aagatatcct tgatctgtgg atctaccaca 60cacaaggcta cttccctgat
tagcagaact acacaccagg gccagggatc agatatccac 120tgacctttgg atggtgctac
aagctagtac cagttgagcc agagaagtta gaagaagcca 180acaaaggaga gaacaccagc
ttgttacacc ctgtgagcct gcatggaatg gatgacccgg 240agagagaagt gttagagtgg
aggtttgaca gccgcctagc atttcatcac atggcccgag 300agctgcatcc ggagtacttc
aagaactgct gacatcgagc ttgctacaag ggactttccg 360ctggggactt tccagggagg
cgtggcctgg gcgggactgg ggagtggcga gccctcagat 420cctgcatata agcagctgct
ttttgcctgt actgggtctc tctggttaga ccagatctga 480gcctgggagc tctctggcta
actagggaac ccactgctta agcctcaata aagcttgcct 540tgagtgcttc aagtagtgtg
tgcccgtctg ttgtgtgact ctggtaacta gagatccctc 600agaccctttt agtcagtgtg
gaaaatctct agcagtggcg cccgaacagg gacctgaaag 660cgaaagggaa accagaggag
ctctctcgac gcaggactcg gcttgctgaa gcgcgcacgg 720caagaggcga ggggcggcga
ctggtgagta cgccaaaaat tttgactagc ggaggctaga 780aggagagaga tgggtgcgag
agcgtcagta ttaagcgggg gagaattaga tcgatgggaa 840aaaattcggt taaggccagg
gggaaagaaa aaatataaat taaaacatat agtatgggca 900agcagggagc tagaacgatt
cgcagttaat cctggcctgt tagaaacatc agaaggctgt 960agacaaatac tgggacagct
acaaccatcc cttcagacag gatcagaaga acttagatca 1020ttatataata cagtagcaac
cctctattgt gtgcatcaaa ggatagagat aaaagacacc 1080aaggaagctt tagacaagat
agaggaagag caaaacaaaa gtaagaaaaa agcacagcaa 1140gcagcagctg acacaggaca
cagcaatcag gtcagccaaa attaccctat agtgcagaac 1200atccaggggc aaatggtaca
tcaggccata tcacctagaa ctttaaatgc atgggtaaaa 1260gtagtagaag agaaggcttt
cagcccagaa gtgataccca tgttttcagc attatcagaa 1320ggagccaccc cacaagattt
aaacaccatg ctaaacacag tggggggaca tcaagcagcc 1380atgcaaatgt taaaagagac
catcaatgag gaagctgcag aatgggatag agtgcatcca 1440gtgcatgcag ggcctattgc
accaggccag atgagagaac caaggggaag tgacatagca 1500ggaactacta gtacccttca
ggaacaaata ggatggatga caaataatcc acctatccca 1560gtaggagaaa tttataaaag
atggataatc ctgggattaa ataaaatagt aagaatgtat 1620agccctacca gcattctgga
cataagacaa ggaccaaagg aaccctttag agactatgta 1680gaccggttct ataaaactct
aagagccgag caagcttcac aggaggtaaa aaattggatg 1740acagaaacct tgttggtcca
aaatgcgaac ccagattgta agactatttt aaaagcattg 1800ggaccagcgg ctacactaga
agaaatgatg acagcatgtc agggagtagg aggacccggc 1860cataaggcaa gagttttggc
tgaagcaatg agccaagtaa caaattcagc taccataatg 1920atgcagagag gcaattttag
gaaccaaaga aagattgtta agtgtttcaa ttgtggcaaa 1980gaagggcaca cagccagaaa
ttgcagggcc cctaggaaaa agggctgttg gaaatgtgga 2040aaggaaggac accaaatgaa
agattgtact gagagacagg ctaatttttt agggaagatc 2100tggccttcct acaagggaag
gccagggaat tttcttcaga gcagaccaga gccaacagcc 2160ccaccagaag agagcttcag
gtctggggta gagacaacaa ctccccctca gaagcaggag 2220ccgatagaca aggaactgta
tcctttaact tccctcaggt cactctttgg caacgacccc 2280tcgtcacaat aaagataggg
gggcaactaa aggaagctct attagataca ggagcagatg 2340atacagtatt agaagaaatg
agtttgccag gaagatggaa accaaaaatg atagggggaa 2400ttggaggttt tatcaaagta
agacagtatg atcagatact catagaaatc tgtggacata 2460aagctatagg tacagtatta
gtaggaccta cacctgtcaa cataattgga agaaatctgt 2520tgactcagat tggttgcact
ttaaattttc ccattagccc tattgagact gtaccagtaa 2580aattaaagcc aggaatggat
ggcccaaaag ttaaacaatg gccattgaca gaagaaaaaa 2640taaaagcatt agtagaaatt
tgtacagaga tggaaaagga agggaaaatt tcaaaaattg 2700ggcctgaaaa tccatacaat
actccagtat ttgccataaa gaaaaaagac agtactaaat 2760ggagaaaatt agtagatttc
agagaactta ataagagaac tcaagacttc tgggaagttc 2820aattaggaat accacatccc
gcagggttaa aaaagaaaaa atcagtaaca gtactggatg 2880tgggtgatgc atatttttca
gttcccttag atgaagactt caggaagtat actgcattta 2940ccatacctag tataaacaat
gagacaccag ggattagata tcagtacaat gtgcttccac 3000agggatggaa aggatcacca
gcaatattcc aaagtagcat gacaaaaatc ttagagcctt 3060ttagaaaaca aaatccagac
atagttatct atcaatacat ggatgatttg tatgtaggat 3120ctgacttaga aatagggcag
catagaacaa aaatagagga gctgagacaa catctgttga 3180ggtggggact taccacacca
gacaaaaaac atcagaaaga acctccattc ctttggatgg 3240gttatgaact ccatcctgat
aaatggacag tacagcctat agtgctgcca gaaaaagaca 3300gctggactgt caatgacata
cagaagttag tggggaaatt gaattgggca agtcagattt 3360acccagggat taaagtaagg
caattatgta aactccttag aggaaccaaa gcactaacag 3420aagtaatacc actaacagaa
gaagcagagc tagaactggc agaaaacaga gagattctaa 3480aagaaccagt acatggagtg
tattatgacc catcaaaaga cttaatagca gaaatacaga 3540agcaggggca aggccaatgg
acatatcaaa tttatcaaga gccatttaaa aatctgaaaa 3600caggaaaata tgcaagaatg
aggggtgccc acactaatga tgtaaaacaa ttaacagagg 3660cagtgcaaaa aataaccaca
gaaagcatag taatatgggg aaagactcct aaatttaaac 3720tgcccataca aaaggaaaca
tgggaaacat ggtggacaga gtattggcaa gccacctgga 3780ttcctgagtg ggagtttgtt
aatacccctc ccttagtgaa attatggtac cagttagaga 3840aagaacccat agtaggagca
gaaaccttct atgtagatgg ggcagctaac agggagacta 3900aattaggaaa agcaggatat
gttactaata gaggaagaca aaaagttgtc accctaactg 3960acacaacaaa tcagaagact
gagttacaag caatttatct agctttgcag gattcgggat 4020tagaagtaaa catagtaaca
gactcacaat atgcattagg aatcattcaa gcacaaccag 4080atcaaagtga atcagagtta
gtcaatcaaa taatagagca gttaataaaa aaggaaaagg 4140tctatctggc atgggtacca
gcacacaaag gaattggagg aaatgaacaa gtagataaat 4200tagtcagtgc tggaatcagg
aaagtactat ttttagatgg aatagataag gcccaagatg 4260aacatgagaa atatcacagt
aattggagag caatggctag tgattttaac ctgccacctg 4320tagtagcaaa agaaatagta
gccagctgtg ataaatgtca gctaaaagga gaagccatgc 4380atggacaagt agactgtagt
ccaggaatat ggcaactaga ttgtacacat ttagaaggaa 4440aagttatcct ggtagcagtt
catgtagcca gtggatatat agaagcagaa gttattccag 4500cagaaacagg gcaggaaaca
gcatattttc ttttaaaatt agcaggaaga tggccagtaa 4560aaacaataca tactgacaat
ggcagcaatt tcaccggtgc tacggttagg gccgcctgtt 4620ggtgggcggg aatcaagcag
gaatttggaa ttccctacaa tccccaaagt caaggagtag 4680tagaatctat gaataaagaa
ttaaagaaaa ttataggaca ggtaagagat caggctgaac 4740atcttaagac agcagtacaa
atggcagtat tcatccacaa ttttaaaaga aaagggggga 4800ttggggggta cagtgcaggg
gaaagaatag tagacataat agcaacagac atacaaacta 4860aagaattaca aaaacaaatt
acaaaaattc aaaattttcg ggtttattac agggacagca 4920gaaatccact ttggaaagga
ccagcaaagc tcctctggaa aggtgaaggg gcagtagtaa 4980tacaagataa tagtgacata
aaagtagtgc caagaagaaa agcaaagatc attagggatt 5040atggaaaaca gatggcaggt
gatgattgtg tggcaagtag acaggatgag gattagaaca 5100tggaaaagtt tagtaaaaca
ccatatgtat gtttcaggga aagctagggg atggttttat 5160agacatcact atgaaagccc
tcatccaaga ataagttcag aagtacacat cccactaggg 5220gatgctagat tggtaataac
aacatattgg ggtctgcata caggagaaag agactggcat 5280ttgggtcagg gagtctccat
agaatggagg aaaaagagat atagcacaca agtagaccct 5340gaactagcag accaactaat
tcatctgtat tactttgact gtttttcaga ctctgctata 5400agaaaggcct tattaggaca
catagttagc cctaggtgtg aatatcaagc aggacataac 5460aaggtaggat ctctacaata
cttggcacta gcagcattaa taacaccaaa aaagataaag 5520ccacctttgc ctagtgttac
gaaactgaca gaggatagat ggaacaagcc ccagaagacc 5580aagggccaca gagggagcca
cacaatgaat ggacactaga gcttttagag gagcttaaga 5640atgaagctgt tagacatttt
cctaggattt ggctccatgg cttagggcaa catatctatg 5700aaacttatgg ggatacttgg
gcaggagtgg aagccataat aagaattctg caacaactgc 5760tgtttatcca ttttcagaat
tgggtgtcga catagcagaa taggcgttac tcgacagagg 5820agagcaagaa atggagccag
tagatcctag actagagccc tggaagcatc caggaagtca 5880gcctaaaact gcttgtacca
attgctattg taaaaagtgt tgctttcatt gccaagtttg 5940tttcataaca aaagccttag
gcatctccta tggcaggaag aagcggagac agcgacgaag 6000agctcatcag aacagtcaga
ctcatcaagc ttctctatca aagcagtaag tagtacatgt 6060aacgcaacct ataccaatag
tagcaatagt agcattagta gtagcaataa taatagcaat 6120agttgtgtgg tccatagtaa
tcatagaata taggaaaata ttaagacaaa gaaaaataga 6180caggttaatt gatagactaa
tagaaagagc agaagacagt ggcaatgaga gtgaaggaga 6240aatatcagca cttgtggaga
tgggggtgga gatggggcac catgctcctt gggatgttga 6300tgatctgtag tgctacagaa
aaattgtggg tcacagtcta ttatggggta cctgtgtgga 6360aggaagcaac caccactcta
ttttgtgcat cagatgctaa agcatatgat acagaggtac 6420ataatgtttg ggccacacat
gcctgtgtac ccacagaccc caacccacaa gaagtagtat 6480tggtaaatgt gacagaaaat
tttaacatgt ggaaaaatga catggtagaa cagatgcatg 6540aggatataat cagtttatgg
gatcaaagcc taaagccatg tgtaaaatta accccactct 6600gtgttagttt aaagtgcact
gatttgaaga atgatactaa taccaatagt agtagcggga 6660gaatgataat ggagaaagga
gagataaaaa actgctcttt caatatcagc acaagcataa 6720gaggtaaggt gcagaaagaa
tatgcatttt tttataaact tgatataata ccaatagata 6780atgatactac cagctataag
ttgacaagtt gtaacacctc agtcattaca caggcctgtc 6840caaaggtatc ctttgagcca
attcccatac attattgtgc cccggctggt tttgcgattc 6900taaaatgtaa taataagacg
ttcaatggaa caggaccatg tacaaatgtc agcacagtac 6960aatgtacaca tggaattagg
ccagtagtat caactcaact gctgttaaat ggcagtctag 7020cagaagaaga ggtagtaatt
agatctgtca atttcacgga caatgctaaa accataatag 7080tacagctgaa cacatctgta
gaaattaatt gtacaagacc caacaacaat acaagaaaaa 7140gaatccgtat ccagagagga
ccagggagag catttgttac aataggaaaa ataggaaata 7200tgagacaagc acattgtaac
attagtagag caaaatggaa taacacttta aaacagatag 7260ctagcaaatt aagagaacaa
tttggaaata ataaaacaat aatctttaag caatcctcag 7320gaggggaccc agaaattgta
acgcacagtt ttaattgtgg aggggaattt ttctactgta 7380attcaacaca actgtttaat
agtacttggt ttaatagtac ttggagtact gaagggtcaa 7440ataacactga aggaagtgac
acaatcaccc tcccatgcag aataaaacaa attataaaca 7500tgtggcagaa agtaggaaaa
gcaatgtatg cccctcccat cagtggacaa attagatgtt 7560catcaaatat tacagggctg
ctattaacaa gagatggtgg taatagcaac aatgagtccg 7620agatcttcag acctggagga
ggagatatga gggacaattg gagaagtgaa ttatataaat 7680ataaagtagt aaaaattgaa
ccattaggag tagcacccac caaggcaaag agaagagtgg 7740tgcagagaga aaaaagagca
gtgggaatag gagctttgtt ccttgggttc ttgggagcag 7800caggaagcac tatgggcgca
gcctcaatga cgctgacggt acaggccaga caattattgt 7860ctggtatagt gcagcagcag
aacaatttgc tgagggctat tgaggcgcaa cagcatctgt 7920tgcaactcac agtctggggc
atcaagcagc tccaggcaag aatcctggct gtggaaagat 7980acctaaagga tcaacagctc
ctggggattt ggggttgctc tggaaaactc atttgcacca 8040ctgctgtgcc ttggaatgct
agttggagta ataaatctct ggaacagatt tggaatcaca 8100cgacctggat ggagtgggac
agagaaatta acaattacac aagcttaata cactccttaa 8160ttgaagaatc gcaaaaccag
caagaaaaga atgaacaaga attattggaa ttagataaat 8220gggcaagttt gtggaattgg
tttaacataa caaattggct gtggtatata aaattattca 8280taatgatagt aggaggcttg
gtaggtttaa gaatagtttt tgctgtactt tctatagtga 8340atagagttag gcagggatat
tcaccattat cgtttcagac ccacctccca accccgaggg 8400gacccgacag gcccgaagga
atagaagaag aaggtggaga gagagacaga gacagatcca 8460ttcgattagt gaacggatcc
ttggcactta tctgggacga tctgcggagc ctgtgcctct 8520tcagctacca ccgcttgaga
gacttactct tgattgtaac gaggattgtg gaacttctgg 8580gacgcagggg gtgggaagcc
ctcaaatatt ggtggaatct cctacagtat tggagtcagg 8640aactaaagaa tagtgctgtt
agcttgctca atgccacagc catagcagta gctgagggga 8700cagatagggt tatagaagta
gtacaaggag cttgtagagc tattcgccac atacctagaa 8760gaataagaca gggcttggaa
aggattttgc tataagatgg gtggcaagtg gtcaaaaagt 8820agtgtgattg gatggcctac
tgtaagggaa agaatgagac gagctgagcc agcagcagat 8880agggtgggag cagcatctcg
agacctggaa aaacatggag caatcacaag tagcaataca 8940gcagctacca atgctgcttg
tgcctggcta gaagcacaag aggaggagga ggtgggtttt 9000ccagtcacac ctcaggtacc
tttaagacca atgacttaca aggcagctgt agatcttagc 9060cactttttaa aagaaaaggg
gggactggaa gggctaattc actcccaaag aagacaagat 9120atccttgatc tgtggatcta
ccacacacaa ggctacttcc ctgattagca gaactacaca 9180ccagggccag gggtcagata
tccactgacc tttggatggt gctacaagct agtaccagtt 9240gagccagata agatagaaga
ggccaataaa ggagagaaca ccagcttgtt acaccctgtg 9300agcctgcatg ggatggatga
cccggagaga gaagtgttag agtggaggtt tgacagccgc 9360ctagcatttc atcacgtggc
ccgagagctg catccggagt acttcaagaa ctgctgacat 9420cgagcttgct acaagggact
ttccgctggg gactttccag ggaggcgtgg cctgggcggg 9480actggggagt ggcgagccct
cagatcctgc atataagcag ctgctttttg cctgtactgg 9540gtctctctgg ttagaccaga
tctgagcctg ggagctctct ggctaactag ggaacccact 9600gcttaagcct caataaagct
tgccttgagt gcttcaagta gtgtgtgccc gtctgttgtg 9660tgactctggt aactagagat
ccctcagacc cttttagtca gtgtggaaaa tctctagca 971953506DNAHuman
immunodeficiency virusMISC_FEATURE(1)..(650)HXB2 Protease gene nt 1900
through 2549MISC_FEATURE(651)..(2336)HXB2 RT gene nt 2550 through
4229MISC_FEATURE(2337)..(3506)HXB2 INT gene nt 4230 through 5400
5acaaattcag ctaccataat gatgcagaga ggcaatttta ggaaccaaag aaagattgtt
60aagtgtttca attgtggcaa agaagggcac acagccagaa attgcagggc ccctaggaaa
120aagggctgtt ggaaatgtgg aaaggaagga caccaaatga aagattgtac tgagagacag
180gctaattttt tagggaagat ctggccttcc tacaagggaa ggccagggaa ttttcttcag
240agcagaccag agccaacagc cccaccagaa gagagcttca ggtctggggt agagacaaca
300actccccctc agaagcagga gccgatagac aaggaactgt atcctttaac ttccctcagg
360tcactctttg gcaacgaccc ctcgtcacaa taaagatagg ggggcaacta aaggaagctc
420taatataaac aggagcagat aatacaatat tagaagaaat gagtttgcca ggaagatgga
480aaccaaaaat actagtggga gttggaggtt ttatgaaagt aagacagtat gatcagatac
540tcatagaaat ctgtggacat aaagctatat ctacagtagt agtaggacct acacctgcca
600acgtaattgg aagagatctg atgactcaga ttggttgcac tttaaatttt cccattagcc
660ctattgagac tgtaccagta aaattaaagc caggaatgga tggcccaaaa gttaaacaat
720ggccattgac agaagaaaaa ataaaagcat tagtagaaat ttgtacagag ttggaaaagg
780aagggaaaat ttcaaaaatt gggcctgaaa atccatacaa tactccagta tttgccataa
840agagaaaaaa cagttcttcc tccagatgga gaaaagtagt agatctcaga gaacttaata
900agagaactca agacttctgg gaagttcaat taggaatacc acatcccgca gggatagaaa
960agaacaaatc agcaacaata ctgtaagtgg gtgatgcatt ttattcagtt cccttagatg
1020aagacttcag gaagtatact gcatttacca tacctagtat aaacaatgag acaccaggga
1080ttagatatca gtacaatgtg cttccaatgg gatggaaagg atcaccagca atattccaaa
1140gtagcatgac aaaaatctta gagcctttta gaaaacaaaa tccagacata gttatctgtc
1200aatacgtgga tgatttgtta gtagcatctg acttagaaat agggcagcat agaacaaaaa
1260tagaggagct gagacaacat ctgtggaggt ggggacttta cacaccagac caaaaacatc
1320agaaagaaca tccattcctt tggctgggtt atgaactcca tcctgataaa tggacagtac
1380agcctatagt gctgccagaa aaagacagct ggactgtcaa tgacatacag aagttagtgg
1440ggaaattgaa ttgggcaagt cagatttacc cagggattaa agtaaggcaa ttatgtaaac
1500tccttagagg aaccaaagca ctaacagaag taataccact aacagaagaa gcagagctag
1560aactggcaga aaacagagag attctaaaag aaccagtaca tggagtgtat tatgacccat
1620caaaagactt aatagcagaa atacagaagc aggggcaagg ccaatggaca tatcaaattt
1680atcaagagcc atttaaaaat ctgaaaacag gaaaatatgc aagaatgagg ggtgcccaca
1740ctaatgatgt aaaacaatta acagaggcag tgcaaaaaat aaccacagaa agcatagtaa
1800tatggggaaa gactcctaaa tttaaactgc ccatacaaaa ggaaacatgg gaaacatggt
1860ggacagagta ttggcaagcc acctggattc ctgagtggga gtttgttaat acccctccct
1920tagtgaaatt atggtaccag ttagagaaag aacccatagt aggagcagaa accttctatg
1980tagatggggc agctaacagg gagactaaat taggaaaagc aggatatgtt actaatagag
2040gaagacaaaa agttgtcacc ctaactgaca caacaaatca gaagactgag ttacaagcaa
2100tttatctagc tttgcaggat tcgggattag aagtaaacat agtaacagac tcacaatatg
2160cattaggaat cattcaagca caaccagatc aaagtgaatc agagttagtc aatcaaataa
2220tagagcagtt aataaaaaag gaaaaggtct atctggcatg ggtaccagca cacaaaggaa
2280ttggaggaaa tgaacaagta gataaattag tcagtgctgg aatcaggaaa gtactatttt
2340tagatggaat agataaggcc caagatgaac atgagaaata tcacagtaat tggagagcaa
2400tggctagtga ttttaacctg ccacctgtag tagcaaaaga aatagtagcc agctgtgata
2460aatgtcagct aaaaggagaa gccatgcatg gacaagtaga ctgtagtcca ggaatatggc
2520aactagattg tatacattta gaaggaaaag ttatcatggt agcagttcat gtagccagtg
2580gatatataga agcagaagtt attccagcac aaacagggca ggaagcagca tattttcttt
2640taaaattagc aggaagatgg ccagtaaaaa caatacatac tgacaatggc agcaatttca
2700ccggtgctac ggttagggcc gcctgttggt gggcgggaat caagcaggca ttttcaattc
2760cccgcaatcc ccaaagtcac ggagtagtat aatctatgca taaagaatta aagaaaatta
2820taggacaggt aagagatcag gctgaacatc ttaagacagc agtacaaatg gcagtattca
2880tccacaattt taaaagaaaa ggggggattg gggggtacag tgcaggggaa agaatagtag
2940acataatagc aacagacata caaactaaag aattacaaaa acaaattaca aaaattcaaa
3000attttcgggt ttattacagg gacagcagaa atccactttg gaaaggacca gcaaagctcc
3060tctggaaagg tgaaggggca gtagtaatac aagataatag tgacataaaa gtagtgccaa
3120gaagaaaagc aaagatcatt agggattatg gaaaacagat ggcaggtgat gattgtgtgg
3180caagtagaca ggatgaggat tagaacatgg aaaagtttag taaaacacca tatgtatgtt
3240tcagggaaag ctaggggatg gttttataga catcactatg aaagccctca tccaagaata
3300agttcagaag tacacatccc actaggggat gctagattgg taataacaac atattggggt
3360ctgcatacag gagaaagaga ctggcatttg ggtcagggag tctccataga atggaggaaa
3420aagagatata gcacacaagt agaccctgaa ctagcagacc aactaattca tctgtattac
3480tttgactgtt tttcagactc tgctat
350663500DNAHuman immunodeficiency virusMISC_FEATURE(1)..(650)HXB2
Protease gene nt 1900 through 2549MISC_FEATURE(651)..(2330)HXB2 RT gene
nt 2550 through 4229MISC_FEATURE(2331)..(3500)HXB2 INT gene nt 4230
through 5400 6acaaattcag ctaccataat gatgcagaga ggcaatttta ggaaccaaag
aaagattgtt 60aagtgtttca attgtggcaa agaagggcac acagccagaa attgcagggc
ccctaggaaa 120aagggctgtt ggaaatgtgg aaaggaagga caccaaatga aagattgtac
tgagagacag 180gctaattttt tagggaagat ctggccttcc tacaagggaa ggccagggaa
ttttcttcag 240agcagaccag agccaacagc cccaccagaa gagagcttca ggtctggggt
agagacaaca 300actccccctc agaagcagga gccgatagac aaggaactgt atcctttaac
ttccctcagg 360tcactctttg gcaacgaccc ctcgtcacaa taaagatagg ggggcaacta
aaggaagctc 420tattataaac aggagcagat gatacagtat tagaagaaat gagtttgcca
ggaagatgga 480aaccaaaaat gataggggga attggaggtt ttatcaaagt aagacagtat
gatcagatac 540tcatagaaat ctgtggacat aaagctatag gtacagtatt agtaggacct
acacctgtca 600acataattgg aagaaatctg ttgactcaga ttggttgcac tttaaatttt
cccattagcc 660ctattgagac tgtaccagta aaattaaagc caggaatgga tggcccaaaa
gttaaacaat 720ggccattgac agaagaaaaa ataaaagcat tagtagaaat ttgtacagag
atggaaaagg 780aagggaaaat ttcaaaaatt gggcctgaaa atccatacaa tactccagta
tttgccataa 840agaaaaaaga cagtactaaa tggagaaaat tagtagattt cagagaactt
aataagagaa 900ctcaagactt ctgggaagtt caattaggaa taccacatcc cgcagggtta
aaaaagaaaa 960aatcagtaac agtactgtaa gtgggtgatg catatttttc agttccctta
gatgaagact 1020tcaggaagta tactgcattt accataccta gtataaacaa tgagacacca
gggattagat 1080atcagtacaa tgtgcttcca cagggatgga aaggatcacc agcaatattc
caaagtagca 1140tgacaaaaat cttagagcct tttagaaaac aaaatccaga catagttatc
tatcaataca 1200tggatgattt gtatgtagga tctgacttag aaatagggca gcatagaaca
aaaatagagg 1260agctgagaca acatctgttg aggtggggac ttaccacacc agacaaaaaa
catcagaaag 1320aacctccatt cctttggatg ggttatgaac tccatcctga taaatggaca
gtacagccta 1380tagtgctgcc agaaaaagac agctggactg tcaatgacat acagaagtta
gtggggaaat 1440tgaattgggc aagtcagatt tacccaggga ttaaagtaag gcaattatgt
aaactcctta 1500gaggaaccaa agcactaaca gaagtaatac cactaacaga agaagcagag
ctagaactgg 1560cagaaaacag agagattcta aaagaaccag tacatggagt gtattatgac
ccatcaaaag 1620acttaatagc agaaatacag aagcaggggc aaggccaatg gacatatcaa
atttatcaag 1680agccatttaa aaatctgaaa acaggaaaat atgcaagaat gaggggtgcc
cacactaatg 1740atgtaaaaca attaacagag gcagtgcaaa aaataaccac agaaagcata
gtaatatggg 1800gaaagactcc taaatttaaa ctgcccatac aaaaggaaac atgggaaaca
tggtggacag 1860agtattggca agccacctgg attcctgagt gggagtttgt taatacccct
cccttagtga 1920aattatggta ccagttagag aaagaaccca tagtaggagc agaaaccttc
tatgtagatg 1980gggcagctaa cagggagact aaattaggaa aagcaggata tgttactaat
agaggaagac 2040aaaaagttgt caccctaact gacacaacaa atcagaagac tgagttacaa
gcaatttatc 2100tagctttgca ggattcggga ttagaagtaa acatagtaac agactcacaa
tatgcattag 2160gaatcattca agcacaacca gatcaaagtg aatcagagtt agtcaatcaa
ataatagagc 2220agttaataaa aaaggaaaag gtctatctgg catgggtacc agcacacaaa
ggaattggag 2280gaaatgaaca agtagataaa ttagtcagtg ctggaatcag gaaagtacta
tttttagatg 2340gaatagataa ggcccaagat gaacatgaga aatatcacag taattggaga
gcaatggcta 2400gtgattttaa cctgccacct gtagtagcaa aagaaatagt agccagctgt
gataaatgtc 2460agctaaaagg agaagccatg catggacaag tagactgtag tccaggaata
tggcaactag 2520attgtacaca tttagaagga aaagttatcc tggtagcagt tcatgtagcc
agtggatata 2580tagaagcaga agttattcca gcagaaacag ggcaggaaac agcatatttt
cttttaaaat 2640tagcaggaag atggccagta aaaacaatac atactgacaa tggcagcaat
ttcaccggtg 2700ctacggttag ggccgcctgt tggtgggcgg gaatcaagca ggaatttgga
attccctaca 2760atccccaaag tcaaggagta gtataatcta tgaataaaga attaaagaaa
attataggac 2820aggtaagaga tcaggctgaa catcttaaga cagcagtaca aatggcagta
ttcatccaca 2880attttaaaag aaaagggggg attggggggt acagtgcagg ggaaagaata
gtagacataa 2940tagcaacaga catacaaact aaagaattac aaaaacaaat tacaaaaatt
caaaattttc 3000gggtttatta cagggacagc agaaatccac tttggaaagg accagcaaag
ctcctctgga 3060aaggtgaagg ggcagtagta atacaagata atagtgacat aaaagtagtg
ccaagaagaa 3120aagcaaagat cattagggat tatggaaaac agatggcagg tgatgattgt
gtggcaagta 3180gacaggatga ggattagaac atggaaaagt ttagtaaaac accatatgta
tgtttcaggg 3240aaagctaggg gatggtttta tagacatcac tatgaaagcc ctcatccaag
aataagttca 3300gaagtacaca tcccactagg ggatgctaga ttggtaataa caacatattg
gggtctgcat 3360acaggagaaa gagactggca tttgggtcag ggagtctcca tagaatggag
gaaaaagaga 3420tatagcacac aagtagaccc tgaactagca gaccaactaa ttcatctgta
ttactttgac 3480tgtttttcag actctgctat
350071526DNAHuman immunodeficiency virus 7atgatctgta
gtgctacaga aaaattgtgg gtcacagtct attatggggt acctgtgtgg 60aaggaagcaa
ccaccactct attttgtgca tcagatgcta aagcatatga tacagaggta 120cataatgttt
gggccacaca tgcctgtgta cccacagacc ccaacccaca agaagtagta 180ttggtaaatg
tgacagaaaa ttttaacatg tggaaaaatg acatggtaga acagatgcat 240gaggatataa
tcagtttatg ggatcaaagc ctaaagccat gtgtaaaatt aaccccactc 300tgtgttagtt
taaagtgcac tgatttgaag aatgatacta ataccaatag tagtagcggg 360agaatgataa
tggagaaagg agagataaaa aactgctctt tcaatatcag cacaagcata 420agaggtaagg
tgcagaaaga atatgcattt ttttataaac ttgatataat accaatagat 480aatgatacta
ccagctataa gttgacaagt tgtaacacct cagtcattac acaggcctgt 540ccaaaggtat
cctttgagcc aattcccata cattattgtg ccccggctgg ttttgcgatt 600ctaaaatgta
ataataagac gttcaatgga acaggaccat gtacaaatgt cagcacagta 660caatgtacac
atggaattag gccagtagta tcaactcaac tgctgttaaa tggcagtcta 720gcagaagaag
aggtagtaat tagatctgtc aatttcacgg acaatgctaa aaccataata 780gtacagctga
acacatctgt agaaattaat tgtacaagac ccaacaacaa tacaagaaaa 840agaatccgta
tccagagagg accagggaga gcatttgtta caataggaaa aataggaaat 900atgagacaag
cacattgtaa cattagtaga gcaaaatgga ataacacttt aaaacagata 960gctagcaaat
taagagaaca atttggaaat aataaaacaa taatctttaa gcaatcctca 1020ggaggggacc
cagaaattgt aacgcacagt tttaattgtg gaggggaatt tttctactgt 1080aattcaacac
aactgtttaa tagtacttgg tttaatagta cttggagtac tgaagggtca 1140aataacactg
aaggaagtga cacaatcacc ctcccatgca gaataaaaca aattataaac 1200atgtggcaga
aagtaggaaa agcaatgtat gcccctccca tcagtggaca aattagatgt 1260tcatcaaata
ttacagggct gctattaaca agagatggtg gtaatagcaa caatgagtcc 1320gagatcttca
gacctggagg aggagatatg agggacaatt ggagaagtga attatataaa 1380tataaagtag
taaaaattga accattagga gtagcaccca ccaaggcaaa gagaagagtg 1440gtgcagagag
aaaaaagagc agtgggaata ggagctttgt tccttgggtt cttgggagca 1500gcaggaagca
ctatgggcgc agcctc
152681520DNAHuman immunodeficiency virus 8atgatctgta gtgctacaga
aaaattgtgg gtcacagtct attatggggt acctgtgtgg 60aaagaagcaa ccaccactct
attttgtgca tcagatgcta aagcatatga tacagaggta 120cataatgttt gggccacaca
tgcctgtgta cccacagacc ccaacccaca agaagtagaa 180ttggaaaatg tgacagaaaa
ttttaacatg tggaaaaata acatggtaga acagatgcat 240gaggatataa tcagtttatg
ggatcaaagc ctaaagccat gtgtaaaatt aactccactc 300tgtgttactt taaattgcac
tgatttgagg aatgctacta atgggaatga cactaatacc 360actagtagta gcagggaaat
gatgggggga ggagaaatga aaaattgctc tttcaaaatc 420accacaaaca taagaggtaa
ggtgcagaaa gaatatgcac ttttttataa acttgatata 480gtaccaatag ataataatag
taataataga tataggttga taagttgtaa cacctcagtc 540attacacagg cctgtccaaa
gatatccttt gagccaattc ccatacatta ttgtgccccg 600gctggttttg cgattctaaa
gtgtaaagat aagaagttca atggaaaagg accatgttca 660aatgtcagca cagtacaatg
tacacatggg attaggccag tagtatcaac tcaactgctg 720ttaaatggca gtctagcaga
agaagaggta gtaattagat ccgaaaattt cgcggacaat 780gctaaaacca taatagtaca
gctgaatgaa tctgtagaaa ttaattgtac aagacccaac 840aacaatacaa gaaaaagtat
acatatagga ccaggcagag cattatatac aacaggaaaa 900ataataggag atataagaca
agcacattgt aaccttagta gagcaaaatg gaatgacact 960ttaaataaaa tagttataaa
attaagagaa caatttggga ataaaacaat agtctttaag 1020cattcctcag gaggggaccc
agaaattgtg acgcacagtt ttaattgtgg aggggaattt 1080ttctactgta attcaacaca
actgtttaat agtacttgga atgttactga agagtcaaat 1140aacactgtag aaaataacac
aatcacactc ccatgcagaa taaaacaaat tataaacatg 1200tggcagaaag taggaagagc
aatgtatgcc cctcccatca gaggacaaat tagatgttca 1260tcaaatatta cagggctgct
attaacaaga gatggtggtc cagaggacaa caagaccgag 1320gtcttcagac ctggaggagg
agatatgagg gacaattgga gaagtgaatt atataaatat 1380aaagtagtaa aaattgaacc
attaggagta gcacccacca aggcaaagag aagagtggtg 1440cagagagaaa aaagagcagt
gggaatagga gctgtgttcc ttgggttctt gggagcagca 1500ggaagcacta tgggcgcagc
1520910080DNAArtificial
SequenceDescription of Artificial Sequence Synthetic
polynucleotideMISC_FEATURE(1)..(7652)Sindbis gene
sequenceMISC_FEATURE(7653)..(9707)Ebola GP insert
sequenceMISC_FEATURE(9708)..(10080)Sindbis gene sequence 9attgacggcg
tagtacacac tattgaatca aacagccgac caattgcact accatcacaa 60tggagaagcc
agtagtaaac gtagacgtag acccccagag tccgtttgtc gtgcaactgc 120aaaaaagctt
cccgcaattt gaggtagtag cacagcaggt cactccaaat gaccatgcta 180atgccagagc
attttcgcat ctggccagta aactaatcga gctggaggtt cctaccacag 240cgacgatctt
ggacataggc agcgcaccgg ctcgtagaat gttttccgag caccagtatc 300attgtgtctg
ccccatgcgt agtccagaag acccggaccg catgatgaaa tacgccagta 360aactggcgga
aaaagcgtgc aagattacaa acaagaactt gcatgagaag attaaggatc 420tccggaccgt
acttgatacg ccggatgctg aaacaccatc gctctgcttt cacaacgatg 480ttacctgcaa
catgcgtgcc gaatattccg tcatgcagga cgtgtatatc aacgctcccg 540gaactatcta
tcatcaggct atgaaaggcg tgcggaccct gtactggatt ggcttcgaca 600ccacccagtt
catgttctcg gctatggcag gttcgtaccc tgcgtacaac accaactggg 660ccgacgagaa
agtccttgaa gcgcgtaaca tcggactttg cagcacaaag ctgagtgaag 720gtaggacagg
aaaattgtcg ataatgagga agaaggagtt gaagcccggg tcgcgggttt 780atttctccgt
aggatcgaca ctttatccag aacacagagc cagcttgcag agctggcatc 840ttccatcggt
gttccacttg aatggaaagc agtcgtacac ttgccgctgt gatacagtgg 900tgagttgcga
aggctacgta gtgaagaaaa tcaccatcag tcccgggatc acgggagaaa 960ccgtgggata
cgcggttaca cacaatagcg agggcttctt gctatgcaaa gttactgaca 1020cagtaaaagg
agaacgggta tcgttccctg tgtgcacgta catcccggcc accatatgcg 1080atcagatgac
tggtataatg gccacggata tatcacctga cgatgcacaa aaacttctgg 1140ttgggctcaa
ccagcgaatt gtcattaacg gtaggactaa caggaacacc aacaccatgc 1200aaaattacct
tctgccgatc atagcacaag ggttcagcaa atgggctaag gagcgcaagg 1260atgatcttga
taacgagaaa atgctgggta ctagagaacg caagcttacg tatggctgct 1320tgtgggcgtt
tcgcactaag aaagtacatt cgttttatcg cccacctgga acgcagacct 1380gcgtaaaagt
cccagcctct tttagcgctt ttcccatgtc gtccgtatgg acgacctctt 1440tgcccatgtc
gctgaggcag aaattgaaac tggcattgca accaaagaag gaggaaaaac 1500tgctgcaggt
ctcggaggaa ttagtcatgg aggccaaggc tgcttttgag gatgctcagg 1560aggaagccag
agcggagaag ctccgagaag cacttccacc attagtggca gacaaaggca 1620tcgaggcagc
cgcagaagtt gtctgcgaag tggaggggct ccaggcggac atcggagcag 1680cattagttga
aaccccgcgc ggtcacgtaa ggataatacc tcaagcaaat gaccgtatga 1740tcggacagta
tatcgttgtc tcgccaaact ctgtgctgaa gaatgccaaa ctcgcaccag 1800cgcacccgct
agcagatcag gttaagatca taacacactc cggaagatca ggaaggtacg 1860cggtcgaacc
atacgacgct aaagtactga tgccagcagg aggtgccgta ccatggccag 1920aattcctagc
actgagtgag agcgccacgt tagtgtacaa cgaaagagag tttgtgaacc 1980gcaaactata
ccacattgcc atgcatggcc ccgccaagaa tacagaagag gagcagtaca 2040aggttacaaa
ggcagagctt gcagaaacag agtacgtgtt tgacgtggac aagaagcgtt 2100gcgttaagaa
ggaagaagcc tcaggtctgg tcctctcggg agaactgacc aaccctccct 2160atcatgagct
agctctggag ggactgaaga cccgacctgc ggtcccgtac aaggtcgaaa 2220caataggagt
gataggcaca ccggggtcgg gcaagtcagc tattatcaag tcaactgtca 2280cggcacgaga
tcttgttacc agcggaaaga aagaaaattg tcgcgaaatt gaggccgacg 2340tgctaagact
gaggggtatg cagattacgt cgaagacagt agattcggtt atgctcaacg 2400gatgccacaa
agccgtagaa gtgctgtacg ttgacgaagc gttcgcgtgc cacgcaggag 2460cactacttgc
cttgattgct atcgtcaggc cccgcaagaa ggtagtacta tgcggagacc 2520ccatgcaatg
cggattcttc aacatgatgc aactaaaggt acatttcaat caccctgaaa 2580aagacatatg
caccaagaca ttctacaagt atatctcccg gcgttgcaca cagccagtta 2640cagctattgt
atcgacactg cattacgatg gaaagatgaa aaccacgaac ccgtgcaaga 2700agaacattga
aatcgatatt acaggggcca caaagccgaa gccaggggat atcatcctga 2760catgtttccg
cgggtgggtt aagcaattgc aaatcgacta tcccggacat gaagtaatga 2820cagccgcggc
ctcacaaggg ctaaccagaa aaggagtgta tgccgtccgg caaaaagtca 2880atgaaaaccc
actgtacgcg atcacatcag agcatgtgaa cgtgttgctc acccgcactg 2940aggacaggct
agtgtggaaa accttgcagg gcgacccatg gattaagcag cccactaaca 3000tacctaaagg
aaactttcag gctactatag aggactggga agctgaacac aagggaataa 3060ttgctgcaat
aaacagcccc actccccgtg ccaatccgtt cagctgcaag accaacgttt 3120gctgggcgaa
agcattggaa ccgatactag ccacggccgg tatcgtactt accggttgcc 3180agtggagcga
actgttccca cagtttgcgg atgacaaacc acattcggcc atttacgcct 3240tagacgtaat
ttgcattaag tttttcggca tggacttgac aagcggactg ttttctaaac 3300agagcatccc
actaacgtac catcccgccg attcagcgag gccggtagct cattgggaca 3360acagcccagg
aacccgcaag tatgggtacg atcacgccat tgccgccgaa ctctcccgta 3420gatttccggt
gttccagcta gctgggaagg gcacacaact tgatttgcag acggggagaa 3480ccagagttat
ctctgcacag cataacctgg tcccggtgaa ccgcaatctt cctcacgcct 3540tagtccccga
gtacaaggag aagcaacccg gcccggtcaa aaaattcttg aaccagttca 3600aacaccactc
agtacttgtg gtatcagagg aaaaaattga agctccccgt aagagaatcg 3660aatggatcgc
cccgattggc atagccggtg cagataagaa ctacaacctg gctttcgggt 3720ttccgccgca
ggcacggtac gacctggtgt tcatcaacat tggaactaaa tacagaaacc 3780accactttca
gcagtgcgaa gaccatgcgg cgaccttaaa aaccctttcg cgttcggccc 3840tgaattgcct
taacccagga ggcaccctcg tggtgaagtc ctatggctac gccgaccgca 3900acagtgagga
cgtagtcacc gctcttgcca gaaagtttgt cagggtgtct gcagcgagac 3960cagattgtgt
ctcaagcaat acagaaatgt acctgatttt ccgacaacta gacaacagcc 4020gtacacggca
attcaccccg caccatctga attgcgtgat ttcgtccgtg tatgagggta 4080caagagatgg
agttggagcc gcgccgtcat accgcaccaa aagggagaat attgctgact 4140gtcaagagga
agcagttgtc aacgcagcca atccgctggg tagaccaggc gaaggagtct 4200gccgtgccat
ctataaacgt tggccgacca gttttaccga ttcagccacg gagacaggca 4260ccgcaagaat
gactgtgtgc ctaggaaaga aagtgatcca cgcggtcggc cctgatttcc 4320ggaagcaccc
agaagcagaa gccttgaaat tgctacaaaa cgcctaccat gcagtggcag 4380acttagtaaa
tgaacataac atcaagtctg tcgccattcc actgctatct acaggcattt 4440acgcagccgg
aaaagaccgc cttgaagtat cacttaactg cttgacaacc gcgctagaca 4500gaactgacgc
ggacgtaacc atctattgcc tggataagaa gtggaaggaa agaatcgacg 4560cggcactcca
acttaaggag tctgtaacag agctgaagga tgaagatatg gagatcgacg 4620atgagttagt
atggattcat ccagacagtt gcttgaaggg aagaaaggga ttcagtacta 4680caaaaggaaa
attgtattcg tacttcgaag gcaccaaatt ccatcaagca gcaaaagaca 4740tggcggagat
aaaggtcctg ttccctaatg accaggaaag taatgaacaa ctgtgtgcct 4800acatattggg
tgagaccatg gaagcaatcc gcgaaaagtg cccggtcgac cataacccgt 4860cgtctagccc
gcccaaaacg ttgccgtgcc tttgcatgta tgccatgacg ccagaaaggg 4920tccacagact
tagaagcaat aacgtcaaag aagttacagt atgctcctcc accccccttc 4980ctaagcacaa
aattaagaat gttcagaagg ttcagtgcac gaaagtagtc ctgtttaatc 5040cgcacactcc
cgcattcgtt cccgcccgta agtacataga agtgccagaa cagcctaccg 5100ctcctcctgc
acaggccgag gaggcccccg aagttgtagc gacaccgtca ccatctacag 5160ctgataacac
ctcgcttgat gtcacagaca tctcactgga tatggatgac agtagcgaag 5220gctcactttt
ttcgagcttt agcggatcgg acaactctat tactagtatg gacagttggt 5280cgtcaggacc
tagttcacta gagatagtag accgaaggca ggtggtggtg gctgacgttc 5340atgccgtcca
agagcctgcc cctattccac cgccaaggct aaagaagatg gcccgcctgg 5400cagcggcaag
aaaagagccc actccaccgg caagcaatag ctctgagtcc ctccacctct 5460cttttggtgg
ggtatccatg tccctcggat caattttcga cggagagacg gcccgccagg 5520cagcggtaca
acccctggca acaggcccca cggatgtgcc tatgtctttc ggatcgtttt 5580ccgacggaga
gattgatgag ctgagccgca gagtaactga gtccgaaccc gtcctgtttg 5640gatcatttga
accgggcgaa gtgaactcaa ttatatcgtc ccgatcagcc gtatcttttc 5700cactacgcaa
gcagagacgt agacgcagga gcaggaggac tgaatactga ctaaccgggg 5760taggtgggta
catattttcg acggacacag gccctgggca cttgcaaaag aagtccgttc 5820tgcagaacca
gcttacagaa ccgaccttgg agcgcaatgt cctggaaaga attcatgccc 5880cggtgctcga
cacgtcgaaa gaggaacaac tcaaactcag gtaccagatg atgcccaccg 5940aagccaacaa
aagtaggtac cagtctcgta aagtagaaaa tcagaaagcc ataaccactg 6000agcgactact
gtcaggacta cgactgtata actctgccac agatcagcca gaatgctata 6060agatcaccta
tccgaaacca ttgtactcca gtagcgtacc ggcgaactac tccgatccac 6120agttcgctgt
agctgtctgt aacaactatc tgcatgagaa ctatccgaca gtagcatctt 6180atcagattac
tgacgagtac gatgcttact tggatatggt agacgggaca gtcgcctgcc 6240tggatactgc
aaccttctgc cccgctaagc ttagaagtta cccgaaaaaa catgagtata 6300gagccccgaa
tatccgcagt gcggttccat cagcgatgca gaacacgcta caaaatgtgc 6360tcattgccgc
aactaaaaga aattgcaacg tcacgcagat gcgtgaactg ccaacactgg 6420actcagcgac
attcaatgtc gaatgctttc gaaaatatgc atgtaatgac gagtattggg 6480aggagttcgc
tcggaagcca attaggatta ccactgagtt tgtcaccgca tatgtagcta 6540gactgaaagg
ccctaaggcc gccgcactat ttgcaaagac gtataatttg gtcccattgc 6600aagaagtgcc
tatggataga ttcgtcatgg acatgaaaag agacgtgaaa gttacaccag 6660gcacgaaaca
cacagaagaa agaccgaaag tacaagtgat acaagccgca gaacccctgg 6720cgactgctta
cttatgcggg attcaccggg aattagtgcg taggcttacg gccgtcttgc 6780ttccaaacat
tcacacgctt tttgacatgt cggcggagga ttttgatgca atcatagcag 6840aacacttcaa
gcaaggcgac ccggtactgg agacggatat cgcatcattc gacaaaagcc 6900aagacgacgc
tatggcgtta accggtctga tgatcttgga ggacctgggt gtggatcaac 6960cactactcga
cttgatcgag tgcgcctttg gagaaatatc atccacccat ctacctacgg 7020gtactcgttt
taaattcggg gcgatgatga aatccggaat gttcctcaca ctttttgtca 7080acacagtttt
gaatgtcgtt atcgccagca gagtactaga agagcggctt aaaacgtcca 7140gatgtgcagc
gttcattggc gacgacaaca tcatacatgg agtagtatct gacaaagaaa 7200tggctgagag
gtgcgccacc tggctcaaca tggaggttaa gatcatcgac gcagtcatcg 7260gtgagagacc
accttacttc tgcggcggat ttatcttgca agattcggtt acttccacag 7320cgtgccgcgt
ggcggatccc ctgaaaaggc tgtttaagtt gggtaaaccg ctcccagccg 7380acgacgagca
agacgaagac agaagacgcg ctctgctaga tgaaacaaag gcgtggttta 7440gagtaggtat
aacaggcact ttagcagtgg ccgtgacgac ccggtatgag gtagacaata 7500ttacacctgt
cctactggca ttgagaactt ttgcccagag caaaagagca ttccaagcca 7560tcagagggga
aataaagcat ctctacggtg gtcctaaata gtcagcatag tacatttcat 7620ctgactaata
ctacaacacc accacctcta gatctagaag gaatattgca gttacctcgt 7680gatcgattca
agaggacatc attctttctt tgggtaatta tccttttcca aagaacattt 7740tccatcccgc
ttggagttat ccacaatagt acattacagg ttagtgatgt cgacaaacta 7800gtttgtcgtg
acaaactgtc atccacaaat caattgagat cagttggact gaatctcgag 7860gggaatggag
tggcaactga cgtgccatct gcgactaaaa gatggggctt caggtccggt 7920gtcccaccaa
aggtggtcaa ttatgaagct ggtgaatggg ctgaaaactg ctacaatctt 7980gaaatcaaaa
aacctgacgg gagtgagtgt ctaccagcag cgccagacgg gattcggggc 8040ttcccccggt
gccggtatgt gcacaaagta tcaggaacgg gaccatgtgc cggagacttt 8100gccttccaca
aagagggtgc tttcttcctg tatgatcgac ttgcttccac agttatctac 8160cgaggaacga
ctttcgctga aggtgtcgtt gcatttctga tactgcccca agctaagaag 8220gacttcttca
gctcacaccc cttgagagag ccggtcaatg caacggagga cccgtcgagt 8280ggctattatt
ctaccacaat tagatatcag gctaccggtt ttggaactaa tgagacagag 8340tacttgttcg
aggttgacaa tttgacctac gtccaacttg aatcaagatt cacaccacag 8400tttctgctcc
agctgaatga gacaatatat gcaagtggga agaggagcaa caccacggga 8460aaactaattt
ggaaggtcaa ccccgaaatt gatacaacaa tcggggagtg ggccttcagg 8520gaaactaaaa
aaacctcact agaaaaattc gcagtgaaga gttgtctttc acagctgtat 8580caaacggacc
caaaaacatc agtggtcaga gtccggcgcg aacttcttcc gacccagaga 8640ccaacacaac
aaatgaagac cacaaaatca tggcttcaga aaattcctct gcaatggttc 8700aagtgcacag
tcaaggaagg aaagctgcag tgtcgcatct gacaaccctt gccacaatct 8760ccacgagtcc
tcaacctccc acaaccaaaa caggtccgga caacagcacc cataatacac 8820ccgtgtataa
acttgacatc tctgaggcaa ctcaagttgg acaacatcac cgtagagcag 8880acaacgacag
cacagcctcc gacactcccc ccgccacgac cgcagccgga cccttaaaag 8940cagagaacac
caacacgagt aagagcgctg actccctgga cctcgccacc acgataagcc 9000cccaaaacta
cagcgagact gctggcaaca acaacactca tcaccaagat accggagaag 9060agagtgccag
cagcgggaag ctaggcttaa ttaccaatac tattgctgga gtagcaggac 9120tgatcacagg
cgggagaagg actcgaagag aagtaattgt caatgctcaa cccaaatgca 9180accccaattt
acattactgg actactcagg atgaaggtgc tgcaatcgga ttggcctgga 9240taccatattt
cgggccagca gccgaaggaa tttacacaga ggggctaatg cacaaccaag 9300atggtttaat
ctgtgggttg aggcagctgg ccaacgaaac gactcaagct ctccaactgt 9360tcctgagagc
cacaactgag ctgcgaacct tttcaatcct caaccgtaag gcaattgact 9420tcctgctgca
gcgatggggt ggcacatgcc acattttggg accggactgc tgtatcgaac 9480cacatgattg
gaccaagaac ataacagaca aaattgatca gattattcat gattttgttg 9540ataaaaccct
tccggaccag ggggacaatg acaattggtg gacaggatgg agacaatgga 9600taccggcaat
ttaaatatcg agggcaggag cgagagggcc aaggccaggg agggcggcca 9660ccaccatcac
caccatcacc attagtaatg aggtaaccgt ggggcccaat gatccgacca 9720gcaaaactcg
atgtacttcc gaggaactga tgtgcataat gcatcaggct ggtacattag 9780atccccgctt
accgcgggca atatagcaac actaaaaact cgatgtactt ccgaggaagc 9840gcagtgcata
atgctgcgca gtgttgccac ataaccacta tattaaccat ttatctagcg 9900gacgccaaaa
actcaatgta tttctgagga agcgtggtgc ataatgccac gcagcgtctg 9960cataactttt
attatttctt ttattaatca acaaaatttt gtttttaaca tttcaaaaaa 10020aaaaaaaaaa
aaaaaaaaaa aaaaaaaaaa agggaattcc tcgattaatt aagcggccgc
100801010348DNAArtificial SequenceDescription of Artificial Sequence
Synthetic polynucleotideMISC_FEATURE(1)..(7652)Sindbis gene
sequenceMISC_FEATURE(7653)..(9975)Ebola NP/VP24 insert
sequenceMISC_FEATURE(9976)..(10348)Sindbis gene sequence 10attgacggcg
tagtacacac tattgaatca aacagccgac caattgcact accatcacaa 60tggagaagcc
agtagtaaac gtagacgtag acccccagag tccgtttgtc gtgcaactgc 120aaaaaagctt
cccgcaattt gaggtagtag cacagcaggt cactccaaat gaccatgcta 180atgccagagc
attttcgcat ctggccagta aactaatcga gctggaggtt cctaccacag 240cgacgatctt
ggacataggc agcgcaccgg ctcgtagaat gttttccgag caccagtatc 300attgtgtctg
ccccatgcgt agtccagaag acccggaccg catgatgaaa tacgccagta 360aactggcgga
aaaagcgtgc aagattacaa acaagaactt gcatgagaag attaaggatc 420tccggaccgt
acttgatacg ccggatgctg aaacaccatc gctctgcttt cacaacgatg 480ttacctgcaa
catgcgtgcc gaatattccg tcatgcagga cgtgtatatc aacgctcccg 540gaactatcta
tcatcaggct atgaaaggcg tgcggaccct gtactggatt ggcttcgaca 600ccacccagtt
catgttctcg gctatggcag gttcgtaccc tgcgtacaac accaactggg 660ccgacgagaa
agtccttgaa gcgcgtaaca tcggactttg cagcacaaag ctgagtgaag 720gtaggacagg
aaaattgtcg ataatgagga agaaggagtt gaagcccggg tcgcgggttt 780atttctccgt
aggatcgaca ctttatccag aacacagagc cagcttgcag agctggcatc 840ttccatcggt
gttccacttg aatggaaagc agtcgtacac ttgccgctgt gatacagtgg 900tgagttgcga
aggctacgta gtgaagaaaa tcaccatcag tcccgggatc acgggagaaa 960ccgtgggata
cgcggttaca cacaatagcg agggcttctt gctatgcaaa gttactgaca 1020cagtaaaagg
agaacgggta tcgttccctg tgtgcacgta catcccggcc accatatgcg 1080atcagatgac
tggtataatg gccacggata tatcacctga cgatgcacaa aaacttctgg 1140ttgggctcaa
ccagcgaatt gtcattaacg gtaggactaa caggaacacc aacaccatgc 1200aaaattacct
tctgccgatc atagcacaag ggttcagcaa atgggctaag gagcgcaagg 1260atgatcttga
taacgagaaa atgctgggta ctagagaacg caagcttacg tatggctgct 1320tgtgggcgtt
tcgcactaag aaagtacatt cgttttatcg cccacctgga acgcagacct 1380gcgtaaaagt
cccagcctct tttagcgctt ttcccatgtc gtccgtatgg acgacctctt 1440tgcccatgtc
gctgaggcag aaattgaaac tggcattgca accaaagaag gaggaaaaac 1500tgctgcaggt
ctcggaggaa ttagtcatgg aggccaaggc tgcttttgag gatgctcagg 1560aggaagccag
agcggagaag ctccgagaag cacttccacc attagtggca gacaaaggca 1620tcgaggcagc
cgcagaagtt gtctgcgaag tggaggggct ccaggcggac atcggagcag 1680cattagttga
aaccccgcgc ggtcacgtaa ggataatacc tcaagcaaat gaccgtatga 1740tcggacagta
tatcgttgtc tcgccaaact ctgtgctgaa gaatgccaaa ctcgcaccag 1800cgcacccgct
agcagatcag gttaagatca taacacactc cggaagatca ggaaggtacg 1860cggtcgaacc
atacgacgct aaagtactga tgccagcagg aggtgccgta ccatggccag 1920aattcctagc
actgagtgag agcgccacgt tagtgtacaa cgaaagagag tttgtgaacc 1980gcaaactata
ccacattgcc atgcatggcc ccgccaagaa tacagaagag gagcagtaca 2040aggttacaaa
ggcagagctt gcagaaacag agtacgtgtt tgacgtggac aagaagcgtt 2100gcgttaagaa
ggaagaagcc tcaggtctgg tcctctcggg agaactgacc aaccctccct 2160atcatgagct
agctctggag ggactgaaga cccgacctgc ggtcccgtac aaggtcgaaa 2220caataggagt
gataggcaca ccggggtcgg gcaagtcagc tattatcaag tcaactgtca 2280cggcacgaga
tcttgttacc agcggaaaga aagaaaattg tcgcgaaatt gaggccgacg 2340tgctaagact
gaggggtatg cagattacgt cgaagacagt agattcggtt atgctcaacg 2400gatgccacaa
agccgtagaa gtgctgtacg ttgacgaagc gttcgcgtgc cacgcaggag 2460cactacttgc
cttgattgct atcgtcaggc cccgcaagaa ggtagtacta tgcggagacc 2520ccatgcaatg
cggattcttc aacatgatgc aactaaaggt acatttcaat caccctgaaa 2580aagacatatg
caccaagaca ttctacaagt atatctcccg gcgttgcaca cagccagtta 2640cagctattgt
atcgacactg cattacgatg gaaagatgaa aaccacgaac ccgtgcaaga 2700agaacattga
aatcgatatt acaggggcca caaagccgaa gccaggggat atcatcctga 2760catgtttccg
cgggtgggtt aagcaattgc aaatcgacta tcccggacat gaagtaatga 2820cagccgcggc
ctcacaaggg ctaaccagaa aaggagtgta tgccgtccgg caaaaagtca 2880atgaaaaccc
actgtacgcg atcacatcag agcatgtgaa cgtgttgctc acccgcactg 2940aggacaggct
agtgtggaaa accttgcagg gcgacccatg gattaagcag cccactaaca 3000tacctaaagg
aaactttcag gctactatag aggactggga agctgaacac aagggaataa 3060ttgctgcaat
aaacagcccc actccccgtg ccaatccgtt cagctgcaag accaacgttt 3120gctgggcgaa
agcattggaa ccgatactag ccacggccgg tatcgtactt accggttgcc 3180agtggagcga
actgttccca cagtttgcgg atgacaaacc acattcggcc atttacgcct 3240tagacgtaat
ttgcattaag tttttcggca tggacttgac aagcggactg ttttctaaac 3300agagcatccc
actaacgtac catcccgccg attcagcgag gccggtagct cattgggaca 3360acagcccagg
aacccgcaag tatgggtacg atcacgccat tgccgccgaa ctctcccgta 3420gatttccggt
gttccagcta gctgggaagg gcacacaact tgatttgcag acggggagaa 3480ccagagttat
ctctgcacag cataacctgg tcccggtgaa ccgcaatctt cctcacgcct 3540tagtccccga
gtacaaggag aagcaacccg gcccggtcaa aaaattcttg aaccagttca 3600aacaccactc
agtacttgtg gtatcagagg aaaaaattga agctccccgt aagagaatcg 3660aatggatcgc
cccgattggc atagccggtg cagataagaa ctacaacctg gctttcgggt 3720ttccgccgca
ggcacggtac gacctggtgt tcatcaacat tggaactaaa tacagaaacc 3780accactttca
gcagtgcgaa gaccatgcgg cgaccttaaa aaccctttcg cgttcggccc 3840tgaattgcct
taacccagga ggcaccctcg tggtgaagtc ctatggctac gccgaccgca 3900acagtgagga
cgtagtcacc gctcttgcca gaaagtttgt cagggtgtct gcagcgagac 3960cagattgtgt
ctcaagcaat acagaaatgt acctgatttt ccgacaacta gacaacagcc 4020gtacacggca
attcaccccg caccatctga attgcgtgat ttcgtccgtg tatgagggta 4080caagagatgg
agttggagcc gcgccgtcat accgcaccaa aagggagaat attgctgact 4140gtcaagagga
agcagttgtc aacgcagcca atccgctggg tagaccaggc gaaggagtct 4200gccgtgccat
ctataaacgt tggccgacca gttttaccga ttcagccacg gagacaggca 4260ccgcaagaat
gactgtgtgc ctaggaaaga aagtgatcca cgcggtcggc cctgatttcc 4320ggaagcaccc
agaagcagaa gccttgaaat tgctacaaaa cgcctaccat gcagtggcag 4380acttagtaaa
tgaacataac atcaagtctg tcgccattcc actgctatct acaggcattt 4440acgcagccgg
aaaagaccgc cttgaagtat cacttaactg cttgacaacc gcgctagaca 4500gaactgacgc
ggacgtaacc atctattgcc tggataagaa gtggaaggaa agaatcgacg 4560cggcactcca
acttaaggag tctgtaacag agctgaagga tgaagatatg gagatcgacg 4620atgagttagt
atggattcat ccagacagtt gcttgaaggg aagaaaggga ttcagtacta 4680caaaaggaaa
attgtattcg tacttcgaag gcaccaaatt ccatcaagca gcaaaagaca 4740tggcggagat
aaaggtcctg ttccctaatg accaggaaag taatgaacaa ctgtgtgcct 4800acatattggg
tgagaccatg gaagcaatcc gcgaaaagtg cccggtcgac cataacccgt 4860cgtctagccc
gcccaaaacg ttgccgtgcc tttgcatgta tgccatgacg ccagaaaggg 4920tccacagact
tagaagcaat aacgtcaaag aagttacagt atgctcctcc accccccttc 4980ctaagcacaa
aattaagaat gttcagaagg ttcagtgcac gaaagtagtc ctgtttaatc 5040cgcacactcc
cgcattcgtt cccgcccgta agtacataga agtgccagaa cagcctaccg 5100ctcctcctgc
acaggccgag gaggcccccg aagttgtagc gacaccgtca ccatctacag 5160ctgataacac
ctcgcttgat gtcacagaca tctcactgga tatggatgac agtagcgaag 5220gctcactttt
ttcgagcttt agcggatcgg acaactctat tactagtatg gacagttggt 5280cgtcaggacc
tagttcacta gagatagtag accgaaggca ggtggtggtg gctgacgttc 5340atgccgtcca
agagcctgcc cctattccac cgccaaggct aaagaagatg gcccgcctgg 5400cagcggcaag
aaaagagccc actccaccgg caagcaatag ctctgagtcc ctccacctct 5460cttttggtgg
ggtatccatg tccctcggat caattttcga cggagagacg gcccgccagg 5520cagcggtaca
acccctggca acaggcccca cggatgtgcc tatgtctttc ggatcgtttt 5580ccgacggaga
gattgatgag ctgagccgca gagtaactga gtccgaaccc gtcctgtttg 5640gatcatttga
accgggcgaa gtgaactcaa ttatatcgtc ccgatcagcc gtatcttttc 5700cactacgcaa
gcagagacgt agacgcagga gcaggaggac tgaatactga ctaaccgggg 5760taggtgggta
catattttcg acggacacag gccctgggca cttgcaaaag aagtccgttc 5820tgcagaacca
gcttacagaa ccgaccttgg agcgcaatgt cctggaaaga attcatgccc 5880cggtgctcga
cacgtcgaaa gaggaacaac tcaaactcag gtaccagatg atgcccaccg 5940aagccaacaa
aagtaggtac cagtctcgta aagtagaaaa tcagaaagcc ataaccactg 6000agcgactact
gtcaggacta cgactgtata actctgccac agatcagcca gaatgctata 6060agatcaccta
tccgaaacca ttgtactcca gtagcgtacc ggcgaactac tccgatccac 6120agttcgctgt
agctgtctgt aacaactatc tgcatgagaa ctatccgaca gtagcatctt 6180atcagattac
tgacgagtac gatgcttact tggatatggt agacgggaca gtcgcctgcc 6240tggatactgc
aaccttctgc cccgctaagc ttagaagtta cccgaaaaaa catgagtata 6300gagccccgaa
tatccgcagt gcggttccat cagcgatgca gaacacgcta caaaatgtgc 6360tcattgccgc
aactaaaaga aattgcaacg tcacgcagat gcgtgaactg ccaacactgg 6420actcagcgac
attcaatgtc gaatgctttc gaaaatatgc atgtaatgac gagtattggg 6480aggagttcgc
tcggaagcca attaggatta ccactgagtt tgtcaccgca tatgtagcta 6540gactgaaagg
ccctaaggcc gccgcactat ttgcaaagac gtataatttg gtcccattgc 6600aagaagtgcc
tatggataga ttcgtcatgg acatgaaaag agacgtgaaa gttacaccag 6660gcacgaaaca
cacagaagaa agaccgaaag tacaagtgat acaagccgca gaacccctgg 6720cgactgctta
cttatgcggg attcaccggg aattagtgcg taggcttacg gccgtcttgc 6780ttccaaacat
tcacacgctt tttgacatgt cggcggagga ttttgatgca atcatagcag 6840aacacttcaa
gcaaggcgac ccggtactgg agacggatat cgcatcattc gacaaaagcc 6900aagacgacgc
tatggcgtta accggtctga tgatcttgga ggacctgggt gtggatcaac 6960cactactcga
cttgatcgag tgcgcctttg gagaaatatc atccacccat ctacctacgg 7020gtactcgttt
taaattcggg gcgatgatga aatccggaat gttcctcaca ctttttgtca 7080acacagtttt
gaatgtcgtt atcgccagca gagtactaga agagcggctt aaaacgtcca 7140gatgtgcagc
gttcattggc gacgacaaca tcatacatgg agtagtatct gacaaagaaa 7200tggctgagag
gtgcgccacc tggctcaaca tggaggttaa gatcatcgac gcagtcatcg 7260gtgagagacc
accttacttc tgcggcggat ttatcttgca agattcggtt acttccacag 7320cgtgccgcgt
ggcggatccc ctgaaaaggc tgtttaagtt gggtaaaccg ctcccagccg 7380acgacgagca
agacgaagac agaagacgcg ctctgctaga tgaaacaaag gcgtggttta 7440gagtaggtat
aacaggcact ttagcagtgg ccgtgacgac ccggtatgag gtagacaata 7500ttacacctgt
cctactggca ttgagaactt ttgcccagag caaaagagca ttccaagcca 7560tcagagggga
aataaagcat ctctacggtg gtcctaaata gtcagcatag tacatttcat 7620ctgactaata
ctacaacacc accacctcta gatctagaat actgtaatca tacctggttt 7680gtttcagagc
catatcacca agatagagaa caacctaggt ctccggaggg ggcaagggca 7740tcagtgtgct
cagttgaaaa tcccttgtca acatctaggc cttatcacat cacaagttcc 7800gccttaaact
ctgcagggtg atccaacaac cttaatagca acattattgt taaaggacag 7860cattagttca
cagtcaaaca agcaagattg agaattaact ttgattttga acctgaacac 7920ccagaggact
ggagactcaa caaccctaaa gcctggggta aaacattaga aatagtttaa 7980agacaaattg
ctcggaatca caaaattccg agtatggatt ctcgtcctca gaaagtctgg 8040tagacgccga
gtctcactga atctgacatg gattaccaca agatcttgac agcaggtctg 8100tccgttcaac
aggggattgt tcggcaaaga gtcatcccag tgtatcaagt aaacaatctt 8160gaggaaattt
gccaacttat catacaggcc tttgaagctg gtgttgattt tcaagagagt 8220gcggacagtt
tccttctcat gctttgtctt catcatgcgt accaaggaga ttacaaactt 8280ttcttggaaa
gtggcgcagt caagtatttg gaagggcacg ggttccgttt tgaagtcaag 8340aagcgtgatg
gagtgaagcg ccttgaggaa ttgctgccag cagtatctag tgggagaaac 8400attaagagaa
cacttgctgc catgccggaa gaggagacga cttaatgccg gacatgatgc 8460caacgatgct
gtgatttcaa attcagtggc tcaagctcgt ttttcaggtc tattgattgt 8520caaaacagta
cttgatcata tcctacaaaa gacagaacga ggagttcgtc tccatcctct 8580tgcaaggacc
gccaaggtaa aaaatgaggt gaactccttc aaggctgcac tcagctccct 8640ggccaagcat
ggagagtatg ctcctttcgc ccgacttttg aacctttctg gagtaaataa 8700tcttgagcat
ggtcttttcc ctcaactgtc ggcaattgca ctcggagtcg ccacagccca 8760cgggagcacc
ctcgcaggag taaatgttgg agaacagtat caacagctca gagaggcagc 8820cactgaggct
gagaagcaac tccaacaata tgcggagtct cgtgaacttg accatcttgg 8880acttgatgat
caggaaaaga aaattcttat gaacttccat cagaaaaaga acgaaatcag 8940cttccagcaa
acaaacgcga tggtaactct aagaaaagag cgcctggcca agctgacaga 9000agctatcact
gctgcatcac tgcccaaaac aagtggacat tacgatgatg atgacgacat 9060tccctttcca
ggacccatca atgatgacga caatcctggc catcaagatg atgatccgac 9120tgactcacag
gatacgacca ttcccgatgt ggtagttgat cccgatgatg gaggctacgg 9180cgaataccaa
agttactcgg aaaacggcat gagtgcacca gatgacttgg tcctatgtct 9240tttagctgta
taccagttgc ccctgagata cgccacaaaa gtgtctctga gctaaagtgg 9300tctgtacaca
tctcatacat tgtattaggg gcaataatat ctaattgaac ttagccattt 9360aaaatttagt
gcataaatct gggctaactc caccaggtca actccattgg ctgaaaagaa 9420gcccacctac
aacgaacatt actttgagcg ccctcacaat taaaaaataa gagcgtcgtt 9480ccaacaatcg
agcgcaaggt tacaaggttg aactgagagt gtctagacaa caaaatatcg 9540atactccaga
caccaagcaa gacctgagaa aaaaccatgg ccaaagctac gggacgatac 9600aatctaatat
cgcccaaaaa ggacctggag aaaggggttg tcttaagcga cctctgtaac 9660ttcttagtta
gtcaaactat tcaagggtgg aaagtttatt gggctggtat tgagtttgat 9720gtgactcaca
aaggaatggc cctattgcat agactgaaaa ctaatgactt tgcccctgca 9780tggtcatggc
ggtgtttgca gatttggacc tgcgagcggg ttctgacctg aaggctctgc 9840gcggacttgt
ggagacagcc gctcaccttg gctattattt aaatatcgag ggcaggagcg 9900agagggccaa
ggccagggag ggcggccacc accatcacca ccatcaccat tagtaatgag 9960gtaaccgtgg
ggcccaatga tccgaccagc aaaactcgat gtacttccga ggaactgatg 10020tgcataatgc
atcaggctgg tacattagat ccccgcttac cgcgggcaat atagcaacac 10080taaaaactcg
atgtacttcc gaggaagcgc agtgcataat gctgcgcagt gttgccacat 10140aaccactata
ttaaccattt atctagcgga cgccaaaaac tcaatgtatt tctgaggaag 10200cgtggtgcat
aatgccacgc agcgtctgca taacttttat tatttctttt attaatcaac 10260aaaattttgt
ttttaacatt tcaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaag 10320ggaattcctc
gattaattaa gcggccgc
103481111655DNAArtificial SequenceDescription of Artificial Sequence
Synthetic polynucleotideMISC_FEATURE(1)..(7646)Sindbis gene
sequenceMISC_FEATURE(7647)..(11166)Multi-mutant HIV-1 insert
sequenceMISC_FEATURE(11167)..(11655)Sindbis gene sequence 11attgacggcg
tagtacacac tattgaatca aacagccgac caattgcact accatcacaa 60tggagaagcc
agtagtaaac gtagacgtag acccccagag tccgtttgtc gtgcaactgc 120aaaaaagctt
cccgcaattt gaggtagtag cacagcaggt cactccaaat gaccatgcta 180atgccagagc
attttcgcat ctggccagta aactaatcga gctggaggtt cctaccacag 240cgacgatctt
ggacataggc agcgcaccgg ctcgtagaat gttttccgag caccagtatc 300attgtgtctg
ccccatgcgt agtccagaag acccggaccg catgatgaaa tacgccagta 360aactggcgga
aaaagcgtgc aagattacaa acaagaactt gcatgagaag attaaggatc 420tccggaccgt
acttgatacg ccggatgctg aaacaccatc gctctgcttt cacaacgatg 480ttacctgcaa
catgcgtgcc gaatattccg tcatgcagga cgtgtatatc aacgctcccg 540gaactatcta
tcatcaggct atgaaaggcg tgcggaccct gtactggatt ggcttcgaca 600ccacccagtt
catgttctcg gctatggcag gttcgtaccc tgcgtacaac accaactggg 660ccgacgagaa
agtccttgaa gcgcgtaaca tcggactttg cagcacaaag ctgagtgaag 720gtaggacagg
aaaattgtcg ataatgagga agaaggagtt gaagcccggg tcgcgggttt 780atttctccgt
aggatcgaca ctttatccag aacacagagc cagcttgcag agctggcatc 840ttccatcggt
gttccacttg aatggaaagc agtcgtacac ttgccgctgt gatacagtgg 900tgagttgcga
aggctacgta gtgaagaaaa tcaccatcag tcccgggatc acgggagaaa 960ccgtgggata
cgcggttaca cacaatagcg agggcttctt gctatgcaaa gttactgaca 1020cagtaaaagg
agaacgggta tcgttccctg tgtgcacgta catcccggcc accatatgcg 1080atcagatgac
tggtataatg gccacggata tatcacctga cgatgcacaa aaacttctgg 1140ttgggctcaa
ccagcgaatt gtcattaacg gtaggactaa caggaacacc aacaccatgc 1200aaaattacct
tctgccgatc atagcacaag ggttcagcaa atgggctaag gagcgcaagg 1260atgatcttga
taacgagaaa atgctgggta ctagagaacg caagcttacg tatggctgct 1320tgtgggcgtt
tcgcactaag aaagtacatt cgttttatcg cccacctgga acgcagacct 1380gcgtaaaagt
cccagcctct tttagcgctt ttcccatgtc gtccgtatgg acgacctctt 1440tgcccatgtc
gctgaggcag aaattgaaac tggcattgca accaaagaag gaggaaaaac 1500tgctgcaggt
ctcggaggaa ttagtcatgg aggccaaggc tgcttttgag gatgctcagg 1560aggaagccag
agcggagaag ctccgagaag cacttccacc attagtggca gacaaaggca 1620tcgaggcagc
cgcagaagtt gtctgcgaag tggaggggct ccaggcggac atcggagcag 1680cattagttga
aaccccgcgc ggtcacgtaa ggataatacc tcaagcaaat gaccgtatga 1740tcggacagta
tatcgttgtc tcgccaaact ctgtgctgaa gaatgccaaa ctcgcaccag 1800cgcacccgct
agcagatcag gttaagatca taacacactc cggaagatca ggaaggtacg 1860cggtcgaacc
atacgacgct aaagtactga tgccagcagg aggtgccgta ccatggccag 1920aattcctagc
actgagtgag agcgccacgt tagtgtacaa cgaaagagag tttgtgaacc 1980gcaaactata
ccacattgcc atgcatggcc ccgccaagaa tacagaagag gagcagtaca 2040aggttacaaa
ggcagagctt gcagaaacag agtacgtgtt tgacgtggac aagaagcgtt 2100gcgttaagaa
ggaagaagcc tcaggtctgg tcctctcggg agaactgacc aaccctccct 2160atcatgagct
agctctggag ggactgaaga cccgacctgc ggtcccgtac aaggtcgaaa 2220caataggagt
gataggcaca ccggggtcgg gcaagtcagc tattatcaag tcaactgtca 2280cggcacgaga
tcttgttacc agcggaaaga aagaaaattg tcgcgaaatt gaggccgacg 2340tgctaagact
gaggggtatg cagattacgt cgaagacagt agattcggtt atgctcaacg 2400gatgccacaa
agccgtagaa gtgctgtacg ttgacgaagc gttcgcgtgc cacgcaggag 2460cactacttgc
cttgattgct atcgtcaggc cccgcaagaa ggtagtacta tgcggagacc 2520ccatgcaatg
cggattcttc aacatgatgc aactaaaggt acatttcaat caccctgaaa 2580aagacatatg
caccaagaca ttctacaagt atatctcccg gcgttgcaca cagccagtta 2640cagctattgt
atcgacactg cattacgatg gaaagatgaa aaccacgaac ccgtgcaaga 2700agaacattga
aatcgatatt acaggggcca caaagccgaa gccaggggat atcatcctga 2760catgtttccg
cgggtgggtt aagcaattgc aaatcgacta tcccggacat gaagtaatga 2820cagccgcggc
ctcacaaggg ctaaccagaa aaggagtgta tgccgtccgg caaaaagtca 2880atgaaaaccc
actgtacgcg atcacatcag agcatgtgaa cgtgttgctc acccgcactg 2940aggacaggct
agtgtggaaa accttgcagg gcgacccatg gattaagcag cccactaaca 3000tacctaaagg
aaactttcag gctactatag aggactggga agctgaacac aagggaataa 3060ttgctgcaat
aaacagcccc actccccgtg ccaatccgtt cagctgcaag accaacgttt 3120gctgggcgaa
agcattggaa ccgatactag ccacggccgg tatcgtactt accggttgcc 3180agtggagcga
actgttccca cagtttgcgg atgacaaacc acattcggcc atttacgcct 3240tagacgtaat
ttgcattaag tttttcggca tggacttgac aagcggactg ttttctaaac 3300agagcatccc
actaacgtac catcccgccg attcagcgag gccggtagct cattgggaca 3360acagcccagg
aacccgcaag tatgggtacg atcacgccat tgccgccgaa ctctcccgta 3420gatttccggt
gttccagcta gctgggaagg gcacacaact tgatttgcag acggggagaa 3480ccagagttat
ctctgcacag cataacctgg tcccggtgaa ccgcaatctt cctcacgcct 3540tagtccccga
gtacaaggag aagcaacccg gcccggtcaa aaaattcttg aaccagttca 3600aacaccactc
agtacttgtg gtatcagagg aaaaaattga agctccccgt aagagaatcg 3660aatggatcgc
cccgattggc atagccggtg cagataagaa ctacaacctg gctttcgggt 3720ttccgccgca
ggcacggtac gacctggtgt tcatcaacat tggaactaaa tacagaaacc 3780accactttca
gcagtgcgaa gaccatgcgg cgaccttaaa aaccctttcg cgttcggccc 3840tgaattgcct
taacccagga ggcaccctcg tggtgaagtc ctatggctac gccgaccgca 3900acagtgagga
cgtagtcacc gctcttgcca gaaagtttgt cagggtgtct gcagcgagac 3960cagattgtgt
ctcaagcaat acagaaatgt acctgatttt ccgacaacta gacaacagcc 4020gtacacggca
attcaccccg caccatctga attgcgtgat ttcgtccgtg tatgagggta 4080caagagatgg
agttggagcc gcgccgtcat accgcaccaa aagggagaat attgctgact 4140gtcaagagga
agcagttgtc aacgcagcca atccgctggg tagaccaggc gaaggagtct 4200gccgtgccat
ctataaacgt tggccgacca gttttaccga ttcagccacg gagacaggca 4260ccgcaagaat
gactgtgtgc ctaggaaaga aagtgatcca cgcggtcggc cctgatttcc 4320ggaagcaccc
agaagcagaa gccttgaaat tgctacaaaa cgcctaccat gcagtggcag 4380acttagtaaa
tgaacataac atcaagtctg tcgccattcc actgctatct acaggcattt 4440acgcagccgg
aaaagaccgc cttgaagtat cacttaactg cttgacaacc gcgctagaca 4500gaactgacgc
ggacgtaacc atctattgcc tggataagaa gtggaaggaa agaatcgacg 4560cggcactcca
acttaaggag tctgtaacag agctgaagga tgaagatatg gagatcgacg 4620atgagttagt
atggattcat ccagacagtt gcttgaaggg aagaaaggga ttcagtacta 4680caaaaggaaa
attgtattcg tacttcgaag gcaccaaatt ccatcaagca gcaaaagaca 4740tggcggagat
aaaggtcctg ttccctaatg accaggaaag taatgaacaa ctgtgtgcct 4800acatattggg
tgagaccatg gaagcaatcc gcgaaaagtg cccggtcgac cataacccgt 4860cgtctagccc
gcccaaaacg ttgccgtgcc tttgcatgta tgccatgacg ccagaaaggg 4920tccacagact
tagaagcaat aacgtcaaag aagttacagt atgctcctcc accccccttc 4980ctaagcacaa
aattaagaat gttcagaagg ttcagtgcac gaaagtagtc ctgtttaatc 5040cgcacactcc
cgcattcgtt cccgcccgta agtacataga agtgccagaa cagcctaccg 5100ctcctcctgc
acaggccgag gaggcccccg aagttgtagc gacaccgtca ccatctacag 5160ctgataacac
ctcgcttgat gtcacagaca tctcactgga tatggatgac agtagcgaag 5220gctcactttt
ttcgagcttt agcggatcgg acaactctat tactagtatg gacagttggt 5280cgtcaggacc
tagttcacta gagatagtag accgaaggca ggtggtggtg gctgacgttc 5340atgccgtcca
agagcctgcc cctattccac cgccaaggct aaagaagatg gcccgcctgg 5400cagcggcaag
aaaagagccc actccaccgg caagcaatag ctctgagtcc ctccacctct 5460cttttggtgg
ggtatccatg tccctcggat caattttcga cggagagacg gcccgccagg 5520cagcggtaca
acccctggca acaggcccca cggatgtgcc tatgtctttc ggatcgtttt 5580ccgacggaga
gattgatgag ctgagccgca gagtaactga gtccgaaccc gtcctgtttg 5640gatcatttga
accgggcgaa gtgaactcaa ttatatcgtc ccgatcagcc gtatcttttc 5700cactacgcaa
gcagagacgt agacgcagga gcaggaggac tgaatactga ctaaccgggg 5760taggtgggta
catattttcg acggacacag gccctgggca cttgcaaaag aagtccgttc 5820tgcagaacca
gcttacagaa ccgaccttgg agcgcaatgt cctggaaaga attcatgccc 5880cggtgctcga
cacgtcgaaa gaggaacaac tcaaactcag gtaccagatg atgcccaccg 5940aagccaacaa
aagtaggtac cagtctcgta aagtagaaaa tcagaaagcc ataaccactg 6000agcgactact
gtcaggacta cgactgtata actctgccac agatcagcca gaatgctata 6060agatcaccta
tccgaaacca ttgtactcca gtagcgtacc ggcgaactac tccgatccac 6120agttcgctgt
agctgtctgt aacaactatc tgcatgagaa ctatccgaca gtagcatctt 6180atcagattac
tgacgagtac gatgcttact tggatatggt agacgggaca gtcgcctgcc 6240tggatactgc
aaccttctgc cccgctaagc ttagaagtta cccgaaaaaa catgagtata 6300gagccccgaa
tatccgcagt gcggttccat cagcgatgca gaacacgcta caaaatgtgc 6360tcattgccgc
aactaaaaga aattgcaacg tcacgcagat gcgtgaactg ccaacactgg 6420actcagcgac
attcaatgtc gaatgctttc gaaaatatgc atgtaatgac gagtattggg 6480aggagttcgc
tcggaagcca attaggatta ccactgagtt tgtcaccgca tatgtagcta 6540gactgaaagg
ccctaaggcc gccgcactat ttgcaaagac gtataatttg gtcccattgc 6600aagaagtgcc
tatggataga ttcgtcatgg acatgaaaag agacgtgaaa gttacaccag 6660gcacgaaaca
cacagaagaa agaccgaaag tacaagtgat acaagccgca gaacccctgg 6720cgactgctta
cttatgcggg attcaccggg aattagtgcg taggcttacg gccgtcttgc 6780ttccaaacat
tcacacgctt tttgacatgt cggcggagga ttttgatgca atcatagcag 6840aacacttcaa
gcaaggcgac ccggtactgg agacggatat cgcatcattc gacaaaagcc 6900aagacgacgc
tatggcgtta accggtctga tgatcttgga ggacctgggt gtggatcaac 6960cactactcga
cttgatcgag tgcgcctttg gagaaatatc atccacccat ctacctacgg 7020gtactcgttt
taaattcggg gcgatgatga aatccggaat gttcctcaca ctttttgtca 7080acacagtttt
gaatgtcgtt atcgccagca gagtactaga agagcggctt aaaacgtcca 7140gatgtgcagc
gttcattggc gacgacaaca tcatacatgg agtagtatct gacaaagaaa 7200tggctgagag
gtgcgccacc tggctcaaca tggaggttaa gatcatcgac gcagtcatcg 7260gtgagagacc
accttacttc tgcggcggat ttatcttgca agattcggtt acttccacag 7320cgtgccgcgt
ggcggatccc ctgaaaaggc tgtttaagtt gggtaaaccg ctcccagccg 7380acgacgagca
agacgaagac agaagacgcg ctctgctaga tgaaacaaag gcgtggttta 7440gagtaggtat
aacaggcact ttagcagtgg ccgtgacgac ccggtatgag gtagacaata 7500ttacacctgt
cctactggca ttgagaactt ttgcccagag caaaagagca ttccaagcca 7560tcagagggga
aataaagcat ctctacggtg gtcctaaata gtcagcatag tacatttcat 7620ctgactaata
ctacaacacc accacctcta gaacaaattc agctaccata atgatgcaga 7680gaggcaattt
taggaaccaa agaaagattg ttaagtgttt caattgtggc aaagaagggc 7740acacagccag
aaattgcagg gcccctagga aaaagggctg ttggaaatgt ggaaaggaag 7800gacaccaaat
gaaagattgt actgagagac aggctaattt tttagggaag atctggcctt 7860cctacaaggg
aaggccaggg aattttcttc agagcagacc agagccaaca gccccaccag 7920aagagagctt
caggtctggg gtagagacaa caactccccc tcagaagcag gagccgatag 7980acaaggaact
gtatccttta acttccctca ggtcactctt tggcaacgac ccctcgtcac 8040aataaagata
ggggggcaac taaaggaagc tctaatataa acaggagcag ataatacaat 8100attagaagaa
atgagtttgc caggaagatg gaaaccaaaa atactagtgg gagttggagg 8160ttttatgaaa
gtaagacagt atgatcagat actcatagaa atctgtggac ataaagctat 8220atctacagta
gtagtaggac ctacacctgc caacgtaatt ggaagagatc tgatgactca 8280gattggttgc
actttaaatt ttcccattag ccctattgag actgtaccag taaaattaaa 8340gccaggaatg
gatggcccaa aagttaaaca atggccattg acagaagaaa aaataaaagc 8400attagtagaa
atttgtacag agttggaaaa ggaagggaaa atttcaaaaa ttgggcctga 8460aaatccatac
aatactccag tatttgccat aaagagaaaa aacagttctt cctccagatg 8520gagaaaagta
gtagatctca gagaacttaa taagagaact caagacttct gggaagttca 8580attaggaata
ccacatcccg cagggataga aaagaacaaa tcagcaacaa tactgtaagt 8640gggtgatgca
ttttattcag ttcccttaga tgaagacttc aggaagtata ctgcatttac 8700catacctagt
ataaacaatg agacaccagg gattagatat cagtacaatg tgcttccaat 8760gggatggaaa
ggatcaccag caatattcca aagtagcatg acaaaaatct tagagccttt 8820tagaaaacaa
aatccagaca tagttatctg tcaatacgtg gatgatttgt tagtagcatc 8880tgacttagaa
atagggcagc atagaacaaa aatagaggag ctgagacaac atctgtggag 8940gtggggactt
tacacaccag accaaaaaca tcagaaagaa catccattcc tttggctggg 9000ttatgaactc
catcctgata aatggacagt acagcctata gtgctgccag aaaaagacag 9060ctggactgtc
aatgacatac agaagttagt ggggaaattg aattgggcaa gtcagattta 9120cccagggatt
aaagtaaggc aattatgtaa actccttaga ggaaccaaag cactaacaga 9180agtaatacca
ctaacagaag aagcagagct agaactggca gaaaacagag agattctaaa 9240agaaccagta
catggagtgt attatgaccc atcaaaagac ttaatagcag aaatacagaa 9300gcaggggcaa
ggccaatgga catatcaaat ttatcaagag ccatttaaaa atctgaaaac 9360aggaaaatat
gcaagaatga ggggtgccca cactaatgat gtaaaacaat taacagaggc 9420agtgcaaaaa
ataaccacag aaagcatagt aatatgggga aagactccta aatttaaact 9480gcccatacaa
aaggaaacat gggaaacatg gtggacagag tattggcaag ccacctggat 9540tcctgagtgg
gagtttgtta atacccctcc cttagtgaaa ttatggtacc agttagagaa 9600agaacccata
gtaggagcag aaaccttcta tgtagatggg gcagctaaca gggagactaa 9660attaggaaaa
gcaggatatg ttactaatag aggaagacaa aaagttgtca ccctaactga 9720cacaacaaat
cagaagactg agttacaagc aatttatcta gctttgcagg attcgggatt 9780agaagtaaac
atagtaacag actcacaata tgcattagga atcattcaag cacaaccaga 9840tcaaagtgaa
tcagagttag tcaatcaaat aatagagcag ttaataaaaa aggaaaaggt 9900ctatctggca
tgggtaccag cacacaaagg aattggagga aatgaacaag tagataaatt 9960agtcagtgct
ggaatcagga aagtactatt tttagatgga atagataagg cccaagatga 10020acatgagaaa
tatcacagta attggagagc aatggctagt gattttaacc tgccacctgt 10080agtagcaaaa
gaaatagtag ccagctgtga taaatgtcag ctaaaaggag aagccatgca 10140tggacaagta
gactgtagtc caggaatatg gcaactagat tgtatacatt tagaaggaaa 10200agttatcatg
gtagcagttc atgtagccag tggatatata gaagcagaag ttattccagc 10260acaaacaggg
caggaagcag catattttct tttaaaatta gcaggaagat ggccagtaaa 10320aacaatacat
actgacaatg gcagcaattt caccggtgct acggttaggg ccgcctgttg 10380gtgggcggga
atcaagcagg cattttcaat tccccgcaat ccccaaagtc acggagtagt 10440ataatctatg
cataaagaat taaagaaaat tataggacag gtaagagatc aggctgaaca 10500tcttaagaca
gcagtacaaa tggcagtatt catccacaat tttaaaagaa aaggggggat 10560tggggggtac
agtgcagggg aaagaatagt agacataata gcaacagaca tacaaactaa 10620agaattacaa
aaacaaatta caaaaattca aaattttcgg gtttattaca gggacagcag 10680aaatccactt
tggaaaggac cagcaaagct cctctggaaa ggtgaagggg cagtagtaat 10740acaagataat
agtgacataa aagtagtgcc aagaagaaaa gcaaagatca ttagggatta 10800tggaaaacag
atggcaggtg atgattgtgt ggcaagtaga caggatgagg attagaacat 10860ggaaaagttt
agtaaaacac catatgtatg tttcagggaa agctagggga tggttttata 10920gacatcacta
tgaaagccct catccaagaa taagttcaga agtacacatc ccactagggg 10980atgctagatt
ggtaataaca acatattggg gtctgcatac aggagaaaga gactggcatt 11040tgggtcaggg
agtctccata gaatggagga aaaagagata tagcacacaa gtagaccctg 11100aactagcaga
ccaactaatt catctgtatt actttgactg tttttcagac tctgctatgg 11160cgcgccacgt
gacgcgtgca tgcatttaaa tatcgagggc aggagcgaga gggccaaggc 11220cagggagggc
ggccaccacc atcaccacca tcaccattag taatgaggta accgtggggc 11280ccaatgatcc
gaccagcaaa actcgatgta cttccgagga actgatgtgc ataatgcatc 11340aggctggtac
attagatccc cgcttaccgc gggcaatata gcaacactaa aaactcgatg 11400tacttccgag
gaagcgcagt gcataatgct gcgcagtgtt gccacataac cactatatta 11460accatttatc
tagcggacgc caaaaactca atgtatttct gaggaagcgt ggtgcataat 11520gccacgcagc
gtctgcataa cttttattat ttcttttatt aatcaacaaa attttgtttt 11580taacatttca
aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaggga attcctcgat 11640taattaagcg
gccgc
116551211649DNAArtificial SequenceDescription of Artificial Sequence
Synthetic polynucleotideMISC_FEATURE(1)..(7646)Sindbis gene
sequenceMISC_FEATURE(7647)..(11160)Wild-type HIV-1 insert
sequenceMISC_FEATURE(11161)..(11649)Sindbis gene sequence 12attgacggcg
tagtacacac tattgaatca aacagccgac caattgcact accatcacaa 60tggagaagcc
agtagtaaac gtagacgtag acccccagag tccgtttgtc gtgcaactgc 120aaaaaagctt
cccgcaattt gaggtagtag cacagcaggt cactccaaat gaccatgcta 180atgccagagc
attttcgcat ctggccagta aactaatcga gctggaggtt cctaccacag 240cgacgatctt
ggacataggc agcgcaccgg ctcgtagaat gttttccgag caccagtatc 300attgtgtctg
ccccatgcgt agtccagaag acccggaccg catgatgaaa tacgccagta 360aactggcgga
aaaagcgtgc aagattacaa acaagaactt gcatgagaag attaaggatc 420tccggaccgt
acttgatacg ccggatgctg aaacaccatc gctctgcttt cacaacgatg 480ttacctgcaa
catgcgtgcc gaatattccg tcatgcagga cgtgtatatc aacgctcccg 540gaactatcta
tcatcaggct atgaaaggcg tgcggaccct gtactggatt ggcttcgaca 600ccacccagtt
catgttctcg gctatggcag gttcgtaccc tgcgtacaac accaactggg 660ccgacgagaa
agtccttgaa gcgcgtaaca tcggactttg cagcacaaag ctgagtgaag 720gtaggacagg
aaaattgtcg ataatgagga agaaggagtt gaagcccggg tcgcgggttt 780atttctccgt
aggatcgaca ctttatccag aacacagagc cagcttgcag agctggcatc 840ttccatcggt
gttccacttg aatggaaagc agtcgtacac ttgccgctgt gatacagtgg 900tgagttgcga
aggctacgta gtgaagaaaa tcaccatcag tcccgggatc acgggagaaa 960ccgtgggata
cgcggttaca cacaatagcg agggcttctt gctatgcaaa gttactgaca 1020cagtaaaagg
agaacgggta tcgttccctg tgtgcacgta catcccggcc accatatgcg 1080atcagatgac
tggtataatg gccacggata tatcacctga cgatgcacaa aaacttctgg 1140ttgggctcaa
ccagcgaatt gtcattaacg gtaggactaa caggaacacc aacaccatgc 1200aaaattacct
tctgccgatc atagcacaag ggttcagcaa atgggctaag gagcgcaagg 1260atgatcttga
taacgagaaa atgctgggta ctagagaacg caagcttacg tatggctgct 1320tgtgggcgtt
tcgcactaag aaagtacatt cgttttatcg cccacctgga acgcagacct 1380gcgtaaaagt
cccagcctct tttagcgctt ttcccatgtc gtccgtatgg acgacctctt 1440tgcccatgtc
gctgaggcag aaattgaaac tggcattgca accaaagaag gaggaaaaac 1500tgctgcaggt
ctcggaggaa ttagtcatgg aggccaaggc tgcttttgag gatgctcagg 1560aggaagccag
agcggagaag ctccgagaag cacttccacc attagtggca gacaaaggca 1620tcgaggcagc
cgcagaagtt gtctgcgaag tggaggggct ccaggcggac atcggagcag 1680cattagttga
aaccccgcgc ggtcacgtaa ggataatacc tcaagcaaat gaccgtatga 1740tcggacagta
tatcgttgtc tcgccaaact ctgtgctgaa gaatgccaaa ctcgcaccag 1800cgcacccgct
agcagatcag gttaagatca taacacactc cggaagatca ggaaggtacg 1860cggtcgaacc
atacgacgct aaagtactga tgccagcagg aggtgccgta ccatggccag 1920aattcctagc
actgagtgag agcgccacgt tagtgtacaa cgaaagagag tttgtgaacc 1980gcaaactata
ccacattgcc atgcatggcc ccgccaagaa tacagaagag gagcagtaca 2040aggttacaaa
ggcagagctt gcagaaacag agtacgtgtt tgacgtggac aagaagcgtt 2100gcgttaagaa
ggaagaagcc tcaggtctgg tcctctcggg agaactgacc aaccctccct 2160atcatgagct
agctctggag ggactgaaga cccgacctgc ggtcccgtac aaggtcgaaa 2220caataggagt
gataggcaca ccggggtcgg gcaagtcagc tattatcaag tcaactgtca 2280cggcacgaga
tcttgttacc agcggaaaga aagaaaattg tcgcgaaatt gaggccgacg 2340tgctaagact
gaggggtatg cagattacgt cgaagacagt agattcggtt atgctcaacg 2400gatgccacaa
agccgtagaa gtgctgtacg ttgacgaagc gttcgcgtgc cacgcaggag 2460cactacttgc
cttgattgct atcgtcaggc cccgcaagaa ggtagtacta tgcggagacc 2520ccatgcaatg
cggattcttc aacatgatgc aactaaaggt acatttcaat caccctgaaa 2580aagacatatg
caccaagaca ttctacaagt atatctcccg gcgttgcaca cagccagtta 2640cagctattgt
atcgacactg cattacgatg gaaagatgaa aaccacgaac ccgtgcaaga 2700agaacattga
aatcgatatt acaggggcca caaagccgaa gccaggggat atcatcctga 2760catgtttccg
cgggtgggtt aagcaattgc aaatcgacta tcccggacat gaagtaatga 2820cagccgcggc
ctcacaaggg ctaaccagaa aaggagtgta tgccgtccgg caaaaagtca 2880atgaaaaccc
actgtacgcg atcacatcag agcatgtgaa cgtgttgctc acccgcactg 2940aggacaggct
agtgtggaaa accttgcagg gcgacccatg gattaagcag cccactaaca 3000tacctaaagg
aaactttcag gctactatag aggactggga agctgaacac aagggaataa 3060ttgctgcaat
aaacagcccc actccccgtg ccaatccgtt cagctgcaag accaacgttt 3120gctgggcgaa
agcattggaa ccgatactag ccacggccgg tatcgtactt accggttgcc 3180agtggagcga
actgttccca cagtttgcgg atgacaaacc acattcggcc atttacgcct 3240tagacgtaat
ttgcattaag tttttcggca tggacttgac aagcggactg ttttctaaac 3300agagcatccc
actaacgtac catcccgccg attcagcgag gccggtagct cattgggaca 3360acagcccagg
aacccgcaag tatgggtacg atcacgccat tgccgccgaa ctctcccgta 3420gatttccggt
gttccagcta gctgggaagg gcacacaact tgatttgcag acggggagaa 3480ccagagttat
ctctgcacag cataacctgg tcccggtgaa ccgcaatctt cctcacgcct 3540tagtccccga
gtacaaggag aagcaacccg gcccggtcaa aaaattcttg aaccagttca 3600aacaccactc
agtacttgtg gtatcagagg aaaaaattga agctccccgt aagagaatcg 3660aatggatcgc
cccgattggc atagccggtg cagataagaa ctacaacctg gctttcgggt 3720ttccgccgca
ggcacggtac gacctggtgt tcatcaacat tggaactaaa tacagaaacc 3780accactttca
gcagtgcgaa gaccatgcgg cgaccttaaa aaccctttcg cgttcggccc 3840tgaattgcct
taacccagga ggcaccctcg tggtgaagtc ctatggctac gccgaccgca 3900acagtgagga
cgtagtcacc gctcttgcca gaaagtttgt cagggtgtct gcagcgagac 3960cagattgtgt
ctcaagcaat acagaaatgt acctgatttt ccgacaacta gacaacagcc 4020gtacacggca
attcaccccg caccatctga attgcgtgat ttcgtccgtg tatgagggta 4080caagagatgg
agttggagcc gcgccgtcat accgcaccaa aagggagaat attgctgact 4140gtcaagagga
agcagttgtc aacgcagcca atccgctggg tagaccaggc gaaggagtct 4200gccgtgccat
ctataaacgt tggccgacca gttttaccga ttcagccacg gagacaggca 4260ccgcaagaat
gactgtgtgc ctaggaaaga aagtgatcca cgcggtcggc cctgatttcc 4320ggaagcaccc
agaagcagaa gccttgaaat tgctacaaaa cgcctaccat gcagtggcag 4380acttagtaaa
tgaacataac atcaagtctg tcgccattcc actgctatct acaggcattt 4440acgcagccgg
aaaagaccgc cttgaagtat cacttaactg cttgacaacc gcgctagaca 4500gaactgacgc
ggacgtaacc atctattgcc tggataagaa gtggaaggaa agaatcgacg 4560cggcactcca
acttaaggag tctgtaacag agctgaagga tgaagatatg gagatcgacg 4620atgagttagt
atggattcat ccagacagtt gcttgaaggg aagaaaggga ttcagtacta 4680caaaaggaaa
attgtattcg tacttcgaag gcaccaaatt ccatcaagca gcaaaagaca 4740tggcggagat
aaaggtcctg ttccctaatg accaggaaag taatgaacaa ctgtgtgcct 4800acatattggg
tgagaccatg gaagcaatcc gcgaaaagtg cccggtcgac cataacccgt 4860cgtctagccc
gcccaaaacg ttgccgtgcc tttgcatgta tgccatgacg ccagaaaggg 4920tccacagact
tagaagcaat aacgtcaaag aagttacagt atgctcctcc accccccttc 4980ctaagcacaa
aattaagaat gttcagaagg ttcagtgcac gaaagtagtc ctgtttaatc 5040cgcacactcc
cgcattcgtt cccgcccgta agtacataga agtgccagaa cagcctaccg 5100ctcctcctgc
acaggccgag gaggcccccg aagttgtagc gacaccgtca ccatctacag 5160ctgataacac
ctcgcttgat gtcacagaca tctcactgga tatggatgac agtagcgaag 5220gctcactttt
ttcgagcttt agcggatcgg acaactctat tactagtatg gacagttggt 5280cgtcaggacc
tagttcacta gagatagtag accgaaggca ggtggtggtg gctgacgttc 5340atgccgtcca
agagcctgcc cctattccac cgccaaggct aaagaagatg gcccgcctgg 5400cagcggcaag
aaaagagccc actccaccgg caagcaatag ctctgagtcc ctccacctct 5460cttttggtgg
ggtatccatg tccctcggat caattttcga cggagagacg gcccgccagg 5520cagcggtaca
acccctggca acaggcccca cggatgtgcc tatgtctttc ggatcgtttt 5580ccgacggaga
gattgatgag ctgagccgca gagtaactga gtccgaaccc gtcctgtttg 5640gatcatttga
accgggcgaa gtgaactcaa ttatatcgtc ccgatcagcc gtatcttttc 5700cactacgcaa
gcagagacgt agacgcagga gcaggaggac tgaatactga ctaaccgggg 5760taggtgggta
catattttcg acggacacag gccctgggca cttgcaaaag aagtccgttc 5820tgcagaacca
gcttacagaa ccgaccttgg agcgcaatgt cctggaaaga attcatgccc 5880cggtgctcga
cacgtcgaaa gaggaacaac tcaaactcag gtaccagatg atgcccaccg 5940aagccaacaa
aagtaggtac cagtctcgta aagtagaaaa tcagaaagcc ataaccactg 6000agcgactact
gtcaggacta cgactgtata actctgccac agatcagcca gaatgctata 6060agatcaccta
tccgaaacca ttgtactcca gtagcgtacc ggcgaactac tccgatccac 6120agttcgctgt
agctgtctgt aacaactatc tgcatgagaa ctatccgaca gtagcatctt 6180atcagattac
tgacgagtac gatgcttact tggatatggt agacgggaca gtcgcctgcc 6240tggatactgc
aaccttctgc cccgctaagc ttagaagtta cccgaaaaaa catgagtata 6300gagccccgaa
tatccgcagt gcggttccat cagcgatgca gaacacgcta caaaatgtgc 6360tcattgccgc
aactaaaaga aattgcaacg tcacgcagat gcgtgaactg ccaacactgg 6420actcagcgac
attcaatgtc gaatgctttc gaaaatatgc atgtaatgac gagtattggg 6480aggagttcgc
tcggaagcca attaggatta ccactgagtt tgtcaccgca tatgtagcta 6540gactgaaagg
ccctaaggcc gccgcactat ttgcaaagac gtataatttg gtcccattgc 6600aagaagtgcc
tatggataga ttcgtcatgg acatgaaaag agacgtgaaa gttacaccag 6660gcacgaaaca
cacagaagaa agaccgaaag tacaagtgat acaagccgca gaacccctgg 6720cgactgctta
cttatgcggg attcaccggg aattagtgcg taggcttacg gccgtcttgc 6780ttccaaacat
tcacacgctt tttgacatgt cggcggagga ttttgatgca atcatagcag 6840aacacttcaa
gcaaggcgac ccggtactgg agacggatat cgcatcattc gacaaaagcc 6900aagacgacgc
tatggcgtta accggtctga tgatcttgga ggacctgggt gtggatcaac 6960cactactcga
cttgatcgag tgcgcctttg gagaaatatc atccacccat ctacctacgg 7020gtactcgttt
taaattcggg gcgatgatga aatccggaat gttcctcaca ctttttgtca 7080acacagtttt
gaatgtcgtt atcgccagca gagtactaga agagcggctt aaaacgtcca 7140gatgtgcagc
gttcattggc gacgacaaca tcatacatgg agtagtatct gacaaagaaa 7200tggctgagag
gtgcgccacc tggctcaaca tggaggttaa gatcatcgac gcagtcatcg 7260gtgagagacc
accttacttc tgcggcggat ttatcttgca agattcggtt acttccacag 7320cgtgccgcgt
ggcggatccc ctgaaaaggc tgtttaagtt gggtaaaccg ctcccagccg 7380acgacgagca
agacgaagac agaagacgcg ctctgctaga tgaaacaaag gcgtggttta 7440gagtaggtat
aacaggcact ttagcagtgg ccgtgacgac ccggtatgag gtagacaata 7500ttacacctgt
cctactggca ttgagaactt ttgcccagag caaaagagca ttccaagcca 7560tcagagggga
aataaagcat ctctacggtg gtcctaaata gtcagcatag tacatttcat 7620ctgactaata
ctacaacacc accacctcta gaacaaattc agctaccata atgatgcaga 7680gaggcaattt
taggaaccaa agaaagattg ttaagtgttt caattgtggc aaagaagggc 7740acacagccag
aaattgcagg gcccctagga aaaagggctg ttggaaatgt ggaaaggaag 7800gacaccaaat
gaaagattgt actgagagac aggctaattt tttagggaag atctggcctt 7860cctacaaggg
aaggccaggg aattttcttc agagcagacc agagccaaca gccccaccag 7920aagagagctt
caggtctggg gtagagacaa caactccccc tcagaagcag gagccgatag 7980acaaggaact
gtatccttta acttccctca ggtcactctt tggcaacgac ccctcgtcac 8040aataaagata
ggggggcaac taaaggaagc tctattataa acaggagcag atgatacagt 8100attagaagaa
atgagtttgc caggaagatg gaaaccaaaa atgatagggg gaattggagg 8160ttttatcaaa
gtaagacagt atgatcagat actcatagaa atctgtggac ataaagctat 8220aggtacagta
ttagtaggac ctacacctgt caacataatt ggaagaaatc tgttgactca 8280gattggttgc
actttaaatt ttcccattag ccctattgag actgtaccag taaaattaaa 8340gccaggaatg
gatggcccaa aagttaaaca atggccattg acagaagaaa aaataaaagc 8400attagtagaa
atttgtacag agatggaaaa ggaagggaaa atttcaaaaa ttgggcctga 8460aaatccatac
aatactccag tatttgccat aaagaaaaaa gacagtacta aatggagaaa 8520attagtagat
ttcagagaac ttaataagag aactcaagac ttctgggaag ttcaattagg 8580aataccacat
cccgcagggt taaaaaagaa aaaatcagta acagtactgt aagtgggtga 8640tgcatatttt
tcagttccct tagatgaaga cttcaggaag tatactgcat ttaccatacc 8700tagtataaac
aatgagacac cagggattag atatcagtac aatgtgcttc cacagggatg 8760gaaaggatca
ccagcaatat tccaaagtag catgacaaaa atcttagagc cttttagaaa 8820acaaaatcca
gacatagtta tctatcaata catggatgat ttgtatgtag gatctgactt 8880agaaataggg
cagcatagaa caaaaataga ggagctgaga caacatctgt tgaggtgggg 8940acttaccaca
ccagacaaaa aacatcagaa agaacctcca ttcctttgga tgggttatga 9000actccatcct
gataaatgga cagtacagcc tatagtgctg ccagaaaaag acagctggac 9060tgtcaatgac
atacagaagt tagtggggaa attgaattgg gcaagtcaga tttacccagg 9120gattaaagta
aggcaattat gtaaactcct tagaggaacc aaagcactaa cagaagtaat 9180accactaaca
gaagaagcag agctagaact ggcagaaaac agagagattc taaaagaacc 9240agtacatgga
gtgtattatg acccatcaaa agacttaata gcagaaatac agaagcaggg 9300gcaaggccaa
tggacatatc aaatttatca agagccattt aaaaatctga aaacaggaaa 9360atatgcaaga
atgaggggtg cccacactaa tgatgtaaaa caattaacag aggcagtgca 9420aaaaataacc
acagaaagca tagtaatatg gggaaagact cctaaattta aactgcccat 9480acaaaaggaa
acatgggaaa catggtggac agagtattgg caagccacct ggattcctga 9540gtgggagttt
gttaataccc ctcccttagt gaaattatgg taccagttag agaaagaacc 9600catagtagga
gcagaaacct tctatgtaga tggggcagct aacagggaga ctaaattagg 9660aaaagcagga
tatgttacta atagaggaag acaaaaagtt gtcaccctaa ctgacacaac 9720aaatcagaag
actgagttac aagcaattta tctagctttg caggattcgg gattagaagt 9780aaacatagta
acagactcac aatatgcatt aggaatcatt caagcacaac cagatcaaag 9840tgaatcagag
ttagtcaatc aaataataga gcagttaata aaaaaggaaa aggtctatct 9900ggcatgggta
ccagcacaca aaggaattgg aggaaatgaa caagtagata aattagtcag 9960tgctggaatc
aggaaagtac tatttttaga tggaatagat aaggcccaag atgaacatga 10020gaaatatcac
agtaattgga gagcaatggc tagtgatttt aacctgccac ctgtagtagc 10080aaaagaaata
gtagccagct gtgataaatg tcagctaaaa ggagaagcca tgcatggaca 10140agtagactgt
agtccaggaa tatggcaact agattgtaca catttagaag gaaaagttat 10200cctggtagca
gttcatgtag ccagtggata tatagaagca gaagttattc cagcagaaac 10260agggcaggaa
acagcatatt ttcttttaaa attagcagga agatggccag taaaaacaat 10320acatactgac
aatggcagca atttcaccgg tgctacggtt agggccgcct gttggtgggc 10380gggaatcaag
caggaatttg gaattcccta caatccccaa agtcaaggag tagtataatc 10440tatgaataaa
gaattaaaga aaattatagg acaggtaaga gatcaggctg aacatcttaa 10500gacagcagta
caaatggcag tattcatcca caattttaaa agaaaagggg ggattggggg 10560gtacagtgca
ggggaaagaa tagtagacat aatagcaaca gacatacaaa ctaaagaatt 10620acaaaaacaa
attacaaaaa ttcaaaattt tcgggtttat tacagggaca gcagaaatcc 10680actttggaaa
ggaccagcaa agctcctctg gaaaggtgaa ggggcagtag taatacaaga 10740taatagtgac
ataaaagtag tgccaagaag aaaagcaaag atcattaggg attatggaaa 10800acagatggca
ggtgatgatt gtgtggcaag tagacaggat gaggattaga acatggaaaa 10860gtttagtaaa
acaccatatg tatgtttcag ggaaagctag gggatggttt tatagacatc 10920actatgaaag
ccctcatcca agaataagtt cagaagtaca catcccacta ggggatgcta 10980gattggtaat
aacaacatat tggggtctgc atacaggaga aagagactgg catttgggtc 11040agggagtctc
catagaatgg aggaaaaaga gatatagcac acaagtagac cctgaactag 11100cagaccaact
aattcatctg tattactttg actgtttttc agactctgct atggcgcgcc 11160acgtgacgcg
tgcatgcatt taaatatcga gggcaggagc gagagggcca aggccaggga 11220gggcggccac
caccatcacc accatcacca ttagtaatga ggtaaccgtg gggcccaatg 11280atccgaccag
caaaactcga tgtacttccg aggaactgat gtgcataatg catcaggctg 11340gtacattaga
tccccgctta ccgcgggcaa tatagcaaca ctaaaaactc gatgtacttc 11400cgaggaagcg
cagtgcataa tgctgcgcag tgttgccaca taaccactat attaaccatt 11460tatctagcgg
acgccaaaaa ctcaatgtat ttctgaggaa gcgtggtgca taatgccacg 11520cagcgtctgc
ataactttta ttatttcttt tattaatcaa caaaattttg tttttaacat 11580ttcaaaaaaa
aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa gggaattcct cgattaatta 11640agcggccgc
11649139675DNAArtificial SequenceDescription of Artificial Sequence
Synthetic polynucleotideMISC_FEATURE(1)..(7646)Sindbis gene
sequenceMISC_FEATURE(7647)..(9186)Mutant HIV-1 insert
sequenceMISC_FEATURE(9187)..(9675)Sindbis gene sequence 13attgacggcg
tagtacacac tattgaatca aacagccgac caattgcact accatcacaa 60tggagaagcc
agtagtaaac gtagacgtag acccccagag tccgtttgtc gtgcaactgc 120aaaaaagctt
cccgcaattt gaggtagtag cacagcaggt cactccaaat gaccatgcta 180atgccagagc
attttcgcat ctggccagta aactaatcga gctggaggtt cctaccacag 240cgacgatctt
ggacataggc agcgcaccgg ctcgtagaat gttttccgag caccagtatc 300attgtgtctg
ccccatgcgt agtccagaag acccggaccg catgatgaaa tacgccagta 360aactggcgga
aaaagcgtgc aagattacaa acaagaactt gcatgagaag attaaggatc 420tccggaccgt
acttgatacg ccggatgctg aaacaccatc gctctgcttt cacaacgatg 480ttacctgcaa
catgcgtgcc gaatattccg tcatgcagga cgtgtatatc aacgctcccg 540gaactatcta
tcatcaggct atgaaaggcg tgcggaccct gtactggatt ggcttcgaca 600ccacccagtt
catgttctcg gctatggcag gttcgtaccc tgcgtacaac accaactggg 660ccgacgagaa
agtccttgaa gcgcgtaaca tcggactttg cagcacaaag ctgagtgaag 720gtaggacagg
aaaattgtcg ataatgagga agaaggagtt gaagcccggg tcgcgggttt 780atttctccgt
aggatcgaca ctttatccag aacacagagc cagcttgcag agctggcatc 840ttccatcggt
gttccacttg aatggaaagc agtcgtacac ttgccgctgt gatacagtgg 900tgagttgcga
aggctacgta gtgaagaaaa tcaccatcag tcccgggatc acgggagaaa 960ccgtgggata
cgcggttaca cacaatagcg agggcttctt gctatgcaaa gttactgaca 1020cagtaaaagg
agaacgggta tcgttccctg tgtgcacgta catcccggcc accatatgcg 1080atcagatgac
tggtataatg gccacggata tatcacctga cgatgcacaa aaacttctgg 1140ttgggctcaa
ccagcgaatt gtcattaacg gtaggactaa caggaacacc aacaccatgc 1200aaaattacct
tctgccgatc atagcacaag ggttcagcaa atgggctaag gagcgcaagg 1260atgatcttga
taacgagaaa atgctgggta ctagagaacg caagcttacg tatggctgct 1320tgtgggcgtt
tcgcactaag aaagtacatt cgttttatcg cccacctgga acgcagacct 1380gcgtaaaagt
cccagcctct tttagcgctt ttcccatgtc gtccgtatgg acgacctctt 1440tgcccatgtc
gctgaggcag aaattgaaac tggcattgca accaaagaag gaggaaaaac 1500tgctgcaggt
ctcggaggaa ttagtcatgg aggccaaggc tgcttttgag gatgctcagg 1560aggaagccag
agcggagaag ctccgagaag cacttccacc attagtggca gacaaaggca 1620tcgaggcagc
cgcagaagtt gtctgcgaag tggaggggct ccaggcggac atcggagcag 1680cattagttga
aaccccgcgc ggtcacgtaa ggataatacc tcaagcaaat gaccgtatga 1740tcggacagta
tatcgttgtc tcgccaaact ctgtgctgaa gaatgccaaa ctcgcaccag 1800cgcacccgct
agcagatcag gttaagatca taacacactc cggaagatca ggaaggtacg 1860cggtcgaacc
atacgacgct aaagtactga tgccagcagg aggtgccgta ccatggccag 1920aattcctagc
actgagtgag agcgccacgt tagtgtacaa cgaaagagag tttgtgaacc 1980gcaaactata
ccacattgcc atgcatggcc ccgccaagaa tacagaagag gagcagtaca 2040aggttacaaa
ggcagagctt gcagaaacag agtacgtgtt tgacgtggac aagaagcgtt 2100gcgttaagaa
ggaagaagcc tcaggtctgg tcctctcggg agaactgacc aaccctccct 2160atcatgagct
agctctggag ggactgaaga cccgacctgc ggtcccgtac aaggtcgaaa 2220caataggagt
gataggcaca ccggggtcgg gcaagtcagc tattatcaag tcaactgtca 2280cggcacgaga
tcttgttacc agcggaaaga aagaaaattg tcgcgaaatt gaggccgacg 2340tgctaagact
gaggggtatg cagattacgt cgaagacagt agattcggtt atgctcaacg 2400gatgccacaa
agccgtagaa gtgctgtacg ttgacgaagc gttcgcgtgc cacgcaggag 2460cactacttgc
cttgattgct atcgtcaggc cccgcaagaa ggtagtacta tgcggagacc 2520ccatgcaatg
cggattcttc aacatgatgc aactaaaggt acatttcaat caccctgaaa 2580aagacatatg
caccaagaca ttctacaagt atatctcccg gcgttgcaca cagccagtta 2640cagctattgt
atcgacactg cattacgatg gaaagatgaa aaccacgaac ccgtgcaaga 2700agaacattga
aatcgatatt acaggggcca caaagccgaa gccaggggat atcatcctga 2760catgtttccg
cgggtgggtt aagcaattgc aaatcgacta tcccggacat gaagtaatga 2820cagccgcggc
ctcacaaggg ctaaccagaa aaggagtgta tgccgtccgg caaaaagtca 2880atgaaaaccc
actgtacgcg atcacatcag agcatgtgaa cgtgttgctc acccgcactg 2940aggacaggct
agtgtggaaa accttgcagg gcgacccatg gattaagcag cccactaaca 3000tacctaaagg
aaactttcag gctactatag aggactggga agctgaacac aagggaataa 3060ttgctgcaat
aaacagcccc actccccgtg ccaatccgtt cagctgcaag accaacgttt 3120gctgggcgaa
agcattggaa ccgatactag ccacggccgg tatcgtactt accggttgcc 3180agtggagcga
actgttccca cagtttgcgg atgacaaacc acattcggcc atttacgcct 3240tagacgtaat
ttgcattaag tttttcggca tggacttgac aagcggactg ttttctaaac 3300agagcatccc
actaacgtac catcccgccg attcagcgag gccggtagct cattgggaca 3360acagcccagg
aacccgcaag tatgggtacg atcacgccat tgccgccgaa ctctcccgta 3420gatttccggt
gttccagcta gctgggaagg gcacacaact tgatttgcag acggggagaa 3480ccagagttat
ctctgcacag cataacctgg tcccggtgaa ccgcaatctt cctcacgcct 3540tagtccccga
gtacaaggag aagcaacccg gcccggtcaa aaaattcttg aaccagttca 3600aacaccactc
agtacttgtg gtatcagagg aaaaaattga agctccccgt aagagaatcg 3660aatggatcgc
cccgattggc atagccggtg cagataagaa ctacaacctg gctttcgggt 3720ttccgccgca
ggcacggtac gacctggtgt tcatcaacat tggaactaaa tacagaaacc 3780accactttca
gcagtgcgaa gaccatgcgg cgaccttaaa aaccctttcg cgttcggccc 3840tgaattgcct
taacccagga ggcaccctcg tggtgaagtc ctatggctac gccgaccgca 3900acagtgagga
cgtagtcacc gctcttgcca gaaagtttgt cagggtgtct gcagcgagac 3960cagattgtgt
ctcaagcaat acagaaatgt acctgatttt ccgacaacta gacaacagcc 4020gtacacggca
attcaccccg caccatctga attgcgtgat ttcgtccgtg tatgagggta 4080caagagatgg
agttggagcc gcgccgtcat accgcaccaa aagggagaat attgctgact 4140gtcaagagga
agcagttgtc aacgcagcca atccgctggg tagaccaggc gaaggagtct 4200gccgtgccat
ctataaacgt tggccgacca gttttaccga ttcagccacg gagacaggca 4260ccgcaagaat
gactgtgtgc ctaggaaaga aagtgatcca cgcggtcggc cctgatttcc 4320ggaagcaccc
agaagcagaa gccttgaaat tgctacaaaa cgcctaccat gcagtggcag 4380acttagtaaa
tgaacataac atcaagtctg tcgccattcc actgctatct acaggcattt 4440acgcagccgg
aaaagaccgc cttgaagtat cacttaactg cttgacaacc gcgctagaca 4500gaactgacgc
ggacgtaacc atctattgcc tggataagaa gtggaaggaa agaatcgacg 4560cggcactcca
acttaaggag tctgtaacag agctgaagga tgaagatatg gagatcgacg 4620atgagttagt
atggattcat ccagacagtt gcttgaaggg aagaaaggga ttcagtacta 4680caaaaggaaa
attgtattcg tacttcgaag gcaccaaatt ccatcaagca gcaaaagaca 4740tggcggagat
aaaggtcctg ttccctaatg accaggaaag taatgaacaa ctgtgtgcct 4800acatattggg
tgagaccatg gaagcaatcc gcgaaaagtg cccggtcgac cataacccgt 4860cgtctagccc
gcccaaaacg ttgccgtgcc tttgcatgta tgccatgacg ccagaaaggg 4920tccacagact
tagaagcaat aacgtcaaag aagttacagt atgctcctcc accccccttc 4980ctaagcacaa
aattaagaat gttcagaagg ttcagtgcac gaaagtagtc ctgtttaatc 5040cgcacactcc
cgcattcgtt cccgcccgta agtacataga agtgccagaa cagcctaccg 5100ctcctcctgc
acaggccgag gaggcccccg aagttgtagc gacaccgtca ccatctacag 5160ctgataacac
ctcgcttgat gtcacagaca tctcactgga tatggatgac agtagcgaag 5220gctcactttt
ttcgagcttt agcggatcgg acaactctat tactagtatg gacagttggt 5280cgtcaggacc
tagttcacta gagatagtag accgaaggca ggtggtggtg gctgacgttc 5340atgccgtcca
agagcctgcc cctattccac cgccaaggct aaagaagatg gcccgcctgg 5400cagcggcaag
aaaagagccc actccaccgg caagcaatag ctctgagtcc ctccacctct 5460cttttggtgg
ggtatccatg tccctcggat caattttcga cggagagacg gcccgccagg 5520cagcggtaca
acccctggca acaggcccca cggatgtgcc tatgtctttc ggatcgtttt 5580ccgacggaga
gattgatgag ctgagccgca gagtaactga gtccgaaccc gtcctgtttg 5640gatcatttga
accgggcgaa gtgaactcaa ttatatcgtc ccgatcagcc gtatcttttc 5700cactacgcaa
gcagagacgt agacgcagga gcaggaggac tgaatactga ctaaccgggg 5760taggtgggta
catattttcg acggacacag gccctgggca cttgcaaaag aagtccgttc 5820tgcagaacca
gcttacagaa ccgaccttgg agcgcaatgt cctggaaaga attcatgccc 5880cggtgctcga
cacgtcgaaa gaggaacaac tcaaactcag gtaccagatg atgcccaccg 5940aagccaacaa
aagtaggtac cagtctcgta aagtagaaaa tcagaaagcc ataaccactg 6000agcgactact
gtcaggacta cgactgtata actctgccac agatcagcca gaatgctata 6060agatcaccta
tccgaaacca ttgtactcca gtagcgtacc ggcgaactac tccgatccac 6120agttcgctgt
agctgtctgt aacaactatc tgcatgagaa ctatccgaca gtagcatctt 6180atcagattac
tgacgagtac gatgcttact tggatatggt agacgggaca gtcgcctgcc 6240tggatactgc
aaccttctgc cccgctaagc ttagaagtta cccgaaaaaa catgagtata 6300gagccccgaa
tatccgcagt gcggttccat cagcgatgca gaacacgcta caaaatgtgc 6360tcattgccgc
aactaaaaga aattgcaacg tcacgcagat gcgtgaactg ccaacactgg 6420actcagcgac
attcaatgtc gaatgctttc gaaaatatgc atgtaatgac gagtattggg 6480aggagttcgc
tcggaagcca attaggatta ccactgagtt tgtcaccgca tatgtagcta 6540gactgaaagg
ccctaaggcc gccgcactat ttgcaaagac gtataatttg gtcccattgc 6600aagaagtgcc
tatggataga ttcgtcatgg acatgaaaag agacgtgaaa gttacaccag 6660gcacgaaaca
cacagaagaa agaccgaaag tacaagtgat acaagccgca gaacccctgg 6720cgactgctta
cttatgcggg attcaccggg aattagtgcg taggcttacg gccgtcttgc 6780ttccaaacat
tcacacgctt tttgacatgt cggcggagga ttttgatgca atcatagcag 6840aacacttcaa
gcaaggcgac ccggtactgg agacggatat cgcatcattc gacaaaagcc 6900aagacgacgc
tatggcgtta accggtctga tgatcttgga ggacctgggt gtggatcaac 6960cactactcga
cttgatcgag tgcgcctttg gagaaatatc atccacccat ctacctacgg 7020gtactcgttt
taaattcggg gcgatgatga aatccggaat gttcctcaca ctttttgtca 7080acacagtttt
gaatgtcgtt atcgccagca gagtactaga agagcggctt aaaacgtcca 7140gatgtgcagc
gttcattggc gacgacaaca tcatacatgg agtagtatct gacaaagaaa 7200tggctgagag
gtgcgccacc tggctcaaca tggaggttaa gatcatcgac gcagtcatcg 7260gtgagagacc
accttacttc tgcggcggat ttatcttgca agattcggtt acttccacag 7320cgtgccgcgt
ggcggatccc ctgaaaaggc tgtttaagtt gggtaaaccg ctcccagccg 7380acgacgagca
agacgaagac agaagacgcg ctctgctaga tgaaacaaag gcgtggttta 7440gagtaggtat
aacaggcact ttagcagtgg ccgtgacgac ccggtatgag gtagacaata 7500ttacacctgt
cctactggca ttgagaactt ttgcccagag caaaagagca ttccaagcca 7560tcagagggga
aataaagcat ctctacggtg gtcctaaata gtcagcatag tacatttcat 7620ctgactaata
ctacaacacc accacctcta gaatgatctg tagtgctaca gaaaaattgt 7680gggtcacagt
ctattatggg gtacctgtgt ggaaggaagc aaccaccact ctattttgtg 7740catcagatgc
taaagcatat gatacagagg tacataatgt ttgggccaca catgcctgtg 7800tacccacaga
ccccaaccca caagaagtag tattggtaaa tgtgacagaa aattttaaca 7860tgtggaaaaa
tgacatggta gaacagatgc atgaggatat aatcagttta tgggatcaaa 7920gcctaaagcc
atgtgtaaaa ttaaccccac tctgtgttag tttaaagtgc actgatttga 7980agaatgatac
taataccaat agtagtagcg ggagaatgat aatggagaaa ggagagataa 8040aaaactgctc
tttcaatatc agcacaagca taagaggtaa ggtgcagaaa gaatatgcat 8100ttttttataa
acttgatata ataccaatag ataatgatac taccagctat aagttgacaa 8160gttgtaacac
ctcagtcatt acacaggcct gtccaaaggt atcctttgag ccaattccca 8220tacattattg
tgccccggct ggttttgcga ttctaaaatg taataataag acgttcaatg 8280gaacaggacc
atgtacaaat gtcagcacag tacaatgtac acatggaatt aggccagtag 8340tatcaactca
actgctgtta aatggcagtc tagcagaaga agaggtagta attagatctg 8400tcaatttcac
ggacaatgct aaaaccataa tagtacagct gaacacatct gtagaaatta 8460attgtacaag
acccaacaac aatacaagaa aaagaatccg tatccagaga ggaccaggga 8520gagcatttgt
tacaatagga aaaataggaa atatgagaca agcacattgt aacattagta 8580gagcaaaatg
gaataacact ttaaaacaga tagctagcaa attaagagaa caatttggaa 8640ataataaaac
aataatcttt aagcaatcct caggagggga cccagaaatt gtaacgcaca 8700gttttaattg
tggaggggaa tttttctact gtaattcaac acaactgttt aatagtactt 8760ggtttaatag
tacttggagt actgaagggt caaataacac tgaaggaagt gacacaatca 8820ccctcccatg
cagaataaaa caaattataa acatgtggca gaaagtagga aaagcaatgt 8880atgcccctcc
catcagtgga caaattagat gttcatcaaa tattacaggg ctgctattaa 8940caagagatgg
tggtaatagc aacaatgagt ccgagatctt cagacctgga ggaggagata 9000tgagggacaa
ttggagaagt gaattatata aatataaagt agtaaaaatt gaaccattag 9060gagtagcacc
caccaaggca aagagaagag tggtgcagag agaaaaaaga gcagtgggaa 9120taggagcttt
gttccttggg ttcttgggag cagcaggaag cactatgggc gcagcctcgg 9180cgcgccacgt
gacgcgtgca tgcatttaaa tatcgagggc aggagcgaga gggccaaggc 9240cagggagggc
ggccaccacc atcaccacca tcaccattag taatgaggta accgtggggc 9300ccaatgatcc
gaccagcaaa actcgatgta cttccgagga actgatgtgc ataatgcatc 9360aggctggtac
attagatccc cgcttaccgc gggcaatata gcaacactaa aaactcgatg 9420tacttccgag
gaagcgcagt gcataatgct gcgcagtgtt gccacataac cactatatta 9480accatttatc
tagcggacgc caaaaactca atgtatttct gaggaagcgt ggtgcataat 9540gccacgcagc
gtctgcataa cttttattat ttcttttatt aatcaacaaa attttgtttt 9600taacatttca
aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaggga attcctcgat 9660taattaagcg
gccgc
9675149670DNAArtificial SequenceDescription of Artificial Sequence
Synthetic polynucleotideMISC_FEATURE(1)..(7646)Sindbis gene
sequenceMISC_FEATURE(7647)..(9181)Wild-type HIV-1 insert
sequenceMISC_FEATURE(9182)..(9670)Sindbis gene sequence 14attgacggcg
tagtacacac tattgaatca aacagccgac caattgcact accatcacaa 60tggagaagcc
agtagtaaac gtagacgtag acccccagag tccgtttgtc gtgcaactgc 120aaaaaagctt
cccgcaattt gaggtagtag cacagcaggt cactccaaat gaccatgcta 180atgccagagc
attttcgcat ctggccagta aactaatcga gctggaggtt cctaccacag 240cgacgatctt
ggacataggc agcgcaccgg ctcgtagaat gttttccgag caccagtatc 300attgtgtctg
ccccatgcgt agtccagaag acccggaccg catgatgaaa tacgccagta 360aactggcgga
aaaagcgtgc aagattacaa acaagaactt gcatgagaag attaaggatc 420tccggaccgt
acttgatacg ccggatgctg aaacaccatc gctctgcttt cacaacgatg 480ttacctgcaa
catgcgtgcc gaatattccg tcatgcagga cgtgtatatc aacgctcccg 540gaactatcta
tcatcaggct atgaaaggcg tgcggaccct gtactggatt ggcttcgaca 600ccacccagtt
catgttctcg gctatggcag gttcgtaccc tgcgtacaac accaactggg 660ccgacgagaa
agtccttgaa gcgcgtaaca tcggactttg cagcacaaag ctgagtgaag 720gtaggacagg
aaaattgtcg ataatgagga agaaggagtt gaagcccggg tcgcgggttt 780atttctccgt
aggatcgaca ctttatccag aacacagagc cagcttgcag agctggcatc 840ttccatcggt
gttccacttg aatggaaagc agtcgtacac ttgccgctgt gatacagtgg 900tgagttgcga
aggctacgta gtgaagaaaa tcaccatcag tcccgggatc acgggagaaa 960ccgtgggata
cgcggttaca cacaatagcg agggcttctt gctatgcaaa gttactgaca 1020cagtaaaagg
agaacgggta tcgttccctg tgtgcacgta catcccggcc accatatgcg 1080atcagatgac
tggtataatg gccacggata tatcacctga cgatgcacaa aaacttctgg 1140ttgggctcaa
ccagcgaatt gtcattaacg gtaggactaa caggaacacc aacaccatgc 1200aaaattacct
tctgccgatc atagcacaag ggttcagcaa atgggctaag gagcgcaagg 1260atgatcttga
taacgagaaa atgctgggta ctagagaacg caagcttacg tatggctgct 1320tgtgggcgtt
tcgcactaag aaagtacatt cgttttatcg cccacctgga acgcagacct 1380gcgtaaaagt
cccagcctct tttagcgctt ttcccatgtc gtccgtatgg acgacctctt 1440tgcccatgtc
gctgaggcag aaattgaaac tggcattgca accaaagaag gaggaaaaac 1500tgctgcaggt
ctcggaggaa ttagtcatgg aggccaaggc tgcttttgag gatgctcagg 1560aggaagccag
agcggagaag ctccgagaag cacttccacc attagtggca gacaaaggca 1620tcgaggcagc
cgcagaagtt gtctgcgaag tggaggggct ccaggcggac atcggagcag 1680cattagttga
aaccccgcgc ggtcacgtaa ggataatacc tcaagcaaat gaccgtatga 1740tcggacagta
tatcgttgtc tcgccaaact ctgtgctgaa gaatgccaaa ctcgcaccag 1800cgcacccgct
agcagatcag gttaagatca taacacactc cggaagatca ggaaggtacg 1860cggtcgaacc
atacgacgct aaagtactga tgccagcagg aggtgccgta ccatggccag 1920aattcctagc
actgagtgag agcgccacgt tagtgtacaa cgaaagagag tttgtgaacc 1980gcaaactata
ccacattgcc atgcatggcc ccgccaagaa tacagaagag gagcagtaca 2040aggttacaaa
ggcagagctt gcagaaacag agtacgtgtt tgacgtggac aagaagcgtt 2100gcgttaagaa
ggaagaagcc tcaggtctgg tcctctcggg agaactgacc aaccctccct 2160atcatgagct
agctctggag ggactgaaga cccgacctgc ggtcccgtac aaggtcgaaa 2220caataggagt
gataggcaca ccggggtcgg gcaagtcagc tattatcaag tcaactgtca 2280cggcacgaga
tcttgttacc agcggaaaga aagaaaattg tcgcgaaatt gaggccgacg 2340tgctaagact
gaggggtatg cagattacgt cgaagacagt agattcggtt atgctcaacg 2400gatgccacaa
agccgtagaa gtgctgtacg ttgacgaagc gttcgcgtgc cacgcaggag 2460cactacttgc
cttgattgct atcgtcaggc cccgcaagaa ggtagtacta tgcggagacc 2520ccatgcaatg
cggattcttc aacatgatgc aactaaaggt acatttcaat caccctgaaa 2580aagacatatg
caccaagaca ttctacaagt atatctcccg gcgttgcaca cagccagtta 2640cagctattgt
atcgacactg cattacgatg gaaagatgaa aaccacgaac ccgtgcaaga 2700agaacattga
aatcgatatt acaggggcca caaagccgaa gccaggggat atcatcctga 2760catgtttccg
cgggtgggtt aagcaattgc aaatcgacta tcccggacat gaagtaatga 2820cagccgcggc
ctcacaaggg ctaaccagaa aaggagtgta tgccgtccgg caaaaagtca 2880atgaaaaccc
actgtacgcg atcacatcag agcatgtgaa cgtgttgctc acccgcactg 2940aggacaggct
agtgtggaaa accttgcagg gcgacccatg gattaagcag cccactaaca 3000tacctaaagg
aaactttcag gctactatag aggactggga agctgaacac aagggaataa 3060ttgctgcaat
aaacagcccc actccccgtg ccaatccgtt cagctgcaag accaacgttt 3120gctgggcgaa
agcattggaa ccgatactag ccacggccgg tatcgtactt accggttgcc 3180agtggagcga
actgttccca cagtttgcgg atgacaaacc acattcggcc atttacgcct 3240tagacgtaat
ttgcattaag tttttcggca tggacttgac aagcggactg ttttctaaac 3300agagcatccc
actaacgtac catcccgccg attcagcgag gccggtagct cattgggaca 3360acagcccagg
aacccgcaag tatgggtacg atcacgccat tgccgccgaa ctctcccgta 3420gatttccggt
gttccagcta gctgggaagg gcacacaact tgatttgcag acggggagaa 3480ccagagttat
ctctgcacag cataacctgg tcccggtgaa ccgcaatctt cctcacgcct 3540tagtccccga
gtacaaggag aagcaacccg gcccggtcaa aaaattcttg aaccagttca 3600aacaccactc
agtacttgtg gtatcagagg aaaaaattga agctccccgt aagagaatcg 3660aatggatcgc
cccgattggc atagccggtg cagataagaa ctacaacctg gctttcgggt 3720ttccgccgca
ggcacggtac gacctggtgt tcatcaacat tggaactaaa tacagaaacc 3780accactttca
gcagtgcgaa gaccatgcgg cgaccttaaa aaccctttcg cgttcggccc 3840tgaattgcct
taacccagga ggcaccctcg tggtgaagtc ctatggctac gccgaccgca 3900acagtgagga
cgtagtcacc gctcttgcca gaaagtttgt cagggtgtct gcagcgagac 3960cagattgtgt
ctcaagcaat acagaaatgt acctgatttt ccgacaacta gacaacagcc 4020gtacacggca
attcaccccg caccatctga attgcgtgat ttcgtccgtg tatgagggta 4080caagagatgg
agttggagcc gcgccgtcat accgcaccaa aagggagaat attgctgact 4140gtcaagagga
agcagttgtc aacgcagcca atccgctggg tagaccaggc gaaggagtct 4200gccgtgccat
ctataaacgt tggccgacca gttttaccga ttcagccacg gagacaggca 4260ccgcaagaat
gactgtgtgc ctaggaaaga aagtgatcca cgcggtcggc cctgatttcc 4320ggaagcaccc
agaagcagaa gccttgaaat tgctacaaaa cgcctaccat gcagtggcag 4380acttagtaaa
tgaacataac atcaagtctg tcgccattcc actgctatct acaggcattt 4440acgcagccgg
aaaagaccgc cttgaagtat cacttaactg cttgacaacc gcgctagaca 4500gaactgacgc
ggacgtaacc atctattgcc tggataagaa gtggaaggaa agaatcgacg 4560cggcactcca
acttaaggag tctgtaacag agctgaagga tgaagatatg gagatcgacg 4620atgagttagt
atggattcat ccagacagtt gcttgaaggg aagaaaggga ttcagtacta 4680caaaaggaaa
attgtattcg tacttcgaag gcaccaaatt ccatcaagca gcaaaagaca 4740tggcggagat
aaaggtcctg ttccctaatg accaggaaag taatgaacaa ctgtgtgcct 4800acatattggg
tgagaccatg gaagcaatcc gcgaaaagtg cccggtcgac cataacccgt 4860cgtctagccc
gcccaaaacg ttgccgtgcc tttgcatgta tgccatgacg ccagaaaggg 4920tccacagact
tagaagcaat aacgtcaaag aagttacagt atgctcctcc accccccttc 4980ctaagcacaa
aattaagaat gttcagaagg ttcagtgcac gaaagtagtc ctgtttaatc 5040cgcacactcc
cgcattcgtt cccgcccgta agtacataga agtgccagaa cagcctaccg 5100ctcctcctgc
acaggccgag gaggcccccg aagttgtagc gacaccgtca ccatctacag 5160ctgataacac
ctcgcttgat gtcacagaca tctcactgga tatggatgac agtagcgaag 5220gctcactttt
ttcgagcttt agcggatcgg acaactctat tactagtatg gacagttggt 5280cgtcaggacc
tagttcacta gagatagtag accgaaggca ggtggtggtg gctgacgttc 5340atgccgtcca
agagcctgcc cctattccac cgccaaggct aaagaagatg gcccgcctgg 5400cagcggcaag
aaaagagccc actccaccgg caagcaatag ctctgagtcc ctccacctct 5460cttttggtgg
ggtatccatg tccctcggat caattttcga cggagagacg gcccgccagg 5520cagcggtaca
acccctggca acaggcccca cggatgtgcc tatgtctttc ggatcgtttt 5580ccgacggaga
gattgatgag ctgagccgca gagtaactga gtccgaaccc gtcctgtttg 5640gatcatttga
accgggcgaa gtgaactcaa ttatatcgtc ccgatcagcc gtatcttttc 5700cactacgcaa
gcagagacgt agacgcagga gcaggaggac tgaatactga ctaaccgggg 5760taggtgggta
catattttcg acggacacag gccctgggca cttgcaaaag aagtccgttc 5820tgcagaacca
gcttacagaa ccgaccttgg agcgcaatgt cctggaaaga attcatgccc 5880cggtgctcga
cacgtcgaaa gaggaacaac tcaaactcag gtaccagatg atgcccaccg 5940aagccaacaa
aagtaggtac cagtctcgta aagtagaaaa tcagaaagcc ataaccactg 6000agcgactact
gtcaggacta cgactgtata actctgccac agatcagcca gaatgctata 6060agatcaccta
tccgaaacca ttgtactcca gtagcgtacc ggcgaactac tccgatccac 6120agttcgctgt
agctgtctgt aacaactatc tgcatgagaa ctatccgaca gtagcatctt 6180atcagattac
tgacgagtac gatgcttact tggatatggt agacgggaca gtcgcctgcc 6240tggatactgc
aaccttctgc cccgctaagc ttagaagtta cccgaaaaaa catgagtata 6300gagccccgaa
tatccgcagt gcggttccat cagcgatgca gaacacgcta caaaatgtgc 6360tcattgccgc
aactaaaaga aattgcaacg tcacgcagat gcgtgaactg ccaacactgg 6420actcagcgac
attcaatgtc gaatgctttc gaaaatatgc atgtaatgac gagtattggg 6480aggagttcgc
tcggaagcca attaggatta ccactgagtt tgtcaccgca tatgtagcta 6540gactgaaagg
ccctaaggcc gccgcactat ttgcaaagac gtataatttg gtcccattgc 6600aagaagtgcc
tatggataga ttcgtcatgg acatgaaaag agacgtgaaa gttacaccag 6660gcacgaaaca
cacagaagaa agaccgaaag tacaagtgat acaagccgca gaacccctgg 6720cgactgctta
cttatgcggg attcaccggg aattagtgcg taggcttacg gccgtcttgc 6780ttccaaacat
tcacacgctt tttgacatgt cggcggagga ttttgatgca atcatagcag 6840aacacttcaa
gcaaggcgac ccggtactgg agacggatat cgcatcattc gacaaaagcc 6900aagacgacgc
tatggcgtta accggtctga tgatcttgga ggacctgggt gtggatcaac 6960cactactcga
cttgatcgag tgcgcctttg gagaaatatc atccacccat ctacctacgg 7020gtactcgttt
taaattcggg gcgatgatga aatccggaat gttcctcaca ctttttgtca 7080acacagtttt
gaatgtcgtt atcgccagca gagtactaga agagcggctt aaaacgtcca 7140gatgtgcagc
gttcattggc gacgacaaca tcatacatgg agtagtatct gacaaagaaa 7200tggctgagag
gtgcgccacc tggctcaaca tggaggttaa gatcatcgac gcagtcatcg 7260gtgagagacc
accttacttc tgcggcggat ttatcttgca agattcggtt acttccacag 7320cgtgccgcgt
ggcggatccc ctgaaaaggc tgtttaagtt gggtaaaccg ctcccagccg 7380acgacgagca
agacgaagac agaagacgcg ctctgctaga tgaaacaaag gcgtggttta 7440gagtaggtat
aacaggcact ttagcagtgg ccgtgacgac ccggtatgag gtagacaata 7500ttacacctgt
cctactggca ttgagaactt ttgcccagag caaaagagca ttccaagcca 7560tcagagggga
aataaagcat ctctacggtg gtcctaaata gtcagcatag tacatttcat 7620ctgactaata
ctacaacacc accacctcta gaatgatctg tagtgctaca gaaaaattgt 7680gggtcacagt
ctattatggg gtacctgtgt ggaaagaagc aaccaccact ctattttgtg 7740catcagatgc
taaagcatat gatacagagg tacataatgt ttgggccaca catgcctgtg 7800tacccacaga
ccccaaccca caagaagtag aattggaaaa tgtgacagaa aattttaaca 7860tgtggaaaaa
taacatggta gaacagatgc atgaggatat aatcagttta tgggatcaaa 7920gcctaaagcc
atgtgtaaaa ttaactccac tctgtgttac tttaaattgc actgatttga 7980ggaatgctac
taatgggaat gacactaata ccactagtag tagcagggaa atgatggggg 8040gaggagaaat
gaaaaattgc tctttcaaaa tcaccacaaa cataagaggt aaggtgcaga 8100aagaatatgc
acttttttat aaacttgata tagtaccaat agataataat agtaataata 8160gatataggtt
gataagttgt aacacctcag tcattacaca ggcctgtcca aagatatcct 8220ttgagccaat
tcccatacat tattgtgccc cggctggttt tgcgattcta aagtgtaaag 8280ataagaagtt
caatggaaaa ggaccatgtt caaatgtcag cacagtacaa tgtacacatg 8340ggattaggcc
agtagtatca actcaactgc tgttaaatgg cagtctagca gaagaagagg 8400tagtaattag
atccgaaaat ttcgcggaca atgctaaaac cataatagta cagctgaatg 8460aatctgtaga
aattaattgt acaagaccca acaacaatac aagaaaaagt atacatatag 8520gaccaggcag
agcattatat acaacaggaa aaataatagg agatataaga caagcacatt 8580gtaaccttag
tagagcaaaa tggaatgaca ctttaaataa aatagttata aaattaagag 8640aacaatttgg
gaataaaaca atagtcttta agcattcctc aggaggggac ccagaaattg 8700tgacgcacag
ttttaattgt ggaggggaat ttttctactg taattcaaca caactgttta 8760atagtacttg
gaatgttact gaagagtcaa ataacactgt agaaaataac acaatcacac 8820tcccatgcag
aataaaacaa attataaaca tgtggcagaa agtaggaaga gcaatgtatg 8880cccctcccat
cagaggacaa attagatgtt catcaaatat tacagggctg ctattaacaa 8940gagatggtgg
tccagaggac aacaagaccg aggtcttcag acctggagga ggagatatga 9000gggacaattg
gagaagtgaa ttatataaat ataaagtagt aaaaattgaa ccattaggag 9060tagcacccac
caaggcaaag agaagagtgg tgcagagaga aaaaagagca gtgggaatag 9120gagctgtgtt
ccttgggttc ttgggagcag caggaagcac tatgggcgca gctggcgcgc 9180cacgtgacgc
gtgcatgcat ttaaatatcg agggcaggag cgagagggcc aaggccaggg 9240agggcggcca
ccaccatcac caccatcacc attagtaatg aggtaaccgt ggggcccaat 9300gatccgacca
gcaaaactcg atgtacttcc gaggaactga tgtgcataat gcatcaggct 9360ggtacattag
atccccgctt accgcgggca atatagcaac actaaaaact cgatgtactt 9420ccgaggaagc
gcagtgcata atgctgcgca gtgttgccac ataaccacta tattaaccat 9480ttatctagcg
gacgccaaaa actcaatgta tttctgagga agcgtggtgc ataatgccac 9540gcagcgtctg
cataactttt attatttctt ttattaatca acaaaatttt gtttttaaca 9600tttcaaaaaa
aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa agggaattcc tcgattaatt 9660aagcggccgc
96701510272DNAZika
virus 15atgaaaaacc ccaaagaaga aatccggagg atccggattg tcaatatgct aaaacgcgga
60gtagcccgtg tgagcccctt tgggggcttg aagaggctgc cagccggact tctgctgggt
120catgggccca tcaggatggt cttggcgata ctagcctttt tgagattcac ggcaatcaag
180ccatcactgg gtctcatcaa tagatggggt tcagtgggga aaaaagaggc tatggaaata
240ataaagaagt tcaagaaaga tctggctgcc atgctgagaa taatcaatgc taggaaggag
300aagaagagac gaggcacaga tactagtgtc ggaattgttg gcctcctgct gaccacagcc
360atggcagtgg aggtcactag acgtgggagt gcatactata tgtacttgga cagaagcgat
420gctggggagg ccatatcttt tccaaccaca ctggggatga acaagtgtta catacagatc
480atggatcttg gacacatgtg tgatgccacc atgagctatg aatgccctat gttggatgag
540ggggtagaac cagatgacgt cgattgttgg tgcaacacga catcaacttg ggttgtgtac
600ggaacctgcc accacaaaaa aggtgaagca cggagatcta gaagagctgt gacgctcccc
660tcccattcca ctaggaagct gcaaacgcgg tcgcagacct ggttggaatc aagagaatat
720acaaagcacc tgattagagt cgaaaattgg atattcagga accctggctt cgcgttagca
780gcagctgcca tcgcctggct tttgggaagt tcaacgagcc aaaaagtcat atacttggtc
840atgatactgc tgattgcccc ggcatacagc atcaggtgca taggagtcag caatagggac
900tttgtggaag gtatgtcagg tgggacttgg gttgatgttg tcttggaaca tggaggttgt
960gttaccgtaa tggcacagga caaaccggct gtcgacatag agctggttac aacaacagtc
1020agcaacatgg cggaggtaag atcctattgc tatgaggcat caatatcgga catggcttcg
1080gacagccgct gcccaacaca aggtgaagcc taccttgaca agcagtcaga cactcaatat
1140gtctgcaaaa gaacgttagt ggacagaggc tggggaaatg gatgtggact ttttggcaaa
1200gggagcctgg tgacatgcgc taagtttgca tgctccaaga aaatgaccgg gaagagcatc
1260cagccagaga atctggagta ccggataatg ctgtcagttc atggctccca gcacagtggg
1320atgatcgtta atgacacagg acatgaaact gatgagaata gagcgaaggt tgagataacg
1380cccaattcac caagagctga agccaccctg gggggttttg gaagcctagg acttgattgt
1440gaaccgagga caggccttga cttttcagat ttgtattact tgactatgaa taacaagcac
1500tggttggttc acaaggagtg gttccacgac attccattac cttggcatgc tggggcagac
1560accggaactc cacattggaa caacaaagaa gcattggtag agttcaagga cgcacatgcc
1620aaaaggcaaa ctgtcgtggt tctagggagt caagaaggag cagttcacac ggcccttgct
1680ggagctctgg aggctgagat ggatggtgca aagggaaggc tgtcctctgg ccacttgaaa
1740tgtcgcctga aaatggataa acttagattg aagggcgtgt catactcctt gtgtaccgca
1800gcgttcacat tcaccaagat cccggctgaa acactgcacg ggacagtcac agtggaggta
1860cagtacgcag ggacagatgg accctgcaag gttccagctc agatggcggt ggacatgcaa
1920actctgaccc cagttgggag gctgataacc gctaaccctg taatcactga aagcactgag
1980aactctaaga tgatgctgga acttgatcca ccatttgggg actcttacat tgtcatagga
2040gtcggggaga agaagatcac ccatcactgg cacaggagtg gcagcaccat tggaaaagca
2100tttgaagcca ctgtgagagg tgccaagaga atggcagtct tgggagacac agcctgggat
2160tttggatcag ttggaggtgc tctcaactca ttgggcaagg gcatccatca aatttttgga
2220gcagctttca aatcattgtt tggaggaatg tcctggttct cacaaattct cattggaacg
2280ttgctggtgt ggttgggtct gaatacaaag aatggatcta tttcccttac gtgcttggcc
2340ttagggggag tgttgatctt tttatccaca gccgtctctg ctgatgtggg gtgctcggtg
2400gacttctcaa agaaggaaac gagatgcggt acgggggtgt tcgtctataa cgacgttgat
2460gcctggaggg acaggtacaa gtaccatcct gactcccctc gtagattagc agcagcagtc
2520aagcaagcct gggragatgg gatctgtggg atctcctctg tttcaagaat ggaaaacatc
2580atgtggagat cagtagaagg ggagctcaac gcaatcctgg aagagaatgg agttcaactg
2640acggtcgttg tgggatctgt aaaaaacccc atgtggagag gtccacagag attgcccgtg
2700cctgtgaacg agctgcccca cggctggaag gcttggggga aatcgtactt cgtcagagca
2760gcaaagacaa ataacagctt tgtcgtggat ggtgacacac tgaaggaatg cccactcaaa
2820catagagcat ggaacagctt tcttgtggag gatcatgggt tcggggtatt tcacactagt
2880gtctggctca aggttagaga agattattca ttagagtgtg atccagccgt tattggaaca
2940gctgctaagg gaaaggaggc tgtgcacagt gatctaggct actggattga gagtgagaag
3000aatgacacat ggaggctgaa gagggcccac ctgatcgaga tgaaaacatg tgaatggcca
3060aagtcccaca cattgtggac agatggaata gaagaaagtg atctgatcat acccaagtct
3120ttagctgggc cactcagcca tcacaacacc agagagggct acaggactca aatgaaaggg
3180ccatggcaca gtgaagagct tgaaattcgg tttgaggaat gcccaggcac taaggtccac
3240gtggaggaaa catgtggaac aagaggacca tctctgagat caaccactgc aagcggaagg
3300gtgatcgagg aatggtgctg cagggaatgc acaatgcccc cactgtcgtt ccgggctaaa
3360gatggctgtt ggtatggaat ggagataagg cccaggaaag aaccagaaag taacttagta
3420aggtcaatgg tgactgcagg atcaactgat cacatggatc acttctccct tggagtgctt
3480gtgattctgc tcatggtgca ggaagggctg aagaagagaa tgaccacaaa gatcatcata
3540agcacatcaa tggcagtgct ggtagctatg atcctgggag gattttcaat gagtgacctg
3600gccaagcttg caattttgat gggtgccacc tttgcggaaa tgaacactgg aggagatgta
3660gctcatctgg cgctgatagc ggcattcaaa gtcagacctg cgttgctggt atctttcatc
3720ttcagagcta attggacacc ccgtgagagc atgctgctgg ccctggcttc gtgtcttctg
3780caaactgcga tctccgcctt ggaaggcgac ctgatggttc tcatcaatgg ttttgctttg
3840gcctggttgg caatacgagc gatggttgtt ccacgcactg acaacatcac cttggcaatc
3900ctgactgcgc tgacaccact ggcccggggc acgctgcttg tggcgtggag agcaggcctt
3960gctacttgcg gggggttcat gcttctctct ctgaagggga agggcagtgt gaagaagaac
4020ctaccatttg tcatggcctt gggactcacc gctgtgaggc tggtcgaccc catcaacgtg
4080gtgggactgc tgttgctcac aaggagtggg aagcggagct ggccccctag tgaagtactc
4140acagctgttg gtctgatatg cgcgttggcc ggagggttcg ccaaggcgga tatagagatg
4200gctgggccca tggccgcggt cggtctgcta attgtcagtt acgtggtctc aggaaagagt
4260gtggacatgt acattgaaag agcaggtgac atcacatggg aaaaagatgc ggaagtcact
4320ggaaacagtc cccggctcga tgtggcactg gatgagagtg gtgatttctc cctagtggag
4380gatgatggtc cccccatgag agagatcata ctcaaagtgg tcctgatgac catctgtggc
4440atgaacccaa tagccatacc ctttgcagct ggagcgtggt acgtgtatgt gaagactgga
4500aaaaggagtg gtgctctatg ggatgtgcct gctcccaagg aagtaaaaaa gggggagacc
4560acagatggag tgtacagagt aatgactcgt agactgcttg gttcaacaca agttggagtg
4620ggagtcatgc aagagggggt cttccacact atgtggcacg tcacaaaagg atccgcgctg
4680agaagcggtg aagggagact tgatccatac tggggagatg tcaagcagga tctggtgtca
4740tattgtggtc cgtggaagct agacgccgcc tgggacgggc acagcgaggt gcagctcttg
4800gccgtgcccc ccggagagag agcgaggaac atccagactc cgcccggaat atttaagaca
4860aaggatgggg acattggagc agttgcgttg gactacccag caggaacttc aggatctcca
4920atcctagaca agtgtgggag agtgatagga ctctatggta atggggtcgt gataaaaaat
4980gggagttatg ttagtgccat cacccaaggg aggagggagg aagagactcc tgttgagtgc
5040ttcgagcctt cgatgttgaa gaagaagcag ctaactgtct tagacctgca tcctggagct
5100gggaaaacca ggagagttct tcctgaaata gtccgtgaag ccataaaaac aagactccgt
5160actgtgatct tagctccaac cagggttgtc gctgctgaaa tggaggaagc ccttagaggg
5220cttccagtgc gttatatgac aacagcagtc aatgtcaccc attctgggac agaaattgtt
5280gacttaatgt gccatgccac cttcacttca cgtctactac aaccaatcag agtccccaac
5340tataatctgt atattatgga cgaggcccac ttcacagatc cctcaagtat agcagcaaga
5400ggatacattt caacaagggt tgagatgggc gaggcggccg ccatcttcat gaccgccacc
5460ccaccaggaa cccgtgacgc attcccggac tccaactcac caattatgga caccgaagtg
5520gaagtcccag agagagcctg gagctcaggc tttgattggg tgacggatca ttctggaaaa
5580acagtttggt ttgttccaag cgtgaggaac ggcaatgaga tcgcagcttg tctgacaaag
5640gctggaaaac gggtcataca gctcagcaga aagacttttg agacagagtt cctgaaaaca
5700aaaaatcaag agtgggactt cgtcgtgaca actgacattt cagagatggg cgccaacttt
5760aaagctgacc gtgtcataga ttccaggaga tgcctaaagc cggtcatact tgatggcgag
5820agagtcattc tggctggacc catgcctgtc acacatgcca gcgctgccca gaggaggggg
5880cgcataggca ggaatcccaa caaacctgga gatgagtatc tgtatggagg tgggtgcgca
5940gagactgatg aagaccatgc acactggctt gaagcaagaa tgcttcttga caacatttac
6000ctccaagatg gcctcatagc ctcgctctat cgacctgagg ccgacaaagt agcagctatt
6060gagggagagt tcaagcttag gacggagcaa aggaagacct ttgtggaact catgaaaaga
6120ggagatcttc ctgtttggct ggcctatcag gttgcatctg ccggaataac ctacacagat
6180agaaaatggt gctttgatgg cacgaccaac aacaccataa tggaagacag tgtgccggca
6240gaggtgtgga ccagatacgg agagaaaaga gtgctcaaac caaggtggat ggacgccaga
6300gtttgttcag atcatgcggc cctgaagtca ttcaaagagt ttgccgctgg gaaaagagga
6360gyggcctttg gagtgatgga agccctggga acattgccgg gacacatgac agagagattc
6420caggaagcca ttgacaacct cgctgtgctc atgcgggcag agactggaag caggccttac
6480gaagccgcgg cggcccaatt gccggagacc ttagagacca tcatgctttt ggggttgctg
6540ggaacagtct cgctgggaat ctttttcgtc ttgatgcgga ataagggcat cgggaagatg
6600ggctttggaa tggtgactct tggggccagc gcatggctta tgtggctctc ggaaattgag
6660ccagccagaa ttgcatgtgt cctcattgtt gtgttcctat tgctggtggt gctcatacct
6720gagccagaaa agcaaagatc tccccaggac aaccaaatgg caatcatcat catgatagca
6780gtgggtcttc tgggtttgat taccgccaat gaacttggat ggttggaaag aacaaagagt
6840gacctaagcc atctaatggg aaggagagag gagggggcaa ccataggatt ctcaatggac
6900attgacctgc ggccagcctc agcttgggct atctatgctg ctctgacaac tttcatcacc
6960ccagccgtcc aacatgcggt gaccacttca tacaacaact actccttaat ggcgatggcc
7020acgcaagctg gagtgttgtt tggtatgggt aaagggatgc cattctacgc atgggacttt
7080ggagtcccgc tgctaatgat gggttgctac tcacaattaa cacccctgac cctaatagtg
7140gccatcattt tgctcgtggc gcactacatg tacttgatcc cagggctgca ggcagcagct
7200gcgcgtgctg cccagaagag aacggcagct ggcatcatga agaaccctgt tgtggatgga
7260atagtggtga ctgacattga cacaatgaca attgaccacc gagtggagaa aaagatggga
7320caggtgctac tcatagcagt agccgtctcc agcgccatac tgtcgcggac cgcctggggg
7380tggggggagg ctggggccct gatcacagct gcaacttcca ctttgtggga aggctctccg
7440aacaagtact ggaactcctc cacagccact tcactgtgta acatttttag gggaagttac
7500ttggctggag cttctctaat ctacacagta acaagaaacg ctggcttggt caagagacgt
7560gggggtggaa cgggagagac cctgggagag aaatggaagg cccgcctgaa ccagatgtcg
7620gccctagagt tctactccta caaaaagtca ggcatcaccg aggtgtgcag agaagaggcc
7680cgccgcgccc tcaaggacgg tgtggcaaca ggaggccatg ctgtgtcccg aggaagtgca
7740aagcttagat ggttggtgga gagaggatac ctgcagccct atggaaaggt cattgatctt
7800ggatgtggca gagggggctg gagttactac gccgccacca tccgcaaagt tcaagaagtg
7860aaaggataca caaaaggagg ccctggtcat gaagaaccca tgttggtgca aagctatggg
7920tggaacatag tccgtcttaa gagtggggtg gacgtctttc atatggcggc tgagccgtgt
7980gacactttgc tgtgtgatat aggtgagtca tcatctagtc ctgaagtgga agaagcacgg
8040acgctcagag tcctttccat ggtgggggat tggcttgaaa aaagaccagg agccttttgt
8100ataaaagtgt tgtgcccata caccagcact atgatggaaa ccctggagcg actgcagcgt
8160aggtatgggg gaggactggt cagagtgcca ctctcccgca actctacaca tgagatgtac
8220tgggtctctg gagcgaaaag caacaccata aaaagtgtgt ccaccacgag ccagctcctc
8280ttggggcgca tggacgggcc caggaggcca gtgaaatatg aggaggatgt gaatctcggc
8340tccggcacgc gggctgtggt aagctgcgct gaagctccca acatgaagat cattggtaac
8400cgcattgaga ggatccgcag tgagcacgcg gaaacgtggt tctttgacga gaaccaccca
8460tataggacat gggcttacca tggaagctat gaggccccta cacaagggtc agcgtcctct
8520ctaataaacg gggttgtcag gctcctgtca aaaccctggg atgtggtgac tggagtcaca
8580ggaatagcca tgaccgacac cacaccgtat ggtcagcaaa gagttttcaa ggaaaaagtg
8640gacactaggg tgccagaccc ccaagaaggc actcgtcagg ttatgagcat ggtctcttcc
8700tggttatgga aagagctagg caaacacaaa cggccacgag tctgtaccaa agaagagttc
8760atcaacaagg ttcgtagcaa tgcagcatta ggggcaatat ttgaagagga aaaagagtgg
8820aagactgcag tggaagctgt gaacgatcca aggttctggg ctctagtgga caaggaaaga
8880gagcaccacc tgagaggaga gtgtcagagc tgtgtgtaca acatgatggg aaaaagagaa
8940aagaaacaag gggaatttgg aaaggccaag ggcagccgcg ccatctggta tatgtggcta
9000ggggctagat tcctggagtt cgaagccctt ggattcttga acgaggatca ctggatgggg
9060agagagaatt caggaggtgg tgttgaaggg ctgggattac aaagactcgg atatgtccta
9120gaagagatga gtcgcatacc aggaggaagg atgtatgctg atgacacagc tggctgggac
9180acccgcatca gcaggtttga tctggagaat gaagctctaa tcaccaacca aatggagaaa
9240gggcacaggg ccttggcatt ggccataatc aagtacacat accaaaacaa agtggtaaag
9300gtcctcagac cagctgaaaa agggaagaca gttatggaca ttatttcaag acaagaccaa
9360agggggagcg gacaagttgt cacttacgct cttaatacat tcaccaacct ggtggtgcag
9420ctcattcgga atatggaggc tgaggaagtt ctagagatgc aagacttgtg gctgctgcgg
9480aggtcagaga aagtgaccaa ctggttgcag agcaacggat gggataggct caaacgaatg
9540gcagtcagtg gagatgattg cgttgtgaaa ccaattgatg ataggtttgc acatgccctc
9600aggttcttga atgatatggg aaaagtcagg aaggacacac aagagtggaa accctcaact
9660ggatgggaca attgggaaga agttccgttt tgctcccacc acttcaacaa gctccatctc
9720aaggacggga ggtccattgt ggttccctgc cgccaccaag atgaactgat tggccgagcc
9780cgcgtctcac caggggcggg atggagcatc cgggagactg cttgcctagc aaaatcatat
9840gcgcaaatgt ggcagctcct ttatttccac agaagggacc tccgactgat ggccaatgcc
9900atttgttcat ctgtgccagt tgactgggtt ccaactggga gaactacctg gtcaatccat
9960ggaaagggag aatggatgac cactgaagac atgcttgtgg tgtggaacag agtgtggatt
10020gaggagaacg accacatgga agacaagacc ccagttacga aatggacaga cattccctat
10080ttgggaaaaa gggaagactt gtggtgtgga tctctcatag ggcacagacc gcgcactacc
10140tgggctgaga acatcaaaaa cacagtcaac atgatgcgca ggatcatagg tgatgaagaa
10200aagtacatgg actacctatc cacccaagtt cgctacttgg gtgaagaagg gtccacaccc
10260ggagtgttgt aa
10272168PRTArtificial SequenceDescription of Artificial Sequence
Synthetic 8xHis tag 16His His His His His His His His1
5
User Contributions:
Comment about this patent or add new information about this topic: