Patent application title: NOVEL ADENOVIRUS
Inventors:
IPC8 Class: AA61K3904FI
USPC Class:
1 1
Class name:
Publication date: 2021-03-04
Patent application number: 20210060150
Abstract:
There is provided adenoviral vectors encoding a mycobacterial antigen
derived from a chimp adenovirus, and to related aspects.Claims:
1-95. (canceled)
96. An immunogenic composition for generating a T cell response in a subject, the immunogenic composition comprising: a first composition to be administered to the subject including a recombinant adenovirus comprising a polynucleotide comprising: (a) a first polynucleotide which encodes a polypeptide having the amino acid sequence according to SEQ ID NO: 1, (b) a second polynucleotide which encodes a polypeptide having the amino acid sequence according to SEQ ID NO: 5, (c) a third polynucleotide which encodes a polypeptide having the amino acid sequence according to SEQ ID NO: 3, and (d) a fourth polynucleotide which encodes a mycobacterial Rv1196 antigen consisting of the amino acid sequence of SEQ ID NO:68, wherein the fourth polynucleotide is operatively linked to one or more sequences which direct expression of said mycobacterial Rv1196 antigen in a host cell.
97. The immunogenic composition of claim 96, further comprising: a second composition to be administered after the first composition, said second composition including a further component comprising: (a) a non-adenoviral vector or a non-viral vector; or (b) a protein.
98. The immunogenic composition of claim 97, wherein the second composition includes a protein comprising the mycobacterial Rv1196 antigen.
99. The immunogenic composition of claim 96, further comprising the first composition and a pharmaceutically acceptable excipient.
100. The immunogenic composition of claim 97, further comprising the second composition and a pharmaceutically acceptable excipient.
101. The immunogenic composition of claim 96, wherein the recombinant adenovirus is replication-incompetent.
102. The adenovirus of claim 96, further comprising a fifth polynucleotide that consists of the amino acid sequence of SEQ ID NO: 70 and encodes a mycobacterial antigen.
103. The adenovirus of claim 96, wherein the one or more sequences which direct expression of said mycobacterial Rv1196 antigen in a host cell includes a promoter sequence, wherein the promoter sequence is selected from the group consisting of an internal promoter, a native promoter, an RSV LTR promoter, a CMV promoter, an SV40 promoter, a dihydrofolate reductase promoter, a .beta.-actin promoter, a PGK promoter, an EF1a promoter and a CASI promoter.
104. The adenovirus according to claim 103, wherein the adenovirus has a non seroprevalence in human subjects.
105. The adenovirus according to claim 96, wherein the adenovirus is a chimpanzee adenovirus.
106. The composition of claim 98, further comprising an adjuvant.
107. The composition of claim 106, wherein the adjuvant is selected from the group consisting of: an inorganic adjuvant, an organic adjuvant, an oil-based adjuvant, a cytokine, a particulate adjuvant, a virosome, a bacterial adjuvant, a synthetic adjuvant, a synthetic polynucleotide adjuvant, or an immunostimulatory oligonucleotide containing unmethylated CpG dinucleotides.
108. The composition of claim 107, wherein the adjuvant is an organic adjuvant.
109. The composition of claim 108, wherein the organic adjuvant is a saponin.
110. A method for the prophylaxis of mycobacterial infection comprising administering the immunogenic composition according to claim 96.
111. A method for the prophylaxis of mycobacterial infection comprising administering the immunogenic composition according to claim 97.
112. The method of claim 111, wherein the second composition is co-administered or sequentially administered with the first composition.
113. A method for the treatment of mycobacterial infection comprising administering the immunogenic composition according to claim 96.
114. A method for the treatment of mycobacterial infection comprising administering the immunogenic composition according to claim 97.
115. The method of claim 114, wherein the second composition is co-administered or sequentially administered with the first composition.
Description:
CROSS-REFERENCE TO RELATED APPLICATIONS
[0001] This is a Continuation of U.S. patent application Ser. No. 15/747,545, filed Jan. 25, 2018, which is the U.S. National Stage application submitted under 35 U.S.C. .sctn. 371 for International Application No. PCT/EP2016/067621, filed Jul. 25, 2016, which claims priority to Application No. GB 1513176.6, filed Jul. 27, 2015 all of which are incorporated herein by reference in their entireties.
SUBMISSION OF SEQUENCE LISTING ON ASCII TEXT FILE
[0002] The content of the following submission on ASCII text file is incorporated herein by reference in its entirety: a computer readable form (CRF) of the Sequence Listing (file name: VB65845AC1_US_Seq_Listing.txt; created 4 Aug. 2020; size: 437,640 bytes).
FIELD OF THE INVENTION
[0003] The present invention relates to adenoviral vectors encoding a mycobacterial antigen, more particular to adenoviral vectors derived from chimp adenovirus ChAd155, and to related aspects.
BACKGROUND OF THE INVENTION
[0004] Tuberculosis (TB) is a chronic infectious disease caused by infection with Mycobacterium tuberculosis and other Mycobacterium species. It is a major disease in developing countries, as well as an increasing problem in developed areas of the world.
[0005] Vaccination is one of the most effective methods for preventing or treating infectious diseases. Adenovirus has been widely used for gene transfer applications due to its ability to achieve highly efficient gene transfer in a variety of target tissues and large transgene capacity. Conventionally, E1 genes of adenovirus are deleted and replaced with a transgene cassette consisting of the promoter of choice, cDNA sequence of the gene of interest and a poly A signal, resulting in a replication defective recombinant virus.
[0006] Recombinant adenoviruses are useful in gene therapy and as vaccines. Viral vectors based on chimpanzee adenovirus represent an alternative to the use of human derived adenoviral (Ad) vectors for the development of genetic vaccines. Adenoviruses isolated from chimpanzees are closely related to adenoviruses isolated from humans as demonstrated by their efficient propagation in cells of human origin. However, since human and chimp adenoviruses are close relatives, serologic cross reactivity between the two virus species is possible.
[0007] There is a demand for vectors which effectively deliver molecules to a target and minimize the effect of pre-existing immunity to selected adenovirus serotypes in the population. One aspect of pre-existing immunity that is observed in humans is humoral immunity, which can result in the production and persistence of antibodies that are specific for adenoviral proteins. The humoral response elicited by adenovirus is mainly directed against the three major structural capsid proteins: fiber, penton and hexon.
[0008] There remains a need for novel methods of immunising against tuberculosis, which are highly efficacious, safe, convenient, cost-effective, long-lasting and induce a broad spectrum of immune responses.
[0009] Vectors, compositions and methods of the present invention may have one or more following improved characteristics over the prior art, including but not limited to higher productivity, improved immunogenicity and increased transgene expression.
SUMMARY OF THE INVENTION
[0010] The present invention provides a recombinant adenovirus comprising at least one polynucleotide or polypeptide selected from the group consisting of:
[0011] (a) a polynucleotide which encodes a polypeptide having the amino acid sequence according to SEQ ID NO:1,
[0012] (b) a polynucleotide which encodes a functional derivative of a polypeptide haying the amino acid sequence according to SEQ ID NO: 1, wherein the functional derivative has an amino add sequence which is at least 80% identical over its entire length to the amino acid sequence of SEQ ID NO: 1,
[0013] (c) a polynucleotide which encodes a polypeptide having the amino acid sequence according to SEQ ID NO: 3,
[0014] (d) a polypeptide having the amino add sequence according to SEQ ID NO: 1,
[0015] (e) a functional derivative of a polypeptide having the amino acid sequence according to SEQ ID NO: 1, wherein the functional derivative has an amino acid sequence which is at least 80% identical over its entire length to the amino acid sequence of SEQ ID NO: 1, and
[0016] (f) a polypeptide having the amino acid sequence according to SEQ ID NO: 3:
[0017] wherein the adenovirus comprises a nucleic acid sequence encoding a mycobacterial antigen, wherein the nucleic acid sequence is operatively linked to one or more sequences which direct expression of said mycobacterial antigen in a host cell.
[0018] Also provided is a composition comprising the recombinant adenovirus and a pharmaceutically acceptable excipient.
[0019] The recombinant adenovirus and compositions may be used as medicaments, in particular for the stimulation of an immune response against mycobacterial infection, such as Mycobacterium tuberculosis infection.
DESCRIPTION OF THE FIGURES
[0020] FIG. 1A-C--Alignment of fiber protein sequences from the indicated simian adenoviruses.
[0021] ChAd3 (SEQ ID NO:27)
[0022] PanAd3 (SEQ ID NO:28)
[0023] ChAd17 (SEQ ID NO:29)
[0024] ChAd19 (SEQ ID NO:30)
[0025] ChAd24 (SEQ ID NO:31)
[0026] ChAd155 (SEQ ID NO:1)
[0027] ChAd11 (SEQ ID NO:32)
[0028] ChAd20 (SEQ ID NO:33)
[0029] ChAd31 (SEQ ID NO:34)
[0030] PanAd1 (SEQ ID NO:35)
[0031] PanAd2 (SEQ ID NO:36)
[0032] FIG. 2--Flow diagram for production of specific ChAd155 BAC and plasmid vectors
[0033] FIG. 3--Species C BAC Shuttle #1365 schematic
[0034] FIG. 4--pArsChAd155 Ad5E4orf6-2 (#1490) schematic
[0035] FIG. 5--pChAd155/RSV schematic
[0036] FIG. 6--BAC ChAd155/RSV schematic
[0037] FIG. 7--Productivity of ChAd3 and ChAd155 vectors expressing an HIV Gag transgene (Experiment 1)
[0038] FIG. 8--Productivity of ChAd3 and ChAd155 vectors expressing an HIV Gag transgene (Experiment 2)
[0039] FIG. 9--Productivity of PanAd3 and ChAd155 vectors expressing RSV transgene
[0040] FIG. 10--Expression levels of ChAd3 and ChAd155 vectors expressing an HIV Gag transgene
[0041] FIG. 11--Expression levels of PanAd3 and ChAd155 vectors expressing an HIV Gag transgene--Western Blot
[0042] FIG. 12--Immunogenicity of ChAd3 and ChAd155 vectors expressing an HIV Gag transgene--IFN-gamma ELISpot
[0043] FIG. 13--Immunogenicity of PanAd3 and ChAd155 vectors expressing an HIV Gag transgene--IFN-gamma ELISpot
DESCRIPTION OF THE SEQUENCES
[0044] SEQ ID NO: 1--Polypeptide sequence of ChAd155 fiber
[0045] SEQ ID NO: 2--Polynucleotide sequence encoding ChAd155 fiber
[0046] SEO ID NO: 3--Polypeptide sequence of ChAd155 penton
[0047] SEQ ID NO: 4--Polynucleotide sequence encoding ChAd155 penton
[0048] SEQ ID NO: 5--Polypeptide sequence of ChAd155 hexon
[0049] SEQ ID NO: 6--Polynucleotide sequence encoding ChAd155 hexon
[0050] SEQ ID NO: 7--Polynucleotide sequence encoding ChAd155#1434
[0051] SEQ ID NO: 8--Polynucleotide sequence encoding ChAd155#1390
[0052] SEQ ID NO: 9--Polynucleotide sequence encoding ChAd155#1375
[0053] SEQ ID NO: 10--Polynucleotide sequence encoding wild type ChAd155
[0054] SEQ ID NO: 11--Polynucleotide sequence encoding ChAd1551RSV
[0055] SEQ ID NO: 12--Polynucleotide sequence encoding the CAST promoter
[0056] SEQ ID NO: 13--Ad5orf6 primer 1 polynucleotide sequence
[0057] SEQ ID NO: 14--Ad5orf6 primer 2 polynucleotide sequence
[0058] SEQ ID NO: 15--BAC!CHAd155 .DELTA.E1_TetO hCMV RpsL-Kana primer 1 polynucleotide sequence
[0059] SEQ ID NO: 16--BAC/CHAd155 .DELTA.E1_TetO hCMV RpsL-Kana (#1375) primer 2 polynucleotide sequence
[0060] SEQ ID NO: 17--1021-FW E4 Del Step1 primer polynucleotide sequence
[0061] SEQ ID NO: 18--1022-RW E4 Del Step1 primer polynucleotide sequence
[0062] SEQ ID NO: 19--1025-FW E4 Del Step2 primer polynucleotide sequence
[0063] SEQ ID NO: 20--1026-RW E4 Del Step2 primer polynucleotide sequence
[0064] SEQ ID NO: 21--91-SubMonte FW primer polynucleotide sequence
[0065] SEQ ID NO: 22--890-BghPolyA RW primer polynucleotide sequence
[0066] SEQ ID NO: 23--CMVfor primer polynucleotide sequence
[0067] SEQ ID NO: 24--CMVrev primer polynucleotide sequence
[0068] SEQ ID NO: 25--CMVFAM-TAMRA gPCR probe polynucleotide sequence
[0069] SEQ ID NO: 26--Woodchuck Hepatitis Virus Posttranscriptional Regulatory Element (WPRE) polynucleotide sequence
[0070] SEQ ID NO: 27 Amino acid sequence for the fiber protein of ChAd3
[0071] SEQ ID NO: 28--Amino acid sequence for the fiber protein of PanAd3
[0072] SEQ ID NO: 29--Amino acid sequence for the fiber protein of ChAd17
[0073] SEQ ID NO: 30--Amino acid sequence for the fiber protein of ChAd19
[0074] SEQ ID NO: 31--Amino acid sequence for the fiber protein of ChAd24
[0075] SEQ ID NO: 32 Amino acid sequence for the fiber protein of ChAd11
[0076] SEQ ID NO: 33--Amino acid sequence for the fiber protein of ChAd20
[0077] SEQ ID NO: 34 Amino acid sequence for the fiber protein of ChAd31
[0078] SEQ ID NO: 35--Amino acid sequence for the fiber protein of PanAd1
[0079] SEQ ID NO: 36--Amino acid sequence for the fiber protein of PanAd2
[0080] SEQ ID NO: 37--RSV FATM amino acid sequence
[0081] SEQ ID NO: 38--HIV Gag polynucleotide sequence
[0082] SEQ ID NO: 39--polypeptide sequence of Rv1174 (Mtb8.4)
[0083] SEQ ID NO: 40--polypeptide sequence of Rv0287 (Mtb9.8)
[0084] SEQ ID NO: 41--polypeptide sequence of Rv1793 (Mtb9.9)
[0085] SEQ ID NO: 42--polypeptide sequence of Rv0915 (Mtb41)
[0086] SEQ. ID NO: 43--polypeptide sequence of Rv3875 (ESAT.-6)
[0087] SEQ ID NO: 44--polypeptide sequence of Rv3804 (Ag85A)
[0088] SEQ ID NO: 45--polypeptide sequence of Rv1886 (Ag85B)
[0089] SEQ ID NO: 46--polypeptide sequence of Rv2031 (alpha-crystallin)
[0090] SEQ ID NO: 47--polypeptide sequence of Rv1980 (Mpt64)
[0091] SEQ ID NO: 48--polypeptide sequence of Rv0288 (TB10.4)
[0092] SEQ ID NO: 49--polypeptide sequence of Rv1753
[0093] SEQ ID NO: 50--polypeptide sequence of Rv2386
[0094] SEQ ID NO: 51--polypeptide sequence of Rv3616 (Mtb40)
[0095] SEQ ID NO: 52--polypeptide sequence of Rv3407
[0096] SEQ ID NO: 53--polypeptide sequence of Rv2660
[0097] SEQ ID NO: 54--polypeptide sequence of Rv2608
[0098] SEQ ID NO: 55--polypeptide sequence of Rv3619
[0099] SEQ ID NO: 56--polypeptide sequence of Rv3620
[0100] SEQ ID NO: 57--polypeptide sequence of Rv1813
[0101] SEQ ID NO: 58--polypeptide sequence of Rv1009 (RpfB)
[0102] SEQ ID NO: 59 polypeptide sequence of Rv2389 (RpfD)
[0103] SEQ ID NO: 60--polypeptide sequence of Rv2626
[0104] SEQ ID NO: 61--polypeptide sequence of Rv1733
[0105] SEQ ID NO: 62--polypeptide sequence of Rv3136 (PPE51)
[0106] SEQ ID NO: 63--polypeptide sequence of Rv0475 (HBHA)
[0107] SEQ ID NO: 64--polypeptide sequence of Rv0125
[0108] SEQ ID NO: 65--polypeptide sequence of Ser/Ala mutated mature Rv0125
[0109] SEQ ID NO: 66--polypeptide sequence of Ra12
[0110] SEQ ID NO: 67--polypeptide sequence of Ra35
[0111] SEQ ID NO: 68--polypeptide sequence of Rv1196
[0112] SEQ ID NO: 69--polypeptide sequence of Mtb72f
[0113] SEQ ID NO: 70--polypeptide sequence of M72
DETAILED DESCRIPTION OF THE INVENTION
[0114] Tuberculosis
[0115] Tuberculosis (TB) is a chronic infectious disease caused by infection with Mycobacterium tuberculosis and other Mycobacterium species. It is a major disease in developing countries, as well as an increasing problem in developed areas of the world. About one third of the world's population are believed to be latently infected with TB bacilli, with about 9 million new cases of active TB and 1.5 million deaths each year. Around 10% of those infected with TB bacilli will develop active TB, each person with active TB infecting an average of 10 to 15 others per year. (World Health Organisation Tuberculosis Facts 2014).
[0116] Mycobacterium tuberculosis infects individuals through the respiratory route. Alveolar macrophages engulf the bacterium, but it is able to survive and proliferate. by inhibiting phagosome fusion with acidic lysosomes. A complex immune response involving CD4+ and CD8+ T cells ensues, ultimately resulting in the formation of a granuloma. Central to the success of Mycobacterium tuberculosis as a pathogen is the fact that the isolated, but not eradicated, bacterium may persist for long periods, leaving an individual vulnerable to the later development of active TB.
[0117] Fewer than 5% of infected individuals develop active TB in the first years after infection. The granuloma can persist for decades and is believed to contain live Mycobacterium tuberculosis in a state of dormancy, deprived of oxygen and nutrients. However, it has been suggested that the majority of the bacteria in the dormancy state are located in non-macrophage cell types spread throughout the body (Locht et al, Expert Opin. Biol. Ther. 2007 7(11):1665-1677). The development of active TB occurs when the balance between the host's natural immunity and the pathogen changes, .for example as a result of an immunosuppressive event (Anderson P Trends in Microbiology 2007 15(1):7-13; Ehlers S Infection 2009 37(2):87-95).
[0118] A dynamic hypothesis describing the balance between latent TB and active TB has also been proposed (Cardona P-J Inflammation & Allergy--Drug Targets 2006 6:27-39; Cardona P-J Infection 2009 37(2):80-86).
[0119] Although an infection may be asymptomatic for a considerable period of time, the active disease is most commonly manifested as an acute inflammation of the lungs, resulting in tiredness, weight loss, fever and a persistent cough. If untreated, serious complications and death typically result.
[0120] Tuberculosis can generally be controlled using extended antibiotic therapy, although such treatment is not sufficient to prevent the spread of the disease. Actively infected individuals may be largely asymptomatic, but contagious, for some time. In addition, although compliance with the treatment regimen (which typically lasts 6 months or more) is critical, patient behaviour is difficult to monitor. Some patients do not complete the course of treatment, which can lead to ineffective treatment and the development of drug resistance.
[0121] Multidrug-resistant TB (MDR-TB) is a form which fails to respond to first line medications. An estimated 480,000 people developed MDR-TB in 2013. MDR-TB is treatable by using second-line drugs. However, second-line treatment options are limited and recommended medicines are not always available. The extensive chemotherapy required (up to two years of treatment) is costly and can produce severe adverse drug reactions in patients.
[0122] Extensively drug-resistant TB (XDR-TB) occurs when resistance to second line medications develops on top of resistance to first line medications. It is estimated that about 9.0% of MDR-TB cases had XDR-TB (World Health Organisation Tuberculosis Facts 2014).
[0123] Even if a full course of antibiotic treatment is completed, infection with M. tuberculosis may not be eradicated from the infected individual and may remain as a latent infection that can be reactivated. Consequently, accurate and early diagnosis of the disease are of utmost importance.
[0124] Currently, vaccination with attenuated live bacteria is the most widely used method for inducing protective immunity. The most common Mycobacterium employed for this purpose is Bacillus Calmette-Guerin (BCG), an avirulent strain of M. bovis which was first developed over 60 years ago. It is administrated at birth in TB endemic regions. However, the safety and efficacy of BCG is a source of controversy--while protecting against severe disease manifestation in children, the efficacy of BOG against disease in adults is variable. Additionally, some countries, such as the United States, do not vaccinate the general public with this agent.
[0125] Several of the proteins which are strongly expressed during the early stages of Mycobacterium infection have been shown to provide protective efficacy in animal vaccination models. However, vaccination with antigens which are highly expressed during the early stages of infection may not provide an optimal immune response for dealing with later stages of infection. Adequate control during latent infection may require T cells which are specific for the particular antigens which are expressed at that time. Post-exposure vaccines which directly target the dormant persistent bacteria may aid in protecting against TB reactivation, thereby enhancing TB control, or even enabling clearance of the infection. A vaccine targeting latent TB could therefore significantly and economically reduce global TB infection rates.
[0126] Vaccines based on late stage antigens could also be utilised in combination with early stage antigens to provide a multiphase vaccine. Alternatively, early and/or late stage antigens could be used to complement and improve BCG vaccination.
[0127] Typically, the aim of the methods of the invention is to induce a protective immune response, i.e. immunise or vaccinate the subject against a related pathogen. The invention may therefore be applied for the prophylaxis, treatment or amelioration of infection by mycobacteria, such as infection by Mycobacterium bovis or Mycobacterium tuberculosis, in particular Mycobacterium tuberculosis.
[0128] The invention may be provided for the purpose of:
[0129] prophylaxis of active tuberculosis due to infection (i.e. primary tuberculosis) or reactivation (i.e. secondary tuberculosis), such as by administering to a subject who is uninfected, or alternatively a subject who has a latent infection;
[0130] prophylaxis of latent tuberculosis, such as by administering to a subject who is uninfected;
[0131] treating latent tuberculosis;
[0132] preventing or delaying reactivation of tuberculosis, especially the delay of reactivation, for example by a period of months, years or indefinitely; or
[0133] treating active tuberculosis (such as to reduce the need for chemotherapeutic treatment: such as reduced term of chemotherapeutic treatment, complexity of drug regimen or dosage of chemotherapeutic treatment; alternatively, to reduce the risk of a later relapse following chemotherapeutic treatment).
[0134] The elicited immune response may be an antigen specific cell response (which may be a systemic and/or a local response). Systemic responses may be detected, for example, from a sample of whole blood. Local responses (for example, the local response in the lung) may be detected from an appropriate sample of tissue (for example, lung tissue) or other locally focused samply method (e.g. bronchoalveolar lavage). The antigen specific T cell response may comprise a CD4+ T cell response, such as a response involving CD4+ T cells expressing a plurality of cytokines (e.g. IFNgamma, TNFalpha or IL2, especially IFNgamma, TNFalpha and IL2). Alternatively, or additionally, the antigen specific T cell response comprises a CD8+ T cell response, such as a response involving CD8+ T cells expressing a plurality of cytokines (e.g. IFNgamma, TNFalpha or 11_2, especially IFNgamma, TNFalpha and IL2).
[0135] The term "active infection" refers to an infection, e.g. infection by M. tuberculosis, with manifested disease symptoms and/or lesions, suitably with manifested disease symptoms.
[0136] The terms "inactive infection", "dormant infection" or "latent infection" or "latent tuberculosis" refer to an infection, e.g. infection by M. tuberculosis, without manifested disease symptoms and/or lesions, suitably without manifested disease symptoms. A subject with latent infection will suitably be one which tests positive for infection, e.g. by Tuberculin skin test (TST) or InterferonGamma-Release Assays (IGRAs), but which has not demonstrated the disease symptoms and/or lesions which are associated with an active infection.
[0137] The term "primary tuberculosis" refers to clinical illness, e.g., manifestation of disease symptoms, directly following infection, e.g. infection by M. tuberculosis. See, Harrison's Principles of internal Medicine, Chapter 150, pp. 953-966 (16th ed., Braunwald, et al., eds., 2005).
[0138] The terms "secondary tuberculosis" or "postprimary tuberculosis" refer to the reactivation of a dormant, inactive or latent infection, e.g. infection by M. tuberculosis. See, Harrison's Principles of Internal Medicine, Chapter 150, pp. 953-966 (16th ed., Braunwald, et al., eds., 2005).
[0139] The term "tuberculosis reactivation" refers to the later manifestation of disease symptoms in an individual that tests positive for infection (e.g. by Tuberculin skin test (TST) or Interferon-Gamma Release Assays (IGRAs)) but does not have apparent disease symptoms. Suitably the individual will not have been re-exposed to infection. The positive diagnostic test indicates that the individual is infected, however, the individual may or may not have previously manifested active disease symptoms that had been treated sufficiently to bring the tuberculosis into an inactive or latent state.
[0140] Suitability the methods are applied to a subject who is uninfected or who has a latent infection by mycobacteria, such as infection by Mycobacterium tuberculosis. In one embodiment the methods are applied to a subject who does not have an infection by Mycobacterium tuberculosis (in the context of human subjects) or Mycobacterium bovis (in the context of bovine subjects). In another embodiment the methods are applied to a subject who has a latent infection by mycobacteria, such as Mycobacterium tuberculosis (in the context of human subjects) or Mycobacterium Bovis (in the context of bovine subjects).
[0141] In some embodiments, the subject has previously been vaccinated with BCG. The approaches of the present invention may, for example, be utilised for a subject at least one year after BCG vaccination, for example at least two years after BCG vaccination such as at least at least five years after BCG vaccination.
[0142] In some embodiments, the subject has previously been infected with M. tuberculosis.
[0143] Antigens of Use in the Invention
[0144] The mycobacterial antigen is an antigenic sequence (i.e. a sequence from a mycobacterial protein which comprises at least one B or T cell epitope). Suitably the mycobacterial antigen comprises at least one T cell epitope.
[0145] Mycobacterial antigens of particular interest in the present invention are those derived from:
[0146] (i) Mtb8.4 (also known as DPV and Rv1174), the polypeptide sequence of which is described in SEQ ID NO: 102 of WO97/09428 (cDNA in SEQ ID NO: 101) and in Coler et al Journal of Immunology 1998 161:2356-2364. Of particular interest is the mature Mtb8.4 sequence which is absent the leading signal peptide (i.e. amino add residues 15-96 from SEQ ID NO: 102 of WO97/09428). The full-length polypeptide sequence of Rv1174 is shown in SEQ ID NO: 39;
[0147] (ii) Mtb9.8 (also known as MSL and Rv0287), the polypeptide sequence of which is described in SEQ ID NO: 109 of WO98/53075 (fragments of MSL are disclosed in SEQ ID NOs: 110-124 of WO98/53075, SEQ ID NOs: 119 and 120 being of particular interest) and also in Coler et al Vaccine 2009 27:223-233 (in particular the reactive fragments shown in FIG. 2 therein). The full-length polypeptide sequence for Rv0287 is shown in SEQ ID NO: 41;
[0148] (iii) Mtb9.9 (also known as Mtb9.9A, MTI, MTI-A and Rv1793) the polypeptide sequence of which is described in SEQ ID NO: 19 of WO98/53075 and in Alderson et al Journal of Experimental Medicine 2000 7:551-559 (fragments of MTI are disclosed in SEQ ID NOs: 17 and 51-66 of WO98/53075, SEQ ID NOs: 17, 51, 52, 53, 56 and 62-65 being of particular interest). A number of polypeptide variants of MTI are described in SEQ ID NOs: 21, 23, 25, 27, 29 and 31 of WO98/53075 and in Alderson et al Journal of Experimental Medicine 2000 7:551-559. The full-length polypeptide sequence for Rv179'3 is shown in SEQ ID NO: 41;
[0149] (iv) Mtb41 (also known as MTCC2 and Rv0915) the polypeptide sequence of which is described in SEQ ID NO: 142 of WO98/53075 (cDNA in SEQ lD NO: 140) and in Skeiky et al Journal of Immunology 2000 165:7140-7149. The full-length polypeptide sequence for Rv0915 is shown in SEQ ID NO; 42;
[0150] (v) ESAT-6 (also known as esxA and Rv3875) the polypeptide sequence of which is described in SEQ ID NO: 103 of WO97/09428 (cDNA in SEQ ID NO: 104) and in Sorensen et al Infection and Immunity 1995 63(5):1710-1717. The full-length polypeptide sequence for Rv3875 is shown in SEQ ID NO: 43;
[0151] (vi) Ag85 complex antigens (e.g. Ag85A, also known as fbpA and Rv3804; or Ag85B, also known as fbpB and Rv1886) which are discussed, for example, in Content et al Infection and Immunity 1991 59:3205-3212 and in Huygen et al Nature Medicine 1996 2(8):893-898. The full-length polypeptide sequence for Rv3804/Ag85A is shown in SEQ ID NO: 44 (the mature protein of residues 43-338, i.e. lacking the signal peptide, being of particular interest). The full-length polypeptide sequence for Ag85B is shown in SEQ ID NO: 45 (the mature protein of residues 41-325, i.e. lacking the signal peptide, being of particular interest);
[0152] (vii) Alpha-crystallin (also known as hspX and Rv2031) which is described in Verbon et al Journal of Bacteriology 1992 174:1352-1359 and Friscia et al Clinical and Experimental Immunology 1995 102:53-57 (of particular interest are the fragments corresponding to residues 71-91, 21-40, 91-110 and 111-130). The full-length polypeptide sequence for Rv2031 is shown in SEQ ID NO: 46;
[0153] (viii) Mpt64 (also known as Rv1980) which is described in Roche et al Scandinavian Journal of Immunology 1996 43:662-670. The full-length polypeptide sequence for Mpt64 is shown in SEQ ID NO: 47 (the mature protein of residues 24-228, i.e. lacking the signal peptide, being of particular interest):
[0154] (ix) TB10.4 (also known as cfp7 and Rv0288), described for example in Skjot et al. Infect Immun 2002 70: 5446-5453, Dietrich et al. J Immunol 2005 174:6332-6339 and Elvang et al. PLoS One 2009;4(4):e5139, The full-length polypeptide sequence for Rv0288 is shown in SEQ ID NO: 48;
[0155] (x) RV1753, such as described in Seq ID NOs: 1 and 2-7 of WO2010010180. The full-length polypeptide sequence for Rv1753 from Mycobacterium tuberculosis strain C is shown in SEQ ID NO: 49;
[0156] (xi) Rv2386c, Seq ID NOs: 1 and 2-7 of WO2010010179. The full-length polypeptide sequence for Rv2386c from Mycobacterium tuberculosis H37Rv is shown in SEQ ID NO: 50; and/or
[0157] (xii) Mtb40 (also known as HTCC1 and Rv3616) such as described in WO2011092253, for example a natural Rv3616 sequence selected from Seq ID NOs: 1 and 2-7 of WO2011092253 or a modified Rv3616 sequence such as those selected from Seq ID NOs: 161 to 169, 179 and 180 of WO2011092253. Rv3616 fragments selected from SEQ ID NO: 127, 128, 130, 131-133, 135, 143-148 or 150-156 of WO2011092253 are of particular interest. The full-length polypeptide sequence for Rv3616 is shown in SEQ ID NO: 51;
[0158] (xiii) Rv3407, the full-length polypeptide sequence for Rv3407 is shown in SEQ ID NO: 52;
[0159] (xiv) Rv2660, the full-length polypeptide sequence for Rv2660 is shown in SEQ ID NO: 53;
[0160] (xv) Rv2608, the full-length polypeptide sequence for Rv2608 is shown in SEQ ID NO: 54;
[0161] (xvi) Rv3619, the full-length polypeptide sequence for Rv3619 is shown in SEQ ID NO: 55;
[0162] (xvii) Rv3620, the full-length polypeptide sequence for Rv3620 is shown in SEQ ID NO: 56;
[0163] (xviii) Rv1813, the full-length polypeptide sequence for Rv1813 is shown in SEQ ID NO: 57;
[0164] (xix) Rv1009, also known as RpfB, the full-length polypeptide sequence for Rv1009 is shown in SEQ ID NO: 58;
[0165] (xx) Rv2389, also known as RpfD, the full-length polypeptide sequence for Rv2389 is shown in SEQ ID NO: 59;
[0166] (xxi) Rv2626, the full-length polypeptide sequence for Rv2626 is shown in SEQ ID NO: 60;
[0167] (xxii) Rv1733, the full-length polypeptide sequence for Rv1733 is shown in SEQ ID NO:
[0168] 61;
[0169] (xxiii) Rv3136, also known as PPE51, the full-length polypeptide sequence for Rv3136 is shown in SEQ ID NO: 62;
[0170] (xxiv) Rv0475, also known as HBHA and described in WO97044463, WO03044048 and WO2010149657, the full-length polypeptide sequence for Rv0475 is shown in SEQ ID NO: 63;
[0171] (xxv) Mtb32A (also known as Rv0125), the polypeptide sequence of which is described in SEQ ID NO: 2 (full-length) and residues 8-330 of SEQ ID NO: 4 (mature) of WO01198460, especially variants having at least one of the catalytic triad mutated (e.g. the catalytic serine residue, which may for example be mutated to alanine). The mature polypeptide sequence for Rv0125 is shown in SEQ ID NO: 64. The mature form of Mtb32A having a Ser/Ala mutation is shown in SEQ ID NO: 65. Fragments of Rv0125 of particular interest include Ra12 (also known as Mtb32A C-terminal antigen) the polypeptide sequence of which is described in SEQ ID NO: 10 of WO01/98460 and in Skeiky et al Journal of Immunology 2004 172:7618-7682. The full-length polypeptide sequence for Ra12 is shown in SEQ ID NO: 66. Another fragment of Rv0125 of particular interest is Ra35 (also known as Mtb32A N-terminal antigen) the polypeptide sequence of which is described in SEQ ID NO: 8 of WO01/98460 and in Skeiky et al Journal of Immunology 2004 172;7618-7682, The full-length polypeptide sequence for Ra35 is shown in SEQ ID NO: 67; and
[0172] (xxvi) TbH9 (also known as Mtb39, Mtb39A, TbH9FL and Rv1196) the polypeptide sequence of which is described in SEQ ID NO: 107 of WO97/09428, and also in Dillon et al Infection and immunity 1999 67(6):2941-2950 and Skeiky et al Journal of Immunology 2004 172:7618-7682. The full-length polypeptide sequence for Rv1196 is shown in SEQ ID NO: 68.
[0173] Combinations of mycobacterial antigens may also be utilised. In such cases the mycobacterial antigens may be encoded individually or as part of one or more fusion proteins. "Fusion polypeptide" or "fusion protein" refers to a protein having at least two heterologous polypeptides (e.g. at least two Mycobacterium sp. polypeptides) covalently linked, either directly or via an amino acid linker. The polypeptides of the fusion protein can be in any order.
[0174] Combinations of antigens which may be encoded (suitably in the form of a single fusion protein) include:
[0175] (a) an Ag85B sequence and an ESAT-6 sequence (such as the H1 fusion protein antigen);
[0176] (b) an Ag85B sequence and a TB10.4 sequence (such as the H4 fusion protein antigen);
[0177] (c) an Ag85B sequence, an ESAT-6 sequence and an Rv2660 sequence (such as the H56 fusion protein antigen);
[0178] (d) an Rv2608 sequence, an Rv3619 sequence, an Rv3620 sequence and an Rv1813 sequence (such as the 1D93 fusion protein antigen);
[0179] (e) an Ag85A sequence, an Ag85B sequence and an Rv3407 sequence;
[0180] (f) an Rv1733 sequence and an Rv2626 sequence; such as:
[0181] (f-i) an Rv1733 sequence, an Rv2626 sequence and an ESAT-6 sequence;
[0182] (f-i-a) an Rv1733 sequence, an Rv2626 sequence, an ESAT-6 sequence and an RpfD sequence; such as:
[0183] (f-i-a-i) an Rv1733 sequence, an Rv2626 sequence, an ESAT-6 sequence, an RpfD sequence and an Ag85B sequence;
[0184] (f-i-b) an Rv1733 sequence, an Rv2626 sequence, an ESAT-6 sequence and an Rpfb sequence; such as:
[0185] (f-i-b-i) an Rv1733 sequence, an Rv2626 sequence, an ESAT-6 sequence, an RpfB sequence and an Ag85B sequence;
[0186] (f-ii) an Rv1733 sequence, an Rv2626 sequence and an Rv3407 sequence; such as:
[0187] (f-ii-a) an Rv1733 sequence, an Rv2626 sequence, an Rv3407 sequence and an RpfD sequence;
[0188] (f-ii-b) an Rv1733 sequence, an Rv2626 sequence, an Rv3407 sequence and an RpfB sequence;
[0189] (f-iii) an Rv1733 sequence, an Rv2626 sequence and a PPE51 sequence; such as:
[0190] (f-iii-a) an Rv1733 sequence, an Rv2626 sequence, a PPE51 sequence and an RpfD sequence;
[0191] (f-iii-b) an Rv1733 sequence, an Rv2626 sequence, a PPE51 sequence and an RpfB sequence.
[0192] The skilled person will recognise that combinations need not rely upon the specific sequences described above in (i)-(xxvi), and that variants having at least 80% identity, such as at least 90% identity, in particular at least 95% identity and especially at least 98% identity) or immunogenic fragments (e.g. at least 20% of the full length antigen, such as at least 40% of the antigen, in particular at least 50% and especially at least 75%) of the described sequences can alternatively be used.
[0193] The adenovirus may therefore encode any of combinations (a) to (f), such as in a single fusion protein, wherein each of the components has at least 80% identity, such as at least 90% identity, in particular at least 95% identity and especially at least 98% identity) or is an immunogenic fragments (e.g. at least 20% of the full length antigen, such as at least 40% of the antigen, in particular at least 50% and especially at least 75%) of the described sequences (which may be in any order).
[0194] Particularly suitable combinations are those comprising an Rv1196 sequence and an Rv0125 sequence, such as the Mtb72f fusion protein antigen or M72 fusion protein antigen.
[0195] Rv1196 (described, for example, by the name Mtb39a in Dillon et al Infection and Immunity 1999 67(6): 2941-2950) is highly conserved, with 100% sequence identity across H37Rv, C, Haarlem, CDC1551, 94-M4241A, 98-R604INH-RIF-EM, KZN605, KZN1435, KZN4207, KZNR506 strains, the F11 strain having a single point mutation Q30K (most other clinical isolates have in excess of 90% identity to H37Rv). An adenovirus encoding an Rv1196 related antigen is described in Lewinsohn et al Am J Respir Crit Care Med 2002 116:843-848.
[0196] Rv0125 (described, for example, by the name Mtb32a in Skeiky et al Infection and Immunity 1999 67(8): 3998-4007) is also highly conserved, with 100% sequence identity across many strains. An adenovirus (human Ad5) encoding an Rv0125 related antigen is described in Zhang et al Human Vaccines & Therapeutics 2015 11(7):1803-1813 doi: 10.1080/21645515.2015.1042193. Full length Rv0125 includes an N-terminal signal sequence which is cleaved to provide the mature protein.
[0197] Mtb72f has been shown to provide protection in a number of animal models (see, for example: Brandt et al Infect. Immun. 2004 72(11):6622-6632; Skeiky et al J. Immunol. 2004 172:7618-7628; Tsenova et al Infect. Immun. 2006 74(4):2392-2401). Mtb72f has also been the subject of clinical investigations (Von Eschen et al 2009 Human Vaccines 5(7):475-482). M72 is an improved antigen which incorporates a single serine to alanine mutation relative to Mtb72f, resulting in improved stability characteristics. M72 related antigens have also been shown to be of value in a latent TB model (international patent application WO2006/117240, incorporated herein by reference). Previous pre-clinical and clinical investigations have led to M72 being administered in humans in conjunction with the immunostimulants 3-O-deacylated monophosphoryl lipid A (3D-MPL) and QS21 in a liposomal formulation and in a 0,1 month schedule using 10 ug M72 polypeptide, 25 ug 3D-MPL and 25 ug QS21 (see, for example, Leroux-Roels et al Vaccine 2013 31 2196-2206, Montoya et al J. Clin. Immunol. 2013 33(8): 1360-1375; Thacher EG et al AIDS 2014 28(141769-1781; idoko OT et al Tuberculosis (Edinb) 2014 94(6):564-578; Penn-Nicholson A, et al Vaccine 2015 33(32):4025-4034 doi:10.1016/j.vaccine.2015.05.088).
[0198] In an embodiment of the invention the adenovirus comprises a nucleic acid sequence encoding a mycobacterial antigen derived from at least one of Rv0125, Rv0287, Rv0288, Rv0475, Rv0915, Rv1009, Rv1174, Rv1196, Rv1733, Rv1753, Rv1793, Rv1813, Rv1886, Rv1980, Rv2031, Rv2386, Rv2389, Rv2608, Rv2626, Rv2660, Rv3136, Rv3407, Rv3616, Rv3619, Rv3620, Rv3804 and Rv3875, in particular at least one of Rv0125, Rv0287, Rv0915, Rv1174, Rv1196, Rv1753, Rv1793, Rv2386 and Rv3616, especially at least one of Rv0125 and Rv1196.
[0199] By the term RvNNNN, means the protein encoded by the gene number NNNN identified from the H37Rv strain of Mycobacterium tuberculosis or a homologue thereof from another mycobacterium, such as Mycobacterium bovis, or in particular from another strain of Mycobacterium tuberculosis. Sequences for proteins from H37Rv are known in the art and may be obtained, for example, from Tuberculist (tuberculist.epfl.ch/).
[0200] The adenovirus may comprise a nucleic add sequence encoding a mycobacterial antigen which comprises at least one of Rv0125, Rv0287, Rv0288, Rv0475, Rv0915, Rv1009, Rv1174, Rv1196, Rv1733, Rv1753, Rv1793, Rv1813, Rv1886, Rv1980, Rv2031, Rv2386, Rv2389, Rv2608, Rv2626, Rv2660, Rv3136, Rv3407, Rv3616, Rv3619, Rv3620, Rv3804 and Rv3875, in particular at least one of Rv0125, Rv0287, Rv0915, Rv1174, Rv1196, Rv1753, Rv1793, Rv2386 and Rv3616, especially at least one of Rv0125 and Rv1196,
[0201] The adenovirus may comprise a nucleic acid sequence encoding a mycobacterial antigen which comprises a sequence which has at least 80% identity to Rv0125, Rv0287, Rv0288, Rv0475, Rv0915, Rv1009, Rv1174, Rv1196, Rv1733, Rv1753, Rv1793, Rv1813, Rv1886, Rv1980, Rv2031, Rv2386, Rv2389, Rv2608, Rv2626, Rv2660, Rv3136, Rv3407, Rv3616, Rv3619, Rv3620, Rv3804 or Rv3875, in particular Rv0125, Rv0287, Rv0915, Rv1174, Rv1196, Rv1753, Rv1793, Rv2386 or Rv3616, especially at least one of Rv0125 or Rv1196, Suitably the mycobacterial antigen which comprises a sequence which has at least 90% identity to the reference sequence, in particular at least 95%, such as at least 98%.
[0202] The adenovirus may comprise a nucleic; add sequence encoding a mycobacterial antigen which comprises a sequence which is an immunogenic fragment (such as comprising one or more T cell epitopes) of Rv0125, Rv0287, Rv0288, Rv0475, Rv0915, Rv1009, Rv1174, Rv1196, Rv1733, Rv1753, Rv1793, Rv1813, Rv1886, Rv1980, Rv2031, Rv2386, Rv2389, Rv2608, Rv2626, Rv2660, Rv3136, Rv3407, Rv3616, Rv3619, Rv3620, Rv3804 or Rv3875, in particular Rv0125, Rv0287, Rv0915, Rv1174, Rv1196, Rv1753, Rv1793, Rv2386 or Rv3616, especially at least one of Rv0125 or Rv1196. Suitably the mycobacterial antigen which comprises a sequence which is an immunogenic fragment of at least 20 amino acid residues, such as at least 50 amino acid residues from the reference sequence. Alternatively, the mycobacterial antigen which comprises a sequence which is an immunogenic fragment of at least 20% of the total length of the reference sequence, such as at least 30% of the total length of the reference sequence residues, in particular at least 40% of the total length of the reference sequence residues, especially at least 50% of the total length of the reference sequence residues such as at least 75% of the total length of the reference sequence.
[0203] In an embodiment of the invention the adenovirus comprises a nucleic acid sequence encoding a mycobacterial antigen derived from at least one of SEQ ID NOs: 39 to 68, in particular SEQ ID NOs: 65 to 68 and especially SEQ ID NO: 70.
[0204] Each of the above individual antigen sequences is also disclosed in Cole et al Nature 1998 393:537-544 and Camus Microbiology 2002 148:2967-2973. The genome of M. tuberculosis H37Rv is publicly available, for example at the Welcome Trust Sanger Institute website (world wide web sanger.ac.uk/Projects/M_tuberculcsisn and elsewhere.
[0205] Suitably the mycobacterial antigen contains 2500 amino acid residues or fewer, such 1500 amino acid residues or fewer, in particular 1200 amino acid residues or fewer, especially 1000 amino acid residues or fewer, typically 800 amino acid residues or fewer.
[0206] T cell epitopes are short contiguous stretches of amino acids which are recognised by T cells (e.g. CD4+ or CD8+ T cells). Identification of T cell epitopes may be achieved through epitope mapping experiments which are known to the person skilled in the art (see, for example, Paul, Fundamental Immunology, 3rd ed., 243-247 (1993); Bei.beta.barth et al Bioinforrnatics 2005 21(Suppl. 1):i29-i37). In a diverse out-bred population, such as humans, different HLA types mean that particular epitopes may not be recognised by all members of the population. As a result of the crucial involvement of the T cell response in tuberculosis, to maximise the level of recognition and scale of immune response, an immunogenic derivative of a reference sequence is desirably one which contains the majority (or suitably all) T cell epitopes intact. Mortier et al BMC Immunology 2015 16:63 undertake sequence conservation analysis and in sifico human leukocyte antigen-peptide binding predictions for Mtb72f and M72 tuberculosis candidate vaccine antigens.
[0207] The skilled person will recognise that individual substitutions, deletions or additions to a protein which alters, adds or deletes a single amino acid or a small percentage of amino acids is an "immunogenic derivative" where the alteration(s) results in the substitution of an amino add with a functionally similar amino add or the substitution/deletion/addition of residues which do not substantially impact the immunogenic function.
[0208] Conservative substitution tables providing functionally similar amino acids are well known in the art. In general, such conservative substitutions will fall within one of the amino-acid groupings specified below, though in some circumstances other substitutions may be possible without substantially affecting the immunogenic properties of the antigen. The following eight groups each contain amino acids that are typically conservative substitutions for one another:
[0209] 1) Alanine (A), Glycine (G);
[0210] 2) Aspartic add (D), Glutamic acid (F);
[0211] 3) Asparagine (N), Glutamine (Q);
[0212] 4) Arginine (R), Lysine (K);
[0213] 5) Isoleucine (I), Leucine (L), Methionine (M), Valine (V);
[0214] 6) Phenylalanine (F), Tyrosine (Y), Tryptophan (W);
[0215] 7) Serine (S), Threonine (T); and
[0216] 8) Cysteine (C), Methionine (M) (see, e.g., Creighton, Proteins 1984).
[0217] Suitably such substitutions do not occur in the region of an epitope, and do not therefore have a significant impact on the immunogenic, properties of the antigen.
[0218] Immunogenic derivatives may also include those wherein additional amino adds are inserted compared to the reference sequence. Suitably such insertions do not occur in the region of an epitope, and do not therefore have a significant impact on the immunogenic properties of the antigen. One example of insertions includes a short stretch of histidine residues (e.g. 2-6 residues) to aid expression and/or purification of the antigen in question.
[0219] Immunogenic derivatives include those wherein amino acids have been deleted compared to the reference sequence. Suitably such deletions do not occur in the region of an epitope, and do not therefore have a significant impact on the immunogenic properties of the antigen. The skilled person will recognise that a particular immunogenic derivative may comprise substitutions, deletions and additions (or any combination thereof).
[0220] The terms "identical" or percentage "identity," in the context of two or more antigen sequences, refer to two or more sequences or sub-sequences that are the same or have a specified percentage of amino acid residues that are the same, when compared and aligned for maximum correspondence over a comparison window, or designated region as measured using one of the following sequence comparison algorithms or by manual alignment and visual inspection. This definition also refers to the compliment of a test sequence. Optionally, the identity exists over a region that is at least 200 amino acids in length, such as at least 300 amino acids or at least 400 amino adds. Suitably, the comparison is performed over a window corresponding to the entire length of the reference sequence (as opposed to the derivative sequence).
[0221] For sequence comparison, one sequence acts as the reference sequence, to which the test sequences are compared, When using a sequence comparison algorithm, test and reference sequences are entered into a computer, subsequence coordinates are designated, if necessary, and sequence algorithm program parameters are designated. Default program parameters can be used, or alternative parameters can be designated. The sequence comparison algorithm then calculates the percentage sequence identities for the test sequences relative to the reference sequence, based on the program parameters.
[0222] A "comparison window", as used herein, refers to a segment in which a sequence may be compared to a reference sequence of the same number of contiguous positions after the two sequences are optimally aligned. Methods of alignment of sequences for comparison are well-known in the art. Optimal alignment of sequences for comparison can be conducted, e.g., by the local homology algorithm of Smith & Waterman, Adv. Appl. Math. 2:482 (1981), by the homology alignment algorithm of Needleman & Wunsch, J. Mol. Biol. 48:443 (1970), by the search for similarity method of Pearson & Lipman, Proc. Nat'l. Acad. Sci. USA 85:2444 (1988), by computerised implementations of these algorithms (GAP, BESTFIT, PASTA, and TFASTA in the Wisconsin Genetics Software Package, Genetics Computer Group, 575 Science Dr., Madison, Wis.), or by manual alignment and visual inspection (see, e.g., Current Protocols in Molecular Biology (Ausubel et al., eds. 1995 supplement)).
[0223] One example of a useful algorithm is PILEUP. PILEUP creates a multiple sequence alignment from a group of related sequences using progressive, pairwise alignments to show relationship and percent sequence identity. It also plots a tree or dendogram showing the clustering relationships used to create the alignment. PILEUP uses a simplification of the progressive alignment method of Feng & Doolittle, J. Mol. Evol. 35:351-360 (1987). The method used is similar to the method described by Higgins & Sharp, CABIOS 5:151-153 (1989). The program can align up to 300 sequences, each of a maximum length of 5,000 nucleotides or amino acids. The multiple alignment procedure begins with the pairwise alignment of the two most similar sequences, producing a cluster of two aligned sequences. This cluster is then aligned to the next most related sequence or cluster of aligned sequences. Two clusters of sequences are aligned by a simple extension of the pairwise alignment of two individual sequences. The final alignment is achieved by a series of progressive, pairwise alignments. The program is run by designating specific sequences and their amino acid coordinates for regions of sequence comparison and by designating the program parameters. Using PILEUP, a reference sequence is compared to other test sequences to determine the percent sequence identity relationship using the following parameters: default gap weight (3.00), default gap length weight (0.10), and weighted end gaps. PILEUP can be obtained from the GCG sequence analysis software package, e.g., version 7.0 (Devereaux et al. Nuc. Acids Res. 12:387-395 (1984)).
[0224] Another example of algorithm that is suitable for determining percent sequence identity and sequence similarity are the BLAST and BLAST 2.0 algorithms, which are described in Altschul et al., Nuc. Acids Res, 25:3389-3402 (1977) and Altschul et al., J. Mol. Biol. 215:403-410 (1990), respectively. Software for performing BLAST analyses is publicly available through the National Center for Biotechnology Information (website at world wide web ncbi.nim.nih.govi). This algorithm involves first identifying high scoring sequence pairs (HSPs) by identifying short words of length W in the query sequence, which either match or satisfy some positive-valued threshold score T when aligned with a word of the same length in a database sequence. T is referred to as the neighbourhood word score threshold (Altschul et al., supra). These initial neighbourhood word hits act as seeds for initiating searches to find longer HSPs containing them. The word hits are extended in both directions along each sequence for as far as the cumulative alignment score can be increased. Cumulative scores are calculated using, for nucleotide sequences, the parameters M (reward score for a pair of matching residues; always >0) and N (penalty score for mismatching residues; always <0). For amino acid sequences, a scoring matrix is used to calculate the cumulative score. Extension of the word hits in each direction are halted when: the cumulative alignment score falls off by the quantity X from its maximum achieved value; the cumulative score goes to zero or below, due to the accumulation of one or more negative-scoring residue alignments; or the end of either sequence is reached. The BLAST algorithm parameters W, T, and X determine the sensitivity and speed of the alignment. The BLASTN program (for nucleotide sequences) uses as defaults a wordlength (W) of 11, an expectation (E) or 10, M=5, N=-4 and a comparison of both strands. For amino add sequences, the BLASTP program uses as defaults a wordlength of 3, and expectation (E) of 10, and the BLOSUM62 scoring matrix (see Henikoff & Henikoff, Proc. Natl. Acad. Sci. USA 89:10915 (1989)) alignments (B) of 50, expectation (E) of 10, M=5, N=-4, and a comparison of both strands.
[0225] The BLAST algorithm also performs a statistical analysis of the similarity between two sequences (see, e.g., Karlin & Altschul, Proc. Nat'l. Acad. Sci. USA 90:5873-5787 (1993)). One measure of similarity provided by the BLAST algorithm is the smallest sum probability (P(N)), which provides an indication of the probability by which a match between two nucleotide or amino acid sequences would occur by chance. For example, a nucleic add is considered similar to a reference sequence if the smallest sum probability in a comparison of the test nucleic acid to the reference nucleic acid is less than about 0.2, more preferably less than about 0.01, and most preferably less than about 0.001.
[0226] In any event, immunogenic derivatives of a polypeptide sequence will usually have essentially the same activity as the reference sequence. By essentially the same activity is meant at least 50%, suitably at least 75% and especially at least 90% activity of the reference sequence in an in vitro restimulation assay of PBMC, whole blood, lung tissue or bronchoalveolar lavage with specific antigens (e.g. restimulation for a period of between several hours to up to two weeks, such as up to one day, 1 day to 1 week or 1 to 2 weeks) that measures the activation of the cells via lymphoproliferation, production of cytokines in the supernatant of culture (measured by ELISA, CBA etc) or characterisation of T and B cell responses by intra and extracellular staining (e.g. using antibodies specific to immune markers, such as CD3, CD4, CD8, IL2, TNF-alpha, IFN-gamma, IL-17, CD40L, CD69 etc) followed by analysis with a flow cytometer. Suitably, by essentially the same activity is meant at least 50%, suitably at least 75% and especially at least 90% activity of the reference sequence in a T cell proliferation and/or IFN-gamma production assay.
[0227] In one embodiment the encoded antigen is a Rv1196 related antigen. The term `Rv1196 related antigen` refers to the Rv1196 protein provided in SEQ ID NO: 68 or an immunogenic derivative thereof. As used herein the term "derivative" refers to an antigen that is modified relative to the reference sequence. Immunogenic derivatives are sufficiently similar to the reference sequence to substantially retain the immunogenic properties of the reference sequence and remain capable of allowing an immune response to be raised against the reference sequence. An immunogenic derivative may, for example, comprise a modified version of the reference sequence or alternatively may consist of a modified version of the reference sequence,
[0228] The Rv1196 related antigen may for example contain 2500 amino acid residues or fewer, such 1500 amino add residues or fewer, in particular 1200 amino acid residues or fewer, especially 1000 amino acid residues or fewer, typically 800 amino add residues or fewer.
[0229] Suitably the Rv1196 related antigen will comprise, such as consist of, a sequence having at least 70% identity to SEQ ID NO: 68, such as at least 80%, in particular at least 90%, especially at least 95%, for example at least 98%, such as at least 99%.
[0230] A specific example of an Rv1196 related antigen is Rv1196 from Mycobacterium tuberculosis strain H37Rv, as provided in SEQ ID NO: 68. Consequently, in one embodiment of the invention the Rv1196 related antigen is a protein comprising SEQ ID NO: 68. In a second embodiment of the invention the Rv1196 related antigen is a protein consisting of SEQ ID NO: 68.
[0231] Typical Rv1196 related antigens will comprise (such as consist of) an immunogenic derivative of SEQ ID NO: 68 having a small number of deletions, insertions and/or substitutions. Examples are those having deletions of up to 5 residues at 0-5 locations, insertions of up to 5 residues at 0-5 five locations and substitution of up to 20 residues,
[0232] Other immunogenic derivatives of Rv1196 are those comprising (such as consisting of) a fragment of SEQ ID NO: 68 which is at least 200 amino adds in length, such as at least 250 amino acids in length, in particular at least 300 amino acids in length, especially at least 350 amino acids in length.
[0233] In one embodiment the polypeptide antigen and the encoded antigen are Rv0125 related antigens. The term `Rv0125 related antigen` refers to the Rv0125 protein provided in SEQ ID NO: 64 or an immunogenic derivative thereof. As used herein the term "derivative" refers to an antigen that is modified relative to the reference sequence. Immunogenic derivatives are sufficiently similar to the reference sequence to substantially retain the immunogenic properties of the reference sequence and remain capable of allowing an immune response to be raised against the reference sequence. An immunogenic derivative may, for example, comprise a modified version of the reference sequence or alternatively may consist of a modified version of the reference sequence.
[0234] The Rv0125 related antigen may for example contain 2500 amino acid residues or fewer, such 1500 amino acid residues or fewer, in particular 1200 amino acid residues or fewer, especially 1000 amino acid residues or fewer, typically 800 amino acid residues or fewer.
[0235] Suitably the Rv0125 related antigen will comprise, such as consist of, a sequence having at least 70% identity to SEQ ID NO: 64, such as at least 80%, in particular at least 90%, especially at least 95%, for example at least 98%, such as at least 99%.
[0236] A specific example of an Rv0125 related antigen is Rv0125 from Mycobacterium tuberculosis strain H37Rv, as provided in SEQ ID NO: 64. Consequently, in one embodiment of the invention the Rv0125 related antigen is a protein comprising SEQ ID NO: 64. Ina second embodiment of the invention the Rv0125 rotated antigen is a protein consisting of SEQ ID NO: 64.
[0237] Typical Rv0125 related antigens will comprise (such as consist of) an immunogenic derivative of SEQ ID NO: 64 having a small number of deletions, insertions and/or substitutions. Examples are those having deletions of up to 5 residues at 0-5 locations, insertions of up to 5 residues at 0-5 five locations and substitution of up to 20 residues.
[0238] Other immunogenic derivatives of Rv0125 are those comprising (such as consisting of) a fragment of SEQ ID NO: 64 which is at least 150 amino acids in length, such as at least 200 amino acids in length, in particular at least 250 amino acids in length, especially at least 300 amino acids in length. Particular immunogenic derivatives of Rv0125 are those comprising (such as consisting of) the fragment of SEQ ID NO: 64 corresponding to residues 1-195 of SEQ ID NO: 3. Further immunogenic derivatives of Rv1196 are those comprising (such as consisting of) the fragment of SEQ ID NO: 64 corresponding to residues 192-323 of SEQ ID NO: 64.
[0239] Particularly preferred Rv0125 related antigens are derivatives of SEQ ID NO: 64 wherein at least one (for example one, two or even all three) of the catalytic triad have been substituted or deleted, such that the protease activity has been reduced and the protein more easily produced the catalytic serine residue may be deleted or substituted (e.g. substituted with alanine) and/or the catalytic histidine residue may be deleted or substituted and/or substituted the catalytic aspartic acid residue may be deleted or substituted. Especially of interest are derivatives of SEQ ID NO: 64 wherein the catalytic serine residue has been substituted (e.g. substituted with alanine). Also of interest are Rv0125 related antigens which comprise, such as consist of, a sequence having at least 70% identity to SEQ lD NO: 64, such as at least 80%, in particular at least 90%, especially at least 95%, for example at least 98%, such as at least 99% and wherein at least one of the catalytic triad have been substituted or deleted or those comprising, such as consisting of, a fragment of SEQ ID NO: 64 which is at least 150 amino acids in length, such as at least 200 amino acids in length, in particular at least 250 amino acids in length, especially at least 300 amino acids in length and wherein at least one of the catalytic triad have been substituted or deleted. Further immunogenic derivatives of Rv0125 are those comprising (such as consisting of) the fragment of SEQ ID NO: 64 corresponding to residues 192-323 of SEQ ID NO: 64 wherein at least one (for example one, two or even all three) of the catalytic triad have been substituted or deleted. Particular immunogenic derivatives of Rv1196 are those comprising (such as consisting of) the fragment of SEQ ID NO: 64 corresponding to residues 1-195 of SEQ ID NO: 64 wherein the catalytic serine residue (position 176 of SEQ ID NO: 64) has been substituted (e.g. substituted with alanine).
[0240] In certain embodiments the mycobacterial antigen is an Rv1196 and Rv0125 related antigen, such as M72 related antigens. Particular derivatives of the M72 protein include those with additional His residues at the N-terminus (e.g. two His residues; or a polyhistidine tag of five or particularly six His residues, which may be used for nickel affinity purification). Mtb72f which contains the original serine residue that has been mutated in M72, is a further derivative of M72, as are Mtb72f proteins with additional His residues at the N-terminus (e.g. two His residues; or a polyhistidine tag of five or particularly six His residues, which may be used for nickel affinity purification).
[0241] In some embodiments a single adenovirus may encode two distinct polypeptides, one being a Rv1196 related antigen and one being a Rv0125 related antigen.
[0242] Suitably an M72 related antigen will comprise, such as consist of, a sequence having at least 70% identity to SEQ ID NO. 70, such as at least 80%, in particular at least 90%, especially at least 95%, such as at least 98%, for example at least 99%.
[0243] Typical M72 related antigens will comprise, such as consist of, a derivative of SEQ ID NO: 70 having a small number of deletions, insertions and/or substitutions. Examples are those having deletions of up to 5 residues at 0-5 locations, insertions of up to 5 residues at 0-5 five locations and substitution of up to 20 residues.
[0244] Other derivatives of M72 are those comprising, such as consisting of, a fragment of SEQ ID NO: 70 which is at least 450 amino acids in length, such as at least 500 amino acids in length, such as at least 550 amino acids in length, such as at least 600 amino acids in length, such as at least 650 amino adds in length or at least 700 amino acids in length. As M72 is a fusion protein derived from two individual antigens, any fragment of at least 450 residues will comprise a plurality of epitopes from the full length sequence (Skeiky et al J. Immunol. 2004 172:7618-7628; Skeiky Infect. Immun. 1999 67(8):3998-4007; Dillon Infect Immun. 1999 67(6):2941-2950).
[0245] In particular embodiments the M72 related antigen will comprise residues 2-723 of SEQ ID NO. 70, for example comprise (or consist of) SEQ ID NO. 70.
[0246] Adenovirus
[0247] Adenoviruses have a characteristic morphology with an icosahedral capsid comprising three major proteins, hexon (II), penton base (III) and a knobbed fiber (IV), along with a number of other minor proteins, VI, VIII, IX, IIIa and IVa2. The virus genome is a linear, double-stranded DNA. The virus DNA is intimately associated with the highly basic protein VII and a small peptide pX (formerly termed mu). Another protein, V, is packaged with this DNA-protein complex and provides a structural link to the capsid via protein VI. The virus also contains a virus-encoded protease, which is necessary for processing of some of the structural proteins to produce mature infectious virus.
[0248] The adenoviral genome is well characterized. There is general conservation in the overall organization of the adenoviral genome with respect to specific open reading frames being similarly positioned, e.g. the location of the E1A, E1B, E2A, E2B, E3, E4, L1, L2, L3, L4 and L5 genes of each virus. Each extremity of the adenoviral genome comprises a sequence known as an inverted terminal repeat (ITR), which is necessary for viral replication. The virus also comprises a virus-encoded protease, which is necessary for processing some of the structural proteins required to produce infectious virions. The structure of the adenoviral genome is described on the basis of the order in which the viral genes are expressed following host cell transduction. More specifically, the viral genes are referred to as early (E) or late (L) genes according to whether transcription occurs prior to or after onset of DNA replication. In the early phase of transduction, the E1A, E1B, E2A, E2B, E3 and E4 genes of adenovirus are expressed to prepare the host cell for viral replication. During the late phase of infection, expression of the late genes L1-L5, which encode the structural components of the virus particles, is activated.
[0249] Adenoviruses are species-specific and different serotypes, i.e., types of viruses that are not cross-neutralized by antibodies, have been isolated from a variety of mammalian species. For example, more than 50 serotypes have been isolated from humans which are divided into six subgroups (A-F: B is subdivided into B1 and B2) based on sequence homology and on their ability to agglutinate red blood cells (Tatsis and Ertl Molecular Therapy (2004) 10:616--629). Numerous adenoviruses have been isolated from nonhuman simians such as chimpanzees, bonobos, rhesus macaques and gorillas, and they are classified into the same human groups based on phylogenetic relationships based on hexon or fiber sequences (Colloca et al. (2012) ScienceTranslational Medicine 4:1-9; Roy et al. (2004) Virology 324: 361-372; Roy et al. (2010) Journal of Gene Medicine 13:17-25).
[0250] Adenovirus Capsid Proteins Including the Fiber Protein and Polynucleotides Encoding These Proteins
[0251] As outlined above, the adenoviral capsid comprises three major proteins, hexon, penton and fiber. The hexon accounts for the majority of the structural components of the capsid, which consists of 240 trimeric hexon capsomeres and 12 penton bases. The hexon has three conserved double barrels, while the top has three towers, each tower containing a loop from each subunit that forms most of the capsid. The base of hexon is highly conserved between adenoviral serotypes, while, the surface loops are variable (Tatsis and Ertl Molecular Therapy (2004) 10:616-629).
[0252] Penton is another adenoviral capsid protein that forms a pentameric base to which fiber attaches. The trimeric fiber protein protrudes from the penton base at each of the 12 vertices of the capsid and is a knobbed rod-like structure. A remarkable difference in the surface of adenovirus capsids compared to that of most other icosahedral viruses is the presence of the long, thin fiber protein. The primary role of the fiber protein is the tethering of the viral capsid to the cell surface via its interaction with a cellular receptor.
[0253] The fiber proteins of many adenovirus serotypes share a common architecture: an N-terminal tail, a central shaft made of repeating sequences, and a C-terminal globular knob domain (or "head"). The central shaft domain consists of a variable number of beta-repeats. The beta-repeats connect to form an elongated structure of three intertwined spiralling strands that is highly rigid and stable. The shaft connects the N-terminal tail with the globular knob structure, which is responsible for interaction with the target cellular receptor. The globular nature of the adenovirus knob domain presents large surfaces for binding the receptor laterally and apically. The effect of this architecture is to project the receptor-binding site far from the virus capsid, thus freeing the virus from steric constraints presented by the relatively flat capsid surface.
[0254] Although fibers of many adenovirus serotypes have the same overall architecture, they have variable amino acid sequences that influence their function as well as structure. For example, a number of exposed regions on the surface of the fiber knob present an easily adaptable receptor binding site. The globular shape of the fiber knob allows receptors to bind at the sides of the knob or on top of the fiber knob. These binding sites typically lie on surface-exposed loops connecting beta-strands that are poorly conserved among human adenoviruses. The exposed side chains on these loops give the knob a variety of surface features while preserving the tertiary and quaternary structure. For example, the electrostatic potential and charge distributions at the knob surfaces can vary due to the wide range of isoelectric points in the fiber knob sequences, from pi approximately 9 for Ad 8, Ad 19, and Ad 37 to approximately 5 for subgroup B adenoviruses. As a structurally complex virus ligand, the fiber protein allows the presentation of a variety of binding surfaces (knob) in a number of orientations and distances (shaft) from the viral capsid.
[0255] One of the most obvious variations between some serotypes is fiber length. Studies have shown that the length of the fiber shaft strongly influences the interaction of the knob and the virus with its target receptors. Further, fiber proteins between serotypes can also vary in their ability to bend. Although beta-repeats in the shaft form a highly stable and regular structure, electron microscopy (EM) studies have shown distinct hinges in the fiber. Analysis of the protein sequence from several adenovirus serotype fibers pinpoints a disruption in the repeating sequences of the shaft at the third beta-repeat from the N-terminal tail, which correlates strongly with one of the hinges in the shaft, as seen by DM. The hinges in the fiber allow the knob to adopt a variety of orientations relative to the virus capsid, which may circumvent steric hindrances to receptor engagement requiring the correct presentation of the receptor binding site on the knob. For example, the rigid fibers of subgroup D Ads thus require a flexible receptor or one prepositioned for virus attachment, as they are unable to bend themselves. (Nicklin et al Molecular Therapy 2005 12:384-393)
[0256] The identification of specific cell receptors for different Ad serotypes and the knowledge of how they contribute to tissue tropism have been achieved through the use of fiber pseudotyping technology. Although Ads of some subgroups use CAR as a primary receptor, it is becoming clear that many Ads use alternate primary receptors, leading to vastly different tropism in vitro and in vivo. The fibers of these serotypes show dear differences in their primary and tertiary structures, such as fiber shaft rigidity, the length of the fiber shaft, and the lack of a CAR binding site and/or the putative HSPG binding motif, together with the differences in net charge within the fiber knob. Pseudotyping Ad 5 particles with an alternate fiber shaft and knob therefore provides an opportunity to remove important cell binding domains and, in addition, may allow more efficient (and potentially more cell-selective) transgene delivery to defined cell types compared to that achieved with Ad 5. Neutralization of fiber-pseudotyped Ad particles may also be reduced if the fibers used are from Ads with lower seroprevalence in humans or experimental models, a situation that favours successful administration of the vector (Nicklin et al Molecular Therapy (2005) 12:384-393). Furthermore, full length fiber as well as isolated fiber knob regions, but not hexon or penton alone, are capable of inducing dendritic cell maturation and are associated with induction of a potent CD8+ T cell response (Molinier-Frenkel et al. J. Biol. Chem. (2003) 278:37175-37182). Taken together, adenoviral fiber plays an important role in at least receptor-binding and immunogenicity of adenoviral vectors.
[0257] Illustrating the differences between the fiber proteins of Group C simian adenoviruses is the alignment provided in FIG. 1. A striking feature is that the fiber sequences of these adenoviruses can be broadly grouped into having a long fiber, such as ChAd155, or a shod fiber, such as ChAd3. This length differential is due to a 36 amino acid deletion at approximately position 321 in the short fiber relative to the long fiber, In addition, there are a number of amino acid substitutions that differ between the short versus long fiber subgroup yet are consistent within each subgroup. While the exact function of these differences have not yet been elucidated, given the function and immunogenicity of fiber, they are likely to be significant. It has been shown that one of the determinants of viral tropism is the length of the fiber shaft. It has been demonstrated that an Ad5 vector with a shorter shaft has a lower efficiency of binding to CAR receptor and a lower infectivity (Ambriovi -Ristov A. et al.: Virology. (2003) 312(2):425-33): It has been speculated that this impairment is the results of an increased rigidity of the shorter fiber leading to a less efficient attachment to the cell receptor (Wu, E et al.: J Virol. (2003) 77(13): 7225-7235). These studies may explain the improved properties of ChAd155 carrying a longer and more flexible fiber in comparison with the previously described ChAd3 and PanAd3 carrying a fiber with a shorter shaft.
[0258] In one aspect of the invention there is provided isolated fiber, penton and hexon capsid polypeptides of chimp adenovirus ChAd155 and isolated polynucleotides encoding the fiber, penton and hexon capsid polypeptides of chimp adenovirus ChAd155.
[0259] All three capsid proteins are expected to contribute to low seroprevalence and can, thus, be used independently from each other or in combination to suppress the affinity of an adenovirus to preexisting neutralizing antibodies, e.g. to manufacture a recombinant adenovirus with a reduced seroprevalence. Such a recombinant adenovirus may be a chimeric adenovirus with capsid proteins from different serotypes with at least a fiber protein from ChAd155.
[0260] The ChAd155 fiber polypeptide sequence is provided in SEQ ID NO: 1.
[0261] The ChAd155 penton polypeptide sequence is provided in SEQ ID NO: 3.
[0262] The ChAd155 hexon polypeptide sequence is provided in SEQ ID NO: 5.
[0263] Recombinant Adenoviruses or Compositions Comprising Polypeptide Sequences of ChAd155 Fiber or a Functional Derivative Thereof
[0264] Suitably the recombinant adenovirus or composition of the invention comprises a polypeptide having the amino acid sequence according to SEQ ID NO: 1.
[0265] Suitably the recombinant adenovirus or composition of the invention comprises a polypeptide which is a functional derivative of a polypeptide having the amino acid sequence according to SEQ ID NO: 1, wherein the functional derivative has an amino acid sequence which is at least 80% identical over its entire length to the amino add sequence of SEQ ID NO: 1. Suitably the functional derivative of a polypeptide having the amino acid sequence according to SEQ ID NO: 1 has an amino acid sequence which is at least 80% identical, such as at least 85.0% identical such as at least 90% identical, such as at least 91.0% identical, such as at least 93.0% identical, such as at least 95.0% identical, such as at least 97.0% identical, such as at least 98.0% identical, such as at least 99.0% identical, such as at least 99.2% identical, such as at least 99.4% identical, such as 99.5% identical, such as at least 99.6% identical, such as at least 99.8% identical, such as 99,9% identical over its entire length to the amino acid sequence of SEQ lD NO: 1. Alternatively the functional derivative has no more than 130, more suitably no more than 120, more suitably no more than 110, more suitably no more than 100, more suitably no more than 90, more suitably no more than 80, more suitably no more than 70, more suitably no more than 60, more suitably no more than 50, more suitably no more than 40, more suitably no more than 30, more suitably no more than 20, more suitably no more than 10, more suitably no more than 5, more suitably no more than 4, more suitably no more than 3, more suitably no more than 2, more suitably no more than 1 addition(s), deletion(s) or substitutions(s) compared to SEQ ID NO: 1.
[0266] Suitably the recombinant adenovirus or composition according to the invention further comprises:
[0267] (a) a polypeptide having the amino acid sequence according to SEQ ID NO: 3; or
[0268] (b) a functional derivative of a polypeptide having the amino acid sequence according to SEQ ID NO: 3, wherein the functional derivative has an amino acid sequence which is at least 50.0% identical over its entire length to the amino acid sequence of SEQ ID NO: 3, and/or
[0269] (a) a polypeptide having the amino acid sequence according to SEQ ID NO: 5; or
[0270] (b) a functional derivative of a polypeptide having the amino acid sequence according to SEQ ID NO: 5, wherein the functional derivative, has an amino acid sequence which is at least 50% identical over its entire length to the amino acid sequence of SEQ ID NO: 5.
[0271] Suitably the functional derivative of a polypeptide having the amino acid sequence according to SEQ ID NO: 3 has an amino acid sequence which is at least 60.0%, such as at least 70.0%, such as at least 80.0%, such as at least 85.0%, such as at least 90,0%, such as at least 91.0% identical, such as at least 93.0% identical, such as at least 95.0% identical, such as at least 97.0% identical, such as at least 98.0% identical, such as at least 99.0%, such as at least 99.2%, such as at least 99.4%, such as 99.5% identical, such as at least 99.6%, such as 99.7% identical such as at least 99.8% identical, such as 99.9% identical over its entire length to the amino acid sequence of SEQ ID NO: 3. Alternatively the functional derivative has no more than 300, more suitably no more than 250, more suitably no more than 200, more suitably no more than 150, more suitably no more than 125, more suitably no more than 100, more suitably no more than 90, more suitably no more than 80, more suitably no more than 70, more suitably no more than 60, more suitably no more than 50, more suitably no more than 40, more suitably no more than 30, more suitably no more than 20, more suitably no more than 10, more suitably no more than 5, more suitably no more than 4, more suitably no more than 3, more suitably no more than 2, more suitably no more than 1 addition(s), deletion(s) or substitutions(s) compared to SEQ ID NO: 3.
[0272] Suitably the functional derivative of a polypeptide having the amino acid sequence according to SEQ lD NO: 5 has an amino acid sequence which is at least 60.0%, such as at least 70.0%, such as at least 80.0%, such as at least 85.0%, such as at least 90.0%, such as at least 91.0% identical, such as at least 93.0% identical, such as at least 95.0% identical, such as at least 97.0% identical, such as at least 98.0% identical, such as at least 99.0%, such as at least 99.2%, such as at least 99.4%, such as 99.5% identical, such as at least 99.6%, such as 997% identical such as at least 99.8% identical, such as 99.9% identical over its entire length to the amino acid sequence of SEQ ID NO: 5, Alternatively the functional derivative has no more than 500, more suitably no more than 400, more suitably no more than 450, more suitably no more than 300, more suitably no more than 250, more suitably no more than 200, more suitably no more than 150, more suitably no more than 125, more suitably no more than 100, more suitably no more than 90, more suitably no more than 80, more suitably no more than 70, more suitably no more than 60, more suitably no more than 50, more suitably no more than 40, more suitably no more than 30, more suitably no more than 20, more suitably no more than 10, more suitably no more than 5, more suitably no more than 4, more suitably no more than 3, more suitably no more than 2, more suitably no more than 1 addition(s), deletion(s) or substitutions(s) compared to SEQ ID NO: 5.
[0273] Recombinant Adenoviruses or Compositions Comprising Polypeptide Sequences of ChAd155 Penton
[0274] Suitably the recombinant adenovirus or composition of the invention comprises a polypeptide having the amino add sequence according to SEQ ID NO: 3.
[0275] Suitably the recombinant adenovirus or composition of the invention further comprises:
[0276] (a) a polypeptide having the amino acid sequence according to SEQ ID NO: 1; or
[0277] (b) a functional derivative of a polypeptide having the amino acid sequence according to SEQ ID NO: 1, wherein the functional derivative has an amino acid sequence which is at least 80% identical over its entire length to the amino acid sequence of SEQ ID NO: 1 and/or
[0278] (a) a polypeptide having the amino acid sequence according to SEQ ID NO: 5; or
[0279] (b) a functional derivative of a polypeptide having the amino acid sequence according to SEQ ID NO: 5, wherein the functional derivative has an amino acid sequence which is at least 60% identical over its entire length to the amino add sequence of SEQ ID NO: 5.
[0280] Suitably the functional derivative of a polypeptide having the amino acid sequence according to SEQ ID NO: 1 has an amino acid sequence which is at least 60.0% identical, such as at least 70.0% identical, such as at least 80.0% identical, such as at least 85.0% identical, such as at least 87.0% identical, such as at least 89.0% identical, such as at least 91.0% identical, such as at least 93.0% identical, such as at least 95.0% identical, such as at least 97.0% identical, such as at least 98.0% identical, such as at least 99.0% identical, such as at least 99.2%, such as at least 99.4%, such as 99.5% identical, such as at least 99.6%, such as at least 99.8% identical, such as 99.9% identical over its entire length to the amino acid sequence of SEQ ID NO: 1. Alternatively the functional derivative has no more than 130, more suitably no more than 120, more suitably no more than 110, more suitably no more than 100, more suitably no more than 90, more suitably no more than 80, more suitably no more than 70, more suitably no more than 60, more suitably no more than 50, more suitably no more than 40, more suitably no more than 30, more suitably no more than 20, more suitably no more than 10, more suitably no more than 5, more suitably no more than 4, more suitably no more than 3, more suitably no more than 2, more suitably no more than 1 addition(s), deletion(s) or substitutions(s) compared to SEQ ID NO: 1.
[0281] Suitably the functional derivative of a polypeptide having the amino acid sequence according to SEQ ID NO: 5 has an amino add sequence which is at least 60.0%, such as at least 70.0%, such as at least 80.0%, such as at least 85.0%, such as at least 90.0%, such as at least 95.0%, such as at least 97.0%, such as at least 99.0%, such as at least 99.0%, such as at least 99.2%, such as at least 99.4%, such as 99.5% identical, such as at least 99.6%, such as at least 99.8% identical, such as 99.9% identical over its entire length to the amino acid sequence of SEQ ID NO:5. Alternatively the functional derivative has no more than 500, more suitably no more than 400, more suitably no more than 450, more suitably no more than 300, more suitably no more than 250, more suitably no more than 200, more suitably no more than 150, more suitably no more than 125, more suitably no more than 100, more suitably no more than 90, more suitably no more than 80, more suitably no more than 70, more suitably no more than 60, more suitably no more than 50, more suitably no more than 40, more suitably no more than 30, more suitably no more than 20, more suitably no more than 10, more suitably no more than 5, more suitably no more than 4, more suitably no more than 3, more suitably no more than 2, more suitably no more than 1 addition(s), deletion(s) or substitutions(s) compared to SEQ ID NO: 5.
[0282] Recombinant Adenoviruses or Compositions comprising Polynucleotides Encoding ChAd155 Fiber or a Functional Derivative Thereof
[0283] Suitably the recombinant adenovirus or composition of the invention comprises a polynucleotide which encodes a polypeptide having the amino acid sequence according to SEQ ID NO: 1. Suitably the polynucleotide has a sequence according to SEQ ID NO: 2.
[0284] Alternatively, the recombinant adenovirus or composition of the invention comprises a polynucleotide which encodes a functional derivative of a polypeptide having the amino add sequence according to SEQ ID NO: 1, wherein the functional derivative has an amino acid sequence which is at least 80% identical over its entire length to the amino acid sequence of SEQ ID NO: 1. Suitably the functional derivative of a polypeptide having the amino acid sequence according to SEQ ID NO: 1 has an amino acid sequence which is at least 80% identical, such as at least 85.0% identical, such as at least 90% identical, such as at least 91.0% identical, such as at least 93.0% identical, such as at least 95.0% identical, such as at least 97.0% identical, such as at least 98.0% identical, such as at least 99.0% identical, such as at least 99% identical, such as at least 99.4% identical, such as at least 99.6% identical, such as at least 99.8% identical over its entire length to the amino acid sequence of SEQ ID NO: 1, Alternatively the functional derivative has no more than 130, more suitably no more than 120, more suitably no more than 110, more suitably no more than 100, more suitably no more than 90, more suitably no more than 80, more suitably no more than 70, more suitably no more than 60, more suitably no more than 50, more suitably no more than 40, more suitably no more than 30, more suitably no more than 20, more suitably no more than 10, more suitably no more than 5, more suitably no more than 4, more suitably no more than 3, more suitably no more than 2, more suitably no more than 1 addition(s), deletion(s) or substitutions(s) compared to SEQ ID NO: 1.
[0285] Suitably the recombinant adenovirus or composition of the invention further comprises a polynucleotide encoding:
[0286] (a) a polypeptide having the amino acid sequence according to SEQ ID NO: 3; or
[0287] (b) a functional derivative of a polypeptide having the amino acid sequence according to SEQ ID NO: 3, wherein the functional derivative has an amino acid sequence which is at least 50.0% identical over its entire length to the amino acid sequence of SEQ ID NO: 3, and/or
[0288] (a) a polypeptide having the amino add sequence according to SEQ ID NO: 5; or
[0289] (b) a functional derivative of a polypeptide having the amino acid sequence according to SEQ ID NO: 5, wherein the functional derivative has an amino acid sequence which is at least 50% identical over its entire length to the amino acid sequence of SEQ ID NO: 5.
[0290] Suitably the functional derivative of the polypeptide having the amino acid sequence according to SEQ ID NO: 3 has an amino acid sequence which is at least 60.0%, such as at least 70.0%, such as at least 80.0%, such as at least 85.0%, such as at least 90.0%, such as at least 91.0% identical, such as at least 93.0% identical, such as at least 95.0% identical, such as at least 97.0% identical, such as at least 98.0% identical, such as at least 99.0%, such as at least 99%, such as at least 99.4%, such as at least 99.6%, such as at least 99.8% identical over its entire length to the amino acid sequence of SEQ ID NO: 3. Alternatively the functional derivative has no more than 300, more suitably no more than 250, more suitably no more than 200, more suitably no more than 150, more suitably no more than 125, more suitably no more than 100, more suitably no more than 90, more suitably no more than 80, more suitably no more than 70, more suitably no more than 60, more suitably no more than 50, more suitably no more than 40, more suitably no more than 30, more suitably no more than 20, more suitably no more than 10, more suitably no more than 5, more suitably no more than 4, more suitably. no more than 3, more suitably no more than 2, more suitably no more than 1 addition(s), deletion(s) or substitutions(s) compared to SEQ ID NO: 3.
[0291] Suitably the functional derivative of the polypeptide having the amino acid sequence according to SEQ ID NO: 5 has an amino acid sequence which is at least 60.0%, such as at least 70.0%, such as at least 80.0%, such as at least 85.0%, such as at least 90.0%, such as at least 95.0%, such as at least 97.0%, such as at least 98.0%, such as at least 99.0%, such as at least 99.2%, such as at least 99.4%, such as 99.5% identical, such as at least 99.6%, such as 997% identical such as at least 99.8% identical, such as 99.9% identical over its entire length to the amino acid sequence of SEQ ID NO: 5. Alternatively the functional derivative has no more than 500, more suitably no more than 400, more suitably no more than 450, more suitably no more than 300, more suitably no more than 250, more suitably no more than 200, more suitably no more than 150, more suitably no more than 125, more suitably no more than 100, more suitably no more than 90, more suitably no more than 80, more suitably no more than 70, more suitably no more than 60, more suitably no more than 50, more suitably no more than 40, more suitably no more than 30, more suitably no more than 20, more suitably no more than 10, more suitably no more than 5, more suitably no more than 4, more suitably no more than 3, more suitably no more than 2, more suitably no more than 1 addition(s), deletion(s) or substitutions(s) compared to SEQ ID NO: 5.
[0292] Recombinant Adenoviruses or Compositions comprising Polynucleotides Encoding ChAd155 Penton
[0293] Suitably the recombinant adenovirus or composition of the invention comprises a polynucleotide which encodes a polypeptide having the amino acid sequence according to SEQ ID NO: 3. Suitably the polynucleotide has a sequence according to SEQ ID NO: 4,
[0294] Suitably the recombinant adenovirus or composition of the invention further comprises a polynucleotide encoding:
[0295] (a) a polypeptide having the amino acid sequence according to SEQ ID NO: 1; or
[0296] (b) a functional derivative of a polypeptide having the amino acid sequence according to SEQ ID NO: 1, wherein the functional derivative, has an amino acid sequence which is at least 50% identical over its entire length to the amino acid sequence of SEQ ID NO: 1 and/or
[0297] (a) a polypeptide having the amino acid sequence according to SEQ ID NO: 5; or
[0298] (b) a functional derivative of a polypeptide having the amino acid sequence according to SEQ ID NO: 5, wherein the functional derivative has an amino acid sequence which is at least 50% identical over its entire length to the amino acid sequence of SEQ ID NO: 5.
[0299] Suitably the functional derivative of a polypeptide having the amino acid sequence according to SEQ ID NO: 1 has an amino add sequence which is at least 60.0% identical, such as at least 70.0% identical, such as at least 80.0% identical, such as at least 85.0% identical, such as at least 87.0% identical, such as at least 89.0% identical, such as at least 91.0% identical, such as at least 93.0% identical, such as at least 95.0% identical, such as at least 97.0% identical, such as at least 98.0% identical, such as at least 99.0%, such as at least 99.2%, such as at least 99.4%, such as 99.5% identical, such as at least 99.6%, such as 99.7% Identical such as at least 99.8% identical, such as 99.9% identical over its entire length to the amino acid sequence of SEQ ID NO: 1. Alternatively the functional derivative has no more than 130, more suitably no more than 120, more suitably no more than 110, more suitably no more than 100, more suitably no more than 90, more suitably no more than 80, more suitably no more than 70, more suitably no more than 60, more suitably no more than 50, more suitably no more than 40, more suitably no more than 30, more suitably no more than 20, more suitably no more than 10, more suitably no more than 5, more suitably no more than 4, more suitably no more than 3, more suitably no more than 2, more suitably no more than 1 addition(s), deletion(s) or substitutions(s) compared to SEQ ID NO: 1.
[0300] Suitably the functional derivative o a polypeptide having the amino acid sequence according to SEQ ID NO: 5 has an amino acid sequence which is at least 60.0%, such as at least 70.0%, such as at least 80.0%, such as at least 85.0%, such as at least 90.0%, such as at least 95.0%, such as at least 97.0%, such as at least 98.0%, such as at least 99.0%, such as at least 99.2%, such as at least 99.4%, such as 99.5% identical, such as at least 99.6%, such as 99.7% identical such as at least 99.8% identical, such as 99.9% identical over its entire length to the amino acid sequence of SEQ ID NO: 5. Alternatively the functional derivative has no more than 500, more suitably no more than 400, more suitably no more than 450, more suitably no more than 300, more suitably no more than 250, more suitably no more than 200, more suitably no more than 150, more suitably no more than 125, more suitably no more than 100, more suitably no more than 90, more suitably no more than 80, more suitably no more than 70, more suitably no more than 60, more suitably no more than 50, more suitably no more than 40, more suitably no more than 30, more suitably no more than 20, more suitably no more than 10, more suitably no more than 5, more suitably no more than 4, more suitably no more than 3, more suitably no more than 2, more suitably no more than 1 addition(s), deletion(s) or substitution(s) compared to SEQ ID NO: 5.
[0301] ChAd155 Backbones
[0302] The present application describes isolated polynucleotide sequences of chimp adenovirus ChAd155, including that of wild type, unmodified ChAd155 (SEQ ID NO: 10) and modified backbone constructs of ChAd155. These modified backbone constructs include, ChAd155#1434 (SEQ ID NO: 7), ChAd155#1390 (SEQ ID NO: 8) and ChAd155#1375 (SEQ ID NO: 9). ChAd155 backbones may be used in the construction of recombinant replication-competent or replication-incompetent adenoviruses for the delivery of transgenes.
[0303] Annotation of the ChAd155 wild type sequence (SEQ ID NO: 10) sequence is provided below.
TABLE-US-00001 LOCUS ChAd155 37830 bp DNA linear 10-JUN-2015 DEFINITION Chimp adenovirus 155, complete genome. COMMENT Annotation according to alignment of ChAd155 against the human Adenovirus 2 reference strain NC_001405 Two putative ORFs in the E3 region added manually FEATURES Location/Qualifiers source 1..37830 /organism="Chimpanzee adenovirus 155" /mol_type="genomic DNA" /acronym="ChAd155" repeat_region 1..101 /standard_name="ITR" /rpt_type=inverted gene 466..1622 /gene="E1A" TATA_signal 466..471 /gene="E1A" prim_transcript 497..1622 /gene="E1A" CDS join(577..1117,1231..1532) /gene="E1A" /product="E1A_280R" CDS join(577..979,1231..1532) /gene="E1A" /product="E1A_243R" polyA_signal 1600..1605 /gene="E1A" gene 1662..4131 /gene="E1B" TATA_signal 1662..1667 /gene="E1B" prim_transcript 1692..4131 /gene="E1B" CDS 1704..2267 /gene="E1B" /product="E1B_19K" CDS 2009..3532 /gene="E1B" /product="E1B_55K" gene 3571..4131 /gene="IX" TATA_signal 3571..3576 /gene="IX" prim_transcript 3601..4131 /gene="IX" CDS 3628..4092 /gene="IX" /product="IX" polyA_signal 4097..4102 /note="E1B, IX" gene complement(4117..27523) /gene="E2B" prim_transcript complement(4117..27494) /gene="E2B" gene complement(4117..5896) /gene="IVa2" prim_transcript complement(4117..5896) /gene="IVa2" CDS complement(join(4151..5487,5766..5778)) /gene="IVa2" /product="E2B_IVa2" polyA_signal complement(4150..4155) /note="IVa2, E2B" CDS complement(join(5257..8838,14209..14217)) /gene="E2B" /product="E2B_polymerase" gene 6078..34605 /gene="L5" gene 6078..28612 /gene="L4" gene 6078..22658 /gene="L3" gene 6078..18164 /gene="L2" gene 6078..14216 /gene="L1" TATA_signal 6078..6083 /note="L" prim_transcript 6109..34605 /gene="L5" prim_transcript 6109..28612 /gene="L4" prim_transcript 6109..22658 /gene="L3" prim_transcript 6109..18164 /gene="L2" prim_transcript 6109..14216 /gene="L1" CDS join(8038..8457,9722..9742) /gene="L1" /product="L1_13.6K" CDS complement(join(8637..10640,14209..14217)) /gene="E2B" /product="E2B_pTP" gene 10671..10832 /gene="VAI" misc_RNA 10671..10832 /gene="VAI" /product="VAI" gene 10902..11072 /gene="VAII" misc_RNA 10902..11072 /gene="VAII" /product="VAII" CDS 11093..12352 /gene="L1" /product="L1_52K" CDS 12376..14157 /gene="L1" /product="L1_pIIIa" polyA_signal 14197..14202 /gene="L1" CDS 14254..16035 /gene="L2" /product="L2_penton" CDS 16050..16646 /gene="L2" /product="L2_pVII" CDS 16719..17834 /gene="L2" /product="L2_V" CDS 17859..18104 /gene="L2" /product="L2_pX" polyA_signal 18143..18148 /gene="L2" CDS 18196..18951 /gene="L3" /product="L3_pVI" CDS 19063..21945 /gene="L3" /product="L3_hexon" CDS 21975..22604 /gene="L3" /product="L3_protease" polyA_signal 22630..22635 /gene="L3" gene complement(22632..27523) /gene="E2A" prim_transcript complement(22632..27494) /gene="E2A" gene complement(22632..26357) /gene="E2A-L" prim_transcript complement(22632..26328) /gene="E2A-L" polyA_signal complement(22649..22654) /note="E2A, E2A-L" CDS complement(22715..24367) /gene="E2A" /note="DBP; genus-common; DBP family" /codon_start=1 /product="E2A" CDS 24405..26915 /gene="L4" /product="L4_100k" TATA_signal complement(26352..26357) /gene="E2A-L" CDS join(26602..26941,27147..27529) /gene="L4" /product="L4_33K" CDS 26602..27207 /gene="L4" /product="L4_22K" TATA_signal complement(27518..27523) /note="E2A, E2B; nominal" CDS 27604..28287 /gene="L4" /product="L4_pVIII" gene 27969..32686 /gene="E3B" gene 27969..31611 /gene="E3A" TATA_signal 27969..27974 /note="E3A, E3B" prim_transcript 27998..32686 /gene="E3B" prim_transcript 27998..31611 /gene="E3A" CDS 28288..28605 /gene="E3A" /product="E3 ORF1" polyA_signal 28594..28599 /gene="L4" CDS 29103..29303 /gene="E3A" /product="E3 ORF2" CDS 29300..29797 /gene="E3A" /product="E3 ORF3" CDS 29826..30731 /gene="E3A" /product="E3 ORF4" CDS 30728..31579 /gene="E3A" /product="E3 ORF5" CDS 31283..31579 /gene="E3A" /product="E3 ORF6" polyA_signal 31578..31584 /gene="E3A" CDS 31591..31863 /gene="E3B" /product="E3 ORF7" CDS 31866..32264 /gene="E3B" /product="E3 ORF8" CDS 32257..32643 /gene="E3B" /product="E3 ORF9" polyA_signal 32659..32664 /gene="E3B" gene complement(<32678..32838) /gene="U" CDS complement(<32678..32838) /gene="U" /note="exon encoding C terminus unidentified; genus-common" /product="protein U" CDS 32849..34585 /gene="L5" /product="L5_fiber" polyA_signal 34581..34586 /gene="L5" gene complement(34611..37520) /gene="E4" prim_transcript complement(34611..37490) /gene="E4" polyA_signal complement(34625..34630) /gene="E4" CDS complement(join(34794..35069,35781..35954)) /gene="E4" /product="E4 ORF7" CDS complement(35070..35954) /gene="E4" /product="E4 ORF6" CDS complement(35875..36219) /gene="E4" /product="E4 ORF4" CDS complement(36235..36582) /gene="E4" /product="E4 ORF3" CDS complement(36579..36971) /gene="E4" /product="E4 ORF2" CDS complement(37029..37415) /gene="E4" /product="E4 ORF1" TATA_signal complement(37515..37520) /gene="E4"
repeat_region 37740..37830 /standard_name="ITR" /rpt_type=inverted
[0304] Definitions
[0305] Recombinant means that the polynucleotide is the product of at least one of cloning, restriction or ligation steps, or other procedures that result in a polynucleotide that is distinct from a polynucleotide found in nature. A recombinant adenovirus is an adenovirus comprising a recombinant polynucleotide.
[0306] Typically, "heterologous" means derived from a genotypically distinct entity from that o the rest of the entity to which it is being compared. A heterologous nucleic acid sequence refers to any nucleic acid sequence that is not isolated from, derived from, or based upon a naturally occurring nucleic acid sequence of the adenoviral vector. "Naturally occurring" means a sequence found in nature and not synthetically prepared or modified. A sequence is "derived" from a source when it is isolated from a source but modified (e.g., by deletion, substitution (mutation), insertion, or other modification), suitably so as not to disrupt the normal function of the source gene.
[0307] A "functional derivative" of a polypeptide suitably refers to a modified version of a polypeptide, e.g. wherein one or more amino adds of the polypeptide may be deleted, inserted, modified and/or substituted. A derivative of an unmodified adenoviral capsid protein is considered functional if, for example:
[0308] (a) an adenovirus comprising the derivative capsid protein within its capsid retains substantially the same or a lower seroprevalence compared to an adenovirus comprising the unmodified capsid protein and/or
[0309] (b) an adenovirus comprising the derivative capsid protein within its capsid retains substantially the same or a higher host cell infectivity compared to an adenovirus comprising the unmodified capsid protein and/or
[0310] (c) an adenovirus comprising the derivative capsid protein within its capsid retains substantially the same or a higher immunogenicity compared to an adenovirus comprising the unmodified capsid protein and or
[0311] (d) an adenovirus comprising the derivative capsid protein within its capsid retains substantially the same or a higher level of transgene productivity compared to an adenovirus comprising the unmodified capsid protein.
[0312] Properties (a)-(d) above may suitably be measured using the methods described in the Examples section below.
[0313] Suitably, the recombinant adenovirus has a low seroprevalence in a human population. "Low seroprevalence" may mean having a reduced pre-existing neutralizing antibody level as compared to human adenovirus 5 (Ad5). Similarly or alternatively, "low seroprevalence" may mean less than about 20% seroprevalence, less than about 15% seroprevalence, less than about 10% seroprevalence, less than about 5% seroprevalence, less than about 4% seroprevalence, less than about 3% seroprevalence, less than about 2% seroprevalence, less than about 1% seroprevalence or no detectable seroprevalence. Seroprevalence can be measured as the percentage of individuals having a clinically relevant neutralizing titre (defined as a 50% neutralisation titer >200) using methods as described in Aste-Amezaga et al., Hum. Gene Ther. (2004) 15(3):293-304.
[0314] The terms polypeptide, peptide and protein are used interchangeably herein.
[0315] The term "simian" is typically meant to encompass nonhuman primates, for example Old World monkeys, New World monkeys, apes and gibbons. in particular, simian may refer to nonhuman apes such as chimpanzees (Pan troglodyte), bonobos (Pan paniscus) and gorillas (genus Gorilla). Non-ape simians may include rhesus macaques (Macaca mulatta).
[0316] Adenovirus Sequence Comparison
[0317] For the purposes of comparing two closely-related polynucleotide or polypeptide sequences, the "% identity" between a first sequence and a second sequence may be calculated using an alignment program, such as BLAST.RTM. (available at blast.ncbi.nlm.nih.gov, last accessed 9 Mar. 2015) using standard settings. The % identity is the number of identical residues divided by the number of residues in the reference sequence, multiplied by 100. The % identity figures referred to above and in the claims are percentages calculated by this methodology. An alternative definition of % identity is the number of identical residues divided by the number of aligned residues, multiplied by 100. Alternative methods include using a gapped method in which gaps in the alignment, for example deletions in one sequence relative to the other sequence, are accounted for in a gap score or a gap cost in the scoring parameter. For more information, see the BLAST.RTM. fact sheet available at ftp.ncbi.nlm.nih.gov/pub/factsheets/HowTo_BLASTGuide.pdf, last accessed on 9 Mar. 2015.
[0318] Sequences that preserve the functionality of the polynucleotide or a polypeptide encoded thereby are likely to be more closely identical. Polypeptide or polynucleotide sequences are said to be the same as or identical to other polypeptide or polynucleotide sequences, if they share 100% sequence identity over their entire length.
[0319] A "difference" between sequences refers to an insertion, deletion or substitution of a single amino acid residue in a position of the second sequence, compared to the first sequence. Two polypeptide sequences can contain one, two or more such amino acid differences. insertions, deletions or substitutions in a second sequence which is otherwise identical (100% sequence identity) to a first sequence result in reduced percent sequence identity, For example, if the identical sequences are 9 amino acid residues long, one substitution in the second sequence results in a sequence identity of 88.9%. If the identical sequences are 17 amino acid residues long, two substitutions in the second sequence results in a sequence identity of 88.2%. If the identical sequences are 7 amino acid residues long, three substitutions in the second sequence results in a sequence identity of 57.1%. If first and second polypeptide sequences are 9 amino add residues long and share 6 identical residues, the first and second polypeptide sequences share greater than 66% identity (the first and second polypeptide sequences share 66.7% identity). If first and second polypeptide sequences are 17 amino acid residues long and share 16 identical residues, the first and second polypeptide sequences share greater than 94% identity (the first and second polypeptide sequences share 94.1% identity). if first and second polypeptide sequences are 7 amino acid residues long and share 3 identical residues, the first and second polypeptide sequences share greater than 42% identity (the first and second polypeptide sequences share 42.9% identity).
[0320] Alternatively, for the purposes of comparing a first, reference polypeptide sequence to a second, comparison polypeptide sequence, the number of additions, substitutions and/or deletions made to the first sequence to produce the second sequence may be ascertained. An addition is the addition of one amino acid residue into the sequence of the first polypeptide (including addition at either terminus of the first polypeptide). A substitution is the substitution of one amino acid residue in the sequence of the first polypeptide with one different amino acid residue. A deletion is the deletion of one amino acid residue from the sequence of the first polypeptide (including deletion at either terminus of the first polypeptide).
[0321] For the purposes of comparing a first, reference polynucleotide sequence to a second, comparison polynucleotide sequence, the number of additions, substitutions and/or deletions made to the first sequence to produce the second sequence may be ascertained. An addition is the addition of one nucleotide residue into the sequence of the first polynucleotide (including addition at either terminus of the first polynucleotide), A substitution is the substitution of one nucleotide residue in the sequence of the first polynucleotide with one different nucleotide residue. A deletion is the deletion of one nucleotide, residue from the sequence of the first polynucleotide (including deletion at either terminus of the first polynucleotide).
[0322] Suitably substitutions in the sequences of the present invention may be conservative substitutions. A conservative substitution comprises the substitution of an amino add with another amino acid having a chemical property similar to the amino acid that is substituted (see, for example, Stryer et al, Biochemistry, 5.sup.th Edition 2002, pages 44-49). Preferably, the conservative substitution is a substitution selected from the group consisting of: (i) a substitution of a basic amino acid with another, different basic amino acid; (ii) a substitution of an acidic amino acid with another, different acidic amino acid; (iii) a substitution of an aromatic amino acid with another, different aromatic amino acid; (iv) a substitution of a non-polar, aliphatic amino acid with another, different non-polar, aliphatic amino acid; and (v) a substitution of a polar, uncharged amino acid with another, different polar, uncharged amino acid. A basic amino add is preferably selected from the group consisting of arginine, histidine, and lysine. An acidic amino acid is preferably aspartate or glutamate. An aromatic amino acid is preferably selected from the group consisting of phenylalanine, tyrosine and tryptophane. A non-polar, aliphatic amino add is preferably selected from the group consisting of glycine, alanine, valine, leucine, methionine and isoleucine. A polar, uncharged amino add is preferably selected from the group consisting of serine, threonine, cysteine, proline, asparagine and glutamine. In contrast to a conservative amino add substitution, a non-conservative amino acid substitution is the exchange of one amino acid with any amino acid that does not fall under the above-outlined conservative substitutions (i) through (v).
[0323] Recombinant Adenovirus
[0324] The ChAd155 sequences are useful as therapeutic agents and in construction of a variety of vector systems, recombinant adenovirus and host cells. Suitably the term "vector" refers to a nucleic acid that has been substantially altered (e.g., a gene or functional region that has been deleted and/or inactivated) relative to a wild type sequence and/or incorporates a heterologous sequence, i.e. nucleic acid obtained from a different source (also called an "insert"), and replicating and/or expressing the inserted polynucleotide sequence, when introduced into a cell (e.g., a host cell). For example, the insert may be all or part of the ChAd155 sequences described herein. in addition or alternatively, a ChAd155 vector may be a ChAd155 adenovirus comprising one or more deletions or inactivations of viral genes, such as E1 or other viral gene or functional region described herein. Such a ChAd155, which may or may not comprise a heterologous sequence, is often called a "backbone" and may be used as is or as a starting point for additional modifications to the vector.
[0325] A vector may be any suitable nucleic acid molecule including naked DNA, a plasmid, a virus, a cosmid, phage vector such as lambda vector, an artificial chromosome such as a BAG (bacterial artificial chromosome), or an episorne. Alternatively, a vector may be a transcription and/or expression unit for cell-free in vitro transcription or expression, such as a T7-compatible system. The vectors may be used alone or in combination with other adenoviral sequences or fragments, or in combination with elements from non-adenoviral sequences. The ChAd155 sequences are also useful in antisense delivery vectors, gene therapy vectors, or vaccine vectors. Thus, further provided are gene delivery vectors, and host cells which contain the ChAd155 sequences.
[0326] The term "replication-competent" adenovirus refers to an adenovirus which can replicate in a host cell in the absence of any recombinant helper proteins comprised in the cell. Suitably, a "replication-competent" adenovirus comprises the following intact or functional essential early. genes: E1A, E1B, E2A, E2B, E3 and E4. Wild type adenoviruses isolated from a particular animal will be replication competent in that animal.
[0327] The term "replication-incompetent" or "replication-defective" adenovirus refers to an adenovirus which is incapable of replication because it has been engineered to comprise at least a functional deletion (or "loss-of-function" mutation), i.e. a deletion or mutation which impairs the function of a gene without removing it entirely, e.g. introduction of artificial stop codons, deletion or mutation of active sites or interaction domains, mutation or deletion of a regulatory sequence of a gene etc, or a complete removal of a gene encoding a gene product that is essential for viral replication, such as one or more of the adenoviral genes selected from E1A, E1B, E2A, E2B, E3 and E4 (such as E3 ORF1 E3 ORF2, E3 ORF3, E3 ORF4, E3 ORF5, E3 ORF6, E3 ORF7, E3 ORF8, E3 ORF9, E4 ORF7, E4 ORF6, E4 ORF4, E4 ORF3, E4 ORF2 and/or E4 ORF1). Particularly suitably E1 and optionally E3 and/or E4 are deleted. If deleted, the aforementioned deleted gene region will suitably not be considered in the alignment when determining % identity with respect to another sequence.
[0328] The present invention provides recombinant adenovirus that deliver a mycobacterial antigen, to cells, either for therapeutic or vaccine purposes. A vector may include any genetic element including naked DNA, a phage, transposon, cosmid, episome, plasmid, or a virus. Such vectors contain DNA of ChAd155 as disclosed herein and a minigene. By "minigene" (or "expression cassette") is meant the combination of a selected heterologous gene (transgene) and the other regulatory elements necessary to drive translation, transcription and/or expression of the gene product in a host cell.
[0329] Typically, a ChAd155-derived adenoviral vector is designed such that the minigene is located in a nucleic acid molecule which contains other adenoviral sequences in the region native to a selected adenoviral gene. The minigene may be inserted into an existing gene region to disrupt the function of that region, if desired. Alternatively, the minigene may be inserted into the site of a partially or fully deleted adenoviral gene. For example, the minigene may be located in the site of a mutation, insertion or deletion which renders non-functional at least one gene of a genomic region selected from the group consisting of E1A, E1B, E2A, E2B, E3 and E4. The term "renders non-functional" means that a sufficient amount of the gene region is removed or otherwise disrupted, so that the gene region is no longer capable of producing functional products of gene expression. If desired, the entire gene region may be removed (and suitably replaced with the minigene).
[0330] For example, for a production vector useful for generation of a recombinant virus, the vector may contain the minigene and either the 5' end of the adenoviral genome or the 3' end of the adenoviral genome, or both the 5' and 3' ends of the adenoviral genome. The 5' end of the adenoviral genome contains the 5' cis-elements necessary for packaging and replication; i.e., the 5' ITR sequences (which function as origins of replication) and the native 5' packaging enhancer domains (that contain sequences necessary for packaging linear Ad genomes and enhancer elements for the E1 promoter). The 3' end of the adenoviral genome includes the 3' cis-elements (including the ITRs) necessary for packaging and encapsidation. Suitably, a recombinant adenovirus contains both 5' and 3' adenoviral cis-elements and the minigene (suitably containing a transgene) is located between the 5' and 3' adenoviral sequences. A ChAd155-based adenoviral vector may also contain additional adenoviral sequences.
[0331] Suitably, ChAd155-based vectors contain one or more adenoviral elements derived from the adenoviral ChAd155 genome. In one embodiment, the vectors contain adenoviral ITRs from ChAd155 and additional adenoviral sequences from the same adenoviral serotype. In another embodiment, the vectors contain adenoviral sequences that are derived from a different adenoviral serotype than that which provides the ITRs.
[0332] As defined herein, a pseudotyped adenovirus refers to an adenovirus in which the capsid proteins of the adenovirus are from a different adenovirus than the adenovirus which provides the ITRs.
[0333] Further, chimeric or hybrid adenoviruses may be constructed using the adenoviruses described herein using techniques known to those of skill in the art (e.g., U.S. Pat. No. 7,291,498).
[0334] ITRs and any other adenoviral sequences present in the vector may be obtained from many sources. A variety of adenovirus strains are available from the American Type Culture Collection, Manassas, Va., or available by request from a variety of commercial and institutional sources. Further, the sequences of many such strains are available from a variety of databases including, e.g., PubMed and GenBank. Homologous adenovirus vectors prepared from other chimp or from human adenoviruses are described in the published literature (for example, U.S. Pat. No. 5,240,846). The DNA sequences of a number of adenovirus types are available from GenBank, including type Ad5 (GenBank Accession Number M73370). The adenovirus sequences may be obtained from any known adenovirus serotype, such as serotypes 2, 3, 4, 7, 12 and 40, and further including any of the presently identified human types. Similarly adenoviruses known to infect nonhuman animals (e.g., simians) may also be employed in the vector constructs of this invention (e.g., U.S. Pat. No. 6,083,716). The viral sequences, helper viruses (if needed), and recombinant viral particles, and other vector components and sequences employed in the construction of the vectors described herein may be obtained as described below.
[0335] Sequence, Vector and Adenovirus Production
[0336] The sequences may be produced by any suitable means, including recombinant production, chemical synthesis, or other synthetic means. Suitable production techniques are well known to those of skill in the art. Alternatively, peptides can also be synthesized by well known solid phase peptide synthesis methods.
[0337] The adenoviral plasmids (or other vectors) may be used to produce adenoviral vectors. In one embodiment, the adenoviral vectors are adenoviral particles which are replication-incompetent. In one embodiment, the adenoviral particles are rendered replication-incompetent by deletions in the E1A and/or E1B genes. Alternatively, the adenoviruses are rendered replication-incompetent by another means, optionally while retaining the E1A and/or E1B genes. Similarly, in some embodiments, reduction of an immune response to the vector may be accomplished by deletions in the E2B and/or DNA polymerase genes. The adenoviral vectors can also contain other mutations to the adenoviral genome, e.g., temperature-sensitive mutations or deletions in other genes. In other embodiments, it is desirable to retain an intact E1A and/or E1B region in the adenoviral vectors. Such an intact E1 region may be located in its native location in the adenoviral genome or placed in the site of a deletion in the native adenoviral genome (e.g., in the E3 region).
[0338] In the construction of adenovirus vectors for delivery of a gene to a mammalian (such as human) cell, a range of modified adenovirus nucleic add sequences can be employed in the vectors. For example, all or a portion of the adenovirus delayed early gene E3 may be eliminated from the adenovirus sequence which forms a part of the recombinant virus. The function of E3 is believed to be irrelevant to the function and production of the recombinant virus particle. Adenovirus vectors may also be constructed having a deletion of at least the ORF6 region of the E4 gene, and more desirably because of the redundancy in the function of this region, the entire E4 region. Still another vector of the invention contains a deletion in the delayed early gene E2A. Deletions may also be made in any of the late genes L1 to L5 of the adenovirus genome. Similarly, deletions in the intermediate genes IX and IVa2 may be useful for some purposes. Other deletions may be made in the other structural or non-structural adenovirus genes. The above discussed deletions may be used individually, i.e., an adenovirus sequence for use as described herein may contain deletions in only a single region. Alternatively, deletions of entire genes or portions thereof effective to destroy their biological activity may be used in any combination. For example, in one exemplary vector, the adenovirus sequence may have deletions of the E1 genes and the E4 gene, or of the E1, E2A and E3 genes, or of the E1 and E3 genes, or of E1, E2A and E4 genes, with or without deletion of E3, and so on, Any one or more of the E genes may suitably be replaced with an E gene (or one or more E gene open reading frames) sourced from a different strain of adenovirus. Particularly suitably the ChAd155 E1 and E3 genes are deleted and the ChAd155E4 gene is replaced with E4Ad5orf6. As discussed above, such deletions and/or substitutions may be used in combination with other mutations, such as temperature-sensitive mutations, to achieve a desired result.
[0339] An adenoviral vector lacking one or more essential adenoviral sequences (e.g., E1A, E1B, E2A, E2B, E4 ORF6, L1, L2, L3, L4 and L5) may be cultured in the presence of the missing adenoviral gene products which are required for viral infectivity and propagation of an adenoviral particle, These helper functions may be provided by culturing the adenoviral vector in the presence of one or more helper constructs (e.g., a plasmid or virus) or a packaging host cell.
[0340] Complementation of Replication-Incompetent Vectors
[0341] To generate recombinant adenoviruses deleted in any of the genes described above, the function of the deleted gene region, if essential to the replication and infectivity of the virus, must be supplied to the recombinant virus by a helper virus or cell line, i.e., a complementation or packaging cell line.
[0342] Helper Viruses
[0343] Depending upon the adenovirus gene content of the viral vectors employed to carry the minigene, a helper adenovirus or non-replicating virus fragment may be used to provide sufficient adenovirus gene sequences necessary to produce an infective recombinant viral particle containing the minigene. Useful helper viruses contain selected adenovirus gene sequences not present in the adenovirus vector construct and/or not expressed by the packaging cell line in which the vector is transfected. In one embodiment, the helper virus is replication-defective and contains adenovirus genes in addition, suitably, to one or more of the sequences described herein. Such a helper virus is suitably used in combination with an E1 expressing (and optionally additionally E3 expressing) cell line.
[0344] A helper virus may optionally contain a reporter gene. A number of such reporter genes are known to the art as well as described herein. The presence of a reporter gene on the helper virus which is different from the transgene on the adenovirus vector allows both the adenoviral vector and the helper virus to be independently monitored. This reporter is used to enable separation between the resulting recombinant virus and the helper virus upon purification.
[0345] Complementation Cell Lines
[0346] In many circumstances, a cell line expressing the one or more missing genes which are essential to the replication and infectivity of the virus, such as human E1, can be used to transcomplement a chimp adenoviral vector. This is particularly advantageous because, due to the diversity between the chimp adenovirus sequences of the invention and the human adenovirus sequences found in currently available packaging cells, the use of the current human E1-containing cells prevents the generation of replication-competent adenoviruses during the replication and production process.
[0347] Alternatively, if desired, one may utilize the sequences provided herein to generate a packaging cell or cell line that expresses, at a minimum, the E1 gene from ChAd155 under the transcriptional control of a promoter for expression in a selected parent cell line. Inducible or constitutive promoters may be employed for this purpose. Examples of such promoters are described in detail elsewhere in this document. A parent cell is selected for the generation of a novel cell line expressing any desired ChAd155 gene. Without limitation, such a parent cell line may be HeLa [ATCC Accession No. CCL 2], A549 [ATCC Accession No. CCL 185], HEK 293, KB [CCL 17], Detroit [e.g., Detroit 510, CCL 72] and WI-38 [CCL 75] cells, among others. These cell lines are all available from the American Type Culture Collection, 10801 University Boulevard, Manassas, Va. 20110-2209.
[0348] Such E1-expressing cell lines are useful in the generation of recombinant adenovirus E1 deleted vectors. Additionally, or alternatively, cell lines that express one or more adenoviral gene products, e.g., E1A, E1B, E2A, E3 and/or E4, can be constructed using essentially the same procedures as used in the generation of recombinant viral vectors. Such cell lines can be utilised to transcomplement adenovirus vectors deleted in the essential genes that encode those products, or to provide helper functions necessary for packaging of a helper-dependent virus (e.g., adeno-associated virus). The preparation of a host cell involves techniques such as assembly of selected DNA sequences.
[0349] In another alternative, the essential adenoviral gene products are provided in trans by the adenoviral vector and/or helper virus. In such an instance, a suitable host cell can be selected from any biological organism, including prokaryotic (e.g., bacterial) cells, and eukaryotic cells, including, insect cells, yeast cells and mammalian cells.
[0350] Host cells may be selected from among any mammalian species, including, without limitation, cells such as A549, WEHI, 3T3, 101112, HEK 293 cells or Per.C6 (both of which express functional adenoviral E1) [Fallaux, F J et al, (1998), Hum Gene Ther, 9:1909-1917], Saos, C2C12, L cells, HT1080, HepG2 and primary fibroblast, hepatocyte and myoblast cells derived from mammals including human, monkey, mouse, rat, rabbit, and hamster.
[0351] A particularly suitable complementation cell line is the Procell92 cell line. The Procell92 cell line is based on HEK 293 cells which express adenoviral E1 genes, transfected with the Tet repressor under control of the human phosphoglycerate kinase-1 (PGK) promoter, and the G418-resistance gene (Vitelli et al. PLOS One (2013) 8(e55435):1-9). Procell92.S is adapted for growth in suspension conditions and is useful for producing adenoviral vectors expressing toxic proteins (world wide web okairos.com/e/inners.php?m=00084, last accessed 13 Apr. 2015).
[0352] Assembly of a Viral Particle and Transfection of a Cell Line
[0353] Generally, when delivering the vector comprising the minigene by transfection, the vector is delivered in an amount from about 5 .mu.g to about 100 .mu.g DNA, and preferably about 10 to about 50 .mu.g DNA to about 1.times.10.sup.4 cells to about 1.times.10.sup.13 cells, and preferably about 10.sup.5 cells. However, the relative amounts of vector DNA to host cells may be adjusted, taking into consideration such factors as the selected vector, the delivery method and the host cells selected.
[0354] Introduction into the host cell of the vector may be achieved by any means known in the art, including transfection, and infection. One or more of the adenoviral genes may be stably integrated into the genome of the host cell, stably expressed as episomes, or expressed transiently. The gene products may all be expressed transiently, on an episome or stably integrated, or some of the gene products may be expressed stably while others are expressed transiently.
[0355] Introduction of vectors into the host cell may also he accomplished using techniques known to the skilled person. Suitably, standard transfection techniques are used, e.g., CaPC transfection or electroporation.
[0356] Assembly of the selected DNA sequences of the adenovirus (as well as the transgene and other vector elements) into various intermediate plasmids, and the use of the plasmids and vectors to produce a recombinant viral particle are all achieved using conventional techniques. Such techniques include conventional cloning techniques of cDNA, use of overlapping oligonucleotide sequences of the adenovirus genomes, polymerase chain reaction, and any suitable method which provides the desired nucleotide sequence. Standard transfection and co-transfection techniques are employed, e.g., CaPC precipitation techniques. Other conventional methods employed include homologous recombination of the viral genomes, plaguing of viruses in agar overlay, methods of measuring signal generation, and the like.
[0357] For example, following the construction and assembly of the desired rninigene-containing viral vector, the vector is transfected in vitro in the presence of a helper virus into the packaging cell line. Homologous recombination occurs between the helper and the vector sequences, which permits the adenovirus-transgene sequences in the vector to be replicated and packaged into virion capsids, resulting in the recombinant viral vector particles. The resulting recombinant adenoviruses are useful in transferring a selected transgene to a selected cell. In in vivo experiments with the recombinant virus grown in the packaging cell lines, the E1-deleted recombinant adenoviral vectors of the invention demonstrate utility in transferring a transgene to a non-simian mammal, preferably a human, cell.
[0358] Transgenes
[0359] The transgene is a nucleic acid sequence, heterologous to the vector sequences flanking the transgene, which encodes a protein of interest. The nucleic acid coding sequence is operatively linked to regulatory components in a manner which permits transgene transcription, translation, and/or expression in a host cell.
[0360] The composition of the transgene sequence will depend upon the use to which the resulting vector will be put. For example, the transgene may be a therapeutic transgene or an immunogenic transgene. Alternatively, a transgene sequence may include a reporter sequence, which upon expression produces a detectable signal. Such reporter sequences include, without limitation, DNA sequences encoding .beta.-lactamase, .beta.-galactosidase (LacZ), alkaline phosphatase, thymidine kinase, green fluorescent protein (GFP), chlorarnphenicol acetyltransferase (CAT), luciferase, membrane bound proteins including, for example, CD2, CD4, CD8, the influenza hemagglutinin protein, and others well known in the art, to which high affinity antibodies directed thereto exist or can be produced by conventional means, and fusion proteins comprising a membrane bound protein appropriately fused to an antigen tag domain from, among others, hemagglutinin or Myc. These coding sequences, when associated with regulatory elements which drive their expression, provide signals detectable by conventional means, including enzymatic, radiographic, colorimetric, fluorescence or other spectrographic assays, fluorescent activating cell sorting assays and immunological assays, including enzyme linked immunosorbent assay (ELISA), radioimmunoassay (RIA) and immunohistochemistry.
[0361] In one embodiment, the transgene is a non-marker sequence encoding a product which is useful in biology and medicine, such as a therapeutic transgene or an immunogenic transgene such as proteins, RNA, enzymes, or catalytic RNAs. Desirable RNA molecules include tRNA, dsRNA, ribosomal RNA, catalytic RNAs, and antisense RNAs. One example of a useful RNA sequence is a sequence which extinguishes expression of a targeted nucleic acid sequence in the treated animal.
[0362] The transgene may be used for treatment, e.g., as a vaccine, for induction of an immune response, and/or for prophylactic vaccine purposes. As used herein, induction of an immune response refers to the ability of a protein to induce a T cell and/or a humoral immune response to the protein.
[0363] Regulatory Elements
[0364] In addition to the transgene the vector also includes conventional control elements which are operably linked to the transgene in a manner that permits its transcription, translation and/or expression in a cell transfected with the plasmid vector or infected with the virus produced by the invention. As used herein, "operably linked" sequences include both expression control sequences that are contiguous with the gene of interest and expression control sequences that act in trans or at a distance to control the gene of interest.
[0365] Expression control sequences include appropriate transcription initiation, termination, promoter and enhancer sequences; efficient RNA processing signals such as splicing and polyadenylation (poly A) signals including rabbit beta-globin polyA; sequences that stabilize cytoplasmic mRNA; sequences that enhance translation efficiency (e.g., Kozak consensus sequence); sequences that enhance protein stability; and when desired, sequences that enhance secretion of the encoded product. Among other sequences, chimeric introns may be used.
[0366] In some embodiments, the Woodchuck Hepatitis Virus Posttranscriptional Regulatory Element (WPRE) (Zuffrey et al. (1999) J Virol; 73(4):2886-9) may be operably linked to the transgene. An exemplary WPRE is provided in SEQ ID NO: 26.
[0367] A "promoter" is a nucleotide sequence that permits binding of RNA polymerase and directs the transcription of a gene. Typically, a promoter is located in the 5' non-coding region of a gene, proximal to the transcriptional start site of the gene. Sequence elements within promoters that function in the initiation of transcription are often characterized by consensus nucleotide sequences. Examples of promoters include, but are not limited to, promoters from bacteria, yeast, plants, viruses, and mammals (including humans). A great number of expression control sequences, including promoters which are internal, native, constitutive, inducible and/or tissue-specific, are known in the art and may be utilized.
[0368] Examples of constitutive promoters include, without limitation, the TBG promoter, the retroviral Rous sarcoma virus LTR promoter (optionally with the enhancer), the cytomegalovirus (CMV) promoter (optionally with the CMV enhancer, see, e.g., Boshart et al, Cell, 41:521-530 (1985)), the CASI promoter, the SV40 promoter, the dihydrofolate reductase promoter, the .beta.-actin promoter, the phosphoglycerol kinase (PGK) promoter, and the EF1a promoter (Invitrogen).
[0369] In some embodiments, the promoter is a CAS promoter (see, for example, WO2012/115980). The CASI promoter is a synthetic, promoter which contains a portion of the CMV enhancer, a portion of the chicken beta-actin promoter, and a portion of the UBC enhancer. In some embodiments, the CASI promoter can include a nucleic acid sequence having at least about 90%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, or more, sequence identity to SEQ ID NO: 12. In some embodiments, the promoter comprises or consists of a nucleic acid sequence of SEQ ID NO: 12.
[0370] Inducible promoters allow regulation of gene expression and can be regulated by exogenously supplied compounds, environmental factors such as temperature, or the presence of a specific physiological state, e.g., acute phase, a particular differentiation state of the cell, or in replicating cells only. inducible promoters and inducible systems are available from a variety of commercial sources, including, without limitation, Invitrogen, Clontech and Ariad. Many other systems have been described and can be readily selected by one of skill in the art. For example, inducible promoters include the zinc-inducible sheep metallothionine (MT) promoter and the dexamethasone (Dex)-inducible mouse mammary tumor virus (MMTV) promoter. Other inducible systems include the T7 polymerase promoter system (WO 98/10088); the ecdysone insect promoter (No et al, Proc. Natl. Acad. Sci. USA, 93:3346-3351 (1996)), the tetracycline-repressible system (Gossen et al, Proc. Natl. Acad. Sci. USA, 89:5547-5551 (1992)), the tetracycline-inducible system (Gossen et al, Science, 378:1766-1769 (1995), see also Harvey et al, Curr. Opin. Chem. Biol, 2:512-518 (1998)). Other systems include the FK506 dimer, VP16 or p65 using castradiol, diphenol murislerone, the RU486-inducible system (Wang et al, Nat. Biotech,, 15:239-243 (1997) and Wang et al, Gene Ther., 4:432-441 (1997)) and the rapamycin-inducible system (Magari et al, J. Olin. Invest, 100:2865-2872 (1997)). The effectiveness of some inducible promoters increases over time. In such cases one can enhance the effectiveness of such systems by inserting multiple repressors in tandem, e.g., TetR linked to a TetR by an IRES.
[0371] In another embodiment, the native promoter for the transgene will be used. The native promoter may be preferred when it is desired that expression of the transciene should mimic the native expression. The native promoter may be used when expression of the transgene must be regulated temporally or developmentally, or in a tissue-specific manner, or in response to specific transcriptional stimuli. In a further embodiment, other native expression control elements, such as enhancer elements, polyadenylation sites or Kozak consensus sequences may also be used to mimic the native expression.
[0372] The transgene may be operably linked to a tissue-specific promoter. For instance, if expression in skeletal muscle is desired, a promoter active in muscle should be used. These include the promoters from genes encoding skeletal p-actin, myosin light chain 2A, dystrophin, muscle creatine kinase, as well as synthetic muscle promoters with activities higher than naturally occurring promoters (see Li et al, Nat. Biotech., 17:241-245 (1999)). Examples of promoters that are tissue-specific, are known for liver (albumin, Miyatake et al, J. Viral, 71:5124-32 (1997); hepatitis B virus core promoter, Sandig et al, Gene Ther., 3:1002-9 (1996); alpha-fetoprotein (AFP), Arbuthnot et al., Hum. Gene Ther., 7: 1503-14 (1996)), bone osteocalcin (Stein et al, Mol. Biol. Rep., 24:185-96 (1997)); bone sialoprotein (Chen et al., J. Bone Miner. Res., 11:654-64 (1996)), lymphocytes (CD2, Hansel et al, J. Immunol, 161:1063-8 (1998); immunoglobulin heavy chain; T cell receptor chain), neuronal such as neuron-specific enolase (NSE) promoter (Andersen et al, Cell. Mol. Neurobiol, 13:503-15 (1993)), neurofilament light-chain gene (Piccioli et al, Proc. Natl. Acad. Sci. USA, 88:5611-5 (1991)), and the neuron-specific vgf gene (Piccioli et al, Neuron, 15:373-84 (1995)), among others.
[0373] Optionally, vectors carrying transgenes encoding therapeutically useful or immunogenic products may also include selectable markers or reporter genes which may include sequences encoding geneticin, hygromicin or purimycin resistance, among others. Such selectable reporters or marker genes (preferably located outside the viral genome to be packaged into a viral particle) can be used to signal the presence of the plasmids in bacterial cells, such as ampicillin resistance. Other components of the vector may include an origin of replication.
[0374] These vectors are generated using the techniques and sequences provided herein, in conjunction with techniques known to those of skill in the art, Such techniques include conventional cloning techniques of cDNA such as those described in texts, use of overlapping oligonucleotide sequences of the adenovirus genomes, polyrnerase chain reaction, and any suitable method which provides the desired nucleotide sequence.
[0375] Therapeutics and Prophylaxis
[0376] The recombinant ChAd155-based vectors are useful for gene transfer to a human or non-simian mammal in vitro, ex vivo, and in vivo.
[0377] The recombinant adenovirus vectors described herein can be used as expression vectors for the production of the products encoded by the heterologous transgenes in vitro. For example, the recombinant replication-incompetent adenovirus containing a transgene may be transfected into a complementation cell line as described above.
[0378] A ChAd155-derived recombinant adenoviral vector provides an efficient gene transfer vehicle that can deliver a selected transgene to a selected host cell in vivo or ex vivo even where the organism has neutralizing antibodies to one or more adenovirus serotypes. In one embodiment, the vector and the cells are mixed ex vivo; the infected delis are cultured using conventional methodologies; and the transduced cells are re-infused into the patient. These techniques are particularly well suited to gene delivery for therapeutic purposes and for immunisation, including inducing protective immunity,
[0379] Immunogenic Transgenes
[0380] The recombinant ChAd155 vectors may also be as administered in immunogenic compositions. An immunogenic composition as described herein is a composition comprising one or more recombinant ChAd155 vector capable of inducing an immune response, for example a humoral (e.g., antibody) and/or cell-mediated (e.g., a cytotoxic T cell) response, against a transgene product delivered by the vector following delivery to a mammal, suitably a human. A recombinant adenovirus may comprise (suitably in any of its gene deletions) a gene encoding a desired immunogen and may therefore be used in a vaccine.
[0381] Such vaccine or other immunogenic compositions may be formulated in a suitable delivery vehicle. Generally, doses for the immunogenic compositions are in the range defined below under `Delivery Methods and Dosage`. The levels of immunity of the selected gene can be monitored to determine the need, if any, for boosters. Following an assessment of antibody titers in the serum, optional booster immunizations may be desired.
[0382] Optionally, a vaccine or immunogenic composition of the invention may be formulated to contain other components, including, e.g., adjuvants, stabilizers, pH adjusters, preservatives and the like. Examples of suitable adjuvants are provided below under `Adjuvants`. Such an adjuvant can be administered with a priming DNA vaccine encoding an antigen to enhance the antigen-specific immune response compared with the immune response generated upon priming with a DNA vaccine encoding the antigen only. Alternatively, such an adjuvant can be administered with a polypeptide antigen which is administered in an administration regimen involving the ChAd155 vectors of the invention (as described below under `Administration Regimens`.
[0383] The recombinant adenoviruses are administered in an immunogenic amount, that is, an amount of recombinant adenovirus that is effective in a route of administration to transfect the desired target cells and provide sufficient levels of expression of the selected gene to induce an immune response. Where protective immunity is provided, the recombinant adenoviruses are considered to be vaccine compositions useful in preventing infection and/or recurrent disease.
[0384] The recombinant vectors described herein are expected to be highly efficacious at inducing cytolytic T cells and antibodies directed to the inserted heterologous antigenic protein expressed by the vector.
[0385] Adjuvants
[0386] An "adjuvant" as used herein refers to a composition that enhances the immune response to an immunogen. Examples of such adjuvants include but are not limited to inorganic adjuvants (e,g, inorganic metal salts such as aluminium phosphate or aluminium hydroxide), organic adjuvants (e.g. saponins, such as QS21, or squalene), oil-based adjuvants (e.g. Freund's complete adjuvant and Freund's incomplete adjuvant), cytokines (e.g. 1L-1.beta., IL-2, IL-7, 1L-12, IL-18, GM-CFS, and INF-.gamma.) particulate adjuvants (e.g. immuno-stimulatory complexes (ISCOMS), liposomes, or biodegradable microspheres), virosome,s, bacterial adjuvants (e.g. monophosphoryl lipid A, such as 3-de-O-acylated monophosphoryl lipid A (3D-MPL), or muramyl peptides), synthetic adjuvants (e.g. non-ionic block copolymers, muramyl peptide analogues, or synthetic lipid A), synthetic polynucleotides adjuvants (e.g polyarginine or polylysine) and immunostimulatory oligonucleotides containing unmethylated CpG dinucleotides ("CpG").
[0387] One suitable adjuvant is monophosphoryl lipid A (MPL), in particular 3-de-O-acylated monophosphoryl lipid A (3D-MPL). Chemically it is often supplied as a mixture of 3-de-O-acylated monophosphoryl lipid A with either 4, 5, or 6 acylated chains. It can be purified and prepared by the methods taught in GB 2122204B, which reference also discloses the preparation of diphosphoryl lipid A, and 3-O-deacylated variants thereof. Other purified and synthetic lipopolysaccharides have been described (U.S. Pat. No. 6,005,099 and EP 0 729 473 B1; Hilgers et al., 1986, Int.Arch.Allergy.Immnunol., 79(4):392-6; Hilgers et al., 1987, Immunology, 60(1):141-6; and EP 0 549 074 B1I).
[0388] Saponins are also suitable adjuvants (see Lacaille-Dubois, M and Wagner H, A review of the biological and pharmacological activities of saponins. Phytomedicine vol 2 pp 363-386 (1996)). For example, the saponin Quil A (derived from the bark of the South American tree Quillaja Saponaria Molina), and fractions thereof, are described in U.S. Pat. No. 5,057,540 and Kensil, Crit. Rev. Ther. Drug Carrier Syst., 1996, 12:1-55; and EP 0 362 279 B1. Purified fractions of Quil A are also known as irnrnunostimulants, such as QS21 and QS17; methods of theft production is disclosed in U.S. Pat. No. 5,057,540 and EP 0 362 279 B1. Also described in these references is QS7 (a non-haemolytic fraction of Quil-A). Use of QS21 is further described in Kensil et al, (1991. J. Immunology, 146: 431-437). Combinations of QS21 and polysorbate or cyciodextrin are also known (WO 99/10008), Particulate adjuvant systems comprising fractions of QuilA, such as QS21 and QS7 are described in WO 96/33739 and WO 96/11711.
[0389] Another adjuvant is an immunostimulatory oligonucleotide containing unmethylated CpG dinucleotides ("CpG") (Krieg, Nature 374:546 (1995)). CpG is an abbreviation for cytosine-guanosine dinucleotide motifs present in DNA. CpG is known as an adjuvant when administered by both systemic and mucosal routes (WO 96/02555, EP 468520, Davis et al, J. Immunol, 1998, 160:870-876; McCluskie and Davis, J. Immunol., 1998, 161:4463-6). CpG, when formulated into vaccines, may be administered in free solution together with free antigen (WO 96/02555) or covalently conjugated to an antigen (WO 98/16247), or formulated with a carrier such as aluminium hydroxide (Brazolot-Millan et al., Proc. Natl. Acad. Sci., USA, 1998, 95:15553-8).
[0390] Adjuvants such as those described above may be formulated together with carriers, such as liposomes, oil in water emulsions, and/or metallic salts (including aluminum salts such as aluminum hydroxide). For example, 3D-MPL may be formulated with aluminum hydroxide (EP 0 689 454) or oil in water emulsions (WO 95/17210); QS21 may be formulated with cholesterol containing liposornes (WO 96/33739), oil in water emulsions (WO 95/17210) or alum (WO 98/15287); CpG may be formulated with alum (Brazolot-Millan, supra) or with other cationic carders.
[0391] Combinations of adjuvants may be utilized in the present invention, in particular a combination of a monophosphoryl lipid A and a saponin derivative (see, e.g., WO 94/00153; WO 95/17210; WO 96/33739; WO 98/56414; WO 99/12565; WO 99/11241), more particularly the combination of QS21 and 3D-MPL as disclosed in WO 94/00153, or a composition where the QS21 is quenched in cholesterol-containing liposomes (DQ) as disclosed in WO 96/33739. Alternatively, a combination of CpG plus a saponin such as QS21 is an adjuvant suitable for use in the present invention. A potent adjuvant formulation involving QS21, 3D-MPL & tocopherol in an oil in water emulsion is described in WO 95/17210 and is another formulation for use in the present invention. Saponin adjuvants may be formulated in a liposome and combined with an immunostimulatory oligonucleotide. Thus, suitable adjuvant systems include, for example, a combination of monophosphoryl lipid A, preferably 3D-MPL, together with an aluminium salt (e.g. as described in WO00/23105). A further exemplary adjuvant comprises comprises QS21 and/or MPL and/or CpG. QS21 may be quenched in cholesterol-containing liposomes as disclosed in WO 96/33739.
[0392] Other suitable adjuvants include alkyl Glucosaminide phosphates (AGPs) such as those disclosed in WO9850399 or U.S. Pat. No. 6,303,347 (processes for preparation of AGPs are also disclosed), or pharmaceutically acceptable salts of AGPs as disclosed in U.S. Pat. No. 6,764,840. Some AGPs are TLR4 agonists, and some are TLR4 antagonists. Both are thought to be useful as adjuvants.
[0393] It has been found (WO 2007/062656, which published as US 2011/0293704 and is incorporated by reference for the purpose of disclosing invariant chain sequences) that the fusion of the invariant chain to an antigen which is comprised by an expression system used for vaccination increases the immune response against said antigen, if it is administered with an adenovirus. Accordingly, in one embodiment of the invention, the immunogenic transgene may be co-expressed with invariant chain in a recombinant ChAd155 viral vector.
[0394] In another embodiment, the invention provides the use of the capsid of ChAd155 (optionally an intact or recombinant viral particle or an empty capsid is used) to induce an immunomodulatory effect response, or to enhance or adjuvant a cytotoxic T cell response to another active agent by delivering a ChAd155 capsid to a subject. The ChAd155 capsid can be delivered alone or in a combination regimen with an active agent to enhance the immune response thereto. Advantageously, the desired effect can be accomplished without infecting the host with an adenovirus.
[0395] Administration Regimens
[0396] Commonly, the ChAd155 recombinant adenoviral vectors will be utilized for delivery of therapeutic or immunogenic molecules (such as proteins). It will be readily understood for both applications, that the recombinant adenoviral vectors of the invention are particularly well suited for use in regimens involving repeat delivery of recombinant adenoviral vectors. Such regimens typically involve delivery of a series of viral vectors in which the viral capsids are alternated. The viral capsids may be changed for each subsequent administration, or after a pre-selected number of administrations of a particular serotype capsid (e.g. one, two, three, four or more). Thus, a regimen may involve delivery of a recombinant adenovirus with a first capsid, delivery with a recombinant adenovirus with a second capsid, and delivery with a recombinant adenovirus with a third capsid. A variety of other regimens which use the adenovirus capsids of the invention alone, in combination with one another, or in combination with other adenoviruses (which are preferably immunologically non-crossreactive) will be apparent to those of skill in the art. Optionally, such a regimen may involve administration of recombinant adenovirus with capsids of other non-human primate adenoviruses, human adenoviruses, or artificial sequences such as are described herein.
[0397] The adenoviral vectors of the invention are particularly well suited for therapeutic regimens in which multiple adenoviral-mediated deliveries of transgenes are desired, e.g., in regimens involving redelivery of the same transgene or in combination regimens involving delivery of other transcienes. Such regimens may involve administration of a ChAd155 adenoviral vector, followed by re-administration with a vector from the same serotype adenovirus. Particularly desirable regimens involve administration of a ChAd155 adenoviral vector, in which the source of the adenoviral capsid sequences of the vector delivered in the first administration differs from the source of adenoviral capsid sequences of the viral vector utilized in one or more of the subsequent administrations. For example, a therapeutic regimen involves administration of a ChAd155 vector and repeat administration with one or more adenoviral vectors of the same or different serotypes.
[0398] In another example, a therapeutic regimen involves administration of an adenoviral vector followed by repeat administration with a ChAd155 vector which has a capsid which differs from the source of the capsid in the first delivered adenoviral vector, and optionally further administration with another vector which is the same or, preferably, differs from the source of the adenoviral capsid of the vector in the prior administration steps. These regimens are not limited to delivery of adenoviral vectors constructed using the ChAd155 sequences. Rather, these regimens can readily utilize other adenoviral sequences, including, without limitation, other adenoviral sequences including other non-human primate adenoviral sequences, or human adenoviral sequences, in combination with the ChAd155 vectors.
[0399] In a further example, a therapeutic regimen may involve either simultaneous (such as co-administration) or sequential (such as a prime-boost) delivery of (i) one or more ChAd155 adenoviral vectors and (ii) a further component such as non-adenoviral vectors, non-viral vectors, and/or a variety of other therapeutically useful compounds or molecules such as antigenic proteins optionally simultaneously administered with adjuvant. Examples of co-administration include homo-lateral co-administration and contra-lateral co-administration (further described below under `Delivery Methods and Dosage`).
[0400] Suitable non-adenoviral vectors for use in simultaneous or particularly in sequential delivery (such as prime-boost) with one or more ChAd155 adenoviral vectors include one or more poxviral vectors. Suitably, the poxviral vector belongs to the subfamily chordopoxvirinae, more suitably to a genus in said subfamily selected from the group consisting of orthopox, parapox, avipox (suitably canarypox (ALVAC) or fowlpox (FPV)) and molluscipox. Even more suitably, the poxviral vector belongs to the orthopox and is selected from the group consisting of vaccinia virus, NYVAC (derived from the Copenhagen strain of vaccinia), Modified Vaccinia Ankara (MVA), cowpoxvirus and monkeypox virus. Most suitably, the poxviral vector is MVA.
[0401] "Simultaneous" administration suitably refers to the same ongoing immune response. Preferably both components are administered at the same time (such as simultaneous administration of both DNA and protein), however, one component could be administered within a few minutes (for example, at the same medical appointment or doctor's visit), within a few hours. Such administration is also referred to as co-administration. In some embodiments, co-administration may refer to the ;administration of an adenoviral vector, an adjuvant and a protein component. In other embodiments, co-administration refers to the administration of an adenoviral vector and another viral vector, for example a second adenovirail vector or a poxvirus such as MVA. In other embodiments, co-administration refers to the administration of an adenoviral vector and a protein component, which is optionally adjuvanted.
[0402] A prime-boost regimen may be used. Prime-boost refers to two separate immune responses: (i) an initial priming of the immune system followed by (ii) a secondary or boosting of the immune system many weeks or months after the primary immune response has been established.
[0403] Such a regimen may involve the administration of a recombinant ChAd155 vector to prime the immune system to second, booster, administration with a traditional antigen, such as a protein (optionally co-administered with adjuvant), or a recombinant virus carrying the sequences encoding such an antigen (e.g., WO 00/11140). Alternatively, an immunization regimen may involve the administration of a recombinant ChAd155 vector to boost the immune response to a vector (either viral or DNA-based) encoding an antigen. In another alternative, an immunization regimen involves administration of a protein followed by booster with a recombinant ChAd155 vector encoding the antigen. In one example, the prime-boost regimen can provide a protective immune response to the virus, bacteria or other organism from which the antigen is derived. In another embodiment, the prime-boost regimen provides a therapeutic effect that can be measured using conventional assays for detection of the presence of the condition for which therapy is being administered.
[0404] Preferably, a boosting composition is administered about 2 to about 27 weeks after administering the priming composition to the subject. The administration of the boosting composition is accomplished using an effective amount of a boosting composition containing or capable of delivering the same antigen or a different antigen as administered by the priming vaccine. The boosting composition may be composed of a recombinant viral vector derived from the same viral source or from another source. Alternatively, the boosting composition can be a composition containing the same antigen as encoded in the priming vaccine, but in the form of a protein, which composition induces an immune response in the host. The primary requirements of the boosting composition are that the antigen of the composition is the same antigen, or a cross-reactive antigen, as that encoded by the priming composition.
[0405] Delivery Methods and Dosage
[0406] The vector may be prepared for administration by being suspended or dissolved in a pharmaceutically or physiologically acceptable carrier such as isotonic saline; isotonic salts solution or other formulations that will be apparent to those skilled in the art. The appropriate carrier will be evident to those skilled in the art and will depend in large part upon the route of administration. The compositions described herein may be administered to a mammal in a sustained release formulation using a biodegradable biocompatible polymer, or by on-site delivery using micelles, gels and liposomes.
[0407] In some embodiments, the recombinant adenovirus of the invention is administered to a subject by intramuscular injection, intravaginal injection, intravenous injection, intraperitoneal injection, subcutaneous injection, epicutaneous administration, intradermal administration, nasal administration or oral administration. Of particular interest in the context of tuberculosis are intramuscular injection, subcutaneous injection, intradermal administration, nasal administration or aerosol administration, especially intramuscular injection, nasal administration or aerosol administration.
[0408] If the therapeutic regimen involves co-administration of one or more ChAd155 adenoviral vectors and a further component, each formulated in different compositions, they are favourably administered co-locationally at or near the same site. For example, the components can be administered (e.g. via an administration route selected from intramuscular, transdermal, intradermal, sub-cutaneous) to the same side or extremity ("co-lateral" administration) or to opposite sides or extremities ("contra-lateral" administration).
[0409] Dosages of the viral vector will depend primarily on factors such as the condition being treated, the age, weight and health of the patient, and may thus vary among patients. For example, a therapeutically effective adult human or veterinary dosage of the viral vector generally contains 1.times.10.sup.5 to 1.times.10.sup.15 viral particles, such as from 1.times.10.sup.8 to 1.times.10.sup.12 (e.g., 1.times.10.sup.8, 2.5.times.10.sup.8, 5.times.10.sup.8, 1.times.10.sup.9, 1.5'10.sup.9, 2.5.times.10.sup.9, 5.times.10.sup.9, 1.times.10.sup.10, 1.5.times.10.sup.10, 2.5.times.10.sup.10, 5.times.10.sup.10, 1.times.10.sup.11, 1.5.times.10.sup.11, 2.5.times.10.sup.11, 5.times.10.sup.11, 1.times.10.sup.12 particles). Alternatively, a viral vector can be administered at a dose that is typically from 1.times.10.sup.5 to 1.times.10.sup.10 plaque forming units (PFU), such as 1.times.10.sup.5 PFU, 2.5.times.10.sup.5 PFU, 5.times.10.sup.5 PFU, 1.times.10.sup.6 PFU, 2.5.times.10.sup.6 PFU, 5.times.10.sup.6 PFU,1.times.10.sup.7 PFU, 2.5.times.10.sup.7 PFU, 5.times.10.sup.7 PFU, 1.times.10.sup.5 PFU, 2.5.times.10.sup.8 PFU, 5.times.10.sup.8 PFU, 1.times.10.sup.9 PFU, 2.5.times.10.sup.9 PFU, 5.times.10.sup.9 PFU, or 1.times.10.sup.10 PFU. Dosages will vary depending upon the size of the animal and the route of administration. For example, a suitable human or veterinary dosage (for about an 80 kg animal) for intramuscular injection is in the range of about 1.times.10.sup.9 to about 5.times.10.sup.12 particles per mL, for a single site. Optionally, multiple sites of administration may be used. In another example, a suitable human or veterinary dosage may be in the range of about 1.times.10.sup.11 to about 1.times.10.sup.15 particles for an oral formulation.
[0410] The viral vector can be quantified by Quantitative PCR Analysis (Q-PCR), for example with primers and probe designed on CMV promoter region using as standard curve serial dilution of plasmid DNA containing the vector genome with expression cassette including HCMV promoter. The copy number in the test sample is determined by the parallel line analysis method. Alternative methods for vector particle quantification can be analytical HPLC or spectrophotometric method based on A.sub.260 nm.
[0411] An immunologically effective amount of a nucleic add may suitably be between 1 ng and 100 mg. For example, a suitable amount can be from 1 .mu.g to 100 mg. An appropriate amount of the particular nucleic acid (e.g., vector) can readily be determined by those of skill in the art. Exemplary effective amounts of a nucleic acid component can be between 1 ng and 100 .mu.g, such as between 1 ng and 1 .mu.g (e.g., 100 ng-1 .mu.g), or between 1 .mu.g and 100 .mu.g, such as 10 ng, 50 ng, 100 ng, 150 ng, 200 ng, 250 ng, 500 ng, 750 ng, or 1 .mu.g. Effective amounts of a nucleic acid can also include from 1 .mu.g to 500 .mu.g, such as between 1 .mu.g and 200 .mu.g, such as between 10 and 100 .mu.g, for example 1 .mu.g, 2 .mu.g, 5 .mu.g, 10 .mu.g, 20 .mu.g, 50 .mu.g, 75 .mu.g, 100 .mu.g, 150 .mu.g, or 200 .mu.g. Alternatively, an exemplary effective amount of a nucleic acid can be between 100 .mu.g and 1 mg, such as from 100 .mu.g to 500 .mu.g, for example, 100 .mu.g, 150 .mu.g, 200 .mu.g, 250 .mu.g, 300 .mu.g, 400 .mu.g, 500 .mu.g, 600 .mu.g, 700 .mu.g, 800 .mu.g, 900 .mu.g or 1 mg.
[0412] Generally a human dose will be in a volume of between 0.1 ml and 2 ml. Thus the composition described herein can be formulated in a volume of, for example 0.1, 0.15, 0.2, 0.5, 1.0, 1.5 or 2.0 ml human dose per individual or combined immunogenic components.
[0413] One of skill in the art may adjust these doses, depending on the route of administration and the therapeutic or vaccine application for which the recombinant vector is employed. The levels of expression of the transgene, or for an adjuvant, the level of circulating antibody, can be monitored to determine the frequency of dosage administration.
[0414] If one or more priming and/or boosting steps are used, this step may include a single dose that is administered hourly, daily, weekly or monthly, or yearly. As an example, mammals may receive one or two doses containing between about 10 .mu.g to about 50 .mu.g of plasmid in carrier. The amount or site of delivery is desirably selected based upon the identity and condition of the mammal.
[0415] The therapeutic levels of, or level of immune response against, the protein encoded by the selected transgene can be monitored to determine the need, if any, for boosters. Following an assessment of CD8+ T cell response, or optionally, antibody titers, in the serum, optional booster immunizations may be desired. Optionally, the recombinant ChAd155 vectors may be delivered in a single administration or in various combination regimens, e.g., in combination with a regimen or course of treatment involving other active ingredients or in a prime-boost regimen.
[0416] The present invention will now be further described by means of the following non-limiting examples.
EXAMPLES
Example 1: Isolation of ChAd155
[0417] Wild type chimpanzee adenovirus type 155 (ChAd155) was isolated from a healthy young chimpanzee housed at the New Iberia Research Center facility (New Iberia Research Center; The University of Louisiana at Lafayette) using standard procedures as described in Colloca et al. (2012) and WO2010086189, which is hereby incorporated by reference for the purpose of describing adenoviral isolation and characterization techniques
Example 2: ChAd155 Vector Construction
[0418] The ChAd155 viral genome was then cloned in a plasmid or in a BAC vector and subsequently modified (FIG. 2) to carry the following modifications in different regions of the ChAd155 viral genome:
[0419] a) deletion of the E1 region (from bp 449 to bp 3529) of the viral genome;
[0420] b) deletion of the E4 region (from bp 34731 to bp 37449) of the viral genome;
[0421] c) insertion of the F4orf6 derived from human Ad5.
[0422] 2.1: Deletion of E1 Region: Construction of BAC/ChAd155 .DELTA.E1_TetO hCMV RpsL-Kana #1375
[0423] The ChAd155 viral genome was cloned into a BAC vector by homologous recombination in E. coli strain BJ5183 electroporation competent cells (Stratagene catalog no. 2000154) co-transformed with ChAd155 viral DNA and Subgroup C BAC Shuttle (#1365). As shown in the schematic of FIG. 3, the Subgroup C Shuttle is a BAC vector derived from pBeloBAC11 (GenBank U51113, NEB) and which is dedicated to the cloning of ChAd belonging to species C and therefore contains the pIX gene and DNA fragments derived from right and left ends (including right and left ITRs) of species C ChAd viruses.
[0424] The Species C BAC Shuttle also contains a RpsL-Kana cassette inserted between left end and the pIX gene. In addition, an Amp-LacZ-SacB selection cassette, flanked by ISceI restriction sites, is present between the pIX gene and right end of the viral genome. In particular, the BAC Shuttle comprised the following features: Left ITR: bp 27 to 139, hCMV(tetO) RpsL-Kana cassette: bp 493 to 3396, pIX gene: bp 3508 to 3972, ISecI restriction sites: bp 3990 and 7481, Amp-LacZ-SacB selection cassette: bp 4000 to 7471, Right ITR: bp 7805 to 7917.
[0425] BJ5183 cells were co-transformed by electroporation with ChAd155 purified viral DNA and Subgroup C BAC Shuttle vector digested with ISceI restriction enzyme and then purified from gel. Homologous recombination occurring between pIX gene and right ITR sequences (present at the ends of Species C BAC Shuttle, linearized DNA) and homologous sequences present in ChAd155 viral DNA lead to the insertion of ChAd155 viral genomic DNA in the BAC shuttle vector. At the same time, the viral E1 region was deleted and substituted by the RpsL-Kana cassette, generating BAC/ChAd155 .DELTA.E1/TetO hCMV RpsL-Kana #1375.
[0426] 2.2: Plasmid Construction by Homologous Recombination in E. coli BJ5183
[0427] 2.2.1: Deletion of E4 Region--Construction of pChAd155 .DELTA.E1, E4_Ad5E4orf61TetO hCMV RpsL-Kana (#1434)
[0428] To improve propagation of the vector, a deletion of the E4 region spanning from nucleotide 34731-37449 (ChAd155 wild type sequence) was introduced in the vector backbone by replacing the native E4 region with Ad5 E4orf6 coding sequence using a strategy involving several steps of cloning and homologous recombination in E. coli. The E4 coding region was completely deleted while the E4 native promoter and polyadenylation signal were conserved. To this end, a shuttle vector was constructed to allow the insertion of Ad5orf6 by replacing the ChAd155 native E4 region by homologous recombination in E. coli BJ5183 as detailed below.
[0429] Construction of pARS SpeciesC Ad5E4orf6-1
[0430] A DNA fragment containing Ad5orf6 was obtained by PCR using Ad5 DNA as template, with the oligonucleotides 5'-ATACGGACTAGTGGAGAAGTACTCGCCTACATG-3' (SEQ ID NO: 13) and 5'-ATACGGAAGATCTAAGACTTCAGGAAATATGACTAC-3' (SEQ ID NO: 14). The PCR fragment was digested with BgIII and SpeI and cloned into Species C RLD-EGFP shuttle digested with BgIII and SpeI, generating the plasmid pARS Species C Ad5orf6-1. Details regarding the shuttle can be found in Colloca et al, Sci. Transl. Med. (2012) 4:115ra.
[0431] Construction of pARS SpeciesC Ad5E4orf6-2
[0432] To delete the E4 region, a 177 bp DNA fragment spanning bp 34586 to bp 34730 of the ChAd155 wt sequence (SEQ ID NO: 10) was amplified by PCR using the plasmid BAC/ChAd155 .DELTA.E1_TetO hCMV RpsL-Kana (#1375) as a template with the following oligonucleotides: 5'-ATTCAGTGTACAGGCGCGCCAAGCATGACGCTGTTGATTTGATTC-3' (SEQ ID NO: 15) and 5-ACTAGGACTAGTTATAAGCTAGAATGGGGCTTTGC-3' (SEQ ID NO: 16). The PCR fragment was digested with BsrGI and SpeI and cloned into pARS SubGroupC Ad5orf6-1 digested with BsrGI and SpeI, generating the plasmid pARS SpeciesC Ad5orf6-2 (#1490). A schematic diagram of this shuttle plasmid is provided in FIG. 4. In particular, the shuttle plasmid comprised the following features: Left ITR: bp 1 to 113, Species C first 460 bp: bp 1 to 460, ChAd155 wt (bp 34587 to bp 34724 of SEQ ID NO:10) : bp 516 to 650, Ad5 or16: bp 680 and 1561, Species C last 393 bp: bp 1567 to 1969, Right ITR: bp 1857 to 1969.
[0433] Construction of pChAd155 .DELTA.E1, E4_Ad5E4orf6/TetO hC V RpsL-Kana (#1434)
[0434] The resulting plasmid pARS SubGroupC Ad5orf6-2 was then used to replace the E4 region within the ChAd155 backbone with Ad5orf6. To this end the plasmid BACIChAd155 .DELTA.E1_TetO hCMV RpsL-Kana (#1375) was digested with PacI/PmeI and co-transformed into BJ5183 cells with the digested plasmid pARS SubGroupC Ad5orf6-2 BsrGI/AscI, to obtain the pChAd155 .DELTA.E1, E4_Ad5E4orf6/TetO hCMV RpsL-Kana (111434) pre-adeno plasmid.
[0435] 2.2.2: Insertion of RSV Expression Cassette--Construction of pChAd155 .DELTA.E1, E4_Ad5E4orf6/TetO hCMV RSV
[0436] An RSV cassette was cloned into a linearized pre-adeno acceptor vector via homologous recombination in E. coli by exploiting the homology existing between HCMV promoter and BGH polyA sequences. The plasmid pvjTetOhCMV-bghpolyA_RSV was cleaved with SfiI and SpeI to excise the 4,65 Kb fragment containing the HCMV promoter with tetO, RSV and BGHpolyA sequence. The resulting RSV 4,65 Kb fragment was cloned by homologous recombination into the pChAd155 .DELTA.E1, E4_Ad5E4orf6/TetO hCMV RpsL-Kana (#1434) acceptor vector carrying the RpsL-Kana selection cassette under control of HCMV and BGHpA. The acceptor pre-adeno plasmid was linearized with the restriction endonuclease SnaBI. The resulting construct was the pChAd155 .DELTA.E1, E4_Ad5E4orf6/TetO hCMV RSV vector (FIG. 5).
[0437] 2.3: BAC Vector Construction by Recombineering
[0438] 2.3.1: Deletion of E4 Region--Construction of BAC/ChAd155 .DELTA.E1, E4_Ad5E4orf6/TetO hCMV RpsL-Kana #1390
[0439] A deletion of the E4 region spanning from nucleotide 34731-37449 of the ChAd155 wt sequence was introduced in the vector backbone by replacing this native E4 region with the Ad5 E4orf6 coding sequence using a strategy involving two different steps of recombineering in E. Coli SW102 competent cells.
[0440] The first step resulted in insertion of a selection cassette including the suicide gene SacB, ampicillin--R gene and lacZ (Amp-LacZ-SacB selection cassette) in the E4 region of ChAd155, for the purpose of positive/negative selection of recombinants.
[0441] First step--Substitution of ChAd155 Native E4 Region with Amp-LacZ-SacB Selection Cassette
[0442] The Amp-LacZ-SacB selection cassette was amplified by PCR using the oligonucleotides provided below containing E4 flanking sequences to allow homologous recombination: 1021-FVV E4 Del Step1 (5'-TTAATAGACACAGTAGCTTAATAGACCCAGTAGTGCAAAGCCCCATTCTAGCTTATAA CCCCTATTTGTTTATTTTTCT-3') (SEQ ID NO: 17) and 1022-RW E4 Del Step1 (5'-ATATATACTCTCTCGGCACTTGGCCTTTTACACTGCGAAGTGTTGGTGCTGGTGCTGCGTT GAGAGATCTTTATTTGTTAACTGTTAATTGTC-3') (SEQ ID NO: 18).
[0443] The PCR product was used to transform E. Coli SW102 competent cells containing the pAdeno plasmid BAC/ChAd155 (DE1) tetO hCMV--RpsLKana #1375. The transformation of SW102 cells allowed the insertion of the selection cassette in the E4 region of ChAd155 via lambda (.lamda.) Red-mediated homologous recombination, thus obtaining BAC/ChAd155 (DE1) TetOhCMV--RpsL Kana #1379 (including Amp-LacZ-SacB cassette by substituting ChAd155 native E4 region).
[0444] Second step--Substitution of Amp-lacZ-SacB Selection Cassette with Ad5E4orf6 Region
[0445] The resulting plasmid BAC/ChAd155 (DE1) TetOhCMV--RpsL Kana #1379 (with Amp-LacZ-SacB cassette in place of ChAd155 E4 region) was then manipulated to replace the Amp-lacZ-SacB selection cassette with Ad5orf6 within the ChAd155 backbone. To this end, a DNA fragment containing the Ad5orf6 region was obtained by PCR, using the oligonucleotides 1025-FW E4 Del Step2 (5'-TTAATAGACACAGTAGCTTAATA-3') (SEQ ID NO: 19) and 1026-RW E4 Del Step2 (5'-GGAAGGGAGTGTCTAGTGTT-3') (SEQ ID NO: 20). The resulting DNA fragment was introduced into E. coli SW102 competent cells containing the pAdeno plasmid BAC/ChAd155 (DE1) TetOhCMV--RpsL Kana) #1379, resulting in a final plasmid BAC/ChAd155 (.DELTA.E1, E4 Ad5E4orf6) TetOhCMV--RpsL Kana #1390 containing Ad5orf6 substituting the native ChAd155 E4 region.
[0446] 2.3.2: Insertion of RSV Expression Cassette: Constriction of BAC/ChAd155 .DELTA.E1, E4_Ad5E4orf6/TetOhCMV RSV #1393
[0447] An RSV transgene was cloned into the BAC/ChAd155 .DELTA.E1, E4_Ad5E4orf6/TetOhCMV RSV #1393 vector by substituting the RpsL-Kana selection cassette. The construction strategy was based on two different steps of recombineering in E. Coli SW102 competent cells.
[0448] First Step--Substitution of RpsL-Kana Cassette with Amp-LacZ-SacB Selection Cassette
[0449] The Amp-LacZ-SacB selection cassette was obtained from plasmid BAC/ChAd155 (DE1) TetO hCMV Amp-LacZ-SacB #1342 by PCR using the oligonucleotides 91-SubMonte FW (5'-CAATGGGCGTGGATAGCGGTTTGAC-3') (SEQ ID NO: 21) and 890-BghPolyA RW (5'-CAGCATGCCTGCTATTGTC-3') (SEQ ID NO: 22). The product was transformed into E. Coli SW102 competent cells containing the pAdeno plasmid BAC/ChAd155 (DE1, E4 Ad5E4orf6) TetOhCMV--RpsL Kana #1390, resulting in BAC/ChAd155 (DE1, E4 Ad5E4orf6) TetOhCMV--Amp-LacZ-SacB #1386.
[0450] Second Step--Substitution of Amp-lacZ-SacB Selection Cassette with RSV Transgene
[0451] The RSV transgene was inserted in plasmid BAC/ChAd155 (DE1, E4 Ad5E4orf6) TetOhCMV--Amp-LacZ-SacB #1386 by replacing the Amp-lacZ-SacB selection cassette by homologous recombination. To this end, the plasmid pvjTetOhCMV-bghpolyA_RSV #1080 (containing an RSV expression cassette) was cleaved with SpeI and SfiI to excise the 4.4 Kb fragment including the HCMV promoter, RSV and BGHpolyA. The resulting RSV 4.4 Kb fragment was transformed into E. Coli SW102 competent cells containing the pAdeno plasmid BAC/ChAd155 (DE1, E4 Adr5E4orf6) TetOhCMV--Amp-LacZ-SacB #1386, resulting in the final plasmid BAC/ChAd155 .DELTA.E1, E4_Ad5E4orf6/TetOhCMV Kana #1390. The structure of the BAC carrying ChAd155/RSV (SEQ ID NO: 11) is illustrated in FIG. 6. In particular, ChAd155/RSV comprised the following features: Species C Left ITR: bp 1 to 113, hCMV(tetO) bp 467 to 1311, RSV gene: bp 1348 to 4785, bohpolyA: bp 4815 to 5032, Ad5E4orf6: bp 36270 to 37151, Species C Right ITR: by 37447 to 37559.
Example 3: Vector Production
[0452] The productivity of ChAd155 was evaluated in comparison to ChAd3 and PanAd3 in the Procell 92 cell line.
[0453] 3.1: Production of Vectors Comprising an HIV Gag Transgene
[0454] Vectors expressing the HIV Gag protein were prepared as described above (ChAd155/GAG) or previously (ChAd3/GAG Colloca et al, Sci. Transl. Med. (2012) 4:115ra). ChAd3/GAG and ChAd155/GAG were rescued and amplified in Procell 92 unto passages 3 (P3); P3 lysates were used to infect 2 T75 flasks of Procell 92 cultivated in monolayer with each vector. A multiplicity of infection (MOI) of 100 vp/cell was used for both infection experiments. The infected cells were harvested when full CPE was evident (72 hours post-infection) and pooled; the viruses were released from the infected cells by 3 cycles of freeze/thaw (-70.degree./37.degree. C.) then the lysate was clarified by centrifugation. The clarified lysates were quantified by Quantitative PCR Analysis with primers and probe complementary to the CMV promoter region. The oligonucleotide sequences are the following: CMV for 5'-CATCTACGTATTAGTCATCGCTATTACCA-3' (SEQ ID NO: 23), CMVrev 5'-GACTTGGAAATCCCCGTGAGT-3' (SEQ ID NO: 24), CMVFAM-TAMRA probe 5'-ACATCAATGGGCGTGGATAGCGGTT-3' (SEQ ID NO: 25) (QPCRs were run on ABI Prism 7900 Sequence detector Applied Biosystem). The resulting volumetric titers (vp/ml) measured on clarified lysates and the specific productivity expressed in virus particles per cell (vo/cell) are provided in Table 1 below and illustrated in FIG. 7.
TABLE-US-00002 TABLE 1 Vector productivity from P3 lysates. Total vp Vector vp/ml (20 ml conc.) vp/cell ChAd3/GAG 9.82E+09 1.96E+11 6.61E+03 ChAd155/GAG 1.11E+10 2.22E+11 7.46E+03
[0455] To confirm the higher productivity of the ChAd155 vector expressing HIV Gag transgene, a second experiment was performed by using purified viruses as inoculum. To this end, Procell 92 cells were seeded in a T25 Flask and infected with ChAd3IGAG and ChAd155/GAG when the confluence of the cells was about 80%, using a MOI=100 vp/cell of infection. The infected cells were harvested when full CPE was evident; the viruses were released from the infected cells by freeze/thaw and clarified by centrifugation. The clarified lysates were quantified by Quantitative PCR Analysis by using following primers and probe: CMV for 5'-CATCTACGTATTAGTCATCGCTATTACCA-3' (SEQ ID NO: 23), CMV rev GACTTGGAAATCCCCGTGAGT (SEQ ID NO: 24), CMV FAM-TAMRA probe 5'-ACATCAATGGGCGTGGATAGCGGTT-3' (SEQ ID NO: 25) complementary to the CMV promoter region (samples were analysed on an ABI Prism 7900 Sequence detector--Applied Biosystems). The resulting volumetric titers (vp/ml) measured on clarified lysates and the specific productivity expressed in virus particles per cell (vp/cell) are provided in Table 2 below and illustrated in FIG. 8.
TABLE-US-00003 TABLE 2 Vector productivity from purified viruses. Total vp/T25 flask Vector vp/ml (5 ml of lysate) vp/cell ChAd3/GAG 1.00E+10 5.00E+10 1.67E+04 ChAd155/GAG 1.21E+10 6.05E+10 2.02E+04
[0456] 3.2: Production of Vectors Comprising an RSV Transgene
[0457] A different set of experiments were performed to evaluate the productivity of RSV vaccine vectors in Procell 92.S cultivated in suspension. The experiment compared PanAd3/IRSV (described in WO2012/089833) and ChAd155/RSV in parallel by infecting Procell 92.S at a cell density of 5.times.10.sup.5 cells/ml. The infected cells were harvested 3 days post infection; the virus was released from the infected cells by 3 cycles of freeze/thaw and the lysate was clarified by centrifugation. The clarified lysates were then quantified by Quantitative PCR Analysis as reported above. The volumetric productivity and the cell specific productivity are provided in Table 3 below and illustrated in FIG. 9.
TABLE-US-00004 TABLE 3 Volumetric Cell specific productivity productivity Virus (Vp/ml) Total vp (vp/cell) PanAd3/RSV 5.82E+09 2.91E+11 1.16E+4 ChAd155/RSV 3.16E+10 1.58E+12 6.31E+04
Example 4: Transgene Expression Levels
[0458] 4.1: Expression Level of HIV Gag Transgene
[0459] Expression levels were compared in parallel experiments by infecting HeLa cells with ChAd3 and ChAd155 vectors comprising an HIV Gag transgene. HeLa cells were seeded in 24 well plates and infected in duplicate with ChAd3/GAG and ChAd155/GAG purified viruses using a MOI=250 vp/cell. The supernatants of HeLa infected cells were harvested 48 hours post-infection, and the production of secreted HIV GAG protein was quantified by using a commercial ELISA Kit (HIV-1 p24 ELISA Kit, PerkinElmer Life Science). The quantification was performed according to the manufacturer's instruction by using an HIV-1 p24 antigen standard curve. The results, expressed in pg/ml of GAG protein, are illustrated in FIG. 10.
[0460] 4.1: Expression Level of RSV F Transgene
[0461] Expression levels were compared in parallel experiments by infecting HeLa cells with the above-described PanAd3 and ChAd155 vectors comprising an RSV F transgene. To this end, HeLa cells were seeded in 6 well plates and infected in duplicate with PanAd3/RSV and ChAd155/RSV purified viruses using a MOI=500 vp/cell. The supernatants were harvested 48 hours post-infection, and the production of secreted RSV F protein was quantified by ELISA. Five different dilutions of the supernatants were transferred to microplate wells which are coated with a commercial mouse anti-RSV F monoclonal antibody. The captured antigen was revealed using a secondary anti-RSV F rabbit antiserum followed by Biotin-conjugated anti-rabbit IgG, then by adding Streptavidin-AP conjugate (BD Pharmingen cat. 554065). The quantification was performed by using an RSV F protein (Sino Biological cat. 11049-V08B) standard curve. The results obtained, expressed as ug/ml of RSV F protein, are provided in Table 4 below.
TABLE-US-00005 TABLE 4 Sample .mu.g/ml RSV F protein ChAd155/RSV 5.9 PanAd3/RSV 4
[0462] A western blot analysis was also performed to confirm the higher level of transgene expression provided by the ChAd155 RSV vector relative to the PanAd3 RSV vector. HeLa cells plated in 6 well plates were infected with PanAd3/RSV and ChAd155/RSV purified viruses using MOI=250 and 500 vp/cell. The supernatants of HeLa infected cells were harvested and the production of secreted RSV F protein were analysed by non-reducing SOS gel followed by Western Blot analysis. Equivalent quantities of supernatants were loaded on non-reducing SDS gel; after electrophoresis separation, the proteins were transferred to a nitrocellulose membrane to be probed with an anti-RSV F mouse monoclonal antibody (clone RSV-F-3 catalog no: ABIN308230 available at antibodies-online.com (last accessed 13 Apr. 2015). After the incubation with primary antibody, the membrane was washed and then incubated with anti-mouse HRP conjugate secondary antibody. Finally, the assay was developed by electrochemiluminescence using standard techniques (ECL detection reagents Pierce catalog no W3252282). The Western Blot results are shown in FIG. 11. A band of about 170 kD indicated by the arrow was revealed by monoclonal antibody mAb 13 raised against the F protein, which corresponds to the expected weight of trimeric F protein. It can be seen that the ChAd155 RSV vector produced a darker band at both MOI=250 and 500 vp/cell.
Example 5: Evaluation of Immunological Potency by Mouse Immunization Experiments
[0463] 5.1: Immunogenicity of Vectors Comprising the HIV Gag Transgene
[0464] The immunogenicity of the ChAd155/GAG vector was evaluated in parallel with the ChAd3/GAG vector in BALB/c mice (5 per group). The experiment was performed by injecting 10.sup.6 viral particles intramuscularly. T-cell response was measured 3 weeks after the immunization by ex vivo IFN-gamma enzyme-linked immunospot (ELISpot) using a GAG CD8+ T cell epitope mapped in BALBIc mice. The results are shown in FIG. 12, expressed as IFN-gamma Spot Forming Cells (SFC) per million of splenocytes. Each dot represents the response in a single mouse, and the line corresponds to the mean for each dose group. Injected dose in number of virus particles and frequency of positive mice to the CD8 immunodominant peptide are shown on the x axis.
[0465] 5.2 Immunogenicity of Vectors Comprising the RSV Transgene
[0466] The immunological potency of the PanAd3/RSV and ChAd155/RSV vectors was evaluated in BALB/c mice. Both vectors were injected intramuscularly at doses of 10.sup.8, 10.sup.7 and 3.times.10.sup.6 vp. Three weeks after vaccination the splenocytes of immunized mice were isolated and analyzed by IFN-gamma-ELISpot using as antigens immunodominant peptide F and M epitopes mapped in BALB/c mice. The levels of immune-responses were reduced in line with decreasing dosage (as expected) but immune responses were clearly higher in the groups of mice immunized with ChAd155/RSV vector compared to the equivalent groups of mice immunized with PanAd3/RSV vaccine (FIG. 13). In FIG. 13, symbols show individual mouse data, expressed as IFN-gamma Spot Forming Cells (SFC)/million splenocytes, calculated as the sum of responses to the three immunodominant epitopes (F.sub.51-66 F.sub.85-93 and M2-1.sub.282-290) and corrected for background. Horizontal lines represent the mean number of IFN-gamma SFCimillion splenocytes for each dose group.
[0467] Conclusion
[0468] Taken together the results reported above demonstrated that ChAd155 is an improved adenoviral vector in comparison to ChAd3 and PanAd3 vectors. ChAd155 was shown to be more productive therefore facilitating the manufacture process, able to express higher level of transgene in vitro and also in vivo providing a stronger T-cell response against the antigens expressed in animal models.
Sequence CWU
1
1
701578PRTSimian adenovirus 1Met Lys Arg Thr Lys Thr Ser Asp Glu Ser Phe
Asn Pro Val Tyr Pro1 5 10
15Tyr Asp Thr Glu Ser Gly Pro Pro Ser Val Pro Phe Leu Thr Pro Pro
20 25 30Phe Val Ser Pro Asp Gly Phe
Gln Glu Ser Pro Pro Gly Val Leu Ser 35 40
45Leu Asn Leu Ala Glu Pro Leu Val Thr Ser His Gly Met Leu Ala
Leu 50 55 60Lys Met Gly Ser Gly Leu
Ser Leu Asp Asp Ala Gly Asn Leu Thr Ser65 70
75 80Gln Asp Ile Thr Thr Ala Ser Pro Pro Leu Lys
Lys Thr Lys Thr Asn 85 90
95Leu Ser Leu Glu Thr Ser Ser Pro Leu Thr Val Ser Thr Ser Gly Ala
100 105 110Leu Thr Val Ala Ala Ala
Ala Pro Leu Ala Val Ala Gly Thr Ser Leu 115 120
125Thr Met Gln Ser Glu Ala Pro Leu Thr Val Gln Asp Ala Lys
Leu Thr 130 135 140Leu Ala Thr Lys Gly
Pro Leu Thr Val Ser Glu Gly Lys Leu Ala Leu145 150
155 160Gln Thr Ser Ala Pro Leu Thr Ala Ala Asp
Ser Ser Thr Leu Thr Val 165 170
175Ser Ala Thr Pro Pro Leu Ser Thr Ser Asn Gly Ser Leu Gly Ile Asp
180 185 190Met Gln Ala Pro Ile
Tyr Thr Thr Asn Gly Lys Leu Gly Leu Asn Phe 195
200 205Gly Ala Pro Leu His Val Val Asp Ser Leu Asn Ala
Leu Thr Val Val 210 215 220Thr Gly Gln
Gly Leu Thr Ile Asn Gly Thr Ala Leu Gln Thr Arg Val225
230 235 240Ser Gly Ala Leu Asn Tyr Asp
Thr Ser Gly Asn Leu Glu Leu Arg Ala 245
250 255Ala Gly Gly Met Arg Val Asp Ala Asn Gly Gln Leu
Ile Leu Asp Val 260 265 270Ala
Tyr Pro Phe Asp Ala Gln Asn Asn Leu Ser Leu Arg Leu Gly Gln 275
280 285Gly Pro Leu Phe Val Asn Ser Ala His
Asn Leu Asp Val Asn Tyr Asn 290 295
300Arg Gly Leu Tyr Leu Phe Thr Ser Gly Asn Thr Lys Lys Leu Glu Val305
310 315 320Asn Ile Lys Thr
Ala Lys Gly Leu Ile Tyr Asp Asp Thr Ala Ile Ala 325
330 335Ile Asn Ala Gly Asp Gly Leu Gln Phe Asp
Ser Gly Ser Asp Thr Asn 340 345
350Pro Leu Lys Thr Lys Leu Gly Leu Gly Leu Asp Tyr Asp Ser Ser Arg
355 360 365Ala Ile Ile Ala Lys Leu Gly
Thr Gly Leu Ser Phe Asp Asn Thr Gly 370 375
380Ala Ile Thr Val Gly Asn Lys Asn Asp Asp Lys Leu Thr Leu Trp
Thr385 390 395 400Thr Pro
Asp Pro Ser Pro Asn Cys Arg Ile Tyr Ser Glu Lys Asp Ala
405 410 415Lys Phe Thr Leu Val Leu Thr
Lys Cys Gly Ser Gln Val Leu Ala Ser 420 425
430Val Ser Val Leu Ser Val Lys Gly Ser Leu Ala Pro Ile Ser
Gly Thr 435 440 445Val Thr Ser Ala
Gln Ile Val Leu Arg Phe Asp Glu Asn Gly Val Leu 450
455 460Leu Ser Asn Ser Ser Leu Asp Pro Gln Tyr Trp Asn
Tyr Arg Lys Gly465 470 475
480Asp Leu Thr Glu Gly Thr Ala Tyr Thr Asn Ala Val Gly Phe Met Pro
485 490 495Asn Leu Thr Ala Tyr
Pro Lys Thr Gln Ser Gln Thr Ala Lys Ser Asn 500
505 510Ile Val Ser Gln Val Tyr Leu Asn Gly Asp Lys Ser
Lys Pro Met Thr 515 520 525Leu Thr
Ile Thr Leu Asn Gly Thr Asn Glu Thr Gly Asp Ala Thr Val 530
535 540Ser Thr Tyr Ser Met Ser Phe Ser Trp Asn Trp
Asn Gly Ser Asn Tyr545 550 555
560Ile Asn Glu Thr Phe Gln Thr Asn Ser Phe Thr Phe Ser Tyr Ile Ala
565 570 575Gln
Glu21734DNASimian adenovirus 2atgaagcgca ccaaaacgtc tgacgagagc ttcaaccccg
tgtaccccta tgacacggaa 60agcggccctc cctccgtccc tttcctcacc cctcccttcg
tgtctcccga tggattccaa 120gaaagtcccc ccggggtcct gtctctgaac ctggccgagc
ccctggtcac ttcccacggc 180atgctcgccc tgaaaatggg aagtggcctc tccctggacg
acgctggcaa cctcacctct 240caagatatca ccaccgctag ccctcccctc aaaaaaacca
agaccaacct cagcctagaa 300acctcatccc ccctaactgt gagcacctca ggcgccctca
ccgtagcagc cgccgctccc 360ctggcggtgg ccggcacctc cctcaccatg caatcagagg
cccccctgac agtacaggat 420gcaaaactca ccctggccac caaaggcccc ctgaccgtgt
ctgaaggcaa actggccttg 480caaacatcgg ccccgctgac ggccgctgac agcagcaccc
tcacagtcag tgccacacca 540ccccttagca caagcaatgg cagcttgggt attgacatgc
aagcccccat ttacaccacc 600aatggaaaac taggacttaa ctttggcgct cccctgcatg
tggtagacag cctaaatgca 660ctgactgtag ttactggcca aggtcttacg ataaacggaa
cagccctaca aactagagtc 720tcaggtgccc tcaactatga cacatcagga aacctagaat
tgagagctgc agggggtatg 780cgagttgatg caaatggtca acttatcctt gatgtagctt
acccatttga tgcacaaaac 840aatctcagcc ttaggcttgg acagggaccc ctgtttgtta
actctgccca caacttggat 900gttaactaca acagaggcct ctacctgttc acatctggaa
ataccaaaaa gctagaagtt 960aatatcaaaa cagccaaggg tctcatttat gatgacactg
ctatagcaat caatgcgggt 1020gatgggctac agtttgactc aggctcagat acaaatccat
taaaaactaa acttggatta 1080ggactggatt atgactccag cagagccata attgctaaac
tgggaactgg cctaagcttt 1140gacaacacag gtgccatcac agtaggcaac aaaaatgatg
acaagcttac cttgtggacc 1200acaccagacc catcccctaa ctgtagaatc tattcagaga
aagatgctaa attcacactt 1260gttttgacta aatgcggcag tcaggtgttg gccagcgttt
ctgttttatc tgtaaaaggt 1320agccttgcgc ccatcagtgg cacagtaact agtgctcaga
ttgtcctcag atttgatgaa 1380aatggagttc tactaagcaa ttcttccctt gaccctcaat
actggaacta cagaaaaggt 1440gaccttacag agggcactgc atataccaac gcagtgggat
ttatgcccaa cctcacagca 1500tacccaaaaa cacagagcca aactgctaaa agcaacattg
taagtcaggt ttacttgaat 1560ggggacaaat ccaaacccat gaccctcacc attaccctca
atggaactaa tgaaacagga 1620gatgccacag taagcactta ctccatgtca ttctcatgga
actggaatgg aagtaattac 1680attaatgaaa cgttccaaac caactccttc accttctcct
acatcgccca agaa 17343593PRTSimian adenovirus 3Met Arg Arg Ala Ala
Met Tyr Gln Glu Gly Pro Pro Pro Ser Tyr Glu1 5
10 15Ser Val Val Gly Ala Ala Ala Ala Ala Pro Ser
Ser Pro Phe Ala Ser 20 25
30Gln Leu Leu Glu Pro Pro Tyr Val Pro Pro Arg Tyr Leu Arg Pro Thr
35 40 45Gly Gly Arg Asn Ser Ile Arg Tyr
Ser Glu Leu Ala Pro Leu Phe Asp 50 55
60Thr Thr Arg Val Tyr Leu Val Asp Asn Lys Ser Ala Asp Val Ala Ser65
70 75 80Leu Asn Tyr Gln Asn
Asp His Ser Asn Phe Leu Thr Thr Val Ile Gln 85
90 95Asn Asn Asp Tyr Ser Pro Ser Glu Ala Ser Thr
Gln Thr Ile Asn Leu 100 105
110Asp Asp Arg Ser His Trp Gly Gly Asp Leu Lys Thr Ile Leu His Thr
115 120 125Asn Met Pro Asn Val Asn Glu
Phe Met Phe Thr Asn Lys Phe Lys Ala 130 135
140Arg Val Met Val Ser Arg Ser His Thr Lys Glu Asp Arg Val Glu
Leu145 150 155 160Lys Tyr
Glu Trp Val Glu Phe Glu Leu Pro Glu Gly Asn Tyr Ser Glu
165 170 175Thr Met Thr Ile Asp Leu Met
Asn Asn Ala Ile Val Glu His Tyr Leu 180 185
190Lys Val Gly Arg Gln Asn Gly Val Leu Glu Ser Asp Ile Gly
Val Lys 195 200 205Phe Asp Thr Arg
Asn Phe Arg Leu Gly Leu Asp Pro Val Thr Gly Leu 210
215 220Val Met Pro Gly Val Tyr Thr Asn Glu Ala Phe His
Pro Asp Ile Ile225 230 235
240Leu Leu Pro Gly Cys Gly Val Asp Phe Thr Tyr Ser Arg Leu Ser Asn
245 250 255Leu Leu Gly Ile Arg
Lys Arg Gln Pro Phe Gln Glu Gly Phe Arg Ile 260
265 270Thr Tyr Glu Asp Leu Glu Gly Gly Asn Ile Pro Ala
Leu Leu Asp Val 275 280 285Glu Ala
Tyr Gln Asp Ser Leu Lys Glu Asn Glu Ala Gly Gln Glu Asp 290
295 300Thr Ala Pro Ala Ala Ser Ala Ala Ala Glu Gln
Gly Glu Asp Ala Ala305 310 315
320Asp Thr Ala Ala Ala Asp Gly Ala Glu Ala Asp Pro Ala Met Val Val
325 330 335Glu Ala Pro Glu
Gln Glu Glu Asp Met Asn Asp Ser Ala Val Arg Gly 340
345 350Asp Thr Phe Val Thr Arg Gly Glu Glu Lys Gln
Ala Glu Ala Glu Ala 355 360 365Ala
Ala Glu Glu Lys Gln Leu Ala Ala Ala Ala Ala Ala Ala Ala Leu 370
375 380Ala Ala Ala Glu Ala Glu Ser Glu Gly Thr
Lys Pro Ala Lys Glu Pro385 390 395
400Val Ile Lys Pro Leu Thr Glu Asp Ser Lys Lys Arg Ser Tyr Asn
Leu 405 410 415Leu Lys Asp
Ser Thr Asn Thr Ala Tyr Arg Ser Trp Tyr Leu Ala Tyr 420
425 430Asn Tyr Gly Asp Pro Ser Thr Gly Val Arg
Ser Trp Thr Leu Leu Cys 435 440
445Thr Pro Asp Val Thr Cys Gly Ser Glu Gln Val Tyr Trp Ser Leu Pro 450
455 460Asp Met Met Gln Asp Pro Val Thr
Phe Arg Ser Thr Arg Gln Val Ser465 470
475 480Asn Phe Pro Val Val Gly Ala Glu Leu Leu Pro Val
His Ser Lys Ser 485 490
495Phe Tyr Asn Asp Gln Ala Val Tyr Ser Gln Leu Ile Arg Gln Phe Thr
500 505 510Ser Leu Thr His Val Phe
Asn Arg Phe Pro Glu Asn Gln Ile Leu Ala 515 520
525Arg Pro Pro Ala Pro Thr Ile Thr Thr Val Ser Glu Asn Val
Pro Ala 530 535 540Leu Thr Asp His Gly
Thr Leu Pro Leu Arg Asn Ser Ile Gly Gly Val545 550
555 560Gln Arg Val Thr Val Thr Asp Ala Arg Arg
Arg Thr Cys Pro Tyr Val 565 570
575Tyr Lys Ala Leu Gly Ile Val Ser Pro Arg Val Leu Ser Ser Arg Thr
580 585 590Phe41779DNASimian
adenovirus 4atgcggcgcg cggcgatgta ccaggaggga cctcctccct cttacgagag
cgtggtgggc 60gcggcggcgg cggcgccctc ttctcccttt gcgtcgcagc tgctggagcc
gccgtacgtg 120cctccgcgct acctgcggcc tacggggggg agaaacagca tccgttactc
ggagctggcg 180cccctgttcg acaccacccg ggtgtacctg gtggacaaca agtcggcgga
cgtggcctcc 240ctgaactacc agaacgacca cagcaatttt ttgaccacgg tcatccagaa
caatgactac 300agcccgagcg aggccagcac ccagaccatc aatctggatg accggtcgca
ctggggcggc 360gacctgaaaa ccatcctgca caccaacatg cccaacgtga acgagttcat
gttcaccaat 420aagttcaagg cgcgggtgat ggtgtcgcgc tcgcacacca aggaagaccg
ggtggagctg 480aagtacgagt gggtggagtt cgagctgcca gagggcaact actccgagac
catgaccatt 540gacctgatga acaacgcgat cgtggagcac tatctgaaag tgggcaggca
gaacggggtc 600ctggagagcg acatcggggt caagttcgac accaggaact tccgcctggg
gctggacccc 660gtgaccgggc tggttatgcc cggggtgtac accaacgagg ccttccatcc
cgacatcatc 720ctgctgcccg gctgcggggt ggacttcact tacagccgcc tgagcaacct
cctgggcatc 780cgcaagcggc agcccttcca ggagggcttc aggatcacct acgaggacct
ggaggggggc 840aacatccccg cgctcctcga tgtggaggcc taccaggata gcttgaagga
aaatgaggcg 900ggacaggagg ataccgcccc cgccgcctcc gccgccgccg agcagggcga
ggatgctgct 960gacaccgcgg ccgcggacgg ggcagaggcc gaccccgcta tggtggtgga
ggctcccgag 1020caggaggagg acatgaatga cagtgcggtg cgcggagaca ccttcgtcac
ccggggggag 1080gaaaagcaag cggaggccga ggccgcggcc gaggaaaagc aactggcggc
agcagcggcg 1140gcggcggcgt tggccgcggc ggaggctgag tctgagggga ccaagcccgc
caaggagccc 1200gtgattaagc ccctgaccga agatagcaag aagcgcagtt acaacctgct
caaggacagc 1260accaacaccg cgtaccgcag ctggtacctg gcctacaact acggcgaccc
gtcgacgggg 1320gtgcgctcct ggaccctgct gtgcacgccg gacgtgacct gcggctcgga
gcaggtgtac 1380tggtcgctgc ccgacatgat gcaagacccc gtgaccttcc gctccacgcg
gcaggtcagc 1440aacttcccgg tggtgggcgc cgagctgctg cccgtgcact ccaagagctt
ctacaacgac 1500caggccgtct actcccagct catccgccag ttcacctctc tgacccacgt
gttcaatcgc 1560tttcctgaga accagattct ggcgcgcccg cccgccccca ccatcaccac
cgtcagtgaa 1620aacgttcctg ctctcacaga tcacgggacg ctaccgctgc gcaacagcat
cggaggagtc 1680cagcgagtga ccgttactga cgccagacgc cgcacctgcc cctacgttta
caaggccttg 1740ggcatagtct cgccgcgcgt cctttccagc cgcactttt
17795964PRTSimian adenovirus 5Met Ala Thr Pro Ser Met Met Pro
Gln Trp Ser Tyr Met His Ile Ser1 5 10
15Gly Gln Asp Ala Ser Glu Tyr Leu Ser Pro Gly Leu Val Gln
Phe Ala 20 25 30Arg Ala Thr
Asp Ser Tyr Phe Ser Leu Ser Asn Lys Phe Arg Asn Pro 35
40 45Thr Val Ala Pro Thr His Asp Val Thr Thr Asp
Arg Ser Gln Arg Leu 50 55 60Thr Leu
Arg Phe Ile Pro Val Asp Arg Glu Asp Thr Ala Tyr Ser Tyr65
70 75 80Lys Ala Arg Phe Thr Leu Ala
Val Gly Asp Asn Arg Val Leu Asp Met 85 90
95Ala Ser Thr Tyr Phe Asp Ile Arg Gly Val Leu Asp Arg
Gly Pro Thr 100 105 110Phe Lys
Pro Tyr Ser Gly Thr Ala Tyr Asn Ser Leu Ala Pro Lys Gly 115
120 125Ala Pro Asn Ser Cys Glu Trp Glu Gln Glu
Glu Thr Gln Thr Ala Glu 130 135 140Glu
Ala Gln Asp Glu Glu Glu Asp Glu Ala Glu Ala Glu Glu Glu Met145
150 155 160Pro Gln Glu Glu Gln Ala
Pro Val Lys Lys Thr His Val Tyr Ala Gln 165
170 175Ala Pro Leu Ser Gly Glu Lys Ile Thr Lys Asp Gly
Leu Gln Ile Gly 180 185 190Thr
Asp Ala Thr Ala Thr Glu Gln Lys Pro Ile Tyr Ala Asp Pro Thr 195
200 205Phe Gln Pro Glu Pro Gln Ile Gly Glu
Ser Gln Trp Asn Glu Ala Asp 210 215
220Ala Ser Val Ala Gly Gly Arg Val Leu Lys Lys Thr Thr Pro Met Lys225
230 235 240Pro Cys Tyr Gly
Ser Tyr Ala Arg Pro Thr Asn Ala Asn Gly Gly Gln 245
250 255Gly Val Leu Val Glu Lys Asp Gly Gly Lys
Met Glu Ser Gln Val Asp 260 265
270Met Gln Phe Phe Ser Thr Ser Glu Asn Ala Arg Asn Glu Ala Asn Asn
275 280 285Ile Gln Pro Lys Leu Val Leu
Tyr Ser Glu Asp Val His Met Glu Thr 290 295
300Pro Asp Thr His Ile Ser Tyr Lys Pro Ala Lys Ser Asp Asp Asn
Ser305 310 315 320Lys Val
Met Leu Gly Gln Gln Ser Met Pro Asn Arg Pro Asn Tyr Ile
325 330 335Gly Phe Arg Asp Asn Phe Ile
Gly Leu Met Tyr Tyr Asn Ser Thr Gly 340 345
350Asn Met Gly Val Leu Ala Gly Gln Ala Ser Gln Leu Asn Ala
Val Val 355 360 365Asp Leu Gln Asp
Arg Asn Thr Glu Leu Ser Tyr Gln Leu Leu Leu Asp 370
375 380Ser Met Gly Asp Arg Thr Arg Tyr Phe Ser Met Trp
Asn Gln Ala Val385 390 395
400Asp Ser Tyr Asp Pro Asp Val Arg Ile Ile Glu Asn His Gly Thr Glu
405 410 415Asp Glu Leu Pro Asn
Tyr Cys Phe Pro Leu Gly Gly Ile Gly Val Thr 420
425 430Asp Thr Tyr Gln Ala Ile Lys Thr Asn Gly Asn Gly
Asn Gly Gly Gly 435 440 445Asn Thr
Thr Trp Thr Lys Asp Glu Thr Phe Ala Asp Arg Asn Glu Ile 450
455 460Gly Val Gly Asn Asn Phe Ala Met Glu Ile Asn
Leu Ser Ala Asn Leu465 470 475
480Trp Arg Asn Phe Leu Tyr Ser Asn Val Ala Leu Tyr Leu Pro Asp Lys
485 490 495Leu Lys Tyr Asn
Pro Ser Asn Val Glu Ile Ser Asp Asn Pro Asn Thr 500
505 510Tyr Asp Tyr Met Asn Lys Arg Val Val Ala Pro
Gly Leu Val Asp Cys 515 520 525Tyr
Ile Asn Leu Gly Ala Arg Trp Ser Leu Asp Tyr Met Asp Asn Val 530
535 540Asn Pro Phe Asn His His Arg Asn Ala Gly
Leu Arg Tyr Arg Ser Met545 550 555
560Leu Leu Gly Asn Gly Arg Tyr Val Pro Phe His Ile Gln Val Pro
Gln 565 570 575Lys Phe Phe
Ala Ile Lys Asn Leu Leu Leu Leu Pro Gly Ser Tyr Thr 580
585 590Tyr Glu Trp Asn Phe Arg Lys Asp Val Asn
Met Val Leu Gln Ser Ser 595 600
605Leu Gly Asn Asp Leu Arg Val Asp Gly Ala Ser Ile Lys Phe Glu Ser 610
615 620Ile Cys Leu Tyr Ala Thr Phe Phe
Pro Met Ala His Asn Thr Ala Ser625 630
635 640Thr Leu Glu Ala Met Leu Arg Asn Asp Thr Asn Asp
Gln Ser Phe Asn 645 650
655Asp Tyr Leu Ser Ala Ala Asn Met Leu Tyr Pro Ile Pro Ala Asn Ala
660 665 670Thr Asn Val Pro Ile Ser
Ile Pro Ser Arg Asn Trp Ala Ala Phe Arg 675 680
685Gly Trp Ala Phe Thr Arg Leu Lys Thr Lys Glu Thr Pro Ser
Leu Gly 690 695 700Ser Gly Phe Asp Pro
Tyr Tyr Thr Tyr Ser Gly Ser Ile Pro Tyr Leu705 710
715 720Asp Gly Thr Phe Tyr Leu Asn His Thr Phe
Lys Lys Val Ser Val Thr 725 730
735Phe Asp Ser Ser Val Ser Trp Pro Gly Asn Asp Arg Leu Leu Thr Pro
740 745 750Asn Glu Phe Glu Ile
Lys Arg Ser Val Asp Gly Glu Gly Tyr Asn Val 755
760 765Ala Gln Cys Asn Met Thr Lys Asp Trp Phe Leu Ile
Gln Met Leu Ala 770 775 780Asn Tyr Asn
Ile Gly Tyr Gln Gly Phe Tyr Ile Pro Glu Ser Tyr Lys785
790 795 800Asp Arg Met Tyr Ser Phe Phe
Arg Asn Phe Gln Pro Met Ser Arg Gln 805
810 815Val Val Asp Glu Thr Lys Tyr Lys Asp Tyr Gln Gln
Val Gly Ile Ile 820 825 830His
Gln His Asn Asn Ser Gly Phe Val Gly Tyr Leu Ala Pro Thr Met 835
840 845Arg Glu Gly Gln Ala Tyr Pro Ala Asn
Phe Pro Tyr Pro Leu Ile Gly 850 855
860Lys Thr Ala Val Asp Ser Val Thr Gln Lys Lys Phe Leu Cys Asp Arg865
870 875 880Thr Leu Trp Arg
Ile Pro Phe Ser Ser Asn Phe Met Ser Met Gly Ala 885
890 895Leu Thr Asp Leu Gly Gln Asn Leu Leu Tyr
Ala Asn Ser Ala His Ala 900 905
910Leu Asp Met Thr Phe Glu Val Asp Pro Met Asp Glu Pro Thr Leu Leu
915 920 925Tyr Val Leu Phe Glu Val Phe
Asp Val Val Arg Val His Gln Pro His 930 935
940Arg Gly Val Ile Glu Thr Val Tyr Leu Arg Thr Pro Phe Ser Ala
Gly945 950 955 960Asn Ala
Thr Thr62880DNASimian adenovirus 6atggcgaccc catcgatgat gccgcagtgg
tcgtacatgc acatctcggg ccaggacgcc 60tcggagtacc tgagccccgg gctggtgcag
ttcgcccgcg ccaccgagag ctacttcagc 120ctgagtaaca agtttaggaa ccccacggtg
gcgcccacgc acgatgtgac caccgaccgg 180tctcagcgcc tgacgctgcg gttcattccc
gtggaccgcg aggacaccgc gtactcgtac 240aaggcgcggt tcaccctggc cgtgggcgac
aaccgcgtgc tggacatggc ctccacctac 300tttgacatcc gcggggtgct ggaccggggt
cccactttca agccctactc tggcaccgcc 360tacaactccc tggcccccaa gggcgctccc
aactcctgcg agtgggagca agaggaaact 420caggcagttg aagaagcagc agaagaggaa
gaagaagatg ctgacggtca agctgaggaa 480gagcaagcag ctaccaaaaa gactcatgta
tatgctcagg ctcccctttc tggcgaaaaa 540attagtaaag atggtctgca aataggaacg
gacgctacag ctacagaaca aaaacctatt 600tatgcagacc ctacattcca gcccgaaccc
caaatcgggg agtcccagtg gaatgaggca 660gatgctacag tcgccggcgg tagagtgcta
aagaaatcta ctcccatgaa accatgctat 720ggttcctatg caagacccac aaatgctaat
ggaggtcagg gtgtactaac ggcaaatgcc 780cagggacagc tagaatctca ggttgaaatg
caattctttt caacttctga aaacgcccgt 840aacgaggcta acaacattca gcccaaattg
gtgctgtata gtgaggatgt gcacatggag 900accccggata cgcacctttc ttacaagccc
gcaaaaagcg atgacaattc aaaaatcatg 960ctgggtcagc agtccatgcc caacagacct
aattacatcg gcttcagaga caactttatc 1020ggcctcatgt attacaatag cactggcaac
atgggagtgc ttgcaggtca ggcctctcag 1080ttgaatgcag tggtggactt gcaagacaga
aacacagaac tgtcctacca gctcttgctt 1140gattccatgg gtgacagaac cagatacttt
tccatgtgga atcaggcagt ggacagttat 1200gacccagatg ttagaattat tgaaaatcat
ggaactgaag acgagctccc caactattgt 1260ttccctctgg gtggcatagg ggtaactgac
acttaccagg ctgttaaaac caacaatggc 1320aataacgggg gccaggtgac ttggacaaaa
gatgaaactt ttgcagatcg caatgaaata 1380ggggtgggaa acaatttcgc tatggagatc
aacctcagtg ccaacctgtg gagaaacttc 1440ctgtactcca acgtggcgct gtacctacca
gacaagctta agtacaaccc ctccaatgtg 1500gacatctctg acaaccccaa cacctacgat
tacatgaaca agcgagtggt ggccccgggg 1560ctggtggact gctacatcaa cctgggcgcg
cgctggtcgc tggactacat ggacaacgtc 1620aaccccttca accaccaccg caatgcgggc
ctgcgctacc gctccatgct cctgggcaac 1680gggcgctacg tgcccttcca catccaggtg
ccccagaagt tctttgccat caagaacctc 1740ctcctcctgc cgggctccta cacctacgag
tggaacttca ggaaggatgt caacatggtc 1800ctccagagct ctctgggtaa cgatctcagg
gtggacgggg ccagcatcaa gttcgagagc 1860atctgcctct acgccacctt cttccccatg
gcccacaaca cggcctccac gctcgaggcc 1920atgctcagga acgacaccaa cgaccagtcc
ttcaatgact acctctccgc cgccaacatg 1980ctctacccca tacccgccaa cgccaccaac
gtccccatct ccatcccctc gcgcaactgg 2040gcggccttcc gcggctgggc cttcacccgc
ctcaagacca aggagacccc ctccctgggc 2100tcgggattcg acccctacta cacctactcg
ggctccattc cctacctgga cggcaccttc 2160tacctcaacc acactttcaa gaaggtctcg
gtcaccttcg actcctcggt cagctggccg 2220ggcaacgacc gtctgctcac ccccaacgag
ttcgagatca agcgctcggt cgacggggag 2280ggctacaacg tggcccagtg caacatgacc
aaggactggt tcctggtcca gatgctggcc 2340aactacaaca tcggctacca gggcttctac
atcccagaga gctacaagga caggatgtac 2400tccttcttca ggaacttcca gcccatgagc
cggcaggtgg tggaccagac caagtacaag 2460gactaccagg aggtgggcat catccaccag
cacaacaact cgggcttcgt gggctacctc 2520gcccccacca tgcgcgaggg acaggcctac
cccgccaact tcccctatcc gctcataggc 2580aagaccgcgg tcgacagcat cacccagaaa
aagttcctct gcgaccgcac cctctggcgc 2640atccccttct ccagcaactt catgtccatg
ggtgcgctct cggacctggg ccagaacttg 2700ctctacgcca actccgccca cgccctcgac
atgaccttcg aggtcgaccc catggacgag 2760cccacccttc tctatgttct gttcgaagtc
tttgacgtgg tccgggtcca ccagccgcac 2820cgcggcgtca tcgagaccgt gtacctgcgt
acgcccttct cggccggcaa cgccaccacc 2880737912DNASimian adenovirus
7catcatcaat aatatacctt attttggatt gaagccaata tgataatgag atgggcggcg
60cggggcgggg cgcggggcgg gaggcgggtt tgggggcggg ccggcgggcg gggcggtgtg
120gcggaagtgg actttgtaag tgtggcggat gtgacttgct agtgccgggc gcggtaaaag
180tgacgttttc cgtgcgcgac aacgcccccg ggaagtgaca tttttcccgc ggtttttacc
240ggatgttgta gtgaatttgg gcgtaaccaa gtaagatttg gccattttcg cgggaaaact
300gaaacgggga agtgaaatct gattaatttt gcgttagtca taccgcgtaa tatttgtcta
360gggccgaggg actttggccg attacgtgga ggactcgccc aggtgttttt tgaggtgaat
420ttccgcgttc cgggtcaaag tctgcgtttt attattatag gatatcccat tgcatacgtt
480gtatccatat cataatatgt acatttatat tggctcatgt ccaacattac cgccatgttg
540acattgatta ttgactagtt attaatagta atcaattacg gggtcattag ttcatagccc
600atatatggag ttccgcgtta cataacttac ggtaaatggc ccgcctggct gaccgcccaa
660cgacccccgc ccattgacgt caataatgac gtatgttccc atagtaacgc caatagggac
720tttccattga cgtcaatggg tggagtattt acggtaaact gcccacttgg cagtacatca
780agtgtatcat atgccaagta cgccccctat tgacgtcaat gacggtaaat ggcccgcctg
840gcattatgcc cagtacatga ccttatggga ctttcctact tggcagtaca tctacgtatt
900agtcatcgct attaccatgg tgatgcggtt ttggcagtac atcaatgggc gtggatagcg
960gtttgactca cggggatttc caagtctcca ccccattgac gtcaatggga gtttgttttg
1020gcaccaaaat caacgggact ttccaaaatg tcgtaacaac tccgccccat tgacgcaaat
1080gggcggtagg cgtgtacggt gggaggtcta tataagcaga gctctcccta tcagtgatag
1140agatctccct atcagtgata gagatcgtcg acgagctcgt ttagtgaacc gtcagatcgc
1200ctggagacgc catccacgct gttttgacct ccatagaaga caccgggacc gatccagcct
1260ccgcggccgg gaacggtgca ttggaacgcg gattccccgt gccaagagtg agatcttccg
1320tttatctagg taccgggccc cccctcgagg tcgacggtat cgataagctt cacgctgccg
1380caagcactca gggcgcaagg gctgctaaag gaagcggaac acgtagaaag ccagtccgca
1440gaaacggtgc tgaccccgga tgaatgtcag ctactgggct atctggacaa gggaaaacgc
1500aagcgcaaag agaaagcagg tagcttgcag tgggcttaca tggcgatagc tagactgggc
1560ggttttatgg acagcaagcg aaccggaatt gccagctggg gcgccctctg gtaaggttgg
1620gaagccctgc aaagtaaact ggatggcttt cttgccgcca aggatctgat ggcgcagggg
1680atcaagatct aaccaggagc tatttaatgg caacagttaa ccagctggta cgcaaaccac
1740gtgctcgcaa agttgcgaaa agcaacgtgc ctgcgctgga agcatgcccg caaaaacgtg
1800gcgtatgtac tcgtgtatat actaccactc ctaaaaaacc gaactccgcg ctgcgtaaag
1860tatgccgtgt tcgtctgact aacggtttcg aagtgacttc ctacatcggt ggtgaaggtc
1920acaacctgca ggagcactcc gtgatcctga tccgtggcgg tcgtgttaaa gacctcccgg
1980gtgttcgtta ccacaccgta cgtggtgcgc ttgactgctc cggcgttaaa gaccgtaagc
2040aggctcgttc caagtatggc gtgaagcgtc ctaaggctta atggtagatc tgatcaagag
2100acaggatgac ggtcgtttcg catgcttgaa caagatggat tgcacgcagg ttctccggcc
2160gcttgggtgg agaggctatt cggctatgac tgggcacaac agacaatcgg ctgctctgat
2220gccgccgtgt tccggctgtc agcgcagggg cgcccggttc tttttgtcaa gaccgacctg
2280tccggtgccc tgaatgaact gcaggacgag gcagcgcggc tatcgtggct ggccacgacg
2340ggcgttcctt gcgcagctgt gctcgacgtt gtcactgaag cgggaaggga ctggctgcta
2400ttgggcgaag tgccggggca ggatctcctg tcatctcacc ttgctcctgc cgagaaagta
2460tccatcatgg ctgatgcaat gcggcggctg catacgcttg atccggctac ctgcccattc
2520gaccaccaag cgaaacatcg catcgagcga gcacgtactc ggatggaagc cggtcttgtc
2580gatcaggatg atctggacga agagcatcag gggctcgcgc cagccgaact gttcgccagg
2640ctcaaggcgc gcatgcccga cggcgaggat ctcgtcgtga cccatggcga tgcctgcttg
2700ccgaatatca tggtggaaaa tggccgcttt tctggattca tcgactgtgg ccggctgggt
2760gtggcggacc gctatcagga catagcgttg gctacccgtg atattgctga agagcttggc
2820ggcgaatggg ctgaccgctt cctcgtgctt tacggtatcg ccgctcccga ttcgcagcgc
2880atcgccttct atcgccttct tgacgagttc ttctgagcgg gactctgggg ttcgaaatga
2940ccgaccaagc gacgcccaac ctgccatcac gagatttcga ttccaccgcc gccttctatg
3000aaaggttggg cttcggaatc gttttccggg acgccggctg gatgatcctc cagcgcgggg
3060atctcatgct ggagttcttc gcccaccccg ggctcgatcc cctcgggggg aatcagaatt
3120cagtcgacag cggccgcgat ctgctgtgcc ttctagttgc cagccatctg ttgtttgccc
3180ctcccccgtg ccttccttga ccctggaagg tgccactccc actgtccttt cctaataaaa
3240tgaggaaatt gcatcgcatt gtctgagtag gtgtcattct attctggggg gtggggtggg
3300gcaggacagc aagggggagg attgggaaga caatagcagg catgctgggg atgcggtggg
3360ctctatggcc gatcagcgat cgctgaggtg ggtgagtggg cgtggcctgg ggtggtcatg
3420aaaatatata agttgggggt cttagggtct ctttatttgt gttgcagaga ccgccggagc
3480catgagcggg agcagcagca gcagcagtag cagcagcgcc ttggatggca gcatcgtgag
3540cccttatttg acgacgcgga tgccccactg ggccggggtg cgtcagaatg tgatgggctc
3600cagcatcgac ggccgacccg tcctgcccgc aaattccgcc acgctgacct atgcgaccgt
3660cgcggggacg ccgttggacg ccaccgccgc cgccgccgcc accgcagccg cctcggccgt
3720gcgcagcctg gccacggact ttgcattcct gggaccactg gcgacagggg ctacttctcg
3780ggccgctgct gccgccgttc gcgatgacaa gctgaccgcc ctgctggcgc agttggatgc
3840gcttactcgg gaactgggtg acctttctca gcaggtcatg gccctgcgcc agcaggtctc
3900ctccctgcaa gctggcggga atgcttctcc cacaaatgcc gtttaagata aataaaacca
3960gactctgttt ggattaaaga aaagtagcaa gtgcattgct ctctttattt cataattttc
4020cgcgcgcgat aggccctaga ccagcgttct cggtcgttga gggtgcggtg tatcttctcc
4080aggacgtggt agaggtggct ctggacgttg agatacatgg gcatgagccc gtcccggggg
4140tggaggtagc accactgcag agcttcatgc tccggggtgg tgttgtagat gatccagtcg
4200tagcaggagc gctgggcatg gtgcctaaaa atgtccttca gcagcaggcc gatggccagg
4260gggaggccct tggtgtaagt gtttacaaaa cggttaagtt gggaagggtg cattcgggga
4320gagatgatgt gcatcttgga ctgtattttt agattggcga tgtttccgcc cagatccctt
4380ctgggattca tgttgtgcag gaccaccagt acagtgtatc cggtgcactt ggggaatttg
4440tcatgcagct tagagggaaa agcgtggaag aacttggaga cgcctttgtg gcctcccaga
4500ttttccatgc attcgtccat gatgatggca atgggcccgc gggaggcagc ttgggcaaag
4560atatttctgg ggtcgctgac gtcgtagttg tgttccaggg tgaggtcgtc ataggccatt
4620tttacaaagc gcgggcggag ggtgcccgac tgggggatga tggtcccctc tggccctggg
4680gcgtagttgc cctcgcagat ctgcatttcc caggccttaa tctcggaggg gggaatcata
4740tccacctgcg gggcgatgaa gaaaacggtt tccggagccg gggagattaa ctgggatgag
4800agcaggtttc taagcagctg tgattttcca caaccggtgg gcccataaat aacacctata
4860accggttgca gctggtagtt tagagagctg cagctgccgt cgtcccggag gaggggggcc
4920acctcgttga gcatgtccct gacgcgcatg ttctccccga ccagatccgc cagaaggcgc
4980tcgccgccca gggacagcag ctcttgcaag gaagcaaagt ttttcagcgg cttgaggccg
5040tccgccgtgg gcatgttttt cagggtctgg ctcagcagct ccaggcggtc ccagagctcg
5100gtgacgtgct ctacggcatc tctatccagc atatctcctc gtttcgcggg ttggggcgac
5160tttcgctgta gggcaccaag cggtggtcgt ccagcggggc cagagtcatg tccttccatg
5220ggcgcagggt cctcgtcagg gtggtctggg tcacggtgaa ggggtgcgct ccgggctgag
5280cgcttgccaa ggtgcgcttg aggctggttc tgctggtgct gaagcgctgc cggtcttcgc
5340cctgcgcgtc ggccaggtag catttgacca tggtgtcata gtccagcccc tccgcggcgt
5400gtcccttggc gcgcagcttg cccttggagg tggcgccgca cgaggggcag agcaggctct
5460tgagcgcgta gagcttgggg gcgaggaaga ccgattcggg ggagtaggcg tccgcgccgc
5520agaccccgca cacggtctcg cactccacca gccaggtgag ctcggggcgc gccgggtcaa
5580aaaccaggtt tcccccatgc tttttgatgc gtttcttacc tcgggtctcc atgaggtggt
5640gtccccgctc ggtgacgaag aggctgtccg tgtctccgta gaccgacttg aggggtcttt
5700tctccagggg ggtccctcgg tcttcctcgt agaggaactc ggaccactct gagacgaagg
5760cccgcgtcca ggccaggacg aaggaggcta tgtgggaggg gtagcggtcg ttgtccacta
5820gggggtccac cttctccaag gtgtgaagac acatgtcgcc ttcctcggcg tccaggaagg
5880tgattggctt gtaggtgtag gccacgtgac cgggggttcc tgacgggggg gtataaaagg
5940gggtgggggc gcgctcgtcg tcactctctt ccgcatcgct gtctgcgagg gccagctgct
6000ggggtgagta ttccctctcg aaggcgggca tgacctccgc gctgaggttg tcagtttcca
6060aaaacgagga ggatttgatg ttcacctgtc ccgaggtgat acctttgagg gtacccgcgt
6120ccatctggtc agaaaacacg atctttttat tgtccagctt ggtggcgaac gacccgtaga
6180gggcgttgga gagcagcttg gcgatggagc gcagggtctg gttcttgtcc ctgtcggcgc
6240gctccttggc cgcgatgttg agctgcacgt actcgcgcgc gacgcagcgc cactcgggga
6300agacggtggt gcgctcgtcg ggcaccaggc gcacgcgcca gccgcggttg tgcagggtga
6360ccaggtccac gctggtggcg acctcgccgc gcaggcgctc gttggtccag cagagacggc
6420cgcccttgcg cgagcagaag gggggcaggg ggtcgagctg ggtctcgtcc ggggggtccg
6480cgtccacggt gaaaaccccg gggcgcaggc gcgcgtcgaa gtagtctatc ttgcaacctt
6540gcatgtccag cgcctgctgc cagtcgcggg cggcgagcgc gcgctcgtag gggttgagcg
6600gcgggcccca gggcatgggg tgggtgagtg cggaggcgta catgccgcag atgtcataga
6660cgtagagggg ctcccgcagg accccgatgt aggtggggta gcagcggccg ccgcggatgc
6720tggcgcgcac gtagtcatac agctcgtgcg agggggcgag gaggtcgggg cccaggttgg
6780tgcgggcggg gcgctccgcg cggaagacga tctgcctgaa gatggcatgc gagttggaag
6840agatggtggg gcgctggaag acgttgaagc tggcgtcctg caggccgacg gcgtcgcgca
6900cgaaggaggc gtaggagtcg cgcagcttgt gtaccagctc ggcggtgacc tgcacgtcga
6960gcgcgcagta gtcgagggtc tcgcggatga tgtcatattt agcctgcccc ttctttttcc
7020acagctcgcg gttgaggaca aactcttcgc ggtctttcca gtactcttgg atcgggaaac
7080cgtccggttc cgaacggtaa gagcctagca tgtagaactg gttgacggcc tggtaggcgc
7140agcagccctt ctccacgggg agggcgtagg cctgcgcggc cttgcggagc gaggtgtggg
7200tcagggcgaa ggtgtccctg accatgactt tgaggtactg gtgcttgaag tcggagtcgt
7260cgcagccgcc ccgctcccag agcgagaagt cggtgcgctt cttggagcgg gggttgggca
7320gagcgaaggt gacatcgttg aagaggattt tgcccgcgcg gggcatgaag ttgcgggtga
7380tgcggaaggg ccccggcact tcagagcggt tgttgatgac ctgggcggcg agcacgatct
7440cgtcgaagcc gttgatgttg tggcccacga tgtagagttc caggaagcgg ggccggccct
7500ttacggtggg cagcttcttt agctcttcgt aggtgagctc ctcgggcgag gcgaggccgt
7560gctcggccag ggcccagtcc gcgaggtgcg ggttgtctct gaggaaggac ttccagaggt
7620cgcgggccag gagggtctgc aggcggtctc tgaaggtcct gaactggcgg cccacggcca
7680ttttttcggg ggtgatgcag tagaaggtga gggggtcttg ctgccagcgg tcccagtcga
7740gctgcagggc gaggtcgcgc gcggcggtga ccaggcgctc gtcgcccccg aatttcatga
7800ccagcatgaa gggcacgagc tgctttccga aggcccccat ccaagtgtag gtctctacat
7860cgtaggtgac aaagaggcgc tccgtgcgag gatgcgagcc gatcgggaag aactggatct
7920cccgccacca gttggaggag tggctgttga tgtggtggaa gtagaagtcc cgtcgccggg
7980ccgaacactc gtgctggctt ttgtaaaagc gagcgcagta ctggcagcgc tgcacgggct
8040gtacctcatg cacgagatgc acctttcgcc cgcgcacgag gaagccgagg ggaaatctga
8100gccccccgcc tggctcgcgg catggctggt tctcttctac tttggatgcg tgtccgtctc
8160cgtctggctc ctcgaggggt gttacggtgg agcggaccac cacgccgcgc gagccgcagg
8220tccagatatc ggcgcgcggc ggtcggagtt tgatgacgac atcgcgcagc tgggagctgt
8280ccatggtctg gagctcccgc ggcggcggca ggtcagccgg gagttcttgc aggttcacct
8340cgcagagtcg ggccagggcg cggggcaggt ctaggtggta cctgatctct aggggcgtgt
8400tggtggcggc gtcgatggct tgcaggagcc cgcagccccg gggggcgacg acggtgcccc
8460gcggggtggt ggtggtggtg gcggtgcagc tcagaagcgg tgccgcgggc gggcccccgg
8520aggtaggggg ggctccggtc ccgcgggcag gggcggcagc ggcacgtcgg cgtggagcgc
8580gggcaggagt tggtgctgtg cccggaggtt gctggcgaag gcgacgacgc ggcggttgat
8640ctcctggatc tggcgcctct gcgtgaagac gacgggcccg gtgagcttga acctgaaaga
8700gagttcgaca gaatcaatct cggtgtcatt gaccgcggcc tggcgcagga tctcctgcac
8760gtctcccgag ttgtcttggt aggcgatctc ggccatgaac tgctcgatct cttcctcctg
8820gaggtctccg cgtccggcgc gttccacggt ggccgccagg tcgttggaga tgcgccccat
8880gagctgcgag aaggcgttga gtccgccctc gttccagact cggctgtaga ccacgccccc
8940ctggtcatcg cgggcgcgca tgaccacctg cgcgaggttg agctccacgt gccgcgcgaa
9000gacggcgtag ttgcgcagac gctggaagag gtagttgagg gtggtggcgg tgtgctcggc
9060cacgaagaag ttcatgaccc agcggcgcaa cgtggattcg ttgatgtccc ccaaggcctc
9120cagccgttcc atggcctcgt agaagtccac ggcgaagttg aaaaactggg agttgcgcgc
9180cgacacggtc aactcctcct ccagaagacg gatgagctcg gcgacggtgt cgcgcacctc
9240gcgctcgaag gctatgggga tctcttcctc cgctagcatc accacctcct cctcttcctc
9300ctcttctggc acttccatga tggcttcctc ctcttcgggg ggtggcggcg gcggcggtgg
9360gggagggggc gctctgcgcc ggcggcggcg caccgggagg cggtccacga agcgcgcgat
9420catctccccg cggcggcggc gcatggtctc ggtgacggcg cggccgttct cccgggggcg
9480cagttggaag acgccgccgg acatctggtg ctggggcggg tggccgtgag gcagcgagac
9540ggcgctgacg atgcatctca acaattgctg cgtaggtacg ccgccgaggg acctgaggga
9600gtccatatcc accggatccg aaaacctttc gaggaaggcg tctaaccagt cgcagtcgca
9660aggtaggctg agcaccgtgg cgggcggcgg ggggtggggg gagtgtctgg cggaggtgct
9720gctgatgatg taattgaagt aggcggactt gacacggcgg atggtcgaca ggagcaccat
9780gtccttgggt ccggcctgct ggatgcggag gcggtcggct atgccccagg cttcgttctg
9840gcatcggcgc aggtccttgt agtagtcttg catgagcctt tccaccggca cctcttctcc
9900ttcctcttct gcttcttcca tgtctgcttc ggccctgggg cggcgccgcg cccccctgcc
9960ccccatgcgc gtgaccccga accccctgag cggttggagc agggccaggt cggcgacgac
10020gcgctcggcc aggatggcct gctgcacctg cgtgagggtg gtttggaagt catccaagtc
10080cacgaagcgg tggtaggcgc ccgtgttgat ggtgtaggtg cagttggcca tgacggacca
10140gttgacggtc tggtggcccg gttgcgacat ctcggtgtac ctgagtcgcg agtaggcgcg
10200ggagtcgaag acgtagtcgt tgcaagtccg caccaggtac tggtagccca ccaggaagtg
10260cggcggcggc tggcggtaga ggggccagcg cagggtggcg ggggctccgg gggccaggtc
10320ttccagcatg aggcggtggt aggcgtagat gtacctggac atccaggtga tacccgcggc
10380ggtggtggag gcgcgcggga agtcgcgcac ccggttccag atgttgcgca ggggcagaaa
10440gtgctccatg gtaggcgtgc tctgtccagt cagacgcgcg cagtcgttga tactctagac
10500cagggaaaac gaaagccggt cagcgggcac tcttccgtgg tctggtgaat agatcgcaag
10560ggtatcatgg cggagggcct cggttcgagc cccgggtccg ggccggacgg tccgccatga
10620tccacgcggt taccgcccgc gtgtcgaacc caggtgtgcg acgtcagaca acggtggagt
10680gttccttttg gcgtttttct ggccgggcgc cggcgccgcg taagagacta agccgcgaaa
10740gcgaaagcag taagtggctc gctccccgta gccggaggga tccttgctaa gggttgcgtt
10800gcggcgaacc ccggttcgaa tcccgtactc gggccggccg gacccgcggc taaggtgttg
10860gattggcctc cccctcgtat aaagaccccg cttgcggatt gactccggac acggggacga
10920gcccctttta tttttgcttt ccccagatgc atccggtgct gcggcagatg cgccccccgc
10980cccagcagca gcaacaacac cagcaagagc ggcagcaaca gcagcgggag tcatgcaggg
11040ccccctcacc caccctcggc gggccggcca cctcggcgtc cgcggccgtg tctggcgcct
11100gcggcggcgg cggggggccg gctgacgacc ccgaggagcc cccgcggcgc agggccagac
11160actacctgga cctggaggag ggcgagggcc tggcgcggct gggggcgccg tctcccgagc
11220gccacccgcg ggtgcagctg aagcgcgact cgcgcgaggc gtacgtgcct cggcagaacc
11280tgttcaggga ccgcgcgggc gaggagcccg aggagatgcg ggacaggagg ttcagcgcag
11340ggcgggagct gcggcagggg ctgaaccgcg agcggctgct gcgcgaggag gactttgagc
11400ccgacgcgcg gacggggatc agccccgcgc gcgcgcacgt ggcggccgcc gacctggtga
11460cggcgtacga gcagacggtg aaccaggaga tcaacttcca aaagagtttc aacaaccacg
11520tgcgcacgct ggtggcgcgc gaggaggtga ccatcgggct gatgcacctg tgggactttg
11580taagcgcgct ggtgcagaac cccaacagca agcctctgac ggcgcagctg ttcctgatag
11640tgcagcacag cagggacaac gaggcgttta gggacgcgct gctgaacatc accgagcccg
11700agggtcggtg gctgctggac ctgattaaca tcctgcagag catagtggtg caggagcgca
11760gcctgagcct ggccgacaag gtggcggcca tcaactactc gatgctgagc ctgggcaagt
11820tttacgcgcg caagatctac cagacgccgt acgtgcccat agacaaggag gtgaagatcg
11880acggttttta catgcgcatg gcgctgaagg tgctcaccct gagcgacgac ctgggcgtgt
11940accgcaacga gcgcatccac aaggccgtga gcgtgagccg gcggcgcgag ctgagcgacc
12000gcgagctgat gcacagcctg cagcgggcgc tggcgggcgc cggcagcggc gacagggagg
12060cggagtccta cttcgatgcg ggggcggacc tgcgctgggc gcccagccgg cgggccctgg
12120aggccgcggg ggtccgcgag gactatgacg aggacggcga ggaggatgag gagtacgagc
12180tagaggaggg cgagtacctg gactaaaccg cgggtggtgt ttccggtaga tgcaagaccc
12240gaacgtggtg gacccggcgc tgcgggcggc tctgcagagc cagccgtccg gccttaactc
12300ctcagacgac tggcgacagg tcatggaccg catcatgtcg ctgacggcgc gtaacccgga
12360cgcgttccgg cagcagccgc aggccaacag gctctccgcc atcctggagg cggtggtgcc
12420tgcgcgctcg aaccccacgc acgagaaggt gctggccata gtgaacgcgc tggccgagaa
12480cagggccatc cgcccggacg aggccgggct ggtgtacgac gcgctgctgc agcgcgtggc
12540ccgctacaac agcggcaacg tgcagaccaa cctggaccgg ctggtggggg acgtgcgcga
12600ggcggtggcg cagcgcgagc gcgcggatcg gcagggcaac ctgggctcca tggtggcgct
12660gaatgccttc ctgagcacgc agccggccaa cgtgccgcgg gggcaggaag actacaccaa
12720ctttgtgagc gcgctgcggc tgatggtgac cgagaccccc cagagcgagg tgtaccagtc
12780gggcccggac tacttcttcc agaccagcag acagggcctg cagacggtga acctgagcca
12840ggctttcaag aacctgcggg ggctgtgggg cgtgaaggcg cccaccggcg accgggcgac
12900ggtgtccagc ctgctgacgc ccaactcgcg cctgctgctg ctgctgatcg cgccgttcac
12960ggacagcggc agcgtgtccc gggacaccta cctggggcac ctgctgaccc tgtaccgcga
13020ggccatcggg caggcgcagg tggacgagca caccttccag gagatcacca gcgtgagccg
13080cgcgctgggg caggaggaca cgagcagcct ggaggcgact ctgaactacc tgctgaccaa
13140ccggcggcag aagattccct cgctgcacag cctgacctcc gaggaggagc gcatcttgcg
13200ctacgtgcag cagagcgtga gcctgaacct gatgcgcgac ggggtgacgc ccagcgtggc
13260gctggacatg accgcgcgca acatggaacc gggcatgtac gccgcgcacc ggccttacat
13320caaccgcctg atggactacc tgcatcgcgc ggcggccgtg aaccccgagt actttaccaa
13380cgccatcctg aacccgcact ggctcccgcc gcccgggttc tacagcgggg gcttcgaggt
13440cccggagacc aacgatggct tcctgtggga cgacatggac gacagcgtgt tctccccgcg
13500gccgcaggcg ctggcggaag cgtccctgct gcgtcccaag aaggaggagg aggaggaggc
13560gagtcgccgc cgcggcagca gcggcgtggc ttctctgtcc gagctggggg cggcagccgc
13620cgcgcgcccc gggtccctgg gcggcagccc ctttccgagc ctggtggggt ctctgcacag
13680cgagcgcacc acccgccctc ggctgctggg cgaggacgag tacctgaata actccctgct
13740gcagccggtg cgggagaaaa acctgcctcc cgccttcccc aacaacggga tagagagcct
13800ggtggacaag atgagcagat ggaagaccta tgcgcaggag cacagggacg cgcctgcgct
13860ccggccgccc acgcggcgcc agcgccacga ccggcagcgg gggctggtgt gggatgacga
13920ggactccgcg gacgatagca gcgtgctgga cctgggaggg agcggcaacc cgttcgcgca
13980cctgcgcccc cgcctgggga ggatgtttta aaaaaaaaaa aaaaaagcaa gaagcatgat
14040gcaaaaatta aataaaactc accaaggcca tggcgaccga gcgttggttt cttgtgttcc
14100cttcagtatg cggcgcgcgg cgatgtacca ggagggacct cctccctctt acgagagcgt
14160ggtgggcgcg gcggcggcgg cgccctcttc tccctttgcg tcgcagctgc tggagccgcc
14220gtacgtgcct ccgcgctacc tgcggcctac gggggggaga aacagcatcc gttactcgga
14280gctggcgccc ctgttcgaca ccacccgggt gtacctggtg gacaacaagt cggcggacgt
14340ggcctccctg aactaccaga acgaccacag caattttttg accacggtca tccagaacaa
14400tgactacagc ccgagcgagg ccagcaccca gaccatcaat ctggatgacc ggtcgcactg
14460gggcggcgac ctgaaaacca tcctgcacac caacatgccc aacgtgaacg agttcatgtt
14520caccaataag ttcaaggcgc gggtgatggt gtcgcgctcg cacaccaagg aagaccgggt
14580ggagctgaag tacgagtggg tggagttcga gctgccagag ggcaactact ccgagaccat
14640gaccattgac ctgatgaaca acgcgatcgt ggagcactat ctgaaagtgg gcaggcagaa
14700cggggtcctg gagagcgaca tcggggtcaa gttcgacacc aggaacttcc gcctggggct
14760ggaccccgtg accgggctgg ttatgcccgg ggtgtacacc aacgaggcct tccatcccga
14820catcatcctg ctgcccggct gcggggtgga cttcacttac agccgcctga gcaacctcct
14880gggcatccgc aagcggcagc ccttccagga gggcttcagg atcacctacg aggacctgga
14940ggggggcaac atccccgcgc tcctcgatgt ggaggcctac caggatagct tgaaggaaaa
15000tgaggcggga caggaggata ccgcccccgc cgcctccgcc gccgccgagc agggcgagga
15060tgctgctgac accgcggccg cggacggggc agaggccgac cccgctatgg tggtggaggc
15120tcccgagcag gaggaggaca tgaatgacag tgcggtgcgc ggagacacct tcgtcacccg
15180gggggaggaa aagcaagcgg aggccgaggc cgcggccgag gaaaagcaac tggcggcagc
15240agcggcggcg gcggcgttgg ccgcggcgga ggctgagtct gaggggacca agcccgccaa
15300ggagcccgtg attaagcccc tgaccgaaga tagcaagaag cgcagttaca acctgctcaa
15360ggacagcacc aacaccgcgt accgcagctg gtacctggcc tacaactacg gcgacccgtc
15420gacgggggtg cgctcctgga ccctgctgtg cacgccggac gtgacctgcg gctcggagca
15480ggtgtactgg tcgctgcccg acatgatgca agaccccgtg accttccgct ccacgcggca
15540ggtcagcaac ttcccggtgg tgggcgccga gctgctgccc gtgcactcca agagcttcta
15600caacgaccag gccgtctact cccagctcat ccgccagttc acctctctga cccacgtgtt
15660caatcgcttt cctgagaacc agattctggc gcgcccgccc gcccccacca tcaccaccgt
15720cagtgaaaac gttcctgctc tcacagatca cgggacgcta ccgctgcgca acagcatcgg
15780aggagtccag cgagtgaccg ttactgacgc cagacgccgc acctgcccct acgtttacaa
15840ggccttgggc atagtctcgc cgcgcgtcct ttccagccgc actttttgag caacaccacc
15900atcatgtcca tcctgatctc acccagcaat aactccggct ggggactgct gcgcgcgccc
15960agcaagatgt tcggaggggc gaggaagcgt tccgagcagc accccgtgcg cgtgcgcggg
16020cacttccgcg ccccctgggg agcgcacaaa cgcggccgcg cggggcgcac caccgtggac
16080gacgccatcg actcggtggt ggagcaggcg cgcaactaca ggcccgcggt ctctaccgtg
16140gacgcggcca tccagaccgt ggtgcggggc gcgcggcggt acgccaagct gaagagccgc
16200cggaagcgcg tggcccgccg ccaccgccgc cgacccgggg ccgccgccaa acgcgccgcc
16260gcggccctgc ttcgccgggc caagcgcacg ggccgccgcg ccgccatgag ggccgcgcgc
16320cgcttggccg ccggcatcac cgccgccacc atggcccccc gtacccgaag acgcgcggcc
16380gccgccgccg ccgccgccat cagtgacatg gccagcaggc gccggggcaa cgtgtactgg
16440gtgcgcgact cggtgaccgg cacgcgcgtg cccgtgcgct tccgcccccc gcggacttga
16500gatgatgtga aaaaacaaca ctgagtctcc tgctgttgtg tgtatcccag cggcggcggc
16560gcgcgcagcg tcatgtccaa gcgcaaaatc aaagaagaga tgctccaggt cgtcgcgccg
16620gagatctatg ggcccccgaa gaaggaagag caggattcga agccccgcaa gataaagcgg
16680gtcaaaaaga aaaagaaaga tgatgacgat gccgatgggg aggtggagtt cctgcgcgcc
16740acggcgccca ggcgcccggt gcagtggaag ggccggcgcg taaagcgcgt cctgcgcccc
16800ggcaccgcgg tggtcttcac gcccggcgag cgctccaccc ggactttcaa gcgcgtctat
16860gacgaggtgt acggcgacga agacctgctg gagcaggcca acgagcgctt cggagagttt
16920gcttacggga agcgtcagcg ggcgctgggg aaggaggacc tgctggcgct gccgctggac
16980cagggcaacc ccacccccag tctgaagccc gtgaccctgc agcaggtgct gccgagcagc
17040gcaccctccg aggcgaagcg gggtctgaag cgcgagggcg gcgacctggc gcccaccgtg
17100cagctcatgg tgcccaagcg gcagaggctg gaggatgtgc tggagaaaat gaaagtagac
17160cccggtctgc agccggacat cagggtccgc cccatcaagc aggtggcgcc gggcctcggc
17220gtgcagaccg tggacgtggt catccccacc ggcaactccc ccgccgccgc caccactacc
17280gctgcctcca cggacatgga gacacagacc gatcccgccg cagccgcagc cgcagccgcc
17340gccgcgacct cctcggcgga ggtgcagacg gacccctggc tgccgccggc gatgtcagct
17400ccccgcgcgc gtcgcgggcg caggaagtac ggcgccgcca acgcgctcct gcccgagtac
17460gccttgcatc cttccatcgc gcccaccccc ggctaccgag gctataccta ccgcccgcga
17520agagccaagg gttccacccg ccgtccccgc cgacgcgccg ccgccaccac ccgccgccgc
17580cgccgcagac gccagcccgc actggctcca gtctccgtga ggaaagtggc gcgcgacgga
17640cacaccctgg tgctgcccag ggcgcgctac caccccagca tcgtttaaaa gcctgttgtg
17700gttcttgcag atatggccct cacttgccgc ctccgtttcc cggtgccggg ataccgagga
17760ggaagatcgc gccgcaggag gggtctggcc ggccgcggcc tgagcggagg cagccgccgc
17820gcgcaccggc ggcgacgcgc caccagccga cgcatgcgcg gcggggtgct gcccctgtta
17880atccccctga tcgccgcggc gatcggcgcc gtgcccggga tcgcctccgt ggccttgcaa
17940gcgtcccaga ggcattgaca gacttgcaaa cttgcaaata tggaaaaaaa aaccccaata
18000aaaaagtcta gactctcacg ctcgcttggt cctgtgacta ttttgtagaa tggaagacat
18060caactttgcg tcgctggccc cgcgtcacgg ctcgcgcccg ttcctgggac actggaacga
18120tatcggcacc agcaacatga gcggtggcgc cttcagttgg ggctctctgt ggagcggcat
18180taaaagtatc gggtctgccg ttaaaaatta cggctcccgg gcctggaaca gcagcacggg
18240ccagatgttg agagacaagt tgaaagagca gaacttccag cagaaggtgg tggagggcct
18300ggcctccggc atcaacgggg tggtggacct ggccaaccag gccgtgcaga ataagatcaa
18360cagcagactg gacccccggc cgccggtgga ggaggtgccg ccggcgctgg agacggtgtc
18420ccccgatggg cgtggcgaga agcgcccgcg gcccgatagg gaagagacca ctctggtcac
18480gcagaccgat gagccgcccc cgtatgagga ggccctgaag caaggtctgc ccaccacgcg
18540gcccatcgcg cccatggcca ccggggtggt gggccgccac acccccgcca cgctggactt
18600gcctccgccc gccgatgtgc cgcagcagca gaaggcggca cagccgggcc cgcccgcgac
18660cgcctcccgt tcctccgccg gtcctctgcg ccgcgcggcc agcggccccc gcgggggggt
18720cgcgaggcac ggcaactggc agagcacgct gaacagcatc gtgggtctgg gggtgcggtc
18780cgtgaagcgc cgccgatgct actgaatagc ttagctaacg tgttgtatgt gtgtatgcgc
18840cctatgtcgc cgccagagga gctgctgagt cgccgccgtt cgcgcgccca ccaccaccgc
18900cactccgccc ctcaagatgg cgaccccatc gatgatgccg cagtggtcgt acatgcacat
18960ctcgggccag gacgcctcgg agtacctgag ccccgggctg gtgcagttcg cccgcgccac
19020cgagagctac ttcagcctga gtaacaagtt taggaacccc acggtggcgc ccacgcacga
19080tgtgaccacc gaccggtctc agcgcctgac gctgcggttc attcccgtgg accgcgagga
19140caccgcgtac tcgtacaagg cgcggttcac cctggccgtg ggcgacaacc gcgtgctgga
19200catggcctcc acctactttg acatccgcgg ggtgctggac cggggtccca ctttcaagcc
19260ctactctggc accgcctaca actccctggc ccccaagggc gctcccaact cctgcgagtg
19320ggagcaagag gaaactcagg cagttgaaga agcagcagaa gaggaagaag aagatgctga
19380cggtcaagct gaggaagagc aagcagctac caaaaagact catgtatatg ctcaggctcc
19440cctttctggc gaaaaaatta gtaaagatgg tctgcaaata ggaacggacg ctacagctac
19500agaacaaaaa cctatttatg cagaccctac attccagccc gaaccccaaa tcggggagtc
19560ccagtggaat gaggcagatg ctacagtcgc cggcggtaga gtgctaaaga aatctactcc
19620catgaaacca tgctatggtt cctatgcaag acccacaaat gctaatggag gtcagggtgt
19680actaacggca aatgcccagg gacagctaga atctcaggtt gaaatgcaat tcttttcaac
19740ttctgaaaac gcccgtaacg aggctaacaa cattcagccc aaattggtgc tgtatagtga
19800ggatgtgcac atggagaccc cggatacgca cctttcttac aagcccgcaa aaagcgatga
19860caattcaaaa atcatgctgg gtcagcagtc catgcccaac agacctaatt acatcggctt
19920cagagacaac tttatcggcc tcatgtatta caatagcact ggcaacatgg gagtgcttgc
19980aggtcaggcc tctcagttga atgcagtggt ggacttgcaa gacagaaaca cagaactgtc
20040ctaccagctc ttgcttgatt ccatgggtga cagaaccaga tacttttcca tgtggaatca
20100ggcagtggac agttatgacc cagatgttag aattattgaa aatcatggaa ctgaagacga
20160gctccccaac tattgtttcc ctctgggtgg cataggggta actgacactt accaggctgt
20220taaaaccaac aatggcaata acgggggcca ggtgacttgg acaaaagatg aaacttttgc
20280agatcgcaat gaaatagggg tgggaaacaa tttcgctatg gagatcaacc tcagtgccaa
20340cctgtggaga aacttcctgt actccaacgt ggcgctgtac ctaccagaca agcttaagta
20400caacccctcc aatgtggaca tctctgacaa ccccaacacc tacgattaca tgaacaagcg
20460agtggtggcc ccggggctgg tggactgcta catcaacctg ggcgcgcgct ggtcgctgga
20520ctacatggac aacgtcaacc ccttcaacca ccaccgcaat gcgggcctgc gctaccgctc
20580catgctcctg ggcaacgggc gctacgtgcc cttccacatc caggtgcccc agaagttctt
20640tgccatcaag aacctcctcc tcctgccggg ctcctacacc tacgagtgga acttcaggaa
20700ggatgtcaac atggtcctcc agagctctct gggtaacgat ctcagggtgg acggggccag
20760catcaagttc gagagcatct gcctctacgc caccttcttc cccatggccc acaacacggc
20820ctccacgctc gaggccatgc tcaggaacga caccaacgac cagtccttca atgactacct
20880ctccgccgcc aacatgctct accccatacc cgccaacgcc accaacgtcc ccatctccat
20940cccctcgcgc aactgggcgg ccttccgcgg ctgggccttc acccgcctca agaccaagga
21000gaccccctcc ctgggctcgg gattcgaccc ctactacacc tactcgggct ccattcccta
21060cctggacggc accttctacc tcaaccacac tttcaagaag gtctcggtca ccttcgactc
21120ctcggtcagc tggccgggca acgaccgtct gctcaccccc aacgagttcg agatcaagcg
21180ctcggtcgac ggggagggct acaacgtggc ccagtgcaac atgaccaagg actggttcct
21240ggtccagatg ctggccaact acaacatcgg ctaccagggc ttctacatcc cagagagcta
21300caaggacagg atgtactcct tcttcaggaa cttccagccc atgagccggc aggtggtgga
21360ccagaccaag tacaaggact accaggaggt gggcatcatc caccagcaca acaactcggg
21420cttcgtgggc tacctcgccc ccaccatgcg cgagggacag gcctaccccg ccaacttccc
21480ctatccgctc ataggcaaga ccgcggtcga cagcatcacc cagaaaaagt tcctctgcga
21540ccgcaccctc tggcgcatcc ccttctccag caacttcatg tccatgggtg cgctctcgga
21600cctgggccag aacttgctct acgccaactc cgcccacgcc ctcgacatga ccttcgaggt
21660cgaccccatg gacgagccca cccttctcta tgttctgttc gaagtctttg acgtggtccg
21720ggtccaccag ccgcaccgcg gcgtcatcga gaccgtgtac ctgcgtacgc ccttctcggc
21780cggcaacgcc accacctaaa gaagcaagcc gcagtcatcg ccgcctgcat gccgtcgggt
21840tccaccgagc aagagctcag ggccatcgtc agagacctgg gatgcgggcc ctattttttg
21900ggcaccttcg acaagcgctt ccctggcttt gtctccccac acaagctggc ctgcgccatc
21960gtcaacacgg ccggccgcga gaccgggggc gtgcactggc tggccttcgc ctggaacccg
22020cgctccaaaa catgcttcct ctttgacccc ttcggctttt cggaccagcg gctcaagcaa
22080atctacgagt tcgagtacga gggcttgctg cgtcgcagcg ccatcgcctc ctcgcccgac
22140cgctgcgtca ccctcgaaaa gtccacccag accgtgcagg ggcccgactc ggccgcctgc
22200ggtctcttct gctgcatgtt tctgcacgcc tttgtgcact ggcctcagag tcccatggac
22260cgcaacccca ccatgaactt gctgacgggg gtgcccaact ccatgctcca gagcccccag
22320gtcgagccca ccctgcgccg caaccaggag cagctctaca gcttcctgga gcgccactcg
22380ccttacttcc gccgccacag cgcacagatc aggagggcca cctccttctg ccacttgcaa
22440gagatgcaag aagggtaata acgatgtaca cacttttttt ctcaataaat ggcatctttt
22500tatttataca agctctctgg ggtattcatt tcccaccacc acccgccgtt gtcgccatct
22560ggctctattt agaaatcgaa agggttctgc cgggagtcgc cgtgcgccac gggcagggac
22620acgttgcgat actggtagcg ggtgccccac ttgaactcgg gcaccaccag gcgaggcagc
22680tcggggaagt tttcgctcca caggctgcgg gtcagcacca gcgcgttcat caggtcgggc
22740gccgagatct tgaagtcgca gttggggccg ccgccctgcg cgcgcgagtt gcggtacacc
22800gggttgcagc actggaacac caacagcgcc gggtgcttca cgctggccag cacgctgcgg
22860tcggagatca gctcggcgtc caggtcctcc gcgttgctca gcgcgaacgg ggtcatcttg
22920ggcacttgcc gccccaggaa gggcgcgtgc cccggtttcg agttgcagtc gcagcgcagc
22980gggatcagca ggtgcccgtg cccggactcg gcgttggggt acagcgcgcg catgaaggcc
23040tgcatctggc ggaaggccat ctgggccttg gcgccctccg agaagaacat gccgcaggac
23100ttgcccgaga actggtttgc ggggcagctg gcgtcgtgca ggcagcagcg cgcgtcggtg
23160ttggcgatct gcaccacgtt gcgcccccac cggttcttca cgatcttggc cttggacgat
23220tgctccttca gcgcgcgctg cccgttctcg ctggtcacat ccatctcgat cacatgttcc
23280ttgttcacca tgctgctgcc gtgcagacac ttcagctcgc cctccgtctc ggtgcagcgg
23340tgctgccaca gcgcgcagcc cgtgggctcg aaagacttgt aggtcacctc cgcgaaggac
23400tgcaggtacc cctgcaaaaa gcggcccatc atggtcacga aggtcttgtt gctgctgaag
23460gtcagctgca gcccgcggtg ctcctcgttc agccaggtct tgcacacggc cgccagcgcc
23520tccacctggt cgggcagcat cttgaagttc accttcagct cattctccac gtggtacttg
23580tccatcagcg tgcgcgccgc ctccatgccc ttctcccagg ccgacaccag cggcaggctc
23640acggggttct tcaccatcac cgtggccgcc gcctccgccg cgctttcgct ttccgccccg
23700ctgttctctt cctcttcctc ctcttcctcg ccgccgccca ctcgcagccc ccgcaccacg
23760gggtcgtctt cctgcaggcg ctgcaccttg cgcttgccgt tgcgcccctg cttgatgcgc
23820acgggcgggt tgctgaagcc caccatcacc agcgcggcct cttcttgctc gtcctcgctg
23880tccagaatga cctccgggga gggggggttg gtcatcctca gtaccgaggc acgcttcttt
23940ttcttcctgg gggcgttcgc cagctccgcg gctgcggccg ctgccgaggt cgaaggccga
24000gggctgggcg tgcgcggcac cagcgcgtcc tgcgagccgt cctcgtcctc ctcggactcg
24060agacggaggc gggcccgctt cttcgggggc gcgcggggcg gcggaggcgg cggcggcgac
24120ggagacgggg acgagacatc gtccagggtg ggtggacggc gggccgcgcc gcgtccgcgc
24180tcgggggtgg tctcgcgctg gtcctcttcc cgactggcca tctcccactg ctccttctcc
24240tataggcaga aagagatcat ggagtctctc atgcgagtcg agaaggagga ggacagccta
24300accgccccct ctgagccctc caccaccgcc gccaccaccg ccaatgccgc cgcggacgac
24360gcgcccaccg agaccaccgc cagtaccacc ctccccagcg acgcaccccc gctcgagaat
24420gaagtgctga tcgagcagga cccgggtttt gtgagcggag aggaggatga ggtggatgag
24480aaggagaagg aggaggtcgc cgcctcagtg ccaaaagagg ataaaaagca agaccaggac
24540gacgcagata aggatgagac agcagtcggg cgggggaacg gaagccatga tgctgatgac
24600ggctacctag acgtgggaga cgacgtgctg cttaagcacc tgcaccgcca gtgcgtcatc
24660gtctgcgacg cgctgcagga gcgctgcgaa gtgcccctgg acgtggcgga ggtcagccgc
24720gcctacgagc ggcacctctt cgcgccgcac gtgcccccca agcgccggga gaacggcacc
24780tgcgagccca acccgcgtct caacttctac ccggtcttcg cggtacccga ggtgctggcc
24840acctaccaca tctttttcca aaactgcaag atccccctct cctgccgcgc caaccgcacc
24900cgcgccgaca aaaccctgac cctgcggcag ggcgcccaca tacctgatat cgcctctctg
24960gaggaagtgc ccaagatctt cgagggtctc ggtcgcgacg agaaacgggc ggcgaacgct
25020ctgcacggag acagcgaaaa cgagagtcac tcgggggtgc tggtggagct cgagggcgac
25080aacgcgcgcc tggccgtact caagcgcagc atagaggtca cccactttgc ctacccggcg
25140ctcaacctgc cccccaaggt catgagtgtg gtcatgggcg agctcatcat gcgccgcgcc
25200cagcccctgg ccgcggatgc aaacttgcaa gagtcctccg aggaaggcct gcccgcggtc
25260agcgacgagc agctggcgcg ctggctggag acccgcgacc ccgcgcagct ggaggagcgg
25320cgcaagctca tgatggccgc ggtgctggtc accgtggagc tcgagtgtct gcagcgcttc
25380ttcgcggacc ccgagatgca gcgcaagctc gaggagaccc tgcactacac cttccgccag
25440ggctacgtgc gccaggcctg caagatctcc aacgtggagc tctgcaacct ggtctcctac
25500ctgggcatcc tgcacgagaa ccgcctcggg cagaacgtcc tgcactccac cctcaaaggg
25560gaggcgcgcc gcgactacat ccgcgactgc gcctacctct tcctctgcta cacctggcag
25620acggccatgg gggtctggca gcagtgcctg gaggagcgca acctcaagga gctggaaaag
25680ctcctcaagc gcaccctcag ggacctctgg acgggcttca acgagcgctc ggtggccgcc
25740gcgctggcgg acatcatctt tcccgagcgc ctgctcaaga ccctgcagca gggcctgccc
25800gacttcacca gccagagcat gctgcagaac ttcaggactt tcatcctgga gcgctcgggc
25860atcctgccgg ccacttgctg cgcgctgccc agcgacttcg tgcccatcaa gtacagggag
25920tgcccgccgc cgctctgggg ccactgctac ctcttccagc tggccaacta cctcgcctac
25980cactcggacc tcatggaaga cgtgagcggc gagggcctgc tcgagtgcca ctgccgctgc
26040aacctctgca cgccccaccg ctctctagtc tgcaacccgc agctgctcag cgagagtcag
26100attatcggta ccttcgagct gcagggtccc tcgcctgacg agaagtccgc ggctccaggg
26160ctgaaactca ctccggggct gtggacttcc gcctacctac gcaaatttgt acctgaggac
26220taccacgccc acgagatcag gttctacgaa gaccaatccc gcccgcccaa ggcggagctc
26280accgcctgcg tcatcaccca ggggcacatc ctgggccaat tgcaagccat caacaaagcc
26340cgccgagagt tcttgctgaa aaagggtcgg ggggtgtacc tggaccccca gtccggcgag
26400gagctaaacc cgctaccccc gccgccgccc cagcagcggg accttgcttc ccaggatggc
26460acccagaaag aagcagcagc cgccgccgcc gccgcagcca tacatgcttc tggaggaaga
26520ggaggaggac tgggacagtc aggcagagga ggtttcggac gaggagcagg aggagatgat
26580ggaagactgg gaggaggaca gcagcctaga cgaggaagct tcagaggccg aagaggtggc
26640agacgcaaca ccatcgccct cggtcgcagc cccctcgccg gggcccctga aatcctccga
26700acccagcacc agcgctataa cctccgctcc tccggcgccg gcgccacccg cccgcagacc
26760caaccgtaga tgggacacca caggaaccgg ggtcggtaag tccaagtgcc cgccgccgcc
26820accgcagcag cagcagcagc agcgccaggg ctaccgctcg tggcgcgggc acaagaacgc
26880catagtcgcc tgcttgcaag actgcggggg caacatctct ttcgcccgcc gcttcctgct
26940attccaccac ggggtcgcct ttccccgcaa tgtcctgcat tactaccgtc atctctacag
27000cccctactgc agcggcgacc cagaggcggc agcggcagcc acagcggcga ccaccaccta
27060ggaagatatc ctccgcgggc aagacagcgg cagcagcggc caggagaccc gcggcagcag
27120cggcgggagc ggtgggcgca ctgcgcctct cgcccaacga acccctctcg acccgggagc
27180tcagacacag gatcttcccc actttgtatg ccatcttcca acagagcaga ggccaggagc
27240aggagctgaa aataaaaaac agatctctgc gctccctcac ccgcagctgt ctgtatcaca
27300aaagcgaaga tcagcttcgg cgcacgctgg aggacgcgga ggcactcttc agcaaatact
27360gcgcgctcac tcttaaagac tagctccgcg cccttctcga atttaggcgg gagaaaacta
27420cgtcatcgcc ggccgccgcc cagcccgccc agccgagatg agcaaagaga ttcccacgcc
27480atacatgtgg agctaccagc cgcagatggg actcgcggcg ggagcggccc aggactactc
27540cacccgcatg aactacatga gcgcgggacc ccacatgatc tcacaggtca acgggatccg
27600cgcccagcga aaccaaatac tgctggaaca ggcggccatc accgccacgc cccgccataa
27660tctcaacccc cgaaattggc ccgccgccct cgtgtaccag gaaaccccct ccgccaccac
27720cgtactactt ccgcgtgacg cccaggccga agtccagatg actaactcag gggcgcagct
27780cgcgggcggc tttcgtcacg gggcgcggcc gctccgacca ggtataagac acctgatgat
27840cagaggccga ggtatccagc tcaacgacga gtcggtgagc tcttcgctcg gtctccgtcc
27900ggacggaact ttccagctcg ccggatccgg ccgctcttcg ttcacgcccc gccaggcgta
27960cctgactctg cagacctcgt cctcggagcc ccgctccggc ggcatcggaa ccctccagtt
28020cgtggaggag ttcgtgccct cggtctactt caaccccttc tcgggacctc ccggacgcta
28080ccccgaccag ttcattccga actttgacgc ggtgaaggac tcggcggacg gctacgactg
28140aatgtcaggt gtcgaggcag agcagcttcg cctgagacac ctcgagcact gccgccgcca
28200caagtgcttc gcccgcggtt ctggtgagtt ctgctacttt cagctacccg aggagcatac
28260cgaggggccg gcgcacggcg tccgcctgac cacccagggc gaggttacct gttccctcat
28320ccgggagttt accctccgtc ccctgctagt ggagcgggag cggggtccct gtgtcctaac
28380tatcgcctgc aactgcccta accctggatt acatcaagat ctttgctgtc atctctgtgc
28440tgagtttaat aaacgctgag atcagaatct actggggctc ctgtcgccat cctgtgaacg
28500ccaccgtctt cacccacccc gaccaggccc aggcgaacct cacctgcggt ctgcatcgga
28560gggccaagaa gtacctcacc tggtacttca acggcacccc ctttgtggtt tacaacagct
28620tcgacgggga cggagtctcc ctgaaagacc agctctccgg tctcagctac tccatccaca
28680agaacaccac cctccaactc ttccctccct acctgccggg aacctacgag tgcgtcaccg
28740gccgctgcac ccacctcacc cgcctgatcg taaaccagag ctttccggga acagataact
28800ccctcttccc cagaacagga ggtgagctca ggaaactccc cggggaccag ggcggagacg
28860taccttcgac ccttgtgggg ttaggatttt ttattaccgg gttgctggct cttttaatca
28920aagtttcctt gagatttgtt ctttccttct acgtgtatga acacctcaac ctccaataac
28980tctacccttt cttcggaatc aggtgacttc tctgaaatcg ggcttggtgt gctgcttact
29040ctgttgattt ttttccttat catactcagc cttctgtgcc tcaggctcgc cgcctgctgc
29100gcacacatct atatctactg ctggttgctc aagtgcaggg gtcgccaccc aagatgaaca
29160ggtacatggt cctatcgatc ctaggcctgc tggccctggc ggcctgcagc gccgccaaaa
29220aagagattac ctttgaggag cccgcttgca atgtaacttt caagcccgag ggtgaccaat
29280gcaccaccct cgtcaaatgc gttaccaatc atgagaggct gcgcatcgac tacaaaaaca
29340aaactggcca gtttgcggtc tatagtgtgt ttacgcccgg agacccctct aactactctg
29400tcaccgtctt ccagggcgga cagtctaaga tattcaatta cactttccct ttttatgagt
29460tatgcgatgc ggtcatgtac atgtcaaaac agtacaacct gtggcctccc tctccccagg
29520cgtgtgtgga aaatactggg tcttactgct gtatggcttt cgcaatcact acgctcgctc
29580taatctgcac ggtgctatac ataaaattca ggcagaggcg aatctttatc gatgaaaaga
29640aaatgccttg atcgctaaca ccggctttct atctgcagaa tgaatgcaat cacctcccta
29700ctaatcacca ccaccctcct tgcgattgcc catgggttga cacgaatcga agtgccagtg
29760gggtccaatg tcaccatggt gggccccgcc ggcaattcca ccctcatgtg ggaaaaattt
29820gtccgcaatc aatgggttca tttctgctct aaccgaatca gtatcaagcc cagagccatc
29880tgcgatgggc aaaatctaac tctgatcaat gtgcaaatga tggatgctgg gtactattac
29940gggcagcggg gagaaatcat taattactgg cgaccccaca aggactacat gctgcatgta
30000gtcgaggcac ttcccactac cacccccact accacctctc ccaccaccac caccactact
30060actactacta ctactactac tactactacc actaccgctg cccgccatac ccgcaaaagc
30120accatgatta gcacaaagcc ccctcgtgct cactcccacg ccggcgggcc catcggtgcg
30180acctcagaaa ccaccgagct ttgcttctgc caatgcacta acgccagcgc tcatgaactg
30240ttcgacctgg agaatgagga tgtccagcag agctccgctt gcctgaccca ggaggctgtg
30300gagcccgttg ccctgaagca gatcggtgat tcaataattg actcttcttc ttttgccact
30360cccgaatacc ctcccgattc tactttccac atcacgggta ccaaagaccc taacctctct
30420ttctacctga tgctgctgct ctgtatctct gtggtctctt ccgcgctgat gttactgggg
30480atgttctgct gcctgatctg ccgcagaaag agaaaagctc gctctcaggg ccaaccactg
30540atgcccttcc cctacccccc ggattttgca gataacaaga tatgagctcg ctgctgacac
30600taaccgcttt actagcctgc gctctaaccc ttgtcgcttg cgactcgaga ttccacaatg
30660tcacagctgt ggcaggagaa aatgttactt tcaactccac ggccgatacc cagtggtcgt
30720ggagtggctc aggtagctac ttaactatct gcaatagctc cacttccccc ggcatatccc
30780caaccaagta ccaatgcaat gccagcctgt tcaccctcat caacgcttcc accctggaca
30840atggactcta tgtaggctat gtaccctttg gtgggcaagg aaagacccac gcttacaacc
30900tggaagttcg ccagcccaga accactaccc aagcttctcc caccaccacc accaccacca
30960ccatcaccag cagcagcagc agcagcagcc acagcagcag cagcagatta ttgactttgg
31020ttttggccag ctcatctgcc gctacccagg ccatctacag ctctgtgccc gaaaccactc
31080agatccaccg cccagaaacg accaccgcca ccaccctaca cacctccagc gatcagatgc
31140cgaccaacat cacccccttg gctcttcaaa tgggacttac aagccccact ccaaaaccag
31200tggatgcggc cgaggtctcc gccctcgtca atgactgggc ggggctggga atgtggtggt
31260tcgccatagg catgatggcg ctctgcctgc ttctgctctg gctcatctgc tgcctccacc
31320gcaggcgagc cagacccccc atctatagac ccatcattgt cctgaacccc gataatgatg
31380ggatccatag attggatggc ctgaaaaacc tacttttttc ttttacagta tgataaattg
31440agacatgcct cgcattttct tgtacatgtt ccttctccca ccttttctgg ggtgttctac
31500gctggccgct gtgtctcacc tggaggtaga ctgcctctca cccttcactg tctacctgct
31560ttacggattg gtcaccctca ctctcatctg cagcctaatc acagtaatca tcgccttcat
31620ccagtgcatt gattacatct gtgtgcgcct cgcatacttc agacaccacc cgcagtaccg
31680agacaggaac attgcccaac ttctaagact gctctaatca tgcataagac tgtgatctgc
31740cttctgatcc tctgcatcct gcccaccctc acctcctgcc agtacaccac aaaatctccg
31800cgcaaaagac atgcctcctg ccgcttcacc caactgtgga atatacccaa atgctacaac
31860gaaaagagcg agctctccga agcttggctg tatggggtca tctgtgtctt agttttctgc
31920agcactgtct ttgccctcat aatctacccc tactttgatt tgggatggaa cgcgatcgat
31980gccatgaatt accccacctt tcccgcaccc gagataattc cactgcgaca agttgtaccc
32040gttgtcgtta atcaacgccc cccatcccct acgcccactg aaatcagcta ctttaaccta
32100acaggcggag atgactgacg ccctagatct agaaatggac ggcatcagta ccgagcagcg
32160tctcctagag aggcgcaggc aggcggctga gcaagagcgc ctcaatcagg agctccgaga
32220tctcgttaac ctgcaccagt gcaaaagagg catcttttgt ctggtaaagc aggccaaagt
32280cacctacgag aagaccggca acagccaccg cctcagttac aaattgccca cccagcgcca
32340gaagctggtg ctcatggtgg gtgagaatcc catcaccgtc acccagcact cggtagagac
32400cgaggggtgt ctgcactccc cctgtcgggg tccagaagac ctctgcaccc tggtaaagac
32460cctgtgcggt ctcagagatt tagtcccctt taactaatca aacactggaa tcaataaaaa
32520gaatcactta cttaaaatca gacagcaggt ctctgtccag tttattcagc agcacctcct
32580tcccctcctc ccaactctgg tactccaaac gccttctggc ggcaaacttc ctccacaccc
32640tgaagggaat gtcagattct tgctcctgtc cctccgcacc cactatcttc atgttgttgc
32700agatgaagcg caccaaaacg tctgacgaga gcttcaaccc cgtgtacccc tatgacacgg
32760aaagcggccc tccctccgtc cctttcctca cccctccctt cgtgtctccc gatggattcc
32820aagaaagtcc ccccggggtc ctgtctctga acctggccga gcccctggtc acttcccacg
32880gcatgctcgc cctgaaaatg ggaagtggcc tctccctgga cgacgctggc aacctcacct
32940ctcaagatat caccaccgct agccctcccc tcaaaaaaac caagaccaac ctcagcctag
33000aaacctcatc ccccctaact gtgagcacct caggcgccct caccgtagca gccgccgctc
33060ccctggcggt ggccggcacc tccctcacca tgcaatcaga ggcccccctg acagtacagg
33120atgcaaaact caccctggcc accaaaggcc ccctgaccgt gtctgaaggc aaactggcct
33180tgcaaacatc ggccccgctg acggccgctg acagcagcac cctcacagtc agtgccacac
33240caccccttag cacaagcaat ggcagcttgg gtattgacat gcaagccccc atttacacca
33300ccaatggaaa actaggactt aactttggcg ctcccctgca tgtggtagac agcctaaatg
33360cactgactgt agttactggc caaggtctta cgataaacgg aacagcccta caaactagag
33420tctcaggtgc cctcaactat gacacatcag gaaacctaga attgagagct gcagggggta
33480tgcgagttga tgcaaatggt caacttatcc ttgatgtagc ttacccattt gatgcacaaa
33540acaatctcag ccttaggctt ggacagggac ccctgtttgt taactctgcc cacaacttgg
33600atgttaacta caacagaggc ctctacctgt tcacatctgg aaataccaaa aagctagaag
33660ttaatatcaa aacagccaag ggtctcattt atgatgacac tgctatagca atcaatgcgg
33720gtgatgggct acagtttgac tcaggctcag atacaaatcc attaaaaact aaacttggat
33780taggactgga ttatgactcc agcagagcca taattgctaa actgggaact ggcctaagct
33840ttgacaacac aggtgccatc acagtaggca acaaaaatga tgacaagctt accttgtgga
33900ccacaccaga cccatcccct aactgtagaa tctattcaga gaaagatgct aaattcacac
33960ttgttttgac taaatgcggc agtcaggtgt tggccagcgt ttctgtttta tctgtaaaag
34020gtagccttgc gcccatcagt ggcacagtaa ctagtgctca gattgtcctc agatttgatg
34080aaaatggagt tctactaagc aattcttccc ttgaccctca atactggaac tacagaaaag
34140gtgaccttac agagggcact gcatatacca acgcagtggg atttatgccc aacctcacag
34200catacccaaa aacacagagc caaactgcta aaagcaacat tgtaagtcag gtttacttga
34260atggggacaa atccaaaccc atgaccctca ccattaccct caatggaact aatgaaacag
34320gagatgccac agtaagcact tactccatgt cattctcatg gaactggaat ggaagtaatt
34380acattaatga aacgttccaa accaactcct tcaccttctc ctacatcgcc caagaataaa
34440aagcatgacg ctgttgattt gattcaatgt gtttctgttt tattttcaag cacaacaaaa
34500tcattcaagt cattcttcca tcttagctta atagacacag tagcttaata gacccagtag
34560tgcaaagccc cattctagct tataactagt ggagaagtac tcgcctacat gggggtagag
34620tcataatcgt gcatcaggat agggcggtgg tgctgcagca gcgcgcgaat aaactgctgc
34680cgccgccgct ccgtcctgca ggaatacaac atggcagtgg tctcctcagc gatgattcgc
34740accgcccgca gcataaggcg ccttgtcctc cgggcacagc agcgcaccct gatctcactt
34800aaatcagcac agtaactgca gcacagcacc acaatattgt tcaaaatccc acagtgcaag
34860gcgctgtatc caaagctcat ggcggggacc acagaaccca cgtggccatc ataccacaag
34920cgcaggtaga ttaagtggcg acccctcata aacacgctgg acataaacat tacctctttt
34980ggcatgttgt aattcaccac ctcccggtac catataaacc tctgattaaa catggcgcca
35040tccaccacca tcctaaacca gctggccaaa acctgcccgc cggctataca ctgcagggaa
35100ccgggactgg aacaatgaca gtggagagcc caggactcgt aaccatggat catcatgctc
35160gtcatgatat caatgttggc acaacacagg cacacgtgca tacacttcct caggattaca
35220agctcctccc gcgttagaac catatcccag ggaacaaccc attcctgaat cagcgtaaat
35280cccacactgc agggaagacc tcgcacgtaa ctcacgttgt gcattgtcaa agtgttacat
35340tcgggcagca gcggatgatc ctccagtatg gtagcgcggg tttctgtctc aaaaggaggt
35400agacgatccc tactgtacgg agtgcgccga gacaaccgag atcgtgttgg tcgtagtgtc
35460atgccaaatg gaacgccgga cgtagtcata tttcctgaag tcttagatct ctcaacgcag
35520caccagcacc aacacttcgc agtgtaaaag gccaagtgcc gagagagtat atataggaat
35580aaaaagtgac gtaaacgggc aaagtccaaa aaacgcccag aaaaaccgca cgcgaaccta
35640cgccccgaaa cgaaagccaa aaaacactag acactccctt ccggcgtcaa cttccgcttt
35700cccacgctac gtcacttgcc ccagtcaaac aaactacata tcccgaactt ccaagtcgcc
35760acgcccaaaa caccgcctac acctccccgc ccgccggccc gcccccaaac ccgcctcccg
35820ccccgcgccc cgccccgcgc cgcccatctc attatcatat tggcttcaat ccaaaataag
35880gtatattatt gatgatggtt taaacggatc caattcttga agacgaaagg gcctcgtgat
35940acgcctattt ttataggtta atgtcatgat aataatggtt tcttagacgt caggtggcac
36000ttttcgggga aatgtgcgcg gaacccctat ttgtttattt ttctaaatac attcaaatat
36060gtatccgctc atgagacaat aaccctgata aatgcttcaa taatattgaa aaaggaagag
36120tatgagtatt caacatttcc gtgtcgccct tattcccttt tttgcggcat tttgccttcc
36180tgtttttgct cacccagaaa cgctggtgaa agtaaaagat gctgaagatc agttgggtgc
36240acgagtgggt tacatcgaac tggatctcaa cagcggtaag atccttgaga gttttcgccc
36300cgaagaacgt tttccaatga tgagcacttt taaagttctg ctatgtggcg cggtattatc
36360ccgtgttgac gccgggcaag agcaactcgg tcgccgcata cactattctc agaatgactt
36420ggttgagtac tcaccagtca cagaaaagca tcttacggat ggcatgacag taagagaatt
36480atgcagtgct gccataacca tgagtgataa cactgcggcc aacttacttc tgacaacgat
36540cggaggaccg aaggagctaa ccgctttttt gcacaacatg ggggatcatg taactcgcct
36600tgatcgttgg gaaccggagc tgaatgaagc cataccaaac gacgagcgtg acaccacgat
36660gcctgtagca atggcaacaa cgttgcgcaa actattaact ggcgaactac ttactctagc
36720ttcccggcaa caattaatag actggatgga ggcggataaa gttgcaggac cacttctgcg
36780ctcggccctt ccggctggct ggtttattgc tgataaatct ggagccggtg agcgtgggtc
36840tcgcggtatc attgcagcac tggggccaga tggtaagccc tcccgtatcg tagttatcta
36900cacgacgggg agtcaggcaa ctatggatga acgaaataga cagatcgctg agataggtgc
36960ctcactgatt aagcattggt aactgtcaga ccaagtttac tcatatatac tttagattga
37020tttaaaagga tctaggtgaa gatccttttt gataatctca tgaccaaaat cccttaacgt
37080gagttttcgt tccactgagc gtcagacccc gtagaaaaga tcaaaggatc ttcttgagat
37140cctttttttc tgcgcgtaat ctgctgcttg caaacaaaaa aaccaccgct accagcggtg
37200gtttgtttgc cggatcaaga gctaccaact ctttttccga aggtaactgg cttcagcaga
37260gcgcagatac caaatactgt ccttctagtg tagccgtagt taggccacca cttcaagaac
37320tctgtagcac cgcctacata cctcgctctg ctaatcctgt taccagtggc tgctgccagt
37380ggcgataagt cgtgtcttac cgggttggac tcaagacgat agttaccgga taaggcgcag
37440cggtcgggct gaacgggggg ttcgtgcaca cagcccagct tggagcgaac gacctacacc
37500gaactgagat acctacagcg tgagctatga gaaagcgcca cgcttcccga agggagaaag
37560gcggacaggt atccggtaag cggcagggtc ggaacaggag agcgcacgag ggagcttcca
37620gggggaaacg cctggtatct ttatagtcct gtcgggtttc gccacctctg acttgagcgt
37680cgatttttgt gatgctcgtc aggggggcgg agcctatgga aaaacgccag caacgcggcc
37740tttttacggt tcctggcctt ttgctggcct tgaagctgtc cctgatggtc gtcatctacc
37800tgcctggaca gcatggcctg caacgcgggc atcccgatgc cgccggaagc gagaagaatc
37860ataatgggga aggccatcca gcctcgcgtc gcagatccga attcgtttaa ac
37912843428DNASimian adenovirus 8catcatcaat aatatacctt attttggatt
gaagccaata tgataatgag atgggcggcg 60cggggcgggg cgcggggcgg gaggcgggtt
tgggggcggg ccggcgggcg gggcggtgtg 120gcggaagtgg actttgtaag tgtggcggat
gtgacttgct agtgccgggc gcggtaaaag 180tgacgttttc cgtgcgcgac aacgcccccg
ggaagtgaca tttttcccgc ggtttttacc 240ggatgttgta gtgaatttgg gcgtaaccaa
gtaagatttg gccattttcg cgggaaaact 300gaaacgggga agtgaaatct gattaatttt
gcgttagtca taccgcgtaa tatttgtcta 360gggccgaggg actttggccg attacgtgga
ggactcgccc aggtgttttt tgaggtgaat 420ttccgcgttc cgggtcaaag tctgcgtttt
attattatag gatatcccat tgcatacgtt 480gtatccatat cataatatgt acatttatat
tggctcatgt ccaacattac cgccatgttg 540acattgatta ttgactagtt attaatagta
atcaattacg gggtcattag ttcatagccc 600atatatggag ttccgcgtta cataacttac
ggtaaatggc ccgcctggct gaccgcccaa 660cgacccccgc ccattgacgt caataatgac
gtatgttccc atagtaacgc caatagggac 720tttccattga cgtcaatggg tggagtattt
acggtaaact gcccacttgg cagtacatca 780agtgtatcat atgccaagta cgccccctat
tgacgtcaat gacggtaaat ggcccgcctg 840gcattatgcc cagtacatga ccttatggga
ctttcctact tggcagtaca tctacgtatt 900agtcatcgct attaccatgg tgatgcggtt
ttggcagtac atcaatgggc gtggatagcg 960gtttgactca cggggatttc caagtctcca
ccccattgac gtcaatggga gtttgttttg 1020gcaccaaaat caacgggact ttccaaaatg
tcgtaacaac tccgccccat tgacgcaaat 1080gggcggtagg cgtgtacggt gggaggtcta
tataagcaga gctctcccta tcagtgatag 1140agatctccct atcagtgata gagatcgtcg
acgagctcgt ttagtgaacc gtcagatcgc 1200ctggagacgc catccacgct gttttgacct
ccatagaaga caccgggacc gatccagcct 1260ccgcggccgg gaacggtgca ttggaacgcg
gattccccgt gccaagagtg agatcttccg 1320tttatctagg taccgggccc cccctcgagg
tcgacggtat cgataagctt cacgctgccg 1380caagcactca gggcgcaagg gctgctaaag
gaagcggaac acgtagaaag ccagtccgca 1440gaaacggtgc tgaccccgga tgaatgtcag
ctactgggct atctggacaa gggaaaacgc 1500aagcgcaaag agaaagcagg tagcttgcag
tgggcttaca tggcgatagc tagactgggc 1560ggttttatgg acagcaagcg aaccggaatt
gccagctggg gcgccctctg gtaaggttgg 1620gaagccctgc aaagtaaact ggatggcttt
cttgccgcca aggatctgat ggcgcagggg 1680atcaagatct aaccaggagc tatttaatgg
caacagttaa ccagctggta cgcaaaccac 1740gtgctcgcaa agttgcgaaa agcaacgtgc
ctgcgctgga agcatgcccg caaaaacgtg 1800gcgtatgtac tcgtgtatat actaccactc
ctaaaaaacc gaactccgcg ctgcgtaaag 1860tatgccgtgt tcgtctgact aacggtttcg
aagtgacttc ctacatcggt ggtgaaggtc 1920acaacctgca ggagcactcc gtgatcctga
tccgtggcgg tcgtgttaaa gacctcccgg 1980gtgttcgtta ccacaccgta cgtggtgcgc
ttgactgctc cggcgttaaa gaccgtaagc 2040aggctcgttc caagtatggc gtgaagcgtc
ctaaggctta atggtagatc tgatcaagag 2100acaggatgac ggtcgtttcg catgcttgaa
caagatggat tgcacgcagg ttctccggcc 2160gcttgggtgg agaggctatt cggctatgac
tgggcacaac agacaatcgg ctgctctgat 2220gccgccgtgt tccggctgtc agcgcagggg
cgcccggttc tttttgtcaa gaccgacctg 2280tccggtgccc tgaatgaact gcaggacgag
gcagcgcggc tatcgtggct ggccacgacg 2340ggcgttcctt gcgcagctgt gctcgacgtt
gtcactgaag cgggaaggga ctggctgcta 2400ttgggcgaag tgccggggca ggatctcctg
tcatctcacc ttgctcctgc cgagaaagta 2460tccatcatgg ctgatgcaat gcggcggctg
catacgcttg atccggctac ctgcccattc 2520gaccaccaag cgaaacatcg catcgagcga
gcacgtactc ggatggaagc cggtcttgtc 2580gatcaggatg atctggacga agagcatcag
gggctcgcgc cagccgaact gttcgccagg 2640ctcaaggcgc gcatgcccga cggcgaggat
ctcgtcgtga cccatggcga tgcctgcttg 2700ccgaatatca tggtggaaaa tggccgcttt
tctggattca tcgactgtgg ccggctgggt 2760gtggcggacc gctatcagga catagcgttg
gctacccgtg atattgctga agagcttggc 2820ggcgaatggg ctgaccgctt cctcgtgctt
tacggtatcg ccgctcccga ttcgcagcgc 2880atcgccttct atcgccttct tgacgagttc
ttctgagcgg gactctgggg ttcgaaatga 2940ccgaccaagc gacgcccaac ctgccatcac
gagatttcga ttccaccgcc gccttctatg 3000aaaggttggg cttcggaatc gttttccggg
acgccggctg gatgatcctc cagcgcgggg 3060atctcatgct ggagttcttc gcccaccccg
ggctcgatcc cctcgggggg aatcagaatt 3120cagtcgacag cggccgcgat ctgctgtgcc
ttctagttgc cagccatctg ttgtttgccc 3180ctcccccgtg ccttccttga ccctggaagg
tgccactccc actgtccttt cctaataaaa 3240tgaggaaatt gcatcgcatt gtctgagtag
gtgtcattct attctggggg gtggggtggg 3300gcaggacagc aagggggagg attgggaaga
caatagcagg catgctgggg atgcggtggg 3360ctctatggcc gatcagcgat cgctgaggtg
ggtgagtggg cgtggcctgg ggtggtcatg 3420aaaatatata agttgggggt cttagggtct
ctttatttgt gttgcagaga ccgccggagc 3480catgagcggg agcagcagca gcagcagtag
cagcagcgcc ttggatggca gcatcgtgag 3540cccttatttg acgacgcgga tgccccactg
ggccggggtg cgtcagaatg tgatgggctc 3600cagcatcgac ggccgacccg tcctgcccgc
aaattccgcc acgctgacct atgcgaccgt 3660cgcggggacg ccgttggacg ccaccgccgc
cgccgccgcc accgcagccg cctcggccgt 3720gcgcagcctg gccacggact ttgcattcct
gggaccactg gcgacagggg ctacttctcg 3780ggccgctgct gccgccgttc gcgatgacaa
gctgaccgcc ctgctggcgc agttggatgc 3840gcttactcgg gaactgggtg acctttctca
gcaggtcatg gccctgcgcc agcaggtctc 3900ctccctgcaa gctggcggga atgcttctcc
cacaaatgcc gtttaagata aataaaacca 3960gactctgttt ggattaaaga aaagtagcaa
gtgcattgct ctctttattt cataattttc 4020cgcgcgcgat aggccctaga ccagcgttct
cggtcgttga gggtgcggtg tatcttctcc 4080aggacgtggt agaggtggct ctggacgttg
agatacatgg gcatgagccc gtcccggggg 4140tggaggtagc accactgcag agcttcatgc
tccggggtgg tgttgtagat gatccagtcg 4200tagcaggagc gctgggcatg gtgcctaaaa
atgtccttca gcagcaggcc gatggccagg 4260gggaggccct tggtgtaagt gtttacaaaa
cggttaagtt gggaagggtg cattcgggga 4320gagatgatgt gcatcttgga ctgtattttt
agattggcga tgtttccgcc cagatccctt 4380ctgggattca tgttgtgcag gaccaccagt
acagtgtatc cggtgcactt ggggaatttg 4440tcatgcagct tagagggaaa agcgtggaag
aacttggaga cgcctttgtg gcctcccaga 4500ttttccatgc attcgtccat gatgatggca
atgggcccgc gggaggcagc ttgggcaaag 4560atatttctgg ggtcgctgac gtcgtagttg
tgttccaggg tgaggtcgtc ataggccatt 4620tttacaaagc gcgggcggag ggtgcccgac
tgggggatga tggtcccctc tggccctggg 4680gcgtagttgc cctcgcagat ctgcatttcc
caggccttaa tctcggaggg gggaatcata 4740tccacctgcg gggcgatgaa gaaaacggtt
tccggagccg gggagattaa ctgggatgag 4800agcaggtttc taagcagctg tgattttcca
caaccggtgg gcccataaat aacacctata 4860accggttgca gctggtagtt tagagagctg
cagctgccgt cgtcccggag gaggggggcc 4920acctcgttga gcatgtccct gacgcgcatg
ttctccccga ccagatccgc cagaaggcgc 4980tcgccgccca gggacagcag ctcttgcaag
gaagcaaagt ttttcagcgg cttgaggccg 5040tccgccgtgg gcatgttttt cagggtctgg
ctcagcagct ccaggcggtc ccagagctcg 5100gtgacgtgct ctacggcatc tctatccagc
atatctcctc gtttcgcggg ttggggcgac 5160tttcgctgta gggcaccaag cggtggtcgt
ccagcggggc cagagtcatg tccttccatg 5220ggcgcagggt cctcgtcagg gtggtctggg
tcacggtgaa ggggtgcgct ccgggctgag 5280cgcttgccaa ggtgcgcttg aggctggttc
tgctggtgct gaagcgctgc cggtcttcgc 5340cctgcgcgtc ggccaggtag catttgacca
tggtgtcata gtccagcccc tccgcggcgt 5400gtcccttggc gcgcagcttg cccttggagg
tggcgccgca cgaggggcag agcaggctct 5460tgagcgcgta gagcttgggg gcgaggaaga
ccgattcggg ggagtaggcg tccgcgccgc 5520agaccccgca cacggtctcg cactccacca
gccaggtgag ctcggggcgc gccgggtcaa 5580aaaccaggtt tcccccatgc tttttgatgc
gtttcttacc tcgggtctcc atgaggtggt 5640gtccccgctc ggtgacgaag aggctgtccg
tgtctccgta gaccgacttg aggggtcttt 5700tctccagggg ggtccctcgg tcttcctcgt
agaggaactc ggaccactct gagacgaagg 5760cccgcgtcca ggccaggacg aaggaggcta
tgtgggaggg gtagcggtcg ttgtccacta 5820gggggtccac cttctccaag gtgtgaagac
acatgtcgcc ttcctcggcg tccaggaagg 5880tgattggctt gtaggtgtag gccacgtgac
cgggggttcc tgacgggggg gtataaaagg 5940gggtgggggc gcgctcgtcg tcactctctt
ccgcatcgct gtctgcgagg gccagctgct 6000ggggtgagta ttccctctcg aaggcgggca
tgacctccgc gctgaggttg tcagtttcca 6060aaaacgagga ggatttgatg ttcacctgtc
ccgaggtgat acctttgagg gtacccgcgt 6120ccatctggtc agaaaacacg atctttttat
tgtccagctt ggtggcgaac gacccgtaga 6180gggcgttgga gagcagcttg gcgatggagc
gcagggtctg gttcttgtcc ctgtcggcgc 6240gctccttggc cgcgatgttg agctgcacgt
actcgcgcgc gacgcagcgc cactcgggga 6300agacggtggt gcgctcgtcg ggcaccaggc
gcacgcgcca gccgcggttg tgcagggtga 6360ccaggtccac gctggtggcg acctcgccgc
gcaggcgctc gttggtccag cagagacggc 6420cgcccttgcg cgagcagaag gggggcaggg
ggtcgagctg ggtctcgtcc ggggggtccg 6480cgtccacggt gaaaaccccg gggcgcaggc
gcgcgtcgaa gtagtctatc ttgcaacctt 6540gcatgtccag cgcctgctgc cagtcgcggg
cggcgagcgc gcgctcgtag gggttgagcg 6600gcgggcccca gggcatgggg tgggtgagtg
cggaggcgta catgccgcag atgtcataga 6660cgtagagggg ctcccgcagg accccgatgt
aggtggggta gcagcggccg ccgcggatgc 6720tggcgcgcac gtagtcatac agctcgtgcg
agggggcgag gaggtcgggg cccaggttgg 6780tgcgggcggg gcgctccgcg cggaagacga
tctgcctgaa gatggcatgc gagttggaag 6840agatggtggg gcgctggaag acgttgaagc
tggcgtcctg caggccgacg gcgtcgcgca 6900cgaaggaggc gtaggagtcg cgcagcttgt
gtaccagctc ggcggtgacc tgcacgtcga 6960gcgcgcagta gtcgagggtc tcgcggatga
tgtcatattt agcctgcccc ttctttttcc 7020acagctcgcg gttgaggaca aactcttcgc
ggtctttcca gtactcttgg atcgggaaac 7080cgtccggttc cgaacggtaa gagcctagca
tgtagaactg gttgacggcc tggtaggcgc 7140agcagccctt ctccacgggg agggcgtagg
cctgcgcggc cttgcggagc gaggtgtggg 7200tcagggcgaa ggtgtccctg accatgactt
tgaggtactg gtgcttgaag tcggagtcgt 7260cgcagccgcc ccgctcccag agcgagaagt
cggtgcgctt cttggagcgg gggttgggca 7320gagcgaaggt gacatcgttg aagaggattt
tgcccgcgcg gggcatgaag ttgcgggtga 7380tgcggaaggg ccccggcact tcagagcggt
tgttgatgac ctgggcggcg agcacgatct 7440cgtcgaagcc gttgatgttg tggcccacga
tgtagagttc caggaagcgg ggccggccct 7500ttacggtggg cagcttcttt agctcttcgt
aggtgagctc ctcgggcgag gcgaggccgt 7560gctcggccag ggcccagtcc gcgaggtgcg
ggttgtctct gaggaaggac ttccagaggt 7620cgcgggccag gagggtctgc aggcggtctc
tgaaggtcct gaactggcgg cccacggcca 7680ttttttcggg ggtgatgcag tagaaggtga
gggggtcttg ctgccagcgg tcccagtcga 7740gctgcagggc gaggtcgcgc gcggcggtga
ccaggcgctc gtcgcccccg aatttcatga 7800ccagcatgaa gggcacgagc tgctttccga
aggcccccat ccaagtgtag gtctctacat 7860cgtaggtgac aaagaggcgc tccgtgcgag
gatgcgagcc gatcgggaag aactggatct 7920cccgccacca gttggaggag tggctgttga
tgtggtggaa gtagaagtcc cgtcgccggg 7980ccgaacactc gtgctggctt ttgtaaaagc
gagcgcagta ctggcagcgc tgcacgggct 8040gtacctcatg cacgagatgc acctttcgcc
cgcgcacgag gaagccgagg ggaaatctga 8100gccccccgcc tggctcgcgg catggctggt
tctcttctac tttggatgcg tgtccgtctc 8160cgtctggctc ctcgaggggt gttacggtgg
agcggaccac cacgccgcgc gagccgcagg 8220tccagatatc ggcgcgcggc ggtcggagtt
tgatgacgac atcgcgcagc tgggagctgt 8280ccatggtctg gagctcccgc ggcggcggca
ggtcagccgg gagttcttgc aggttcacct 8340cgcagagtcg ggccagggcg cggggcaggt
ctaggtggta cctgatctct aggggcgtgt 8400tggtggcggc gtcgatggct tgcaggagcc
cgcagccccg gggggcgacg acggtgcccc 8460gcggggtggt ggtggtggtg gcggtgcagc
tcagaagcgg tgccgcgggc gggcccccgg 8520aggtaggggg ggctccggtc ccgcgggcag
gggcggcagc ggcacgtcgg cgtggagcgc 8580gggcaggagt tggtgctgtg cccggaggtt
gctggcgaag gcgacgacgc ggcggttgat 8640ctcctggatc tggcgcctct gcgtgaagac
gacgggcccg gtgagcttga acctgaaaga 8700gagttcgaca gaatcaatct cggtgtcatt
gaccgcggcc tggcgcagga tctcctgcac 8760gtctcccgag ttgtcttggt aggcgatctc
ggccatgaac tgctcgatct cttcctcctg 8820gaggtctccg cgtccggcgc gttccacggt
ggccgccagg tcgttggaga tgcgccccat 8880gagctgcgag aaggcgttga gtccgccctc
gttccagact cggctgtaga ccacgccccc 8940ctggtcatcg cgggcgcgca tgaccacctg
cgcgaggttg agctccacgt gccgcgcgaa 9000gacggcgtag ttgcgcagac gctggaagag
gtagttgagg gtggtggcgg tgtgctcggc 9060cacgaagaag ttcatgaccc agcggcgcaa
cgtggattcg ttgatgtccc ccaaggcctc 9120cagccgttcc atggcctcgt agaagtccac
ggcgaagttg aaaaactggg agttgcgcgc 9180cgacacggtc aactcctcct ccagaagacg
gatgagctcg gcgacggtgt cgcgcacctc 9240gcgctcgaag gctatgggga tctcttcctc
cgctagcatc accacctcct cctcttcctc 9300ctcttctggc acttccatga tggcttcctc
ctcttcgggg ggtggcggcg gcggcggtgg 9360gggagggggc gctctgcgcc ggcggcggcg
caccgggagg cggtccacga agcgcgcgat 9420catctccccg cggcggcggc gcatggtctc
ggtgacggcg cggccgttct cccgggggcg 9480cagttggaag acgccgccgg acatctggtg
ctggggcggg tggccgtgag gcagcgagac 9540ggcgctgacg atgcatctca acaattgctg
cgtaggtacg ccgccgaggg acctgaggga 9600gtccatatcc accggatccg aaaacctttc
gaggaaggcg tctaaccagt cgcagtcgca 9660aggtaggctg agcaccgtgg cgggcggcgg
ggggtggggg gagtgtctgg cggaggtgct 9720gctgatgatg taattgaagt aggcggactt
gacacggcgg atggtcgaca ggagcaccat 9780gtccttgggt ccggcctgct ggatgcggag
gcggtcggct atgccccagg cttcgttctg 9840gcatcggcgc aggtccttgt agtagtcttg
catgagcctt tccaccggca cctcttctcc 9900ttcctcttct gcttcttcca tgtctgcttc
ggccctgggg cggcgccgcg cccccctgcc 9960ccccatgcgc gtgaccccga accccctgag
cggttggagc agggccaggt cggcgacgac 10020gcgctcggcc aggatggcct gctgcacctg
cgtgagggtg gtttggaagt catccaagtc 10080cacgaagcgg tggtaggcgc ccgtgttgat
ggtgtaggtg cagttggcca tgacggacca 10140gttgacggtc tggtggcccg gttgcgacat
ctcggtgtac ctgagtcgcg agtaggcgcg 10200ggagtcgaag acgtagtcgt tgcaagtccg
caccaggtac tggtagccca ccaggaagtg 10260cggcggcggc tggcggtaga ggggccagcg
cagggtggcg ggggctccgg gggccaggtc 10320ttccagcatg aggcggtggt aggcgtagat
gtacctggac atccaggtga tacccgcggc 10380ggtggtggag gcgcgcggga agtcgcgcac
ccggttccag atgttgcgca ggggcagaaa 10440gtgctccatg gtaggcgtgc tctgtccagt
cagacgcgcg cagtcgttga tactctagac 10500cagggaaaac gaaagccggt cagcgggcac
tcttccgtgg tctggtgaat agatcgcaag 10560ggtatcatgg cggagggcct cggttcgagc
cccgggtccg ggccggacgg tccgccatga 10620tccacgcggt taccgcccgc gtgtcgaacc
caggtgtgcg acgtcagaca acggtggagt 10680gttccttttg gcgtttttct ggccgggcgc
cggcgccgcg taagagacta agccgcgaaa 10740gcgaaagcag taagtggctc gctccccgta
gccggaggga tccttgctaa gggttgcgtt 10800gcggcgaacc ccggttcgaa tcccgtactc
gggccggccg gacccgcggc taaggtgttg 10860gattggcctc cccctcgtat aaagaccccg
cttgcggatt gactccggac acggggacga 10920gcccctttta tttttgcttt ccccagatgc
atccggtgct gcggcagatg cgccccccgc 10980cccagcagca gcaacaacac cagcaagagc
ggcagcaaca gcagcgggag tcatgcaggg 11040ccccctcacc caccctcggc gggccggcca
cctcggcgtc cgcggccgtg tctggcgcct 11100gcggcggcgg cggggggccg gctgacgacc
ccgaggagcc cccgcggcgc agggccagac 11160actacctgga cctggaggag ggcgagggcc
tggcgcggct gggggcgccg tctcccgagc 11220gccacccgcg ggtgcagctg aagcgcgact
cgcgcgaggc gtacgtgcct cggcagaacc 11280tgttcaggga ccgcgcgggc gaggagcccg
aggagatgcg ggacaggagg ttcagcgcag 11340ggcgggagct gcggcagggg ctgaaccgcg
agcggctgct gcgcgaggag gactttgagc 11400ccgacgcgcg gacggggatc agccccgcgc
gcgcgcacgt ggcggccgcc gacctggtga 11460cggcgtacga gcagacggtg aaccaggaga
tcaacttcca aaagagtttc aacaaccacg 11520tgcgcacgct ggtggcgcgc gaggaggtga
ccatcgggct gatgcacctg tgggactttg 11580taagcgcgct ggtgcagaac cccaacagca
agcctctgac ggcgcagctg ttcctgatag 11640tgcagcacag cagggacaac gaggcgttta
gggacgcgct gctgaacatc accgagcccg 11700agggtcggtg gctgctggac ctgattaaca
tcctgcagag catagtggtg caggagcgca 11760gcctgagcct ggccgacaag gtggcggcca
tcaactactc gatgctgagc ctgggcaagt 11820tttacgcgcg caagatctac cagacgccgt
acgtgcccat agacaaggag gtgaagatcg 11880acggttttta catgcgcatg gcgctgaagg
tgctcaccct gagcgacgac ctgggcgtgt 11940accgcaacga gcgcatccac aaggccgtga
gcgtgagccg gcggcgcgag ctgagcgacc 12000gcgagctgat gcacagcctg cagcgggcgc
tggcgggcgc cggcagcggc gacagggagg 12060cggagtccta cttcgatgcg ggggcggacc
tgcgctgggc gcccagccgg cgggccctgg 12120aggccgcggg ggtccgcgag gactatgacg
aggacggcga ggaggatgag gagtacgagc 12180tagaggaggg cgagtacctg gactaaaccg
cgggtggtgt ttccggtaga tgcaagaccc 12240gaacgtggtg gacccggcgc tgcgggcggc
tctgcagagc cagccgtccg gccttaactc 12300ctcagacgac tggcgacagg tcatggaccg
catcatgtcg ctgacggcgc gtaacccgga 12360cgcgttccgg cagcagccgc aggccaacag
gctctccgcc atcctggagg cggtggtgcc 12420tgcgcgctcg aaccccacgc acgagaaggt
gctggccata gtgaacgcgc tggccgagaa 12480cagggccatc cgcccggacg aggccgggct
ggtgtacgac gcgctgctgc agcgcgtggc 12540ccgctacaac agcggcaacg tgcagaccaa
cctggaccgg ctggtggggg acgtgcgcga 12600ggcggtggcg cagcgcgagc gcgcggatcg
gcagggcaac ctgggctcca tggtggcgct 12660gaatgccttc ctgagcacgc agccggccaa
cgtgccgcgg gggcaggaag actacaccaa 12720ctttgtgagc gcgctgcggc tgatggtgac
cgagaccccc cagagcgagg tgtaccagtc 12780gggcccggac tacttcttcc agaccagcag
acagggcctg cagacggtga acctgagcca 12840ggctttcaag aacctgcggg ggctgtgggg
cgtgaaggcg cccaccggcg accgggcgac 12900ggtgtccagc ctgctgacgc ccaactcgcg
cctgctgctg ctgctgatcg cgccgttcac 12960ggacagcggc agcgtgtccc gggacaccta
cctggggcac ctgctgaccc tgtaccgcga 13020ggccatcggg caggcgcagg tggacgagca
caccttccag gagatcacca gcgtgagccg 13080cgcgctgggg caggaggaca cgagcagcct
ggaggcgact ctgaactacc tgctgaccaa 13140ccggcggcag aagattccct cgctgcacag
cctgacctcc gaggaggagc gcatcttgcg 13200ctacgtgcag cagagcgtga gcctgaacct
gatgcgcgac ggggtgacgc ccagcgtggc 13260gctggacatg accgcgcgca acatggaacc
gggcatgtac gccgcgcacc ggccttacat 13320caaccgcctg atggactacc tgcatcgcgc
ggcggccgtg aaccccgagt actttaccaa 13380cgccatcctg aacccgcact ggctcccgcc
gcccgggttc tacagcgggg gcttcgaggt 13440cccggagacc aacgatggct tcctgtggga
cgacatggac gacagcgtgt tctccccgcg 13500gccgcaggcg ctggcggaag cgtccctgct
gcgtcccaag aaggaggagg aggaggaggc 13560gagtcgccgc cgcggcagca gcggcgtggc
ttctctgtcc gagctggggg cggcagccgc 13620cgcgcgcccc gggtccctgg gcggcagccc
ctttccgagc ctggtggggt ctctgcacag 13680cgagcgcacc acccgccctc ggctgctggg
cgaggacgag tacctgaata actccctgct 13740gcagccggtg cgggagaaaa acctgcctcc
cgccttcccc aacaacggga tagagagcct 13800ggtggacaag atgagcagat ggaagaccta
tgcgcaggag cacagggacg cgcctgcgct 13860ccggccgccc acgcggcgcc agcgccacga
ccggcagcgg gggctggtgt gggatgacga 13920ggactccgcg gacgatagca gcgtgctgga
cctgggaggg agcggcaacc cgttcgcgca 13980cctgcgcccc cgcctgggga ggatgtttta
aaaaaaaaaa aaaaaagcaa gaagcatgat 14040gcaaaaatta aataaaactc accaaggcca
tggcgaccga gcgttggttt cttgtgttcc 14100cttcagtatg cggcgcgcgg cgatgtacca
ggagggacct cctccctctt acgagagcgt 14160ggtgggcgcg gcggcggcgg cgccctcttc
tccctttgcg tcgcagctgc tggagccgcc 14220gtacgtgcct ccgcgctacc tgcggcctac
gggggggaga aacagcatcc gttactcgga 14280gctggcgccc ctgttcgaca ccacccgggt
gtacctggtg gacaacaagt cggcggacgt 14340ggcctccctg aactaccaga acgaccacag
caattttttg accacggtca tccagaacaa 14400tgactacagc ccgagcgagg ccagcaccca
gaccatcaat ctggatgacc ggtcgcactg 14460gggcggcgac ctgaaaacca tcctgcacac
caacatgccc aacgtgaacg agttcatgtt 14520caccaataag ttcaaggcgc gggtgatggt
gtcgcgctcg cacaccaagg aagaccgggt 14580ggagctgaag tacgagtggg tggagttcga
gctgccagag ggcaactact ccgagaccat 14640gaccattgac ctgatgaaca acgcgatcgt
ggagcactat ctgaaagtgg gcaggcagaa 14700cggggtcctg gagagcgaca tcggggtcaa
gttcgacacc aggaacttcc gcctggggct 14760ggaccccgtg accgggctgg ttatgcccgg
ggtgtacacc aacgaggcct tccatcccga 14820catcatcctg ctgcccggct gcggggtgga
cttcacttac agccgcctga gcaacctcct 14880gggcatccgc aagcggcagc ccttccagga
gggcttcagg atcacctacg aggacctgga 14940ggggggcaac atccccgcgc tcctcgatgt
ggaggcctac caggatagct tgaaggaaaa 15000tgaggcggga caggaggata ccgcccccgc
cgcctccgcc gccgccgagc agggcgagga 15060tgctgctgac accgcggccg cggacggggc
agaggccgac cccgctatgg tggtggaggc 15120tcccgagcag gaggaggaca tgaatgacag
tgcggtgcgc ggagacacct tcgtcacccg 15180gggggaggaa aagcaagcgg aggccgaggc
cgcggccgag gaaaagcaac tggcggcagc 15240agcggcggcg gcggcgttgg ccgcggcgga
ggctgagtct gaggggacca agcccgccaa 15300ggagcccgtg attaagcccc tgaccgaaga
tagcaagaag cgcagttaca acctgctcaa 15360ggacagcacc aacaccgcgt accgcagctg
gtacctggcc tacaactacg gcgacccgtc 15420gacgggggtg cgctcctgga ccctgctgtg
cacgccggac gtgacctgcg gctcggagca 15480ggtgtactgg tcgctgcccg acatgatgca
agaccccgtg accttccgct ccacgcggca 15540ggtcagcaac ttcccggtgg tgggcgccga
gctgctgccc gtgcactcca agagcttcta 15600caacgaccag gccgtctact cccagctcat
ccgccagttc acctctctga cccacgtgtt 15660caatcgcttt cctgagaacc agattctggc
gcgcccgccc gcccccacca tcaccaccgt 15720cagtgaaaac gttcctgctc tcacagatca
cgggacgcta ccgctgcgca acagcatcgg 15780aggagtccag cgagtgaccg ttactgacgc
cagacgccgc acctgcccct acgtttacaa 15840ggccttgggc atagtctcgc cgcgcgtcct
ttccagccgc actttttgag caacaccacc 15900atcatgtcca tcctgatctc acccagcaat
aactccggct ggggactgct gcgcgcgccc 15960agcaagatgt tcggaggggc gaggaagcgt
tccgagcagc accccgtgcg cgtgcgcggg 16020cacttccgcg ccccctgggg agcgcacaaa
cgcggccgcg cggggcgcac caccgtggac 16080gacgccatcg actcggtggt ggagcaggcg
cgcaactaca ggcccgcggt ctctaccgtg 16140gacgcggcca tccagaccgt ggtgcggggc
gcgcggcggt acgccaagct gaagagccgc 16200cggaagcgcg tggcccgccg ccaccgccgc
cgacccgggg ccgccgccaa acgcgccgcc 16260gcggccctgc ttcgccgggc caagcgcacg
ggccgccgcg ccgccatgag ggccgcgcgc 16320cgcttggccg ccggcatcac cgccgccacc
atggcccccc gtacccgaag acgcgcggcc 16380gccgccgccg ccgccgccat cagtgacatg
gccagcaggc gccggggcaa cgtgtactgg 16440gtgcgcgact cggtgaccgg cacgcgcgtg
cccgtgcgct tccgcccccc gcggacttga 16500gatgatgtga aaaaacaaca ctgagtctcc
tgctgttgtg tgtatcccag cggcggcggc 16560gcgcgcagcg tcatgtccaa gcgcaaaatc
aaagaagaga tgctccaggt cgtcgcgccg 16620gagatctatg ggcccccgaa gaaggaagag
caggattcga agccccgcaa gataaagcgg 16680gtcaaaaaga aaaagaaaga tgatgacgat
gccgatgggg aggtggagtt cctgcgcgcc 16740acggcgccca ggcgcccggt gcagtggaag
ggccggcgcg taaagcgcgt cctgcgcccc 16800ggcaccgcgg tggtcttcac gcccggcgag
cgctccaccc ggactttcaa gcgcgtctat 16860gacgaggtgt acggcgacga agacctgctg
gagcaggcca acgagcgctt cggagagttt 16920gcttacggga agcgtcagcg ggcgctgggg
aaggaggacc tgctggcgct gccgctggac 16980cagggcaacc ccacccccag tctgaagccc
gtgaccctgc agcaggtgct gccgagcagc 17040gcaccctccg aggcgaagcg gggtctgaag
cgcgagggcg gcgacctggc gcccaccgtg 17100cagctcatgg tgcccaagcg gcagaggctg
gaggatgtgc tggagaaaat gaaagtagac 17160cccggtctgc agccggacat cagggtccgc
cccatcaagc aggtggcgcc gggcctcggc 17220gtgcagaccg tggacgtggt catccccacc
ggcaactccc ccgccgccgc caccactacc 17280gctgcctcca cggacatgga gacacagacc
gatcccgccg cagccgcagc cgcagccgcc 17340gccgcgacct cctcggcgga ggtgcagacg
gacccctggc tgccgccggc gatgtcagct 17400ccccgcgcgc gtcgcgggcg caggaagtac
ggcgccgcca acgcgctcct gcccgagtac 17460gccttgcatc cttccatcgc gcccaccccc
ggctaccgag gctataccta ccgcccgcga 17520agagccaagg gttccacccg ccgtccccgc
cgacgcgccg ccgccaccac ccgccgccgc 17580cgccgcagac gccagcccgc actggctcca
gtctccgtga ggaaagtggc gcgcgacgga 17640cacaccctgg tgctgcccag ggcgcgctac
caccccagca tcgtttaaaa gcctgttgtg 17700gttcttgcag atatggccct cacttgccgc
ctccgtttcc cggtgccggg ataccgagga 17760ggaagatcgc gccgcaggag gggtctggcc
ggccgcggcc tgagcggagg cagccgccgc 17820gcgcaccggc ggcgacgcgc caccagccga
cgcatgcgcg gcggggtgct gcccctgtta 17880atccccctga tcgccgcggc gatcggcgcc
gtgcccggga tcgcctccgt ggccttgcaa 17940gcgtcccaga ggcattgaca gacttgcaaa
cttgcaaata tggaaaaaaa aaccccaata 18000aaaaagtcta gactctcacg ctcgcttggt
cctgtgacta ttttgtagaa tggaagacat 18060caactttgcg tcgctggccc cgcgtcacgg
ctcgcgcccg ttcctgggac actggaacga 18120tatcggcacc agcaacatga gcggtggcgc
cttcagttgg ggctctctgt ggagcggcat 18180taaaagtatc gggtctgccg ttaaaaatta
cggctcccgg gcctggaaca gcagcacggg 18240ccagatgttg agagacaagt tgaaagagca
gaacttccag cagaaggtgg tggagggcct 18300ggcctccggc atcaacgggg tggtggacct
ggccaaccag gccgtgcaga ataagatcaa 18360cagcagactg gacccccggc cgccggtgga
ggaggtgccg ccggcgctgg agacggtgtc 18420ccccgatggg cgtggcgaga agcgcccgcg
gcccgatagg gaagagacca ctctggtcac 18480gcagaccgat gagccgcccc cgtatgagga
ggccctgaag caaggtctgc ccaccacgcg 18540gcccatcgcg cccatggcca ccggggtggt
gggccgccac acccccgcca cgctggactt 18600gcctccgccc gccgatgtgc cgcagcagca
gaaggcggca cagccgggcc cgcccgcgac 18660cgcctcccgt tcctccgccg gtcctctgcg
ccgcgcggcc agcggccccc gcgggggggt 18720cgcgaggcac ggcaactggc agagcacgct
gaacagcatc gtgggtctgg gggtgcggtc 18780cgtgaagcgc cgccgatgct actgaatagc
ttagctaacg tgttgtatgt gtgtatgcgc 18840cctatgtcgc cgccagagga gctgctgagt
cgccgccgtt cgcgcgccca ccaccaccgc 18900cactccgccc ctcaagatgg cgaccccatc
gatgatgccg cagtggtcgt acatgcacat 18960ctcgggccag gacgcctcgg agtacctgag
ccccgggctg gtgcagttcg cccgcgccac 19020cgagagctac ttcagcctga gtaacaagtt
taggaacccc acggtggcgc ccacgcacga 19080tgtgaccacc gaccggtctc agcgcctgac
gctgcggttc attcccgtgg accgcgagga 19140caccgcgtac tcgtacaagg cgcggttcac
cctggccgtg ggcgacaacc gcgtgctgga 19200catggcctcc acctactttg acatccgcgg
ggtgctggac cggggtccca ctttcaagcc 19260ctactctggc accgcctaca actccctggc
ccccaagggc gctcccaact cctgcgagtg 19320ggagcaagag gaaactcagg cagttgaaga
agcagcagaa gaggaagaag aagatgctga 19380cggtcaagct gaggaagagc aagcagctac
caaaaagact catgtatatg ctcaggctcc 19440cctttctggc gaaaaaatta gtaaagatgg
tctgcaaata ggaacggacg ctacagctac 19500agaacaaaaa cctatttatg cagaccctac
attccagccc gaaccccaaa tcggggagtc 19560ccagtggaat gaggcagatg ctacagtcgc
cggcggtaga gtgctaaaga aatctactcc 19620catgaaacca tgctatggtt cctatgcaag
acccacaaat gctaatggag gtcagggtgt 19680actaacggca aatgcccagg gacagctaga
atctcaggtt gaaatgcaat tcttttcaac 19740ttctgaaaac gcccgtaacg aggctaacaa
cattcagccc aaattggtgc tgtatagtga 19800ggatgtgcac atggagaccc cggatacgca
cctttcttac aagcccgcaa aaagcgatga 19860caattcaaaa atcatgctgg gtcagcagtc
catgcccaac agacctaatt acatcggctt 19920cagagacaac tttatcggcc tcatgtatta
caatagcact ggcaacatgg gagtgcttgc 19980aggtcaggcc tctcagttga atgcagtggt
ggacttgcaa gacagaaaca cagaactgtc 20040ctaccagctc ttgcttgatt ccatgggtga
cagaaccaga tacttttcca tgtggaatca 20100ggcagtggac agttatgacc cagatgttag
aattattgaa aatcatggaa ctgaagacga 20160gctccccaac tattgtttcc ctctgggtgg
cataggggta actgacactt accaggctgt 20220taaaaccaac aatggcaata acgggggcca
ggtgacttgg acaaaagatg aaacttttgc 20280agatcgcaat gaaatagggg tgggaaacaa
tttcgctatg gagatcaacc tcagtgccaa 20340cctgtggaga aacttcctgt actccaacgt
ggcgctgtac ctaccagaca agcttaagta 20400caacccctcc aatgtggaca tctctgacaa
ccccaacacc tacgattaca tgaacaagcg 20460agtggtggcc ccggggctgg tggactgcta
catcaacctg ggcgcgcgct ggtcgctgga 20520ctacatggac aacgtcaacc ccttcaacca
ccaccgcaat gcgggcctgc gctaccgctc 20580catgctcctg ggcaacgggc gctacgtgcc
cttccacatc caggtgcccc agaagttctt 20640tgccatcaag aacctcctcc tcctgccggg
ctcctacacc tacgagtgga acttcaggaa 20700ggatgtcaac atggtcctcc agagctctct
gggtaacgat ctcagggtgg acggggccag 20760catcaagttc gagagcatct gcctctacgc
caccttcttc cccatggccc acaacacggc 20820ctccacgctc gaggccatgc tcaggaacga
caccaacgac cagtccttca atgactacct 20880ctccgccgcc aacatgctct accccatacc
cgccaacgcc accaacgtcc ccatctccat 20940cccctcgcgc aactgggcgg ccttccgcgg
ctgggccttc acccgcctca agaccaagga 21000gaccccctcc ctgggctcgg gattcgaccc
ctactacacc tactcgggct ccattcccta 21060cctggacggc accttctacc tcaaccacac
tttcaagaag gtctcggtca ccttcgactc 21120ctcggtcagc tggccgggca acgaccgtct
gctcaccccc aacgagttcg agatcaagcg 21180ctcggtcgac ggggagggct acaacgtggc
ccagtgcaac atgaccaagg actggttcct 21240ggtccagatg ctggccaact acaacatcgg
ctaccagggc ttctacatcc cagagagcta 21300caaggacagg atgtactcct tcttcaggaa
cttccagccc atgagccggc aggtggtgga 21360ccagaccaag tacaaggact accaggaggt
gggcatcatc caccagcaca acaactcggg 21420cttcgtgggc tacctcgccc ccaccatgcg
cgagggacag gcctaccccg ccaacttccc 21480ctatccgctc ataggcaaga ccgcggtcga
cagcatcacc cagaaaaagt tcctctgcga 21540ccgcaccctc tggcgcatcc ccttctccag
caacttcatg tccatgggtg cgctctcgga 21600cctgggccag aacttgctct acgccaactc
cgcccacgcc ctcgacatga ccttcgaggt 21660cgaccccatg gacgagccca cccttctcta
tgttctgttc gaagtctttg acgtggtccg 21720ggtccaccag ccgcaccgcg gcgtcatcga
gaccgtgtac ctgcgtacgc ccttctcggc 21780cggcaacgcc accacctaaa gaagcaagcc
gcagtcatcg ccgcctgcat gccgtcgggt 21840tccaccgagc aagagctcag ggccatcgtc
agagacctgg gatgcgggcc ctattttttg 21900ggcaccttcg acaagcgctt ccctggcttt
gtctccccac acaagctggc ctgcgccatc 21960gtcaacacgg ccggccgcga gaccgggggc
gtgcactggc tggccttcgc ctggaacccg 22020cgctccaaaa catgcttcct ctttgacccc
ttcggctttt cggaccagcg gctcaagcaa 22080atctacgagt tcgagtacga gggcttgctg
cgtcgcagcg ccatcgcctc ctcgcccgac 22140cgctgcgtca ccctcgaaaa gtccacccag
accgtgcagg ggcccgactc ggccgcctgc 22200ggtctcttct gctgcatgtt tctgcacgcc
tttgtgcact ggcctcagag tcccatggac 22260cgcaacccca ccatgaactt gctgacgggg
gtgcccaact ccatgctcca gagcccccag 22320gtcgagccca ccctgcgccg caaccaggag
cagctctaca gcttcctgga gcgccactcg 22380ccttacttcc gccgccacag cgcacagatc
aggagggcca cctccttctg ccacttgcaa 22440gagatgcaag aagggtaata acgatgtaca
cacttttttt ctcaataaat ggcatctttt 22500tatttataca agctctctgg ggtattcatt
tcccaccacc acccgccgtt gtcgccatct 22560ggctctattt agaaatcgaa agggttctgc
cgggagtcgc cgtgcgccac gggcagggac 22620acgttgcgat actggtagcg ggtgccccac
ttgaactcgg gcaccaccag gcgaggcagc 22680tcggggaagt tttcgctcca caggctgcgg
gtcagcacca gcgcgttcat caggtcgggc 22740gccgagatct tgaagtcgca gttggggccg
ccgccctgcg cgcgcgagtt gcggtacacc 22800gggttgcagc actggaacac caacagcgcc
gggtgcttca cgctggccag cacgctgcgg 22860tcggagatca gctcggcgtc caggtcctcc
gcgttgctca gcgcgaacgg ggtcatcttg 22920ggcacttgcc gccccaggaa gggcgcgtgc
cccggtttcg agttgcagtc gcagcgcagc 22980gggatcagca ggtgcccgtg cccggactcg
gcgttggggt acagcgcgcg catgaaggcc 23040tgcatctggc ggaaggccat ctgggccttg
gcgccctccg agaagaacat gccgcaggac 23100ttgcccgaga actggtttgc ggggcagctg
gcgtcgtgca ggcagcagcg cgcgtcggtg 23160ttggcgatct gcaccacgtt gcgcccccac
cggttcttca cgatcttggc cttggacgat 23220tgctccttca gcgcgcgctg cccgttctcg
ctggtcacat ccatctcgat cacatgttcc 23280ttgttcacca tgctgctgcc gtgcagacac
ttcagctcgc cctccgtctc ggtgcagcgg 23340tgctgccaca gcgcgcagcc cgtgggctcg
aaagacttgt aggtcacctc cgcgaaggac 23400tgcaggtacc cctgcaaaaa gcggcccatc
atggtcacga aggtcttgtt gctgctgaag 23460gtcagctgca gcccgcggtg ctcctcgttc
agccaggtct tgcacacggc cgccagcgcc 23520tccacctggt cgggcagcat cttgaagttc
accttcagct cattctccac gtggtacttg 23580tccatcagcg tgcgcgccgc ctccatgccc
ttctcccagg ccgacaccag cggcaggctc 23640acggggttct tcaccatcac cgtggccgcc
gcctccgccg cgctttcgct ttccgccccg 23700ctgttctctt cctcttcctc ctcttcctcg
ccgccgccca ctcgcagccc ccgcaccacg 23760gggtcgtctt cctgcaggcg ctgcaccttg
cgcttgccgt tgcgcccctg cttgatgcgc 23820acgggcgggt tgctgaagcc caccatcacc
agcgcggcct cttcttgctc gtcctcgctg 23880tccagaatga cctccgggga gggggggttg
gtcatcctca gtaccgaggc acgcttcttt 23940ttcttcctgg gggcgttcgc cagctccgcg
gctgcggccg ctgccgaggt cgaaggccga 24000gggctgggcg tgcgcggcac cagcgcgtcc
tgcgagccgt cctcgtcctc ctcggactcg 24060agacggaggc gggcccgctt cttcgggggc
gcgcggggcg gcggaggcgg cggcggcgac 24120ggagacgggg acgagacatc gtccagggtg
ggtggacggc gggccgcgcc gcgtccgcgc 24180tcgggggtgg tctcgcgctg gtcctcttcc
cgactggcca tctcccactg ctccttctcc 24240tataggcaga aagagatcat ggagtctctc
atgcgagtcg agaaggagga ggacagccta 24300accgccccct ctgagccctc caccaccgcc
gccaccaccg ccaatgccgc cgcggacgac 24360gcgcccaccg agaccaccgc cagtaccacc
ctccccagcg acgcaccccc gctcgagaat 24420gaagtgctga tcgagcagga cccgggtttt
gtgagcggag aggaggatga ggtggatgag 24480aaggagaagg aggaggtcgc cgcctcagtg
ccaaaagagg ataaaaagca agaccaggac 24540gacgcagata aggatgagac agcagtcggg
cgggggaacg gaagccatga tgctgatgac 24600ggctacctag acgtgggaga cgacgtgctg
cttaagcacc tgcaccgcca gtgcgtcatc 24660gtctgcgacg cgctgcagga gcgctgcgaa
gtgcccctgg acgtggcgga ggtcagccgc 24720gcctacgagc ggcacctctt cgcgccgcac
gtgcccccca agcgccggga gaacggcacc 24780tgcgagccca acccgcgtct caacttctac
ccggtcttcg cggtacccga ggtgctggcc 24840acctaccaca tctttttcca aaactgcaag
atccccctct cctgccgcgc caaccgcacc 24900cgcgccgaca aaaccctgac cctgcggcag
ggcgcccaca tacctgatat cgcctctctg 24960gaggaagtgc ccaagatctt cgagggtctc
ggtcgcgacg agaaacgggc ggcgaacgct 25020ctgcacggag acagcgaaaa cgagagtcac
tcgggggtgc tggtggagct cgagggcgac 25080aacgcgcgcc tggccgtact caagcgcagc
atagaggtca cccactttgc ctacccggcg 25140ctcaacctgc cccccaaggt catgagtgtg
gtcatgggcg agctcatcat gcgccgcgcc 25200cagcccctgg ccgcggatgc aaacttgcaa
gagtcctccg aggaaggcct gcccgcggtc 25260agcgacgagc agctggcgcg ctggctggag
acccgcgacc ccgcgcagct ggaggagcgg 25320cgcaagctca tgatggccgc ggtgctggtc
accgtggagc tcgagtgtct gcagcgcttc 25380ttcgcggacc ccgagatgca gcgcaagctc
gaggagaccc tgcactacac cttccgccag 25440ggctacgtgc gccaggcctg caagatctcc
aacgtggagc tctgcaacct ggtctcctac 25500ctgggcatcc tgcacgagaa ccgcctcggg
cagaacgtcc tgcactccac cctcaaaggg 25560gaggcgcgcc gcgactacat ccgcgactgc
gcctacctct tcctctgcta cacctggcag 25620acggccatgg gggtctggca gcagtgcctg
gaggagcgca acctcaagga gctggaaaag 25680ctcctcaagc gcaccctcag ggacctctgg
acgggcttca acgagcgctc ggtggccgcc 25740gcgctggcgg acatcatctt tcccgagcgc
ctgctcaaga ccctgcagca gggcctgccc 25800gacttcacca gccagagcat gctgcagaac
ttcaggactt tcatcctgga gcgctcgggc 25860atcctgccgg ccacttgctg cgcgctgccc
agcgacttcg tgcccatcaa gtacagggag 25920tgcccgccgc cgctctgggg ccactgctac
ctcttccagc tggccaacta cctcgcctac 25980cactcggacc tcatggaaga cgtgagcggc
gagggcctgc tcgagtgcca ctgccgctgc 26040aacctctgca cgccccaccg ctctctagtc
tgcaacccgc agctgctcag cgagagtcag 26100attatcggta ccttcgagct gcagggtccc
tcgcctgacg agaagtccgc ggctccaggg 26160ctgaaactca ctccggggct gtggacttcc
gcctacctac gcaaatttgt acctgaggac 26220taccacgccc acgagatcag gttctacgaa
gaccaatccc gcccgcccaa ggcggagctc 26280accgcctgcg tcatcaccca ggggcacatc
ctgggccaat tgcaagccat caacaaagcc 26340cgccgagagt tcttgctgaa aaagggtcgg
ggggtgtacc tggaccccca gtccggcgag 26400gagctaaacc cgctaccccc gccgccgccc
cagcagcggg accttgcttc ccaggatggc 26460acccagaaag aagcagcagc cgccgccgcc
gccgcagcca tacatgcttc tggaggaaga 26520ggaggaggac tgggacagtc aggcagagga
ggtttcggac gaggagcagg aggagatgat 26580ggaagactgg gaggaggaca gcagcctaga
cgaggaagct tcagaggccg aagaggtggc 26640agacgcaaca ccatcgccct cggtcgcagc
cccctcgccg gggcccctga aatcctccga 26700acccagcacc agcgctataa cctccgctcc
tccggcgccg gcgccacccg cccgcagacc 26760caaccgtaga tgggacacca caggaaccgg
ggtcggtaag tccaagtgcc cgccgccgcc 26820accgcagcag cagcagcagc agcgccaggg
ctaccgctcg tggcgcgggc acaagaacgc 26880catagtcgcc tgcttgcaag actgcggggg
caacatctct ttcgcccgcc gcttcctgct 26940attccaccac ggggtcgcct ttccccgcaa
tgtcctgcat tactaccgtc atctctacag 27000cccctactgc agcggcgacc cagaggcggc
agcggcagcc acagcggcga ccaccaccta 27060ggaagatatc ctccgcgggc aagacagcgg
cagcagcggc caggagaccc gcggcagcag 27120cggcgggagc ggtgggcgca ctgcgcctct
cgcccaacga acccctctcg acccgggagc 27180tcagacacag gatcttcccc actttgtatg
ccatcttcca acagagcaga ggccaggagc 27240aggagctgaa aataaaaaac agatctctgc
gctccctcac ccgcagctgt ctgtatcaca 27300aaagcgaaga tcagcttcgg cgcacgctgg
aggacgcgga ggcactcttc agcaaatact 27360gcgcgctcac tcttaaagac tagctccgcg
cccttctcga atttaggcgg gagaaaacta 27420cgtcatcgcc ggccgccgcc cagcccgccc
agccgagatg agcaaagaga ttcccacgcc 27480atacatgtgg agctaccagc cgcagatggg
actcgcggcg ggagcggccc aggactactc 27540cacccgcatg aactacatga gcgcgggacc
ccacatgatc tcacaggtca acgggatccg 27600cgcccagcga aaccaaatac tgctggaaca
ggcggccatc accgccacgc cccgccataa 27660tctcaacccc cgaaattggc ccgccgccct
cgtgtaccag gaaaccccct ccgccaccac 27720cgtactactt ccgcgtgacg cccaggccga
agtccagatg actaactcag gggcgcagct 27780cgcgggcggc tttcgtcacg gggcgcggcc
gctccgacca ggtataagac acctgatgat 27840cagaggccga ggtatccagc tcaacgacga
gtcggtgagc tcttcgctcg gtctccgtcc 27900ggacggaact ttccagctcg ccggatccgg
ccgctcttcg ttcacgcccc gccaggcgta 27960cctgactctg cagacctcgt cctcggagcc
ccgctccggc ggcatcggaa ccctccagtt 28020cgtggaggag ttcgtgccct cggtctactt
caaccccttc tcgggacctc ccggacgcta 28080ccccgaccag ttcattccga actttgacgc
ggtgaaggac tcggcggacg gctacgactg 28140aatgtcaggt gtcgaggcag agcagcttcg
cctgagacac ctcgagcact gccgccgcca 28200caagtgcttc gcccgcggtt ctggtgagtt
ctgctacttt cagctacccg aggagcatac 28260cgaggggccg gcgcacggcg tccgcctgac
cacccagggc gaggttacct gttccctcat 28320ccgggagttt accctccgtc ccctgctagt
ggagcgggag cggggtccct gtgtcctaac 28380tatcgcctgc aactgcccta accctggatt
acatcaagat ctttgctgtc atctctgtgc 28440tgagtttaat aaacgctgag atcagaatct
actggggctc ctgtcgccat cctgtgaacg 28500ccaccgtctt cacccacccc gaccaggccc
aggcgaacct cacctgcggt ctgcatcgga 28560gggccaagaa gtacctcacc tggtacttca
acggcacccc ctttgtggtt tacaacagct 28620tcgacgggga cggagtctcc ctgaaagacc
agctctccgg tctcagctac tccatccaca 28680agaacaccac cctccaactc ttccctccct
acctgccggg aacctacgag tgcgtcaccg 28740gccgctgcac ccacctcacc cgcctgatcg
taaaccagag ctttccggga acagataact 28800ccctcttccc cagaacagga ggtgagctca
ggaaactccc cggggaccag ggcggagacg 28860taccttcgac ccttgtgggg ttaggatttt
ttattaccgg gttgctggct cttttaatca 28920aagtttcctt gagatttgtt ctttccttct
acgtgtatga acacctcaac ctccaataac 28980tctacccttt cttcggaatc aggtgacttc
tctgaaatcg ggcttggtgt gctgcttact 29040ctgttgattt ttttccttat catactcagc
cttctgtgcc tcaggctcgc cgcctgctgc 29100gcacacatct atatctactg ctggttgctc
aagtgcaggg gtcgccaccc aagatgaaca 29160ggtacatggt cctatcgatc ctaggcctgc
tggccctggc ggcctgcagc gccgccaaaa 29220aagagattac ctttgaggag cccgcttgca
atgtaacttt caagcccgag ggtgaccaat 29280gcaccaccct cgtcaaatgc gttaccaatc
atgagaggct gcgcatcgac tacaaaaaca 29340aaactggcca gtttgcggtc tatagtgtgt
ttacgcccgg agacccctct aactactctg 29400tcaccgtctt ccagggcgga cagtctaaga
tattcaatta cactttccct ttttatgagt 29460tatgcgatgc ggtcatgtac atgtcaaaac
agtacaacct gtggcctccc tctccccagg 29520cgtgtgtgga aaatactggg tcttactgct
gtatggcttt cgcaatcact acgctcgctc 29580taatctgcac ggtgctatac ataaaattca
ggcagaggcg aatctttatc gatgaaaaga 29640aaatgccttg atcgctaaca ccggctttct
atctgcagaa tgaatgcaat cacctcccta 29700ctaatcacca ccaccctcct tgcgattgcc
catgggttga cacgaatcga agtgccagtg 29760gggtccaatg tcaccatggt gggccccgcc
ggcaattcca ccctcatgtg ggaaaaattt 29820gtccgcaatc aatgggttca tttctgctct
aaccgaatca gtatcaagcc cagagccatc 29880tgcgatgggc aaaatctaac tctgatcaat
gtgcaaatga tggatgctgg gtactattac 29940gggcagcggg gagaaatcat taattactgg
cgaccccaca aggactacat gctgcatgta 30000gtcgaggcac ttcccactac cacccccact
accacctctc ccaccaccac caccactact 30060actactacta ctactactac tactactacc
actaccgctg cccgccatac ccgcaaaagc 30120accatgatta gcacaaagcc ccctcgtgct
cactcccacg ccggcgggcc catcggtgcg 30180acctcagaaa ccaccgagct ttgcttctgc
caatgcacta acgccagcgc tcatgaactg 30240ttcgacctgg agaatgagga tgtccagcag
agctccgctt gcctgaccca ggaggctgtg 30300gagcccgttg ccctgaagca gatcggtgat
tcaataattg actcttcttc ttttgccact 30360cccgaatacc ctcccgattc tactttccac
atcacgggta ccaaagaccc taacctctct 30420ttctacctga tgctgctgct ctgtatctct
gtggtctctt ccgcgctgat gttactgggg 30480atgttctgct gcctgatctg ccgcagaaag
agaaaagctc gctctcaggg ccaaccactg 30540atgcccttcc cctacccccc ggattttgca
gataacaaga tatgagctcg ctgctgacac 30600taaccgcttt actagcctgc gctctaaccc
ttgtcgcttg cgactcgaga ttccacaatg 30660tcacagctgt ggcaggagaa aatgttactt
tcaactccac ggccgatacc cagtggtcgt 30720ggagtggctc aggtagctac ttaactatct
gcaatagctc cacttccccc ggcatatccc 30780caaccaagta ccaatgcaat gccagcctgt
tcaccctcat caacgcttcc accctggaca 30840atggactcta tgtaggctat gtaccctttg
gtgggcaagg aaagacccac gcttacaacc 30900tggaagttcg ccagcccaga accactaccc
aagcttctcc caccaccacc accaccacca 30960ccatcaccag cagcagcagc agcagcagcc
acagcagcag cagcagatta ttgactttgg 31020ttttggccag ctcatctgcc gctacccagg
ccatctacag ctctgtgccc gaaaccactc 31080agatccaccg cccagaaacg accaccgcca
ccaccctaca cacctccagc gatcagatgc 31140cgaccaacat cacccccttg gctcttcaaa
tgggacttac aagccccact ccaaaaccag 31200tggatgcggc cgaggtctcc gccctcgtca
atgactgggc ggggctggga atgtggtggt 31260tcgccatagg catgatggcg ctctgcctgc
ttctgctctg gctcatctgc tgcctccacc 31320gcaggcgagc cagacccccc atctatagac
ccatcattgt cctgaacccc gataatgatg 31380ggatccatag attggatggc ctgaaaaacc
tacttttttc ttttacagta tgataaattg 31440agacatgcct cgcattttct tgtacatgtt
ccttctccca ccttttctgg ggtgttctac 31500gctggccgct gtgtctcacc tggaggtaga
ctgcctctca cccttcactg tctacctgct 31560ttacggattg gtcaccctca ctctcatctg
cagcctaatc acagtaatca tcgccttcat 31620ccagtgcatt gattacatct gtgtgcgcct
cgcatacttc agacaccacc cgcagtaccg 31680agacaggaac attgcccaac ttctaagact
gctctaatca tgcataagac tgtgatctgc 31740cttctgatcc tctgcatcct gcccaccctc
acctcctgcc agtacaccac aaaatctccg 31800cgcaaaagac atgcctcctg ccgcttcacc
caactgtgga atatacccaa atgctacaac 31860gaaaagagcg agctctccga agcttggctg
tatggggtca tctgtgtctt agttttctgc 31920agcactgtct ttgccctcat aatctacccc
tactttgatt tgggatggaa cgcgatcgat 31980gccatgaatt accccacctt tcccgcaccc
gagataattc cactgcgaca agttgtaccc 32040gttgtcgtta atcaacgccc cccatcccct
acgcccactg aaatcagcta ctttaaccta 32100acaggcggag atgactgacg ccctagatct
agaaatggac ggcatcagta ccgagcagcg 32160tctcctagag aggcgcaggc aggcggctga
gcaagagcgc ctcaatcagg agctccgaga 32220tctcgttaac ctgcaccagt gcaaaagagg
catcttttgt ctggtaaagc aggccaaagt 32280cacctacgag aagaccggca acagccaccg
cctcagttac aaattgccca cccagcgcca 32340gaagctggtg ctcatggtgg gtgagaatcc
catcaccgtc acccagcact cggtagagac 32400cgaggggtgt ctgcactccc cctgtcgggg
tccagaagac ctctgcaccc tggtaaagac 32460cctgtgcggt ctcagagatt tagtcccctt
taactaatca aacactggaa tcaataaaaa 32520gaatcactta cttaaaatca gacagcaggt
ctctgtccag tttattcagc agcacctcct 32580tcccctcctc ccaactctgg tactccaaac
gccttctggc ggcaaacttc ctccacaccc 32640tgaagggaat gtcagattct tgctcctgtc
cctccgcacc cactatcttc atgttgttgc 32700agatgaagcg caccaaaacg tctgacgaga
gcttcaaccc cgtgtacccc tatgacacgg 32760aaagcggccc tccctccgtc cctttcctca
cccctccctt cgtgtctccc gatggattcc 32820aagaaagtcc ccccggggtc ctgtctctga
acctggccga gcccctggtc acttcccacg 32880gcatgctcgc cctgaaaatg ggaagtggcc
tctccctgga cgacgctggc aacctcacct 32940ctcaagatat caccaccgct agccctcccc
tcaaaaaaac caagaccaac ctcagcctag 33000aaacctcatc ccccctaact gtgagcacct
caggcgccct caccgtagca gccgccgctc 33060ccctggcggt ggccggcacc tccctcacca
tgcaatcaga ggcccccctg acagtacagg 33120atgcaaaact caccctggcc accaaaggcc
ccctgaccgt gtctgaaggc aaactggcct 33180tgcaaacatc ggccccgctg acggccgctg
acagcagcac cctcacagtc agtgccacac 33240caccccttag cacaagcaat ggcagcttgg
gtattgacat gcaagccccc atttacacca 33300ccaatggaaa actaggactt aactttggcg
ctcccctgca tgtggtagac agcctaaatg 33360cactgactgt agttactggc caaggtctta
cgataaacgg aacagcccta caaactagag 33420tctcaggtgc cctcaactat gacacatcag
gaaacctaga attgagagct gcagggggta 33480tgcgagttga tgcaaatggt caacttatcc
ttgatgtagc ttacccattt gatgcacaaa 33540acaatctcag ccttaggctt ggacagggac
ccctgtttgt taactctgcc cacaacttgg 33600atgttaacta caacagaggc ctctacctgt
tcacatctgg aaataccaaa aagctagaag 33660ttaatatcaa aacagccaag ggtctcattt
atgatgacac tgctatagca atcaatgcgg 33720gtgatgggct acagtttgac tcaggctcag
atacaaatcc attaaaaact aaacttggat 33780taggactgga ttatgactcc agcagagcca
taattgctaa actgggaact ggcctaagct 33840ttgacaacac aggtgccatc acagtaggca
acaaaaatga tgacaagctt accttgtgga 33900ccacaccaga cccatcccct aactgtagaa
tctattcaga gaaagatgct aaattcacac 33960ttgttttgac taaatgcggc agtcaggtgt
tggccagcgt ttctgtttta tctgtaaaag 34020gtagccttgc gcccatcagt ggcacagtaa
ctagtgctca gattgtcctc agatttgatg 34080aaaatggagt tctactaagc aattcttccc
ttgaccctca atactggaac tacagaaaag 34140gtgaccttac agagggcact gcatatacca
acgcagtggg atttatgccc aacctcacag 34200catacccaaa aacacagagc caaactgcta
aaagcaacat tgtaagtcag gtttacttga 34260atggggacaa atccaaaccc atgaccctca
ccattaccct caatggaact aatgaaacag 34320gagatgccac agtaagcact tactccatgt
cattctcatg gaactggaat ggaagtaatt 34380acattaatga aacgttccaa accaactcct
tcaccttctc ctacatcgcc caagaataaa 34440aagcatgacg ctgttgattt gattcaatgt
gtttctgttt tattttcaag cacaacaaaa 34500tcattcaagt cattcttcca tcttagctta
atagacacag tagcttaata gacccagtag 34560tgcaaagccc cattctagct tataactagt
ggagaagtac tcgcctacat gggggtagag 34620tcataatcgt gcatcaggat agggcggtgg
tgctgcagca gcgcgcgaat aaactgctgc 34680cgccgccgct ccgtcctgca ggaatacaac
atggcagtgg tctcctcagc gatgattcgc 34740accgcccgca gcataaggcg ccttgtcctc
cgggcacagc agcgcaccct gatctcactt 34800aaatcagcac agtaactgca gcacagcacc
acaatattgt tcaaaatccc acagtgcaag 34860gcgctgtatc caaagctcat ggcggggacc
acagaaccca cgtggccatc ataccacaag 34920cgcaggtaga ttaagtggcg acccctcata
aacacgctgg acataaacat tacctctttt 34980ggcatgttgt aattcaccac ctcccggtac
catataaacc tctgattaaa catggcgcca 35040tccaccacca tcctaaacca gctggccaaa
acctgcccgc cggctataca ctgcagggaa 35100ccgggactgg aacaatgaca gtggagagcc
caggactcgt aaccatggat catcatgctc 35160gtcatgatat caatgttggc acaacacagg
cacacgtgca tacacttcct caggattaca 35220agctcctccc gcgttagaac catatcccag
ggaacaaccc attcctgaat cagcgtaaat 35280cccacactgc agggaagacc tcgcacgtaa
ctcacgttgt gcattgtcaa agtgttacat 35340tcgggcagca gcggatgatc ctccagtatg
gtagcgcggg tttctgtctc aaaaggaggt 35400agacgatccc tactgtacgg agtgcgccga
gacaaccgag atcgtgttgg tcgtagtgtc 35460atgccaaatg gaacgccgga cgtagtcata
tttcctgaag tcttagatct ctcaacgcag 35520caccagcacc aacacttcgc agtgtaaaag
gccaagtgcc gagagagtat atataggaat 35580aaaaagtgac gtaaacgggc aaagtccaaa
aaacgcccag aaaaaccgca cgcgaaccta 35640cgccccgaaa cgaaagccaa aaaacactag
acactccctt ccggcgtcaa cttccgcttt 35700cccacgctac gtcacttgcc ccagtcaaac
aaactacata tcccgaactt ccaagtcgcc 35760acgcccaaaa caccgcctac acctccccgc
ccgccggccc gcccccaaac ccgcctcccg 35820ccccgcgccc cgccccgcgc cgcccatctc
attatcatat tggcttcaat ccaaaataag 35880gtatattatt gatgatggtt taaacggatc
ctctagagtc gacctgcagg catgcaagct 35940tgagtattct atagtgtcac ctaaatagct
tggcgtaatc atggtcatag ctgtttcctg 36000tgtgaaattg ttatccgctc acaattccac
acaacatacg agccggaagc ataaagtgta 36060aagcctgggg tgcctaatga gtgagctaac
tcacattaat tgcgttgcgc tcactgcccg 36120ctttccagtc gggaaacctg tcgtgccagc
tgcattaatg aatcggccaa cgcgaacccc 36180ttgcggccgc ccgggccgtc gaccaattct
catgtttgac agcttatcat cgaatttctg 36240ccattcatcc gcttattatc acttattcag
gcgtagcaac caggcgttta agggcaccaa 36300taactgcctt aaaaaaatta cgccccgccc
tgccactcat cgcagtactg ttgtaattca 36360ttaagcattc tgccgacatg gaagccatca
caaacggcat gatgaacctg aatcgccagc 36420ggcatcagca ccttgtcgcc ttgcgtataa
tatttgccca tggtgaaaac gggggcgaag 36480aagttgtcca tattggccac gtttaaatca
aaactggtga aactcaccca gggattggct 36540gagacgaaaa acatattctc aataaaccct
ttagggaaat aggccaggtt ttcaccgtaa 36600cacgccacat cttgcgaata tatgtgtaga
aactgccgga aatcgtcgtg gtattcactc 36660cagagcgatg aaaacgtttc agtttgctca
tggaaaacgg tgtaacaagg gtgaacacta 36720tcccatatca ccagctcacc gtctttcatt
gccatacgga attccggatg agcattcatc 36780aggcgggcaa gaatgtgaat aaaggccgga
taaaacttgt gcttattttt ctttacggtc 36840tttaaaaagg ccgtaatatc cagctgaacg
gtctggttat aggtacattg agcaactgac 36900tgaaatgcct caaaatgttc tttacgatgc
cattgggata tatcaacggt ggtatatcca 36960gtgatttttt tctccatttt agcttcctta
gctcctgaaa atctcgataa ctcaaaaaat 37020acgcccggta gtgatcttat ttcattatgg
tgaaagttgg aacctcttac gtgccgatca 37080acgtctcatt ttcgccaaaa gttggcccag
ggcttcccgg tatcaacagg gacaccagga 37140tttatttatt ctgcgaagtg atcttccgtc
acaggtattt attcgcgata agctcatgga 37200gcggcgtaac cgtcgcacag gaaggacaga
gaaagcgcgg atctgggaag tgacggacag 37260aacggtcagg acctggattg gggaggcggt
tgccgccgct gctgctgacg gtgtgacgtt 37320ctctgttccg gtcacaccac atacgttccg
ccattcctat gcgatgcaca tgctgtatgc 37380cggtataccg ctgaaagttc tgcaaagcct
gatgggacat aagtccatca gttcaacgga 37440agtctacacg aaggtttttg cgctggatgt
ggctgcccgg caccgggtgc agtttgcgat 37500gccggagtct gatgcggttg cgatgctgaa
acaattatcc tgagaataaa tgccttggcc 37560tttatatgga aatgtggaac tgagtggata
tgctgttttt gtctgttaaa cagagaagct 37620ggctgttatc cactgagaag cgaacgaaac
agtcgggaaa atctcccatt atcgtagaga 37680tccgcattat taatctcagg agcctgtgta
gcgtttatag gaagtagtgt tctgtcatga 37740tgcctgcaag cggtaacgaa aacgatttga
atatgccttc aggaacaata gaaatcttcg 37800tgcggtgtta cgttgaagtg gagcggatta
tgtcagcaat ggacagaaca acctaatgaa 37860cacagaacca tgatgtggtc tgtcctttta
cagccagtag tgctcgccgc agtcgagcga 37920cagggcgaag ccctcgagtg agcgaggaag
caccagggaa cagcacttat atattctgct 37980tacacacgat gcctgaaaaa acttcccttg
gggttatcca cttatccacg gggatatttt 38040tataattatt ttttttatag tttttagatc
ttctttttta gagcgccttg taggccttta 38100tccatgctgg ttctagagaa ggtgttgtga
caaattgccc tttcagtgtg acaaatcacc 38160ctcaaatgac agtcctgtct gtgacaaatt
gcccttaacc ctgtgacaaa ttgccctcag 38220aagaagctgt tttttcacaa agttatccct
gcttattgac tcttttttat ttagtgtgac 38280aatctaaaaa cttgtcacac ttcacatgga
tctgtcatgg cggaaacagc ggttatcaat 38340cacaagaaac gtaaaaatag cccgcgaatc
gtccagtcaa acgacctcac tgaggcggca 38400tatagtctct cccgggatca aaaacgtatg
ctgtatctgt tcgttgacca gatcagaaaa 38460tctgatggca ccctacagga acatgacggt
atctgcgaga tccatgttgc taaatatgct 38520gaaatattcg gattgacctc tgcggaagcc
agtaaggata tacggcaggc attgaagagt 38580ttcgcgggga aggaagtggt tttttatcgc
cctgaagagg atgccggcga tgaaaaaggc 38640tatgaatctt ttccttggtt tatcaaacgt
gcgcacagtc catccagagg gctttacagt 38700gtacatatca acccatatct cattcccttc
tttatcgggt tacagaaccg gtttacgcag 38760tttcggctta gtgaaacaaa agaaatcacc
aatccgtatg ccatgcgttt atacgaatcc 38820ctgtgtcagt atcgtaagcc ggatggctca
ggcatcgtct ctctgaaaat cgactggatc 38880atagagcgtt accagctgcc tcaaagttac
cagcgtatgc ctgacttccg ccgccgcttc 38940ctgcaggtct gtgttaatga gatcaacagc
agaactccaa tgcgcctctc atacattgag 39000aaaaagaaag gccgccagac gactcatatc
gtattttcct tccgcgatat cacttccatg 39060acgacaggat agtctgaggg ttatctgtca
cagatttgag ggtggttcgt cacatttgtt 39120ctgacctact gagggtaatt tgtcacagtt
ttgctgtttc cttcagcctg catggatttt 39180ctcatacttt ttgaactgta atttttaagg
aagccaaatt tgagggcagt ttgtcacagt 39240tgatttcctt ctctttccct tcgtcatgtg
acctgatatc gggggttagt tcgtcatcat 39300tgatgagggt tgattatcac agtttattac
tctgaattgg ctatccgcgt gtgtacctct 39360acctggagtt tttcccacgg tggatatttc
ttcttgcgct gagcgtaaga gctatctgac 39420agaacagttc ttctttgctt cctcgccagt
tcgctcgcta tgctcggtta cacggctgcg 39480gcgagcgcta gtgataataa gtgactgagg
tatgtgctct tcttatctcc ttttgtagtg 39540ttgctcttat tttaaacaac tttgcggttt
tttgatgact ttgcgatttt gttgttgctt 39600tgcagtaaat tgcaagattt aataaaaaaa
cgcaaagcaa tgattaaagg atgttcagaa 39660tgaaactcat ggaaacactt aaccagtgca
taaacgctgg tcatgaaatg acgaaggcta 39720tcgccattgc acagtttaat gatgacagcc
cggaagcgag gaaaataacc cggcgctgga 39780gaataggtga agcagcggat ttagttgggg
tttcttctca ggctatcaga gatgccgaga 39840aagcagggcg actaccgcac ccggatatgg
aaattcgagg acgggttgag caacgtgttg 39900gttatacaat tgaacaaatt aatcatatgc
gtgatgtgtt tggtacgcga ttgcgacgtg 39960ctgaagacgt atttccaccg gtgatcgggg
ttgctgccca taaaggtggc gtttacaaaa 40020cctcagtttc tgttcatctt gctcaggatc
tggctctgaa ggggctacgt gttttgctcg 40080tggaaggtaa cgacccccag ggaacagcct
caatgtatca cggatgggta ccagatcttc 40140atattcatgc agaagacact ctcctgcctt
tctatcttgg ggaaaaggac gatgtcactt 40200atgcaataaa gcccacttgc tggccggggc
ttgacattat tccttcctgt ctggctctgc 40260accgtattga aactgagtta atgggcaaat
ttgatgaagg taaactgccc accgatccac 40320acctgatgct ccgactggcc attgaaactg
ttgctcatga ctatgatgtc atagttattg 40380acagcgcgcc taacctgggt atcggcacga
ttaatgtcgt atgtgctgct gatgtgctga 40440ttgttcccac gcctgctgag ttgtttgact
acacctccgc actgcagttt ttcgatatgc 40500ttcgtgatct gctcaagaac gttgatctta
aagggttcga gcctgatgta cgtattttgc 40560ttaccaaata cagcaatagt aatggctctc
agtccccgtg gatggaggag caaattcggg 40620atgcctgggg aagcatggtt ctaaaaaatg
ttgtacgtga aacggatgaa gttggtaaag 40680gtcagatccg gatgagaact gtttttgaac
aggccattga tcaacgctct tcaactggtg 40740cctggagaaa tgctctttct atttgggaac
ctgtctgcaa tgaaattttc gatcgtctga 40800ttaaaccacg ctgggagatt agataatgaa
gcgtgcgcct gttattccaa aacatacgct 40860caatactcaa ccggttgaag atacttcgtt
atcgacacca gctgccccga tggtggattc 40920gttaattgcg cgcgtaggag taatggctcg
cggtaatgcc attactttgc ctgtatgtgg 40980tcgggatgtg aagtttactc ttgaagtgct
ccggggtgat agtgttgaga agacctctcg 41040ggtatggtca ggtaatgaac gtgaccagga
gctgcttact gaggacgcac tggatgatct 41100catcccttct tttctactga ctggtcaaca
gacaccggcg ttcggtcgaa gagtatctgg 41160tgtcatagaa attgccgatg ggagtcgccg
tcgtaaagct gctgcactta ccgaaagtga 41220ttatcgtgtt ctggttggcg agctggatga
tgagcagatg gctgcattat ccagattggg 41280taacgattat cgcccaacaa gtgcttatga
acgtggtcag cgttatgcaa gccgattgca 41340gaatgaattt gctggaaata tttctgcgct
ggctgatgcg gaaaatattt cacgtaagat 41400tattacccgc tgtatcaaca ccgccaaatt
gcctaaatca gttgttgctc ttttttctca 41460ccccggtgaa ctatctgccc ggtcaggtga
tgcacttcaa aaagccttta cagataaaga 41520ggaattactt aagcagcagg catctaacct
tcatgagcag aaaaaagctg gggtgatatt 41580tgaagctgaa gaagttatca ctcttttaac
ttctgtgctt aaaacgtcat ctgcatcaag 41640aactagttta agctcacgac atcagtttgc
tcctggagcg acagtattgt ataagggcga 41700taaaatggtg cttaacctgg acaggtctcg
tgttccaact gagtgtatag agaaaattga 41760ggccattctt aaggaacttg aaaagccagc
accctgatgc gaccacgttt tagtctacgt 41820ttatctgtct ttacttaatg tcctttgtta
caggccagaa agcataactg gcctgaatat 41880tctctctggg cccactgttc cacttgtatc
gtcggtctga taatcagact gggaccacgg 41940tcccactcgt atcgtcggtc tgattattag
tctgggacca cggtcccact cgtatcgtcg 42000gtctgattat tagtctggga ccacggtccc
actcgtatcg tcggtctgat aatcagactg 42060ggaccacggt cccactcgta tcgtcggtct
gattattagt ctgggaccat ggtcccactc 42120gtatcgtcgg tctgattatt agtctgggac
cacggtccca ctcgtatcgt cggtctgatt 42180attagtctgg aaccacggtc ccactcgtat
cgtcggtctg attattagtc tgggaccacg 42240gtcccactcg tatcgtcggt ctgattatta
gtctgggacc acgatcccac tcgtgttgtc 42300ggtctgatta tcggtctggg accacggtcc
cacttgtatt gtcgatcaga ctatcagcgt 42360gagactacga ttccatcaat gcctgtcaag
ggcaagtatt gacatgtcgt cgtaacctgt 42420agaacggagt aacctcggtg tgcggttgta
tgcctgctgt ggattgctgc tgtgtcctgc 42480ttatccacaa cattttgcgc acggttatgt
ggacaaaata cctggttacc caggccgtgc 42540cggcacgtta accgggctgc atccgatgca
agtgtgtcgc tgtcgacgag ctcgcgagct 42600cggacatgag gttgccccgt attcagtgtc
gctgatttgt attgtctgaa gttgttttta 42660cgttaagttg atgcagatca attaatacga
tacctgcgtc ataattgatt atttgacgtg 42720gtttgatggc ctccacgcac gttgtgatat
gtagatgata atcattatca ctttacgggt 42780cctttccggt gatccgacag gttacggggc
ggcgacctcg cgggttttcg ctatttatga 42840aaattttccg gtttaaggcg tttccgttct
tcttcgtcat aacttaatgt ttttatttaa 42900aataccctct gaaaagaaag gaaacgacag
gtgctgaaag cgagcttttt ggcctctgtc 42960gtttcctttc tctgtttttg tccgtggaat
gaacaatgga agtccgagct catcgctaat 43020aacttcgtat agcatacatt atacgaagtt
atattcgatg cggccgcaag gggttcgcgt 43080cagcgggtgt tggcgggtgt cggggctggc
ttaactatgc ggcatcagag cagattgtac 43140tgagagtgca ccatatgcgg tgtgaaatac
cgcacagatg cgtaaggaga aaataccgca 43200tcaggcgcca ttcgccattc aggctgcgca
actgttggga agggcgatcg gtgcgggcct 43260cttcgctatt acgccagctg gcgaaagggg
gatgtgctgc aaggcgatta agttgggtaa 43320cgccagggtt ttcccagtca cgacgttgta
aaacgacggc cagtgaattg taatacgact 43380cactataggg cgaattcgag ctcggtaccc
ggggatcctc gtttaaac 43428945227DNASimian adenovirus
9catcatcaat aatatacctt attttggatt gaagccaata tgataatgag atgggcggcg
60cggggcgggg cgcggggcgg gaggcgggtt tgggggcggg ccggcgggcg gggcggtgtg
120gcggaagtgg actttgtaag tgtggcggat gtgacttgct agtgccgggc gcggtaaaag
180tgacgttttc cgtgcgcgac aacgcccccg ggaagtgaca tttttcccgc ggtttttacc
240ggatgttgta gtgaatttgg gcgtaaccaa gtaagatttg gccattttcg cgggaaaact
300gaaacgggga agtgaaatct gattaatttt gcgttagtca taccgcgtaa tatttgtcta
360gggccgaggg actttggccg attacgtgga ggactcgccc aggtgttttt tgaggtgaat
420ttccgcgttc cgggtcaaag tctgcgtttt attattatag gatatcccat tgcatacgtt
480gtatccatat cataatatgt acatttatat tggctcatgt ccaacattac cgccatgttg
540acattgatta ttgactagtt attaatagta atcaattacg gggtcattag ttcatagccc
600atatatggag ttccgcgtta cataacttac ggtaaatggc ccgcctggct gaccgcccaa
660cgacccccgc ccattgacgt caataatgac gtatgttccc atagtaacgc caatagggac
720tttccattga cgtcaatggg tggagtattt acggtaaact gcccacttgg cagtacatca
780agtgtatcat atgccaagta cgccccctat tgacgtcaat gacggtaaat ggcccgcctg
840gcattatgcc cagtacatga ccttatggga ctttcctact tggcagtaca tctacgtatt
900agtcatcgct attaccatgg tgatgcggtt ttggcagtac atcaatgggc gtggatagcg
960gtttgactca cggggatttc caagtctcca ccccattgac gtcaatggga gtttgttttg
1020gcaccaaaat caacgggact ttccaaaatg tcgtaacaac tccgccccat tgacgcaaat
1080gggcggtagg cgtgtacggt gggaggtcta tataagcaga gctctcccta tcagtgatag
1140agatctccct atcagtgata gagatcgtcg acgagctcgt ttagtgaacc gtcagatcgc
1200ctggagacgc catccacgct gttttgacct ccatagaaga caccgggacc gatccagcct
1260ccgcggccgg gaacggtgca ttggaacgcg gattccccgt gccaagagtg agatcttccg
1320tttatctagg taccgggccc cccctcgagg tcgacggtat cgataagctt cacgctgccg
1380caagcactca gggcgcaagg gctgctaaag gaagcggaac acgtagaaag ccagtccgca
1440gaaacggtgc tgaccccgga tgaatgtcag ctactgggct atctggacaa gggaaaacgc
1500aagcgcaaag agaaagcagg tagcttgcag tgggcttaca tggcgatagc tagactgggc
1560ggttttatgg acagcaagcg aaccggaatt gccagctggg gcgccctctg gtaaggttgg
1620gaagccctgc aaagtaaact ggatggcttt cttgccgcca aggatctgat ggcgcagggg
1680atcaagatct aaccaggagc tatttaatgg caacagttaa ccagctggta cgcaaaccac
1740gtgctcgcaa agttgcgaaa agcaacgtgc ctgcgctgga agcatgcccg caaaaacgtg
1800gcgtatgtac tcgtgtatat actaccactc ctaaaaaacc gaactccgcg ctgcgtaaag
1860tatgccgtgt tcgtctgact aacggtttcg aagtgacttc ctacatcggt ggtgaaggtc
1920acaacctgca ggagcactcc gtgatcctga tccgtggcgg tcgtgttaaa gacctcccgg
1980gtgttcgtta ccacaccgta cgtggtgcgc ttgactgctc cggcgttaaa gaccgtaagc
2040aggctcgttc caagtatggc gtgaagcgtc ctaaggctta atggtagatc tgatcaagag
2100acaggatgac ggtcgtttcg catgcttgaa caagatggat tgcacgcagg ttctccggcc
2160gcttgggtgg agaggctatt cggctatgac tgggcacaac agacaatcgg ctgctctgat
2220gccgccgtgt tccggctgtc agcgcagggg cgcccggttc tttttgtcaa gaccgacctg
2280tccggtgccc tgaatgaact gcaggacgag gcagcgcggc tatcgtggct ggccacgacg
2340ggcgttcctt gcgcagctgt gctcgacgtt gtcactgaag cgggaaggga ctggctgcta
2400ttgggcgaag tgccggggca ggatctcctg tcatctcacc ttgctcctgc cgagaaagta
2460tccatcatgg ctgatgcaat gcggcggctg catacgcttg atccggctac ctgcccattc
2520gaccaccaag cgaaacatcg catcgagcga gcacgtactc ggatggaagc cggtcttgtc
2580gatcaggatg atctggacga agagcatcag gggctcgcgc cagccgaact gttcgccagg
2640ctcaaggcgc gcatgcccga cggcgaggat ctcgtcgtga cccatggcga tgcctgcttg
2700ccgaatatca tggtggaaaa tggccgcttt tctggattca tcgactgtgg ccggctgggt
2760gtggcggacc gctatcagga catagcgttg gctacccgtg atattgctga agagcttggc
2820ggcgaatggg ctgaccgctt cctcgtgctt tacggtatcg ccgctcccga ttcgcagcgc
2880atcgccttct atcgccttct tgacgagttc ttctgagcgg gactctgggg ttcgaaatga
2940ccgaccaagc gacgcccaac ctgccatcac gagatttcga ttccaccgcc gccttctatg
3000aaaggttggg cttcggaatc gttttccggg acgccggctg gatgatcctc cagcgcgggg
3060atctcatgct ggagttcttc gcccaccccg ggctcgatcc cctcgggggg aatcagaatt
3120cagtcgacag cggccgcgat ctgctgtgcc ttctagttgc cagccatctg ttgtttgccc
3180ctcccccgtg ccttccttga ccctggaagg tgccactccc actgtccttt cctaataaaa
3240tgaggaaatt gcatcgcatt gtctgagtag gtgtcattct attctggggg gtggggtggg
3300gcaggacagc aagggggagg attgggaaga caatagcagg catgctgggg atgcggtggg
3360ctctatggcc gatcagcgat cgctgaggtg ggtgagtggg cgtggcctgg ggtggtcatg
3420aaaatatata agttgggggt cttagggtct ctttatttgt gttgcagaga ccgccggagc
3480catgagcggg agcagcagca gcagcagtag cagcagcgcc ttggatggca gcatcgtgag
3540cccttatttg acgacgcgga tgccccactg ggccggggtg cgtcagaatg tgatgggctc
3600cagcatcgac ggccgacccg tcctgcccgc aaattccgcc acgctgacct atgcgaccgt
3660cgcggggacg ccgttggacg ccaccgccgc cgccgccgcc accgcagccg cctcggccgt
3720gcgcagcctg gccacggact ttgcattcct gggaccactg gcgacagggg ctacttctcg
3780ggccgctgct gccgccgttc gcgatgacaa gctgaccgcc ctgctggcgc agttggatgc
3840gcttactcgg gaactgggtg acctttctca gcaggtcatg gccctgcgcc agcaggtctc
3900ctccctgcaa gctggcggga atgcttctcc cacaaatgcc gtttaagata aataaaacca
3960gactctgttt ggattaaaga aaagtagcaa gtgcattgct ctctttattt cataattttc
4020cgcgcgcgat aggccctaga ccagcgttct cggtcgttga gggtgcggtg tatcttctcc
4080aggacgtggt agaggtggct ctggacgttg agatacatgg gcatgagccc gtcccggggg
4140tggaggtagc accactgcag agcttcatgc tccggggtgg tgttgtagat gatccagtcg
4200tagcaggagc gctgggcatg gtgcctaaaa atgtccttca gcagcaggcc gatggccagg
4260gggaggccct tggtgtaagt gtttacaaaa cggttaagtt gggaagggtg cattcgggga
4320gagatgatgt gcatcttgga ctgtattttt agattggcga tgtttccgcc cagatccctt
4380ctgggattca tgttgtgcag gaccaccagt acagtgtatc cggtgcactt ggggaatttg
4440tcatgcagct tagagggaaa agcgtggaag aacttggaga cgcctttgtg gcctcccaga
4500ttttccatgc attcgtccat gatgatggca atgggcccgc gggaggcagc ttgggcaaag
4560atatttctgg ggtcgctgac gtcgtagttg tgttccaggg tgaggtcgtc ataggccatt
4620tttacaaagc gcgggcggag ggtgcccgac tgggggatga tggtcccctc tggccctggg
4680gcgtagttgc cctcgcagat ctgcatttcc caggccttaa tctcggaggg gggaatcata
4740tccacctgcg gggcgatgaa gaaaacggtt tccggagccg gggagattaa ctgggatgag
4800agcaggtttc taagcagctg tgattttcca caaccggtgg gcccataaat aacacctata
4860accggttgca gctggtagtt tagagagctg cagctgccgt cgtcccggag gaggggggcc
4920acctcgttga gcatgtccct gacgcgcatg ttctccccga ccagatccgc cagaaggcgc
4980tcgccgccca gggacagcag ctcttgcaag gaagcaaagt ttttcagcgg cttgaggccg
5040tccgccgtgg gcatgttttt cagggtctgg ctcagcagct ccaggcggtc ccagagctcg
5100gtgacgtgct ctacggcatc tctatccagc atatctcctc gtttcgcggg ttggggcgac
5160tttcgctgta gggcaccaag cggtggtcgt ccagcggggc cagagtcatg tccttccatg
5220ggcgcagggt cctcgtcagg gtggtctggg tcacggtgaa ggggtgcgct ccgggctgag
5280cgcttgccaa ggtgcgcttg aggctggttc tgctggtgct gaagcgctgc cggtcttcgc
5340cctgcgcgtc ggccaggtag catttgacca tggtgtcata gtccagcccc tccgcggcgt
5400gtcccttggc gcgcagcttg cccttggagg tggcgccgca cgaggggcag agcaggctct
5460tgagcgcgta gagcttgggg gcgaggaaga ccgattcggg ggagtaggcg tccgcgccgc
5520agaccccgca cacggtctcg cactccacca gccaggtgag ctcggggcgc gccgggtcaa
5580aaaccaggtt tcccccatgc tttttgatgc gtttcttacc tcgggtctcc atgaggtggt
5640gtccccgctc ggtgacgaag aggctgtccg tgtctccgta gaccgacttg aggggtcttt
5700tctccagggg ggtccctcgg tcttcctcgt agaggaactc ggaccactct gagacgaagg
5760cccgcgtcca ggccaggacg aaggaggcta tgtgggaggg gtagcggtcg ttgtccacta
5820gggggtccac cttctccaag gtgtgaagac acatgtcgcc ttcctcggcg tccaggaagg
5880tgattggctt gtaggtgtag gccacgtgac cgggggttcc tgacgggggg gtataaaagg
5940gggtgggggc gcgctcgtcg tcactctctt ccgcatcgct gtctgcgagg gccagctgct
6000ggggtgagta ttccctctcg aaggcgggca tgacctccgc gctgaggttg tcagtttcca
6060aaaacgagga ggatttgatg ttcacctgtc ccgaggtgat acctttgagg gtacccgcgt
6120ccatctggtc agaaaacacg atctttttat tgtccagctt ggtggcgaac gacccgtaga
6180gggcgttgga gagcagcttg gcgatggagc gcagggtctg gttcttgtcc ctgtcggcgc
6240gctccttggc cgcgatgttg agctgcacgt actcgcgcgc gacgcagcgc cactcgggga
6300agacggtggt gcgctcgtcg ggcaccaggc gcacgcgcca gccgcggttg tgcagggtga
6360ccaggtccac gctggtggcg acctcgccgc gcaggcgctc gttggtccag cagagacggc
6420cgcccttgcg cgagcagaag gggggcaggg ggtcgagctg ggtctcgtcc ggggggtccg
6480cgtccacggt gaaaaccccg gggcgcaggc gcgcgtcgaa gtagtctatc ttgcaacctt
6540gcatgtccag cgcctgctgc cagtcgcggg cggcgagcgc gcgctcgtag gggttgagcg
6600gcgggcccca gggcatgggg tgggtgagtg cggaggcgta catgccgcag atgtcataga
6660cgtagagggg ctcccgcagg accccgatgt aggtggggta gcagcggccg ccgcggatgc
6720tggcgcgcac gtagtcatac agctcgtgcg agggggcgag gaggtcgggg cccaggttgg
6780tgcgggcggg gcgctccgcg cggaagacga tctgcctgaa gatggcatgc gagttggaag
6840agatggtggg gcgctggaag acgttgaagc tggcgtcctg caggccgacg gcgtcgcgca
6900cgaaggaggc gtaggagtcg cgcagcttgt gtaccagctc ggcggtgacc tgcacgtcga
6960gcgcgcagta gtcgagggtc tcgcggatga tgtcatattt agcctgcccc ttctttttcc
7020acagctcgcg gttgaggaca aactcttcgc ggtctttcca gtactcttgg atcgggaaac
7080cgtccggttc cgaacggtaa gagcctagca tgtagaactg gttgacggcc tggtaggcgc
7140agcagccctt ctccacgggg agggcgtagg cctgcgcggc cttgcggagc gaggtgtggg
7200tcagggcgaa ggtgtccctg accatgactt tgaggtactg gtgcttgaag tcggagtcgt
7260cgcagccgcc ccgctcccag agcgagaagt cggtgcgctt cttggagcgg gggttgggca
7320gagcgaaggt gacatcgttg aagaggattt tgcccgcgcg gggcatgaag ttgcgggtga
7380tgcggaaggg ccccggcact tcagagcggt tgttgatgac ctgggcggcg agcacgatct
7440cgtcgaagcc gttgatgttg tggcccacga tgtagagttc caggaagcgg ggccggccct
7500ttacggtggg cagcttcttt agctcttcgt aggtgagctc ctcgggcgag gcgaggccgt
7560gctcggccag ggcccagtcc gcgaggtgcg ggttgtctct gaggaaggac ttccagaggt
7620cgcgggccag gagggtctgc aggcggtctc tgaaggtcct gaactggcgg cccacggcca
7680ttttttcggg ggtgatgcag tagaaggtga gggggtcttg ctgccagcgg tcccagtcga
7740gctgcagggc gaggtcgcgc gcggcggtga ccaggcgctc gtcgcccccg aatttcatga
7800ccagcatgaa gggcacgagc tgctttccga aggcccccat ccaagtgtag gtctctacat
7860cgtaggtgac aaagaggcgc tccgtgcgag gatgcgagcc gatcgggaag aactggatct
7920cccgccacca gttggaggag tggctgttga tgtggtggaa gtagaagtcc cgtcgccggg
7980ccgaacactc gtgctggctt ttgtaaaagc gagcgcagta ctggcagcgc tgcacgggct
8040gtacctcatg cacgagatgc acctttcgcc cgcgcacgag gaagccgagg ggaaatctga
8100gccccccgcc tggctcgcgg catggctggt tctcttctac tttggatgcg tgtccgtctc
8160cgtctggctc ctcgaggggt gttacggtgg agcggaccac cacgccgcgc gagccgcagg
8220tccagatatc ggcgcgcggc ggtcggagtt tgatgacgac atcgcgcagc tgggagctgt
8280ccatggtctg gagctcccgc ggcggcggca ggtcagccgg gagttcttgc aggttcacct
8340cgcagagtcg ggccagggcg cggggcaggt ctaggtggta cctgatctct aggggcgtgt
8400tggtggcggc gtcgatggct tgcaggagcc cgcagccccg gggggcgacg acggtgcccc
8460gcggggtggt ggtggtggtg gcggtgcagc tcagaagcgg tgccgcgggc gggcccccgg
8520aggtaggggg ggctccggtc ccgcgggcag gggcggcagc ggcacgtcgg cgtggagcgc
8580gggcaggagt tggtgctgtg cccggaggtt gctggcgaag gcgacgacgc ggcggttgat
8640ctcctggatc tggcgcctct gcgtgaagac gacgggcccg gtgagcttga acctgaaaga
8700gagttcgaca gaatcaatct cggtgtcatt gaccgcggcc tggcgcagga tctcctgcac
8760gtctcccgag ttgtcttggt aggcgatctc ggccatgaac tgctcgatct cttcctcctg
8820gaggtctccg cgtccggcgc gttccacggt ggccgccagg tcgttggaga tgcgccccat
8880gagctgcgag aaggcgttga gtccgccctc gttccagact cggctgtaga ccacgccccc
8940ctggtcatcg cgggcgcgca tgaccacctg cgcgaggttg agctccacgt gccgcgcgaa
9000gacggcgtag ttgcgcagac gctggaagag gtagttgagg gtggtggcgg tgtgctcggc
9060cacgaagaag ttcatgaccc agcggcgcaa cgtggattcg ttgatgtccc ccaaggcctc
9120cagccgttcc atggcctcgt agaagtccac ggcgaagttg aaaaactggg agttgcgcgc
9180cgacacggtc aactcctcct ccagaagacg gatgagctcg gcgacggtgt cgcgcacctc
9240gcgctcgaag gctatgggga tctcttcctc cgctagcatc accacctcct cctcttcctc
9300ctcttctggc acttccatga tggcttcctc ctcttcgggg ggtggcggcg gcggcggtgg
9360gggagggggc gctctgcgcc ggcggcggcg caccgggagg cggtccacga agcgcgcgat
9420catctccccg cggcggcggc gcatggtctc ggtgacggcg cggccgttct cccgggggcg
9480cagttggaag acgccgccgg acatctggtg ctggggcggg tggccgtgag gcagcgagac
9540ggcgctgacg atgcatctca acaattgctg cgtaggtacg ccgccgaggg acctgaggga
9600gtccatatcc accggatccg aaaacctttc gaggaaggcg tctaaccagt cgcagtcgca
9660aggtaggctg agcaccgtgg cgggcggcgg ggggtggggg gagtgtctgg cggaggtgct
9720gctgatgatg taattgaagt aggcggactt gacacggcgg atggtcgaca ggagcaccat
9780gtccttgggt ccggcctgct ggatgcggag gcggtcggct atgccccagg cttcgttctg
9840gcatcggcgc aggtccttgt agtagtcttg catgagcctt tccaccggca cctcttctcc
9900ttcctcttct gcttcttcca tgtctgcttc ggccctgggg cggcgccgcg cccccctgcc
9960ccccatgcgc gtgaccccga accccctgag cggttggagc agggccaggt cggcgacgac
10020gcgctcggcc aggatggcct gctgcacctg cgtgagggtg gtttggaagt catccaagtc
10080cacgaagcgg tggtaggcgc ccgtgttgat ggtgtaggtg cagttggcca tgacggacca
10140gttgacggtc tggtggcccg gttgcgacat ctcggtgtac ctgagtcgcg agtaggcgcg
10200ggagtcgaag acgtagtcgt tgcaagtccg caccaggtac tggtagccca ccaggaagtg
10260cggcggcggc tggcggtaga ggggccagcg cagggtggcg ggggctccgg gggccaggtc
10320ttccagcatg aggcggtggt aggcgtagat gtacctggac atccaggtga tacccgcggc
10380ggtggtggag gcgcgcggga agtcgcgcac ccggttccag atgttgcgca ggggcagaaa
10440gtgctccatg gtaggcgtgc tctgtccagt cagacgcgcg cagtcgttga tactctagac
10500cagggaaaac gaaagccggt cagcgggcac tcttccgtgg tctggtgaat agatcgcaag
10560ggtatcatgg cggagggcct cggttcgagc cccgggtccg ggccggacgg tccgccatga
10620tccacgcggt taccgcccgc gtgtcgaacc caggtgtgcg acgtcagaca acggtggagt
10680gttccttttg gcgtttttct ggccgggcgc cggcgccgcg taagagacta agccgcgaaa
10740gcgaaagcag taagtggctc gctccccgta gccggaggga tccttgctaa gggttgcgtt
10800gcggcgaacc ccggttcgaa tcccgtactc gggccggccg gacccgcggc taaggtgttg
10860gattggcctc cccctcgtat aaagaccccg cttgcggatt gactccggac acggggacga
10920gcccctttta tttttgcttt ccccagatgc atccggtgct gcggcagatg cgccccccgc
10980cccagcagca gcaacaacac cagcaagagc ggcagcaaca gcagcgggag tcatgcaggg
11040ccccctcacc caccctcggc gggccggcca cctcggcgtc cgcggccgtg tctggcgcct
11100gcggcggcgg cggggggccg gctgacgacc ccgaggagcc cccgcggcgc agggccagac
11160actacctgga cctggaggag ggcgagggcc tggcgcggct gggggcgccg tctcccgagc
11220gccacccgcg ggtgcagctg aagcgcgact cgcgcgaggc gtacgtgcct cggcagaacc
11280tgttcaggga ccgcgcgggc gaggagcccg aggagatgcg ggacaggagg ttcagcgcag
11340ggcgggagct gcggcagggg ctgaaccgcg agcggctgct gcgcgaggag gactttgagc
11400ccgacgcgcg gacggggatc agccccgcgc gcgcgcacgt ggcggccgcc gacctggtga
11460cggcgtacga gcagacggtg aaccaggaga tcaacttcca aaagagtttc aacaaccacg
11520tgcgcacgct ggtggcgcgc gaggaggtga ccatcgggct gatgcacctg tgggactttg
11580taagcgcgct ggtgcagaac cccaacagca agcctctgac ggcgcagctg ttcctgatag
11640tgcagcacag cagggacaac gaggcgttta gggacgcgct gctgaacatc accgagcccg
11700agggtcggtg gctgctggac ctgattaaca tcctgcagag catagtggtg caggagcgca
11760gcctgagcct ggccgacaag gtggcggcca tcaactactc gatgctgagc ctgggcaagt
11820tttacgcgcg caagatctac cagacgccgt acgtgcccat agacaaggag gtgaagatcg
11880acggttttta catgcgcatg gcgctgaagg tgctcaccct gagcgacgac ctgggcgtgt
11940accgcaacga gcgcatccac aaggccgtga gcgtgagccg gcggcgcgag ctgagcgacc
12000gcgagctgat gcacagcctg cagcgggcgc tggcgggcgc cggcagcggc gacagggagg
12060cggagtccta cttcgatgcg ggggcggacc tgcgctgggc gcccagccgg cgggccctgg
12120aggccgcggg ggtccgcgag gactatgacg aggacggcga ggaggatgag gagtacgagc
12180tagaggaggg cgagtacctg gactaaaccg cgggtggtgt ttccggtaga tgcaagaccc
12240gaacgtggtg gacccggcgc tgcgggcggc tctgcagagc cagccgtccg gccttaactc
12300ctcagacgac tggcgacagg tcatggaccg catcatgtcg ctgacggcgc gtaacccgga
12360cgcgttccgg cagcagccgc aggccaacag gctctccgcc atcctggagg cggtggtgcc
12420tgcgcgctcg aaccccacgc acgagaaggt gctggccata gtgaacgcgc tggccgagaa
12480cagggccatc cgcccggacg aggccgggct ggtgtacgac gcgctgctgc agcgcgtggc
12540ccgctacaac agcggcaacg tgcagaccaa cctggaccgg ctggtggggg acgtgcgcga
12600ggcggtggcg cagcgcgagc gcgcggatcg gcagggcaac ctgggctcca tggtggcgct
12660gaatgccttc ctgagcacgc agccggccaa cgtgccgcgg gggcaggaag actacaccaa
12720ctttgtgagc gcgctgcggc tgatggtgac cgagaccccc cagagcgagg tgtaccagtc
12780gggcccggac tacttcttcc agaccagcag acagggcctg cagacggtga acctgagcca
12840ggctttcaag aacctgcggg ggctgtgggg cgtgaaggcg cccaccggcg accgggcgac
12900ggtgtccagc ctgctgacgc ccaactcgcg cctgctgctg ctgctgatcg cgccgttcac
12960ggacagcggc agcgtgtccc gggacaccta cctggggcac ctgctgaccc tgtaccgcga
13020ggccatcggg caggcgcagg tggacgagca caccttccag gagatcacca gcgtgagccg
13080cgcgctgggg caggaggaca cgagcagcct ggaggcgact ctgaactacc tgctgaccaa
13140ccggcggcag aagattccct cgctgcacag cctgacctcc gaggaggagc gcatcttgcg
13200ctacgtgcag cagagcgtga gcctgaacct gatgcgcgac ggggtgacgc ccagcgtggc
13260gctggacatg accgcgcgca acatggaacc gggcatgtac gccgcgcacc ggccttacat
13320caaccgcctg atggactacc tgcatcgcgc ggcggccgtg aaccccgagt actttaccaa
13380cgccatcctg aacccgcact ggctcccgcc gcccgggttc tacagcgggg gcttcgaggt
13440cccggagacc aacgatggct tcctgtggga cgacatggac gacagcgtgt tctccccgcg
13500gccgcaggcg ctggcggaag cgtccctgct gcgtcccaag aaggaggagg aggaggaggc
13560gagtcgccgc cgcggcagca gcggcgtggc ttctctgtcc gagctggggg cggcagccgc
13620cgcgcgcccc gggtccctgg gcggcagccc ctttccgagc ctggtggggt ctctgcacag
13680cgagcgcacc acccgccctc ggctgctggg cgaggacgag tacctgaata actccctgct
13740gcagccggtg cgggagaaaa acctgcctcc cgccttcccc aacaacggga tagagagcct
13800ggtggacaag atgagcagat ggaagaccta tgcgcaggag cacagggacg cgcctgcgct
13860ccggccgccc acgcggcgcc agcgccacga ccggcagcgg gggctggtgt gggatgacga
13920ggactccgcg gacgatagca gcgtgctgga cctgggaggg agcggcaacc cgttcgcgca
13980cctgcgcccc cgcctgggga ggatgtttta aaaaaaaaaa aaaaaagcaa gaagcatgat
14040gcaaaaatta aataaaactc accaaggcca tggcgaccga gcgttggttt cttgtgttcc
14100cttcagtatg cggcgcgcgg cgatgtacca ggagggacct cctccctctt acgagagcgt
14160ggtgggcgcg gcggcggcgg cgccctcttc tccctttgcg tcgcagctgc tggagccgcc
14220gtacgtgcct ccgcgctacc tgcggcctac gggggggaga aacagcatcc gttactcgga
14280gctggcgccc ctgttcgaca ccacccgggt gtacctggtg gacaacaagt cggcggacgt
14340ggcctccctg aactaccaga acgaccacag caattttttg accacggtca tccagaacaa
14400tgactacagc ccgagcgagg ccagcaccca gaccatcaat ctggatgacc ggtcgcactg
14460gggcggcgac ctgaaaacca tcctgcacac caacatgccc aacgtgaacg agttcatgtt
14520caccaataag ttcaaggcgc gggtgatggt gtcgcgctcg cacaccaagg aagaccgggt
14580ggagctgaag tacgagtggg tggagttcga gctgccagag ggcaactact ccgagaccat
14640gaccattgac ctgatgaaca acgcgatcgt ggagcactat ctgaaagtgg gcaggcagaa
14700cggggtcctg gagagcgaca tcggggtcaa gttcgacacc aggaacttcc gcctggggct
14760ggaccccgtg accgggctgg ttatgcccgg ggtgtacacc aacgaggcct tccatcccga
14820catcatcctg ctgcccggct gcggggtgga cttcacttac agccgcctga gcaacctcct
14880gggcatccgc aagcggcagc ccttccagga gggcttcagg atcacctacg aggacctgga
14940ggggggcaac atccccgcgc tcctcgatgt ggaggcctac caggatagct tgaaggaaaa
15000tgaggcggga caggaggata ccgcccccgc cgcctccgcc gccgccgagc agggcgagga
15060tgctgctgac accgcggccg cggacggggc agaggccgac cccgctatgg tggtggaggc
15120tcccgagcag gaggaggaca tgaatgacag tgcggtgcgc ggagacacct tcgtcacccg
15180gggggaggaa aagcaagcgg aggccgaggc cgcggccgag gaaaagcaac tggcggcagc
15240agcggcggcg gcggcgttgg ccgcggcgga ggctgagtct gaggggacca agcccgccaa
15300ggagcccgtg attaagcccc tgaccgaaga tagcaagaag cgcagttaca acctgctcaa
15360ggacagcacc aacaccgcgt accgcagctg gtacctggcc tacaactacg gcgacccgtc
15420gacgggggtg cgctcctgga ccctgctgtg cacgccggac gtgacctgcg gctcggagca
15480ggtgtactgg tcgctgcccg acatgatgca agaccccgtg accttccgct ccacgcggca
15540ggtcagcaac ttcccggtgg tgggcgccga gctgctgccc gtgcactcca agagcttcta
15600caacgaccag gccgtctact cccagctcat ccgccagttc acctctctga cccacgtgtt
15660caatcgcttt cctgagaacc agattctggc gcgcccgccc gcccccacca tcaccaccgt
15720cagtgaaaac gttcctgctc tcacagatca cgggacgcta ccgctgcgca acagcatcgg
15780aggagtccag cgagtgaccg ttactgacgc cagacgccgc acctgcccct acgtttacaa
15840ggccttgggc atagtctcgc cgcgcgtcct ttccagccgc actttttgag caacaccacc
15900atcatgtcca tcctgatctc acccagcaat aactccggct ggggactgct gcgcgcgccc
15960agcaagatgt tcggaggggc gaggaagcgt tccgagcagc accccgtgcg cgtgcgcggg
16020cacttccgcg ccccctgggg agcgcacaaa cgcggccgcg cggggcgcac caccgtggac
16080gacgccatcg actcggtggt ggagcaggcg cgcaactaca ggcccgcggt ctctaccgtg
16140gacgcggcca tccagaccgt ggtgcggggc gcgcggcggt acgccaagct gaagagccgc
16200cggaagcgcg tggcccgccg ccaccgccgc cgacccgggg ccgccgccaa acgcgccgcc
16260gcggccctgc ttcgccgggc caagcgcacg ggccgccgcg ccgccatgag ggccgcgcgc
16320cgcttggccg ccggcatcac cgccgccacc atggcccccc gtacccgaag acgcgcggcc
16380gccgccgccg ccgccgccat cagtgacatg gccagcaggc gccggggcaa cgtgtactgg
16440gtgcgcgact cggtgaccgg cacgcgcgtg cccgtgcgct tccgcccccc gcggacttga
16500gatgatgtga aaaaacaaca ctgagtctcc tgctgttgtg tgtatcccag cggcggcggc
16560gcgcgcagcg tcatgtccaa gcgcaaaatc aaagaagaga tgctccaggt cgtcgcgccg
16620gagatctatg ggcccccgaa gaaggaagag caggattcga agccccgcaa gataaagcgg
16680gtcaaaaaga aaaagaaaga tgatgacgat gccgatgggg aggtggagtt cctgcgcgcc
16740acggcgccca ggcgcccggt gcagtggaag ggccggcgcg taaagcgcgt cctgcgcccc
16800ggcaccgcgg tggtcttcac gcccggcgag cgctccaccc ggactttcaa gcgcgtctat
16860gacgaggtgt acggcgacga agacctgctg gagcaggcca acgagcgctt cggagagttt
16920gcttacggga agcgtcagcg ggcgctgggg aaggaggacc tgctggcgct gccgctggac
16980cagggcaacc ccacccccag tctgaagccc gtgaccctgc agcaggtgct gccgagcagc
17040gcaccctccg aggcgaagcg gggtctgaag cgcgagggcg gcgacctggc gcccaccgtg
17100cagctcatgg tgcccaagcg gcagaggctg gaggatgtgc tggagaaaat gaaagtagac
17160cccggtctgc agccggacat cagggtccgc cccatcaagc aggtggcgcc gggcctcggc
17220gtgcagaccg tggacgtggt catccccacc ggcaactccc ccgccgccgc caccactacc
17280gctgcctcca cggacatgga gacacagacc gatcccgccg cagccgcagc cgcagccgcc
17340gccgcgacct cctcggcgga ggtgcagacg gacccctggc tgccgccggc gatgtcagct
17400ccccgcgcgc gtcgcgggcg caggaagtac ggcgccgcca acgcgctcct gcccgagtac
17460gccttgcatc cttccatcgc gcccaccccc ggctaccgag gctataccta ccgcccgcga
17520agagccaagg gttccacccg ccgtccccgc cgacgcgccg ccgccaccac ccgccgccgc
17580cgccgcagac gccagcccgc actggctcca gtctccgtga ggaaagtggc gcgcgacgga
17640cacaccctgg tgctgcccag ggcgcgctac caccccagca tcgtttaaaa gcctgttgtg
17700gttcttgcag atatggccct cacttgccgc ctccgtttcc cggtgccggg ataccgagga
17760ggaagatcgc gccgcaggag gggtctggcc ggccgcggcc tgagcggagg cagccgccgc
17820gcgcaccggc ggcgacgcgc caccagccga cgcatgcgcg gcggggtgct gcccctgtta
17880atccccctga tcgccgcggc gatcggcgcc gtgcccggga tcgcctccgt ggccttgcaa
17940gcgtcccaga ggcattgaca gacttgcaaa cttgcaaata tggaaaaaaa aaccccaata
18000aaaaagtcta gactctcacg ctcgcttggt cctgtgacta ttttgtagaa tggaagacat
18060caactttgcg tcgctggccc cgcgtcacgg ctcgcgcccg ttcctgggac actggaacga
18120tatcggcacc agcaacatga gcggtggcgc cttcagttgg ggctctctgt ggagcggcat
18180taaaagtatc gggtctgccg ttaaaaatta cggctcccgg gcctggaaca gcagcacggg
18240ccagatgttg agagacaagt tgaaagagca gaacttccag cagaaggtgg tggagggcct
18300ggcctccggc atcaacgggg tggtggacct ggccaaccag gccgtgcaga ataagatcaa
18360cagcagactg gacccccggc cgccggtgga ggaggtgccg ccggcgctgg agacggtgtc
18420ccccgatggg cgtggcgaga agcgcccgcg gcccgatagg gaagagacca ctctggtcac
18480gcagaccgat gagccgcccc cgtatgagga ggccctgaag caaggtctgc ccaccacgcg
18540gcccatcgcg cccatggcca ccggggtggt gggccgccac acccccgcca cgctggactt
18600gcctccgccc gccgatgtgc cgcagcagca gaaggcggca cagccgggcc cgcccgcgac
18660cgcctcccgt tcctccgccg gtcctctgcg ccgcgcggcc agcggccccc gcgggggggt
18720cgcgaggcac ggcaactggc agagcacgct gaacagcatc gtgggtctgg gggtgcggtc
18780cgtgaagcgc cgccgatgct actgaatagc ttagctaacg tgttgtatgt gtgtatgcgc
18840cctatgtcgc cgccagagga gctgctgagt cgccgccgtt cgcgcgccca ccaccaccgc
18900cactccgccc ctcaagatgg cgaccccatc gatgatgccg cagtggtcgt acatgcacat
18960ctcgggccag gacgcctcgg agtacctgag ccccgggctg gtgcagttcg cccgcgccac
19020cgagagctac ttcagcctga gtaacaagtt taggaacccc acggtggcgc ccacgcacga
19080tgtgaccacc gaccggtctc agcgcctgac gctgcggttc attcccgtgg accgcgagga
19140caccgcgtac tcgtacaagg cgcggttcac cctggccgtg ggcgacaacc gcgtgctgga
19200catggcctcc acctactttg acatccgcgg ggtgctggac cggggtccca ctttcaagcc
19260ctactctggc accgcctaca actccctggc ccccaagggc gctcccaact cctgcgagtg
19320ggagcaagag gaaactcagg cagttgaaga agcagcagaa gaggaagaag aagatgctga
19380cggtcaagct gaggaagagc aagcagctac caaaaagact catgtatatg ctcaggctcc
19440cctttctggc gaaaaaatta gtaaagatgg tctgcaaata ggaacggacg ctacagctac
19500agaacaaaaa cctatttatg cagaccctac attccagccc gaaccccaaa tcggggagtc
19560ccagtggaat gaggcagatg ctacagtcgc cggcggtaga gtgctaaaga aatctactcc
19620catgaaacca tgctatggtt cctatgcaag acccacaaat gctaatggag gtcagggtgt
19680actaacggca aatgcccagg gacagctaga atctcaggtt gaaatgcaat tcttttcaac
19740ttctgaaaac gcccgtaacg aggctaacaa cattcagccc aaattggtgc tgtatagtga
19800ggatgtgcac atggagaccc cggatacgca cctttcttac aagcccgcaa aaagcgatga
19860caattcaaaa atcatgctgg gtcagcagtc catgcccaac agacctaatt acatcggctt
19920cagagacaac tttatcggcc tcatgtatta caatagcact ggcaacatgg gagtgcttgc
19980aggtcaggcc tctcagttga atgcagtggt ggacttgcaa gacagaaaca cagaactgtc
20040ctaccagctc ttgcttgatt ccatgggtga cagaaccaga tacttttcca tgtggaatca
20100ggcagtggac agttatgacc cagatgttag aattattgaa aatcatggaa ctgaagacga
20160gctccccaac tattgtttcc ctctgggtgg cataggggta actgacactt accaggctgt
20220taaaaccaac aatggcaata acgggggcca ggtgacttgg acaaaagatg aaacttttgc
20280agatcgcaat gaaatagggg tgggaaacaa tttcgctatg gagatcaacc tcagtgccaa
20340cctgtggaga aacttcctgt actccaacgt ggcgctgtac ctaccagaca agcttaagta
20400caacccctcc aatgtggaca tctctgacaa ccccaacacc tacgattaca tgaacaagcg
20460agtggtggcc ccggggctgg tggactgcta catcaacctg ggcgcgcgct ggtcgctgga
20520ctacatggac aacgtcaacc ccttcaacca ccaccgcaat gcgggcctgc gctaccgctc
20580catgctcctg ggcaacgggc gctacgtgcc cttccacatc caggtgcccc agaagttctt
20640tgccatcaag aacctcctcc tcctgccggg ctcctacacc tacgagtgga acttcaggaa
20700ggatgtcaac atggtcctcc agagctctct gggtaacgat ctcagggtgg acggggccag
20760catcaagttc gagagcatct gcctctacgc caccttcttc cccatggccc acaacacggc
20820ctccacgctc gaggccatgc tcaggaacga caccaacgac cagtccttca atgactacct
20880ctccgccgcc aacatgctct accccatacc cgccaacgcc accaacgtcc ccatctccat
20940cccctcgcgc aactgggcgg ccttccgcgg ctgggccttc acccgcctca agaccaagga
21000gaccccctcc ctgggctcgg gattcgaccc ctactacacc tactcgggct ccattcccta
21060cctggacggc accttctacc tcaaccacac tttcaagaag gtctcggtca ccttcgactc
21120ctcggtcagc tggccgggca acgaccgtct gctcaccccc aacgagttcg agatcaagcg
21180ctcggtcgac ggggagggct acaacgtggc ccagtgcaac atgaccaagg actggttcct
21240ggtccagatg ctggccaact acaacatcgg ctaccagggc ttctacatcc cagagagcta
21300caaggacagg atgtactcct tcttcaggaa cttccagccc atgagccggc aggtggtgga
21360ccagaccaag tacaaggact accaggaggt gggcatcatc caccagcaca acaactcggg
21420cttcgtgggc tacctcgccc ccaccatgcg cgagggacag gcctaccccg ccaacttccc
21480ctatccgctc ataggcaaga ccgcggtcga cagcatcacc cagaaaaagt tcctctgcga
21540ccgcaccctc tggcgcatcc ccttctccag caacttcatg tccatgggtg cgctctcgga
21600cctgggccag aacttgctct acgccaactc cgcccacgcc ctcgacatga ccttcgaggt
21660cgaccccatg gacgagccca cccttctcta tgttctgttc gaagtctttg acgtggtccg
21720ggtccaccag ccgcaccgcg gcgtcatcga gaccgtgtac ctgcgtacgc ccttctcggc
21780cggcaacgcc accacctaaa gaagcaagcc gcagtcatcg ccgcctgcat gccgtcgggt
21840tccaccgagc aagagctcag ggccatcgtc agagacctgg gatgcgggcc ctattttttg
21900ggcaccttcg acaagcgctt ccctggcttt gtctccccac acaagctggc ctgcgccatc
21960gtcaacacgg ccggccgcga gaccgggggc gtgcactggc tggccttcgc ctggaacccg
22020cgctccaaaa catgcttcct ctttgacccc ttcggctttt cggaccagcg gctcaagcaa
22080atctacgagt tcgagtacga gggcttgctg cgtcgcagcg ccatcgcctc ctcgcccgac
22140cgctgcgtca ccctcgaaaa gtccacccag accgtgcagg ggcccgactc ggccgcctgc
22200ggtctcttct gctgcatgtt tctgcacgcc tttgtgcact ggcctcagag tcccatggac
22260cgcaacccca ccatgaactt gctgacgggg gtgcccaact ccatgctcca gagcccccag
22320gtcgagccca ccctgcgccg caaccaggag cagctctaca gcttcctgga gcgccactcg
22380ccttacttcc gccgccacag cgcacagatc aggagggcca cctccttctg ccacttgcaa
22440gagatgcaag aagggtaata acgatgtaca cacttttttt ctcaataaat ggcatctttt
22500tatttataca agctctctgg ggtattcatt tcccaccacc acccgccgtt gtcgccatct
22560ggctctattt agaaatcgaa agggttctgc cgggagtcgc cgtgcgccac gggcagggac
22620acgttgcgat actggtagcg ggtgccccac ttgaactcgg gcaccaccag gcgaggcagc
22680tcggggaagt tttcgctcca caggctgcgg gtcagcacca gcgcgttcat caggtcgggc
22740gccgagatct tgaagtcgca gttggggccg ccgccctgcg cgcgcgagtt gcggtacacc
22800gggttgcagc actggaacac caacagcgcc gggtgcttca cgctggccag cacgctgcgg
22860tcggagatca gctcggcgtc caggtcctcc gcgttgctca gcgcgaacgg ggtcatcttg
22920ggcacttgcc gccccaggaa gggcgcgtgc cccggtttcg agttgcagtc gcagcgcagc
22980gggatcagca ggtgcccgtg cccggactcg gcgttggggt acagcgcgcg catgaaggcc
23040tgcatctggc ggaaggccat ctgggccttg gcgccctccg agaagaacat gccgcaggac
23100ttgcccgaga actggtttgc ggggcagctg gcgtcgtgca ggcagcagcg cgcgtcggtg
23160ttggcgatct gcaccacgtt gcgcccccac cggttcttca cgatcttggc cttggacgat
23220tgctccttca gcgcgcgctg cccgttctcg ctggtcacat ccatctcgat cacatgttcc
23280ttgttcacca tgctgctgcc gtgcagacac ttcagctcgc cctccgtctc ggtgcagcgg
23340tgctgccaca gcgcgcagcc cgtgggctcg aaagacttgt aggtcacctc cgcgaaggac
23400tgcaggtacc cctgcaaaaa gcggcccatc atggtcacga aggtcttgtt gctgctgaag
23460gtcagctgca gcccgcggtg ctcctcgttc agccaggtct tgcacacggc cgccagcgcc
23520tccacctggt cgggcagcat cttgaagttc accttcagct cattctccac gtggtacttg
23580tccatcagcg tgcgcgccgc ctccatgccc ttctcccagg ccgacaccag cggcaggctc
23640acggggttct tcaccatcac cgtggccgcc gcctccgccg cgctttcgct ttccgccccg
23700ctgttctctt cctcttcctc ctcttcctcg ccgccgccca ctcgcagccc ccgcaccacg
23760gggtcgtctt cctgcaggcg ctgcaccttg cgcttgccgt tgcgcccctg cttgatgcgc
23820acgggcgggt tgctgaagcc caccatcacc agcgcggcct cttcttgctc gtcctcgctg
23880tccagaatga cctccgggga gggggggttg gtcatcctca gtaccgaggc acgcttcttt
23940ttcttcctgg gggcgttcgc cagctccgcg gctgcggccg ctgccgaggt cgaaggccga
24000gggctgggcg tgcgcggcac cagcgcgtcc tgcgagccgt cctcgtcctc ctcggactcg
24060agacggaggc gggcccgctt cttcgggggc gcgcggggcg gcggaggcgg cggcggcgac
24120ggagacgggg acgagacatc gtccagggtg ggtggacggc gggccgcgcc gcgtccgcgc
24180tcgggggtgg tctcgcgctg gtcctcttcc cgactggcca tctcccactg ctccttctcc
24240tataggcaga aagagatcat ggagtctctc atgcgagtcg agaaggagga ggacagccta
24300accgccccct ctgagccctc caccaccgcc gccaccaccg ccaatgccgc cgcggacgac
24360gcgcccaccg agaccaccgc cagtaccacc ctccccagcg acgcaccccc gctcgagaat
24420gaagtgctga tcgagcagga cccgggtttt gtgagcggag aggaggatga ggtggatgag
24480aaggagaagg aggaggtcgc cgcctcagtg ccaaaagagg ataaaaagca agaccaggac
24540gacgcagata aggatgagac agcagtcggg cgggggaacg gaagccatga tgctgatgac
24600ggctacctag acgtgggaga cgacgtgctg cttaagcacc tgcaccgcca gtgcgtcatc
24660gtctgcgacg cgctgcagga gcgctgcgaa gtgcccctgg acgtggcgga ggtcagccgc
24720gcctacgagc ggcacctctt cgcgccgcac gtgcccccca agcgccggga gaacggcacc
24780tgcgagccca acccgcgtct caacttctac ccggtcttcg cggtacccga ggtgctggcc
24840acctaccaca tctttttcca aaactgcaag atccccctct cctgccgcgc caaccgcacc
24900cgcgccgaca aaaccctgac cctgcggcag ggcgcccaca tacctgatat cgcctctctg
24960gaggaagtgc ccaagatctt cgagggtctc ggtcgcgacg agaaacgggc ggcgaacgct
25020ctgcacggag acagcgaaaa cgagagtcac tcgggggtgc tggtggagct cgagggcgac
25080aacgcgcgcc tggccgtact caagcgcagc atagaggtca cccactttgc ctacccggcg
25140ctcaacctgc cccccaaggt catgagtgtg gtcatgggcg agctcatcat gcgccgcgcc
25200cagcccctgg ccgcggatgc aaacttgcaa gagtcctccg aggaaggcct gcccgcggtc
25260agcgacgagc agctggcgcg ctggctggag acccgcgacc ccgcgcagct ggaggagcgg
25320cgcaagctca tgatggccgc ggtgctggtc accgtggagc tcgagtgtct gcagcgcttc
25380ttcgcggacc ccgagatgca gcgcaagctc gaggagaccc tgcactacac cttccgccag
25440ggctacgtgc gccaggcctg caagatctcc aacgtggagc tctgcaacct ggtctcctac
25500ctgggcatcc tgcacgagaa ccgcctcggg cagaacgtcc tgcactccac cctcaaaggg
25560gaggcgcgcc gcgactacat ccgcgactgc gcctacctct tcctctgcta cacctggcag
25620acggccatgg gggtctggca gcagtgcctg gaggagcgca acctcaagga gctggaaaag
25680ctcctcaagc gcaccctcag ggacctctgg acgggcttca acgagcgctc ggtggccgcc
25740gcgctggcgg acatcatctt tcccgagcgc ctgctcaaga ccctgcagca gggcctgccc
25800gacttcacca gccagagcat gctgcagaac ttcaggactt tcatcctgga gcgctcgggc
25860atcctgccgg ccacttgctg cgcgctgccc agcgacttcg tgcccatcaa gtacagggag
25920tgcccgccgc cgctctgggg ccactgctac ctcttccagc tggccaacta cctcgcctac
25980cactcggacc tcatggaaga cgtgagcggc gagggcctgc tcgagtgcca ctgccgctgc
26040aacctctgca cgccccaccg ctctctagtc tgcaacccgc agctgctcag cgagagtcag
26100attatcggta ccttcgagct gcagggtccc tcgcctgacg agaagtccgc ggctccaggg
26160ctgaaactca ctccggggct gtggacttcc gcctacctac gcaaatttgt acctgaggac
26220taccacgccc acgagatcag gttctacgaa gaccaatccc gcccgcccaa ggcggagctc
26280accgcctgcg tcatcaccca ggggcacatc ctgggccaat tgcaagccat caacaaagcc
26340cgccgagagt tcttgctgaa aaagggtcgg ggggtgtacc tggaccccca gtccggcgag
26400gagctaaacc cgctaccccc gccgccgccc cagcagcggg accttgcttc ccaggatggc
26460acccagaaag aagcagcagc cgccgccgcc gccgcagcca tacatgcttc tggaggaaga
26520ggaggaggac tgggacagtc aggcagagga ggtttcggac gaggagcagg aggagatgat
26580ggaagactgg gaggaggaca gcagcctaga cgaggaagct tcagaggccg aagaggtggc
26640agacgcaaca ccatcgccct cggtcgcagc cccctcgccg gggcccctga aatcctccga
26700acccagcacc agcgctataa cctccgctcc tccggcgccg gcgccacccg cccgcagacc
26760caaccgtaga tgggacacca caggaaccgg ggtcggtaag tccaagtgcc cgccgccgcc
26820accgcagcag cagcagcagc agcgccaggg ctaccgctcg tggcgcgggc acaagaacgc
26880catagtcgcc tgcttgcaag actgcggggg caacatctct ttcgcccgcc gcttcctgct
26940attccaccac ggggtcgcct ttccccgcaa tgtcctgcat tactaccgtc atctctacag
27000cccctactgc agcggcgacc cagaggcggc agcggcagcc acagcggcga ccaccaccta
27060ggaagatatc ctccgcgggc aagacagcgg cagcagcggc caggagaccc gcggcagcag
27120cggcgggagc ggtgggcgca ctgcgcctct cgcccaacga acccctctcg acccgggagc
27180tcagacacag gatcttcccc actttgtatg ccatcttcca acagagcaga ggccaggagc
27240aggagctgaa aataaaaaac agatctctgc gctccctcac ccgcagctgt ctgtatcaca
27300aaagcgaaga tcagcttcgg cgcacgctgg aggacgcgga ggcactcttc agcaaatact
27360gcgcgctcac tcttaaagac tagctccgcg cccttctcga atttaggcgg gagaaaacta
27420cgtcatcgcc ggccgccgcc cagcccgccc agccgagatg agcaaagaga ttcccacgcc
27480atacatgtgg agctaccagc cgcagatggg actcgcggcg ggagcggccc aggactactc
27540cacccgcatg aactacatga gcgcgggacc ccacatgatc tcacaggtca acgggatccg
27600cgcccagcga aaccaaatac tgctggaaca ggcggccatc accgccacgc cccgccataa
27660tctcaacccc cgaaattggc ccgccgccct cgtgtaccag gaaaccccct ccgccaccac
27720cgtactactt ccgcgtgacg cccaggccga agtccagatg actaactcag gggcgcagct
27780cgcgggcggc tttcgtcacg gggcgcggcc gctccgacca ggtataagac acctgatgat
27840cagaggccga ggtatccagc tcaacgacga gtcggtgagc tcttcgctcg gtctccgtcc
27900ggacggaact ttccagctcg ccggatccgg ccgctcttcg ttcacgcccc gccaggcgta
27960cctgactctg cagacctcgt cctcggagcc ccgctccggc ggcatcggaa ccctccagtt
28020cgtggaggag ttcgtgccct cggtctactt caaccccttc tcgggacctc ccggacgcta
28080ccccgaccag ttcattccga actttgacgc ggtgaaggac tcggcggacg gctacgactg
28140aatgtcaggt gtcgaggcag agcagcttcg cctgagacac ctcgagcact gccgccgcca
28200caagtgcttc gcccgcggtt ctggtgagtt ctgctacttt cagctacccg aggagcatac
28260cgaggggccg gcgcacggcg tccgcctgac cacccagggc gaggttacct gttccctcat
28320ccgggagttt accctccgtc ccctgctagt ggagcgggag cggggtccct gtgtcctaac
28380tatcgcctgc aactgcccta accctggatt acatcaagat ctttgctgtc atctctgtgc
28440tgagtttaat aaacgctgag atcagaatct actggggctc ctgtcgccat cctgtgaacg
28500ccaccgtctt cacccacccc gaccaggccc aggcgaacct cacctgcggt ctgcatcgga
28560gggccaagaa gtacctcacc tggtacttca acggcacccc ctttgtggtt tacaacagct
28620tcgacgggga cggagtctcc ctgaaagacc agctctccgg tctcagctac tccatccaca
28680agaacaccac cctccaactc ttccctccct acctgccggg aacctacgag tgcgtcaccg
28740gccgctgcac ccacctcacc cgcctgatcg taaaccagag ctttccggga acagataact
28800ccctcttccc cagaacagga ggtgagctca ggaaactccc cggggaccag ggcggagacg
28860taccttcgac ccttgtgggg ttaggatttt ttattaccgg gttgctggct cttttaatca
28920aagtttcctt gagatttgtt ctttccttct acgtgtatga acacctcaac ctccaataac
28980tctacccttt cttcggaatc aggtgacttc tctgaaatcg ggcttggtgt gctgcttact
29040ctgttgattt ttttccttat catactcagc cttctgtgcc tcaggctcgc cgcctgctgc
29100gcacacatct atatctactg ctggttgctc aagtgcaggg gtcgccaccc aagatgaaca
29160ggtacatggt cctatcgatc ctaggcctgc tggccctggc ggcctgcagc gccgccaaaa
29220aagagattac ctttgaggag cccgcttgca atgtaacttt caagcccgag ggtgaccaat
29280gcaccaccct cgtcaaatgc gttaccaatc atgagaggct gcgcatcgac tacaaaaaca
29340aaactggcca gtttgcggtc tatagtgtgt ttacgcccgg agacccctct aactactctg
29400tcaccgtctt ccagggcgga cagtctaaga tattcaatta cactttccct ttttatgagt
29460tatgcgatgc ggtcatgtac atgtcaaaac agtacaacct gtggcctccc tctccccagg
29520cgtgtgtgga aaatactggg tcttactgct gtatggcttt cgcaatcact acgctcgctc
29580taatctgcac ggtgctatac ataaaattca ggcagaggcg aatctttatc gatgaaaaga
29640aaatgccttg atcgctaaca ccggctttct atctgcagaa tgaatgcaat cacctcccta
29700ctaatcacca ccaccctcct tgcgattgcc catgggttga cacgaatcga agtgccagtg
29760gggtccaatg tcaccatggt gggccccgcc ggcaattcca ccctcatgtg ggaaaaattt
29820gtccgcaatc aatgggttca tttctgctct aaccgaatca gtatcaagcc cagagccatc
29880tgcgatgggc aaaatctaac tctgatcaat gtgcaaatga tggatgctgg gtactattac
29940gggcagcggg gagaaatcat taattactgg cgaccccaca aggactacat gctgcatgta
30000gtcgaggcac ttcccactac cacccccact accacctctc ccaccaccac caccactact
30060actactacta ctactactac tactactacc actaccgctg cccgccatac ccgcaaaagc
30120accatgatta gcacaaagcc ccctcgtgct cactcccacg ccggcgggcc catcggtgcg
30180acctcagaaa ccaccgagct ttgcttctgc caatgcacta acgccagcgc tcatgaactg
30240ttcgacctgg agaatgagga tgtccagcag agctccgctt gcctgaccca ggaggctgtg
30300gagcccgttg ccctgaagca gatcggtgat tcaataattg actcttcttc ttttgccact
30360cccgaatacc ctcccgattc tactttccac atcacgggta ccaaagaccc taacctctct
30420ttctacctga tgctgctgct ctgtatctct gtggtctctt ccgcgctgat gttactgggg
30480atgttctgct gcctgatctg ccgcagaaag agaaaagctc gctctcaggg ccaaccactg
30540atgcccttcc cctacccccc ggattttgca gataacaaga tatgagctcg ctgctgacac
30600taaccgcttt actagcctgc gctctaaccc ttgtcgcttg cgactcgaga ttccacaatg
30660tcacagctgt ggcaggagaa aatgttactt tcaactccac ggccgatacc cagtggtcgt
30720ggagtggctc aggtagctac ttaactatct gcaatagctc cacttccccc ggcatatccc
30780caaccaagta ccaatgcaat gccagcctgt tcaccctcat caacgcttcc accctggaca
30840atggactcta tgtaggctat gtaccctttg gtgggcaagg aaagacccac gcttacaacc
30900tggaagttcg ccagcccaga accactaccc aagcttctcc caccaccacc accaccacca
30960ccatcaccag cagcagcagc agcagcagcc acagcagcag cagcagatta ttgactttgg
31020ttttggccag ctcatctgcc gctacccagg ccatctacag ctctgtgccc gaaaccactc
31080agatccaccg cccagaaacg accaccgcca ccaccctaca cacctccagc gatcagatgc
31140cgaccaacat cacccccttg gctcttcaaa tgggacttac aagccccact ccaaaaccag
31200tggatgcggc cgaggtctcc gccctcgtca atgactgggc ggggctggga atgtggtggt
31260tcgccatagg catgatggcg ctctgcctgc ttctgctctg gctcatctgc tgcctccacc
31320gcaggcgagc cagacccccc atctatagac ccatcattgt cctgaacccc gataatgatg
31380ggatccatag attggatggc ctgaaaaacc tacttttttc ttttacagta tgataaattg
31440agacatgcct cgcattttct tgtacatgtt ccttctccca ccttttctgg ggtgttctac
31500gctggccgct gtgtctcacc tggaggtaga ctgcctctca cccttcactg tctacctgct
31560ttacggattg gtcaccctca ctctcatctg cagcctaatc acagtaatca tcgccttcat
31620ccagtgcatt gattacatct gtgtgcgcct cgcatacttc agacaccacc cgcagtaccg
31680agacaggaac attgcccaac ttctaagact gctctaatca tgcataagac tgtgatctgc
31740cttctgatcc tctgcatcct gcccaccctc acctcctgcc agtacaccac aaaatctccg
31800cgcaaaagac atgcctcctg ccgcttcacc caactgtgga atatacccaa atgctacaac
31860gaaaagagcg agctctccga agcttggctg tatggggtca tctgtgtctt agttttctgc
31920agcactgtct ttgccctcat aatctacccc tactttgatt tgggatggaa cgcgatcgat
31980gccatgaatt accccacctt tcccgcaccc gagataattc cactgcgaca agttgtaccc
32040gttgtcgtta atcaacgccc cccatcccct acgcccactg aaatcagcta ctttaaccta
32100acaggcggag atgactgacg ccctagatct agaaatggac ggcatcagta ccgagcagcg
32160tctcctagag aggcgcaggc aggcggctga gcaagagcgc ctcaatcagg agctccgaga
32220tctcgttaac ctgcaccagt gcaaaagagg catcttttgt ctggtaaagc aggccaaagt
32280cacctacgag aagaccggca acagccaccg cctcagttac aaattgccca cccagcgcca
32340gaagctggtg ctcatggtgg gtgagaatcc catcaccgtc acccagcact cggtagagac
32400cgaggggtgt ctgcactccc cctgtcgggg tccagaagac ctctgcaccc tggtaaagac
32460cctgtgcggt ctcagagatt tagtcccctt taactaatca aacactggaa tcaataaaaa
32520gaatcactta cttaaaatca gacagcaggt ctctgtccag tttattcagc agcacctcct
32580tcccctcctc ccaactctgg tactccaaac gccttctggc ggcaaacttc ctccacaccc
32640tgaagggaat gtcagattct tgctcctgtc cctccgcacc cactatcttc atgttgttgc
32700agatgaagcg caccaaaacg tctgacgaga gcttcaaccc cgtgtacccc tatgacacgg
32760aaagcggccc tccctccgtc cctttcctca cccctccctt cgtgtctccc gatggattcc
32820aagaaagtcc ccccggggtc ctgtctctga acctggccga gcccctggtc acttcccacg
32880gcatgctcgc cctgaaaatg ggaagtggcc tctccctgga cgacgctggc aacctcacct
32940ctcaagatat caccaccgct agccctcccc tcaaaaaaac caagaccaac ctcagcctag
33000aaacctcatc ccccctaact gtgagcacct caggcgccct caccgtagca gccgccgctc
33060ccctggcggt ggccggcacc tccctcacca tgcaatcaga ggcccccctg acagtacagg
33120atgcaaaact caccctggcc accaaaggcc ccctgaccgt gtctgaaggc aaactggcct
33180tgcaaacatc ggccccgctg acggccgctg acagcagcac cctcacagtc agtgccacac
33240caccccttag cacaagcaat ggcagcttgg gtattgacat gcaagccccc atttacacca
33300ccaatggaaa actaggactt aactttggcg ctcccctgca tgtggtagac agcctaaatg
33360cactgactgt agttactggc caaggtctta cgataaacgg aacagcccta caaactagag
33420tctcaggtgc cctcaactat gacacatcag gaaacctaga attgagagct gcagggggta
33480tgcgagttga tgcaaatggt caacttatcc ttgatgtagc ttacccattt gatgcacaaa
33540acaatctcag ccttaggctt ggacagggac ccctgtttgt taactctgcc cacaacttgg
33600atgttaacta caacagaggc ctctacctgt tcacatctgg aaataccaaa aagctagaag
33660ttaatatcaa aacagccaag ggtctcattt atgatgacac tgctatagca atcaatgcgg
33720gtgatgggct acagtttgac tcaggctcag atacaaatcc attaaaaact aaacttggat
33780taggactgga ttatgactcc agcagagcca taattgctaa actgggaact ggcctaagct
33840ttgacaacac aggtgccatc acagtaggca acaaaaatga tgacaagctt accttgtgga
33900ccacaccaga cccatcccct aactgtagaa tctattcaga gaaagatgct aaattcacac
33960ttgttttgac taaatgcggc agtcaggtgt tggccagcgt ttctgtttta tctgtaaaag
34020gtagccttgc gcccatcagt ggcacagtaa ctagtgctca gattgtcctc agatttgatg
34080aaaatggagt tctactaagc aattcttccc ttgaccctca atactggaac tacagaaaag
34140gtgaccttac agagggcact gcatatacca acgcagtggg atttatgccc aacctcacag
34200catacccaaa aacacagagc caaactgcta aaagcaacat tgtaagtcag gtttacttga
34260atggggacaa atccaaaccc atgaccctca ccattaccct caatggaact aatgaaacag
34320gagatgccac agtaagcact tactccatgt cattctcatg gaactggaat ggaagtaatt
34380acattaatga aacgttccaa accaactcct tcaccttctc ctacatcgcc caagaataaa
34440aagcatgacg ctgttgattt gattcaatgt gtttctgttt tattttcaag cacaacaaaa
34500tcattcaagt cattcttcca tcttagctta atagacacag tagcttaata gacccagtag
34560tgcaaagccc cattctagct tatagatcag acagtgataa ttaaccacca ccaccaccat
34620accttttgat tcaggaaatc atgatcatca caggatccta gtcttcaggc cgccccctcc
34680ctcccaagac acagaataca cagtcctctc cccccgactg gctttaaata acaccatctg
34740gttggtcaca gacatgttct taggggttat attccacacg gtctcctgcc gcgccaggcg
34800ctcgtcggtg atgttgataa actctcccgg cagctcgctc aagttcacgt cgctgtccag
34860cggctgaacc tccggctgac gcgataactg tgcgaccggc tgctggacga acggaggccg
34920cgcctacaag ggggtagagt cataatcctc ggtcaggata gggcggtgat gcagcagcag
34980cgagcgaaac atctgctgcc gccgccgctc cgtccggcag gaaaacaaca cgccggtggt
35040ctcctccgcg ataatccgca ccgcccgcag catcagcttc ctcgttctcc gcgcgcagca
35100cctcaccctt atctcgctca aatcggcgca gtaggtacag cacagcacca cgatgttatt
35160catgatccca cagtgcaggg cgctgtatcc aaagctcatg ccgggaacca ccgcccccac
35220gtggccatcg taccacaagc gcacgtaaat caagtgtcga cccctcatga acgcgctgga
35280cacaaacatt acttccttgg gcatgttgta attcaccacc tcccggtacc agataaacct
35340ctggttgaac agggcacctt ccaccaccat cctgaaccaa gaggccagaa cctgcccacc
35400ggctatgcac tgcagggaac ccgggttgga acaatgacaa tgcagactcc aaggctcgta
35460accgtggatc atccggctgc tgaaggcatc gatgttggca caacacagac acacgtgcat
35520gcactttctc atgattagca gctcttccct cgtcaggatc atatcccaag gaataaccca
35580ttcttgaatc aacgtaaaac ccacacagca gggaaggcct cgcacataac tcacgttgtg
35640catggtcagc gtgttgcatt ccggaaacag cggatgatcc tccagtatcg aggcgcgggt
35700ctccttctca cagggaggta aagggtccct gctgtacgga ctgcgccggg acgaccgaga
35760tcgtgttgag cgtagtgtca tggaaaaggg aacgccggac gtggtcatac ttcttgaagc
35820agaaccaggt tcgcgcgtgg caggcctcct tgcgtctgcg gtctcgccgt ctagctcgct
35880ccgtgtgata gttgtagtac agccactccc gcagagcgtc gaggcgcacc ctggcttccg
35940gatctatgta gactccgtct tgcaccgcgg ccctgataat atccaccacc gtagaataag
36000caacacccag ccaagcaata cactcgctct gcgagcggca gacaggagga gcgggcagag
36060atgggagaac catgataaaa aacttttttt aaagaatatt ttccaattct tcgaaagtaa
36120gatctatcaa gtggcagcgc tcccctccac tggcgcggtc aaactctacg gccaaagcac
36180agacaacggc atttctaaga tgttccttaa tggcgtccaa aagacacacc gctctcaagt
36240tgcagtaaac tatgaatgaa aacccatccg gctgattttc caatatagac gcgccggcag
36300cgtccaccaa acccagataa ttttcttctc tccagcggtt tacgatctgt ctaagcaaat
36360cccttatatc aagtccgacc atgccaaaaa tctgctcaag agcgccctcc accttcatgt
36420acaagcagcg catcatgatt gcaaaaattc aggttcttca gagacctgta taagattcaa
36480aatgggaaca ttaacaaaaa ttcctctgtc gcgcagatcc cttcgcaggg caagctgaac
36540ataatcagac aggtccgaac ggaccagtga ggccaaatcc ccaccaggaa ccagatccag
36600agaccctata ctgattatga cgcgcatact cggggctatg ctgaccagcg tagcgccgat
36660gtaggcgtgc tgcatgggcg gcgagataaa atgcaaagtg ctggttaaaa aatcaggcaa
36720agcctcgcgc aaaaaagcta acacatcata atcatgctca tgcaggtagt tgcaggtaag
36780ctcaggaacc aaaacggaat aacacacgat tttcctctca aacatgactt cgcggatact
36840gcgtaaaaca aaaaattata aataaaaaat taattaaata acttaaacat tggaagcctg
36900tctcacaaca ggaaaaacca ctttaatcaa cataagacgg gccacgggca tgccggcata
36960gccgtaaaaa aattggtccc cgtgattaac aagtaccaca gacagctccc cggtcatgtc
37020gggggtcatc atgtgagact ctgtatacac gtctggattg tgaacatcag acaaacaaag
37080aaatcgagcc acgtagcccg gaggtataat cacccgcagg cggaggtaca gcaaaacgac
37140ccccatagga ggaatcacaa aattagtagg agaaaaaaat acataaacac cagaaaaacc
37200ctgttgctga ggcaaaatag cgccctcccg atccaaaaca acataaagcg cttccacagg
37260agcagccata acaaagaccc gagtcttacc agtaaaagaa aaaagatctc tcaacgcagc
37320accagcacca acacttcgca gtgtaaaagg ccaagtgccg agagagtata tataggaata
37380aaaagtgacg taaacgggca aagtccaaaa aacgcccaga aaaaccgcac gcgaacctac
37440gccccgaaac gaaagccaaa aaacactaga cactcccttc cggcgtcaac ttccgctttc
37500ccacgctacg tcacttgccc cagtcaaaca aactacatat cccgaacttc caagtcgcca
37560cgcccaaaac accgcctaca cctccccgcc cgccggcccg cccccaaacc cgcctcccgc
37620cccgcgcccc gccccgcgcc gcccatctca ttatcatatt ggcttcaatc caaaataagg
37680tatattattg atgatggttt aaacggatcc tctagagtcg acctgcaggc atgcaagctt
37740gagtattcta tagtgtcacc taaatagctt ggcgtaatca tggtcatagc tgtttcctgt
37800gtgaaattgt tatccgctca caattccaca caacatacga gccggaagca taaagtgtaa
37860agcctggggt gcctaatgag tgagctaact cacattaatt gcgttgcgct cactgcccgc
37920tttccagtcg ggaaacctgt cgtgccagct gcattaatga atcggccaac gcgaacccct
37980tgcggccgcc cgggccgtcg accaattctc atgtttgaca gcttatcatc gaatttctgc
38040cattcatccg cttattatca cttattcagg cgtagcaacc aggcgtttaa gggcaccaat
38100aactgcctta aaaaaattac gccccgccct gccactcatc gcagtactgt tgtaattcat
38160taagcattct gccgacatgg aagccatcac aaacggcatg atgaacctga atcgccagcg
38220gcatcagcac cttgtcgcct tgcgtataat atttgcccat ggtgaaaacg ggggcgaaga
38280agttgtccat attggccacg tttaaatcaa aactggtgaa actcacccag ggattggctg
38340agacgaaaaa catattctca ataaaccctt tagggaaata ggccaggttt tcaccgtaac
38400acgccacatc ttgcgaatat atgtgtagaa actgccggaa atcgtcgtgg tattcactcc
38460agagcgatga aaacgtttca gtttgctcat ggaaaacggt gtaacaaggg tgaacactat
38520cccatatcac cagctcaccg tctttcattg ccatacggaa ttccggatga gcattcatca
38580ggcgggcaag aatgtgaata aaggccggat aaaacttgtg cttatttttc tttacggtct
38640ttaaaaaggc cgtaatatcc agctgaacgg tctggttata ggtacattga gcaactgact
38700gaaatgcctc aaaatgttct ttacgatgcc attgggatat atcaacggtg gtatatccag
38760tgattttttt ctccatttta gcttccttag ctcctgaaaa tctcgataac tcaaaaaata
38820cgcccggtag tgatcttatt tcattatggt gaaagttgga acctcttacg tgccgatcaa
38880cgtctcattt tcgccaaaag ttggcccagg gcttcccggt atcaacaggg acaccaggat
38940ttatttattc tgcgaagtga tcttccgtca caggtattta ttcgcgataa gctcatggag
39000cggcgtaacc gtcgcacagg aaggacagag aaagcgcgga tctgggaagt gacggacaga
39060acggtcagga cctggattgg ggaggcggtt gccgccgctg ctgctgacgg tgtgacgttc
39120tctgttccgg tcacaccaca tacgttccgc cattcctatg cgatgcacat gctgtatgcc
39180ggtataccgc tgaaagttct gcaaagcctg atgggacata agtccatcag ttcaacggaa
39240gtctacacga aggtttttgc gctggatgtg gctgcccggc accgggtgca gtttgcgatg
39300ccggagtctg atgcggttgc gatgctgaaa caattatcct gagaataaat gccttggcct
39360ttatatggaa atgtggaact gagtggatat gctgtttttg tctgttaaac agagaagctg
39420gctgttatcc actgagaagc gaacgaaaca gtcgggaaaa tctcccatta tcgtagagat
39480ccgcattatt aatctcagga gcctgtgtag cgtttatagg aagtagtgtt ctgtcatgat
39540gcctgcaagc ggtaacgaaa acgatttgaa tatgccttca ggaacaatag aaatcttcgt
39600gcggtgttac gttgaagtgg agcggattat gtcagcaatg gacagaacaa cctaatgaac
39660acagaaccat gatgtggtct gtccttttac agccagtagt gctcgccgca gtcgagcgac
39720agggcgaagc cctcgagtga gcgaggaagc accagggaac agcacttata tattctgctt
39780acacacgatg cctgaaaaaa cttcccttgg ggttatccac ttatccacgg ggatattttt
39840ataattattt tttttatagt ttttagatct tcttttttag agcgccttgt aggcctttat
39900ccatgctggt tctagagaag gtgttgtgac aaattgccct ttcagtgtga caaatcaccc
39960tcaaatgaca gtcctgtctg tgacaaattg cccttaaccc tgtgacaaat tgccctcaga
40020agaagctgtt ttttcacaaa gttatccctg cttattgact cttttttatt tagtgtgaca
40080atctaaaaac ttgtcacact tcacatggat ctgtcatggc ggaaacagcg gttatcaatc
40140acaagaaacg taaaaatagc ccgcgaatcg tccagtcaaa cgacctcact gaggcggcat
40200atagtctctc ccgggatcaa aaacgtatgc tgtatctgtt cgttgaccag atcagaaaat
40260ctgatggcac cctacaggaa catgacggta tctgcgagat ccatgttgct aaatatgctg
40320aaatattcgg attgacctct gcggaagcca gtaaggatat acggcaggca ttgaagagtt
40380tcgcggggaa ggaagtggtt ttttatcgcc ctgaagagga tgccggcgat gaaaaaggct
40440atgaatcttt tccttggttt atcaaacgtg cgcacagtcc atccagaggg ctttacagtg
40500tacatatcaa cccatatctc attcccttct ttatcgggtt acagaaccgg tttacgcagt
40560ttcggcttag tgaaacaaaa gaaatcacca atccgtatgc catgcgttta tacgaatccc
40620tgtgtcagta tcgtaagccg gatggctcag gcatcgtctc tctgaaaatc gactggatca
40680tagagcgtta ccagctgcct caaagttacc agcgtatgcc tgacttccgc cgccgcttcc
40740tgcaggtctg tgttaatgag atcaacagca gaactccaat gcgcctctca tacattgaga
40800aaaagaaagg ccgccagacg actcatatcg tattttcctt ccgcgatatc acttccatga
40860cgacaggata gtctgagggt tatctgtcac agatttgagg gtggttcgtc acatttgttc
40920tgacctactg agggtaattt gtcacagttt tgctgtttcc ttcagcctgc atggattttc
40980tcatactttt tgaactgtaa tttttaagga agccaaattt gagggcagtt tgtcacagtt
41040gatttccttc tctttccctt cgtcatgtga cctgatatcg ggggttagtt cgtcatcatt
41100gatgagggtt gattatcaca gtttattact ctgaattggc tatccgcgtg tgtacctcta
41160cctggagttt ttcccacggt ggatatttct tcttgcgctg agcgtaagag ctatctgaca
41220gaacagttct tctttgcttc ctcgccagtt cgctcgctat gctcggttac acggctgcgg
41280cgagcgctag tgataataag tgactgaggt atgtgctctt cttatctcct tttgtagtgt
41340tgctcttatt ttaaacaact ttgcggtttt ttgatgactt tgcgattttg ttgttgcttt
41400gcagtaaatt gcaagattta ataaaaaaac gcaaagcaat gattaaagga tgttcagaat
41460gaaactcatg gaaacactta accagtgcat aaacgctggt catgaaatga cgaaggctat
41520cgccattgca cagtttaatg atgacagccc ggaagcgagg aaaataaccc ggcgctggag
41580aataggtgaa gcagcggatt tagttggggt ttcttctcag gctatcagag atgccgagaa
41640agcagggcga ctaccgcacc cggatatgga aattcgagga cgggttgagc aacgtgttgg
41700ttatacaatt gaacaaatta atcatatgcg tgatgtgttt ggtacgcgat tgcgacgtgc
41760tgaagacgta tttccaccgg tgatcggggt tgctgcccat aaaggtggcg tttacaaaac
41820ctcagtttct gttcatcttg ctcaggatct ggctctgaag gggctacgtg ttttgctcgt
41880ggaaggtaac gacccccagg gaacagcctc aatgtatcac ggatgggtac cagatcttca
41940tattcatgca gaagacactc tcctgccttt ctatcttggg gaaaaggacg atgtcactta
42000tgcaataaag cccacttgct ggccggggct tgacattatt ccttcctgtc tggctctgca
42060ccgtattgaa actgagttaa tgggcaaatt tgatgaaggt aaactgccca ccgatccaca
42120cctgatgctc cgactggcca ttgaaactgt tgctcatgac tatgatgtca tagttattga
42180cagcgcgcct aacctgggta tcggcacgat taatgtcgta tgtgctgctg atgtgctgat
42240tgttcccacg cctgctgagt tgtttgacta cacctccgca ctgcagtttt tcgatatgct
42300tcgtgatctg ctcaagaacg ttgatcttaa agggttcgag cctgatgtac gtattttgct
42360taccaaatac agcaatagta atggctctca gtccccgtgg atggaggagc aaattcggga
42420tgcctgggga agcatggttc taaaaaatgt tgtacgtgaa acggatgaag ttggtaaagg
42480tcagatccgg atgagaactg tttttgaaca ggccattgat caacgctctt caactggtgc
42540ctggagaaat gctctttcta tttgggaacc tgtctgcaat gaaattttcg atcgtctgat
42600taaaccacgc tgggagatta gataatgaag cgtgcgcctg ttattccaaa acatacgctc
42660aatactcaac cggttgaaga tacttcgtta tcgacaccag ctgccccgat ggtggattcg
42720ttaattgcgc gcgtaggagt aatggctcgc ggtaatgcca ttactttgcc tgtatgtggt
42780cgggatgtga agtttactct tgaagtgctc cggggtgata gtgttgagaa gacctctcgg
42840gtatggtcag gtaatgaacg tgaccaggag ctgcttactg aggacgcact ggatgatctc
42900atcccttctt ttctactgac tggtcaacag acaccggcgt tcggtcgaag agtatctggt
42960gtcatagaaa ttgccgatgg gagtcgccgt cgtaaagctg ctgcacttac cgaaagtgat
43020tatcgtgttc tggttggcga gctggatgat gagcagatgg ctgcattatc cagattgggt
43080aacgattatc gcccaacaag tgcttatgaa cgtggtcagc gttatgcaag ccgattgcag
43140aatgaatttg ctggaaatat ttctgcgctg gctgatgcgg aaaatatttc acgtaagatt
43200attacccgct gtatcaacac cgccaaattg cctaaatcag ttgttgctct tttttctcac
43260cccggtgaac tatctgcccg gtcaggtgat gcacttcaaa aagcctttac agataaagag
43320gaattactta agcagcaggc atctaacctt catgagcaga aaaaagctgg ggtgatattt
43380gaagctgaag aagttatcac tcttttaact tctgtgctta aaacgtcatc tgcatcaaga
43440actagtttaa gctcacgaca tcagtttgct cctggagcga cagtattgta taagggcgat
43500aaaatggtgc ttaacctgga caggtctcgt gttccaactg agtgtataga gaaaattgag
43560gccattctta aggaacttga aaagccagca ccctgatgcg accacgtttt agtctacgtt
43620tatctgtctt tacttaatgt cctttgttac aggccagaaa gcataactgg cctgaatatt
43680ctctctgggc ccactgttcc acttgtatcg tcggtctgat aatcagactg ggaccacggt
43740cccactcgta tcgtcggtct gattattagt ctgggaccac ggtcccactc gtatcgtcgg
43800tctgattatt agtctgggac cacggtccca ctcgtatcgt cggtctgata atcagactgg
43860gaccacggtc ccactcgtat cgtcggtctg attattagtc tgggaccatg gtcccactcg
43920tatcgtcggt ctgattatta gtctgggacc acggtcccac tcgtatcgtc ggtctgatta
43980ttagtctgga accacggtcc cactcgtatc gtcggtctga ttattagtct gggaccacgg
44040tcccactcgt atcgtcggtc tgattattag tctgggacca cgatcccact cgtgttgtcg
44100gtctgattat cggtctggga ccacggtccc acttgtattg tcgatcagac tatcagcgtg
44160agactacgat tccatcaatg cctgtcaagg gcaagtattg acatgtcgtc gtaacctgta
44220gaacggagta acctcggtgt gcggttgtat gcctgctgtg gattgctgct gtgtcctgct
44280tatccacaac attttgcgca cggttatgtg gacaaaatac ctggttaccc aggccgtgcc
44340ggcacgttaa ccgggctgca tccgatgcaa gtgtgtcgct gtcgacgagc tcgcgagctc
44400ggacatgagg ttgccccgta ttcagtgtcg ctgatttgta ttgtctgaag ttgtttttac
44460gttaagttga tgcagatcaa ttaatacgat acctgcgtca taattgatta tttgacgtgg
44520tttgatggcc tccacgcacg ttgtgatatg tagatgataa tcattatcac tttacgggtc
44580ctttccggtg atccgacagg ttacggggcg gcgacctcgc gggttttcgc tatttatgaa
44640aattttccgg tttaaggcgt ttccgttctt cttcgtcata acttaatgtt tttatttaaa
44700ataccctctg aaaagaaagg aaacgacagg tgctgaaagc gagctttttg gcctctgtcg
44760tttcctttct ctgtttttgt ccgtggaatg aacaatggaa gtccgagctc atcgctaata
44820acttcgtata gcatacatta tacgaagtta tattcgatgc ggccgcaagg ggttcgcgtc
44880agcgggtgtt ggcgggtgtc ggggctggct taactatgcg gcatcagagc agattgtact
44940gagagtgcac catatgcggt gtgaaatacc gcacagatgc gtaaggagaa aataccgcat
45000caggcgccat tcgccattca ggctgcgcaa ctgttgggaa gggcgatcgg tgcgggcctc
45060ttcgctatta cgccagctgg cgaaaggggg atgtgctgca aggcgattaa gttgggtaac
45120gccagggttt tcccagtcac gacgttgtaa aacgacggcc agtgaattgt aatacgactc
45180actatagggc gaattcgagc tcggtacccg gggatcctcg tttaaac
452271037830DNASimian adenovirus 10catcatcaat aatatacctt attttggatt
gaagccaata tgataatgag atgggcggcg 60cggggcggga ggcgggtccg ggggcgggcc
ggcgggcggg gcggtgtggc ggaagtggac 120tttgtaagtg tggcggatgt gacttgctag
tgccgggcgc ggtaaaagtg acgttttccg 180tgcgcgacaa cgcccacggg aagtgacatt
tttcccgcgg tttttaccgg atgttgtagt 240gaatttgggc gtaaccaagt aagatttggc
cattttcgcg ggaaaactga aacggggaag 300tgaaatctga ttaatttcgc gttagtcata
ccgcgtaata tttgtcgagg gccgagggac 360tttggccgat tacgtggagg actcgcccag
gtgttttttg aggtgaattt ccgcgttccg 420ggtcaaagtc tccgttttat tattatagtc
agctgacgcg gagtgtattt ataccctctg 480atctcgtcaa gtggccactc ttgagtgcca
gcgagtagag ttttctcctc tgccgctctc 540cgctccgctc cgctcggctc tgacaccggg
gaaaaaatga gacatttcac ctacgatggc 600ggtgtgctca ccggccagct ggctgctgaa
gtcctggaca ccctgatcga ggaggtattg 660gccgataatt atcctccctc gactcctttt
gagccaccta cacttcacga actctacgat 720ctggatgtgg tggggcccag cgatccgaac
gagcaggcgg tttccagttt ttttccagag 780tccatgttgt tggccagcca ggagggggtc
gaacttgaga cccctcctcc gatcgtggat 840tcccccgatc cgccgcagct gactaggcag
cccgagcgct gtgcgggacc tgagactatg 900ccccagctgc tacctgaggt gatcgatctc
acctgtaatg agtctggttt tccacccagc 960gaggatgagg acgaagaggg tgagcagttt
gtgttagatt ctgtggaaca acccgggcga 1020ggatgcaggt cttgtcaata tcaccggaaa
aacacaggag actcccagat tatgtgttct 1080ctgtgttata tgaagatgac ctgtatgttt
atttacagta agtttatcat ctgtgggcag 1140gtgggctata gtgtgggtgg tggtctttgg
ggggtttttt aatatatgtc aggggttatg 1200ctgaagactt ttttattgtg atttttaaag
gtccagtgtc tgagcccgag caagaacctg 1260aaccggagcc tgagccttct cgccccagga
gaaagcctgt aatcttaact agacccagcg 1320caccggtagc gagaggcctc agcagcgcgg
agaccaccga ctccggtgct tcctcatcac 1380ccccggagat tcaccccctg gtgcccctgt
gtcccgttaa gcccgttgcc gtgagagtca 1440gtgggcggcg gtctgctgtg gagtgcattg
aggacttgct ttttgattca caggaacctt 1500tggacttgag cttgaaacgc cccaggcatt
aaacctggtc acctggactg aatgagttga 1560cgcctatgtt tgcttttgaa tgacttaatg
tgtatagata ataaagagtg agataatgtt 1620ttaattgcat ggtgtgttta acttgggcgg
agtctgctgg gtatataagc ttccctgggc 1680taaacttggt tacacttgac ctcatggagg
cctgggagtg tttggagaac tttgccggag 1740ttcgtgcctt gctggacgag agctctaaca
atacctcttg gtggtggagg tatttgtggg 1800gctctcccca gggcaagtta gtttgtagaa
tcaaggagga ttacaagtgg gaatttgaag 1860agcttttgaa atcctgtggt gagctattgg
attctttgaa tctaggccac caggctctct 1920tccaggagaa ggtcatcagg actttggatt
tttccacacc ggggcgcatt gcagccgcgg 1980ttgcttttct agcttttttg aaggatagat
ggagcgaaga gacccacttg agttcgggct 2040acgtcctgga ttttctggcc atgcaactgt
ggagagcatg gatcagacac aagaacaggc 2100tgcaactgtt gtcttccgtc cgcccgttgc
tgattccggc ggaggagcaa caggccgggt 2160cagaggaccg ggcccgtcgg gatccggagg
agagggcacc gaggccgggc gagaggagcg 2220cgctgaacct gggaaccggg ctgagcggcc
atccacatcg ggagtgaatg tcgggcaggt 2280ggtggatctt tttccagaac tgcggcggat
tttgactatt agggaggatg ggcaatttgt 2340taagggtctt aagagggaga ggggggcttc
tgagcataac gaggaggcca gtaatttagc 2400ttttagcttg atgaccagac accgtccaga
gtgcatcact tttcagcaga ttaaggacaa 2460ttgtgccaat gagttggatc tgttgggtca
gaagtatagc atagagcagc tgaccactta 2520ctggctgcag ccgggtgatg atctggagga
agctattagg gtgtatgcta aggtggccct 2580gcggcccgat tgcaagtaca agctcaaggg
gctggtgaat atcaggaatt gttgctacat 2640ttctggcaac ggggcggagg tggagataga
gaccgaagac agggtggctt tcagatgcag 2700catgatgaat atgtggccgg gggtgctggg
catggacggg gtggtgatta tgaatgtgag 2760gttcacgggg cccaacttta acggcacggt
gtttttgggg aacaccaacc tggtcctgca 2820cggggtgagc ttctatgggt ttaacaacac
ctgtgtggag gcctggaccg atgtgaaggt 2880ccgcggttgc gccttttatg gatgttggaa
ggccatagtg agccgcccta agagcaggag 2940ttccattaag aaatgcttgt ttgagaggtg
caccttgggg atcctggccg agggcaactg 3000cagggtgcgc cacaatgtgg cctccgagtg
cggttgcttc atgctagtca agagcgtggc 3060ggtaatcaag cataatatgg tgtgcggcaa
cagcgaggac aaggcctcac agatgctgac 3120ctgcacggat ggcaactgcc acttgctgaa
gaccatccat gtaaccagcc acagccggaa 3180ggcctggccc gtgttcgagc acaacttgct
gacccgctgc tccttgcatc tgggcaacag 3240gcggggggtg ttcctgccct atcaatgcaa
ctttagtcac accaagatct tgctagagcc 3300cgagagcatg tccaaggtga acttgaacgg
ggtgtttgac atgaccatga agatctggaa 3360ggtgctgagg tacgacgaga ccaggtcccg
gtgcagaccc tgcgagtgcg ggggcaagca 3420tatgaggaac cagcccgtga tgctggatgt
gaccgaggag ctgaggacag accacttggt 3480tctggcctgc accagggccg agtttggttc
tagcgatgaa gacacagatt gaggtgggtg 3540agtgggcgtg gcctggggtg gtcatgaaaa
tatataagtt gggggtctta gggtctcttt 3600atttgtgttg cagagaccgc cggagccatg
agcgggagca gcagcagcag cagtagcagc 3660agcgccttgg atggcagcat cgtgagccct
tatttgacga cgcggatgcc ccactgggcc 3720ggggtgcgtc agaatgtgat gggctccagc
atcgacggcc gacccgtcct gcccgcaaat 3780tccgccacgc tgacctatgc gaccgtcgcg
gggacgccgt tggacgccac cgccgccgcc 3840gccgccaccg cagccgcctc ggccgtgcgc
agcctggcca cggactttgc attcctggga 3900ccactggcga caggggctac ttctcgggcc
gctgctgccg ccgttcgcga tgacaagctg 3960accgccctgc tggcgcagtt ggatgcgctt
actcgggaac tgggtgacct ttctcagcag 4020gtcatggccc tgcgccagca ggtctcctcc
ctgcaagctg gcgggaatgc ttctcccaca 4080aatgccgttt aagataaata aaaccagact
ctgtttggat taaagaaaag tagcaagtgc 4140attgctctct ttatttcata attttccgcg
cgcgataggc cctagaccag cgttctcggt 4200cgttgagggt gcggtgtatc ttctccagga
cgtggtagag gtggctctgg acgttgagat 4260acatgggcat gagcccgtcc cgggggtgga
ggtagcacca ctgcagagct tcatgctccg 4320gggtggtgtt gtagatgatc cagtcgtagc
aggagcgctg ggcatggtgc ctaaaaatgt 4380ccttcagcag caggccgatg gccaggggga
ggcccttggt gtaagtgttt acaaaacggt 4440taagttggga agggtgcatt cggggagaga
tgatgtgcat cttggactgt atttttagat 4500tggcgatgtt tccgcccaga tcccttctgg
gattcatgtt gtgcaggacc accagtacag 4560tgtatccggt gcacttgggg aatttgtcat
gcagcttaga gggaaaagcg tggaagaact 4620tggagacgcc tttgtggcct cccagatttt
ccatgcattc gtccatgatg atggcaatgg 4680gcccgcggga ggcagcttgg gcaaagatat
ttctggggtc gctgacgtcg tagttgtgtt 4740ccagggtgag gtcgtcatag gccattttta
caaagcgcgg gcggagggtg cccgactggg 4800ggatgatggt cccctctggc cctggggcgt
agttgccctc gcagatctgc atttcccagg 4860ccttaatctc ggagggggga atcatatcca
cctgcggggc gatgaagaaa acggtttccg 4920gagccgggga gattaactgg gatgagagca
ggtttctaag cagctgtgat tttccacaac 4980cggtgggccc ataaataaca cctataaccg
gttgcagctg gtagtttaga gagctgcagc 5040tgccgtcgtc ccggaggagg ggggccacct
cgttgagcat gtccctgacg cgcatgttct 5100ccccgaccag atccgccaga aggcgctcgc
cgcccaggga cagcagctct tgcaaggaag 5160caaagttttt cagcggcttg aggccgtccg
ccgtgggcat gtttttcagg gtctggctca 5220gcagctccag gcggtcccag agctcggtga
cgtgctctac ggcatctcta tccagcatat 5280ctcctcgttt cgcgggttgg ggcgactttc
gctgtagggc accaagcggt ggtcgtccag 5340cggggccaga gtcatgtcct tccatgggcg
cagggtcctc gtcagggtgg tctgggtcac 5400ggtgaagggg tgcgctccgg gctgagcgct
tgccaaggtg cgcttgaggc tggttctgct 5460ggtgctgaag cgctgccggt cttcgccctg
cgcgtcggcc aggtagcatt tgaccatggt 5520gtcatagtcc agcccctccg cggcgtgtcc
cttggcgcgc agcttgccct tggaggtggc 5580gccgcacgag gggcagagca ggctcttgag
cgcgtagagc ttgggggcga ggaagaccga 5640ttcgggggag taggcgtccg cgccgcagac
cccgcacacg gtctcgcact ccaccagcca 5700ggtgagctcg gggcgcgccg ggtcaaaaac
caggtttccc ccatgctttt tgatgcgttt 5760cttacctcgg gtctccatga ggtggtgtcc
ccgctcggtg acgaagaggc tgtccgtgtc 5820tccgtagacc gacttgaggg gtcttttctc
caggggggtc cctcggtctt cctcgtagag 5880gaactcggac cactctgaga cgaaggcccg
cgtccaggcc aggacgaagg aggctatgtg 5940ggaggggtag cggtcgttgt ccactagggg
gtccaccttc tccaaggtgt gaagacacat 6000gtcgccttcc tcggcgtcca ggaaggtgat
tggcttgtag gtgtaggcca cgtgaccggg 6060ggttcctgac gggggggtat aaaagggggt
gggggcgcgc tcgtcgtcac tctcttccgc 6120atcgctgtct gcgagggcca gctgctgggg
tgagtattcc ctctcgaagg cgggcatgac 6180ctccgcgctg aggttgtcag tttccaaaaa
cgaggaggat ttgatgttca cctgtcccga 6240ggtgatacct ttgagggtac ccgcgtccat
ctggtcagaa aacacgatct ttttattgtc 6300cagcttggtg gcgaacgacc cgtagagggc
gttggagagc agcttggcga tggagcgcag 6360ggtctggttc ttgtccctgt cggcgcgctc
cttggccgcg atgttgagct gcacgtactc 6420gcgcgcgacg cagcgccact cggggaagac
ggtggtgcgc tcgtcgggca ccaggcgcac 6480gcgccagccg cggttgtgca gggtgaccag
gtccacgctg gtggcgacct cgccgcgcag 6540gcgctcgttg gtccagcaga gacggccgcc
cttgcgcgag cagaaggggg gcagggggtc 6600gagctgggtc tcgtccgggg ggtccgcgtc
cacggtgaaa accccggggc gcaggcgcgc 6660gtcgaagtag tctatcttgc aaccttgcat
gtccagcgcc tgctgccagt cgcgggcggc 6720gagcgcgcgc tcgtaggggt tgagcggcgg
gccccagggc atggggtggg tgagtgcgga 6780ggcgtacatg ccgcagatgt catagacgta
gaggggctcc cgcaggaccc cgatgtaggt 6840ggggtagcag cggccgccgc ggatgctggc
gcgcacgtag tcatacagct cgtgcgaggg 6900ggcgaggagg tcggggccca ggttggtgcg
ggcggggcgc tccgcgcgga agacgatctg 6960cctgaagatg gcatgcgagt tggaagagat
ggtggggcgc tggaagacgt tgaagctggc 7020gtcctgcagg ccgacggcgt cgcgcacgaa
ggaggcgtag gagtcgcgca gcttgtgtac 7080cagctcggcg gtgacctgca cgtcgagcgc
gcagtagtcg agggtctcgc ggatgatgtc 7140atatttagcc tgccccttct ttttccacag
ctcgcggttg aggacaaact cttcgcggtc 7200tttccagtac tcttggatcg ggaaaccgtc
cggttccgaa cggtaagagc ctagcatgta 7260gaactggttg acggcctggt aggcgcagca
gcccttctcc acggggaggg cgtaggcctg 7320cgcggccttg cggagcgagg tgtgggtcag
ggcgaaggtg tccctgacca tgactttgag 7380gtactggtgc ttgaagtcgg agtcgtcgca
gccgccccgc tcccagagcg agaagtcggt 7440gcgcttcttg gagcgggggt tgggcagagc
gaaggtgaca tcgttgaaga ggattttgcc 7500cgcgcggggc atgaagttgc gggtgatgcg
gaagggcccc ggcacttcag agcggttgtt 7560gatgacctgg gcggcgagca cgatctcgtc
gaagccgttg atgttgtggc ccacgatgta 7620gagttccagg aagcggggcc ggccctttac
ggtgggcagc ttctttagct cttcgtaggt 7680gagctcctcg ggcgaggcga ggccgtgctc
ggccagggcc cagtccgcga ggtgcgggtt 7740gtctctgagg aaggacttcc agaggtcgcg
ggccaggagg gtctgcaggc ggtctctgaa 7800ggtcctgaac tggcggccca cggccatttt
ttcgggggtg atgcagtaga aggtgagggg 7860gtcttgctgc cagcggtccc agtcgagctg
cagggcgagg tcgcgcgcgg cggtgaccag 7920gcgctcgtcg cccccgaatt tcatgaccag
catgaagggc acgagctgct ttccgaaggc 7980ccccatccaa gtgtaggtct ctacatcgta
ggtgacaaag aggcgctccg tgcgaggatg 8040cgagccgatc gggaagaact ggatctcccg
ccaccagttg gaggagtggc tgttgatgtg 8100gtggaagtag aagtcccgtc gccgggccga
acactcgtgc tggcttttgt aaaagcgagc 8160gcagtactgg cagcgctgca cgggctgtac
ctcatgcacg agatgcacct ttcgcccgcg 8220cacgaggaag ccgaggggaa atctgagccc
cccgcctggc tcgcggcatg gctggttctc 8280ttctactttg gatgcgtgtc cgtctccgtc
tggctcctcg aggggtgtta cggtggagcg 8340gaccaccacg ccgcgcgagc cgcaggtcca
gatatcggcg cgcggcggtc ggagtttgat 8400gacgacatcg cgcagctggg agctgtccat
ggtctggagc tcccgcggcg gcggcaggtc 8460agccgggagt tcttgcaggt tcacctcgca
gagtcgggcc agggcgcggg gcaggtctag 8520gtggtacctg atctctaggg gcgtgttggt
ggcggcgtcg atggcttgca ggagcccgca 8580gccccggggg gcgacgacgg tgccccgcgg
ggtggtggtg gtggtggcgg tgcagctcag 8640aagcggtgcc gcgggcgggc ccccggaggt
agggggggct ccggtcccgc gggcaggggc 8700ggcagcggca cgtcggcgtg gagcgcgggc
aggagttggt gctgtgcccg gaggttgctg 8760gcgaaggcga cgacgcggcg gttgatctcc
tggatctggc gcctctgcgt gaagacgacg 8820ggcccggtga gcttgaacct gaaagagagt
tcgacagaat caatctcggt gtcattgacc 8880gcggcctggc gcaggatctc ctgcacgtct
cccgagttgt cttggtaggc gatctcggcc 8940atgaactgct cgatctcttc ctcctggagg
tctccgcgtc cggcgcgttc cacggtggcc 9000gccaggtcgt tggagatgcg ccccatgagc
tgcgagaagg cgttgagtcc gccctcgttc 9060cagactcggc tgtagaccac gcccccctgg
tcatcgcggg cgcgcatgac cacctgcgcg 9120aggttgagct ccacgtgccg cgcgaagacg
gcgtagttgc gcagacgctg gaagaggtag 9180ttgagggtgg tggcggtgtg ctcggccacg
aagaagttca tgacccagcg gcgcaacgtg 9240gattcgttga tgtcccccaa ggcctccagc
cgttccatgg cctcgtagaa gtccacggcg 9300aagttgaaaa actgggagtt gcgcgccgac
acggtcaact cctcctccag aagacggatg 9360agctcggcga cggtgtcgcg cacctcgcgc
tcgaaggcta tggggatctc ttcctccgct 9420agcatcacca cctcctcctc ttcctcctct
tctggcactt ccatgatggc ttcctcctct 9480tcggggggtg gcggcggcgg cggtggggga
gggggcgctc tgcgccggcg gcggcgcacc 9540gggaggcggt ccacgaagcg cgcgatcatc
tccccgcggc ggcggcgcat ggtctcggtg 9600acggcgcggc cgttctcccg ggggcgcagt
tggaagacgc cgccggacat ctggtgctgg 9660ggcgggtggc cgtgaggcag cgagacggcg
ctgacgatgc atctcaacaa ttgctgcgta 9720ggtacgccgc cgagggacct gagggagtcc
atatccaccg gatccgaaaa cctttcgagg 9780aaggcgtcta accagtcgca gtcgcaaggt
aggctgagca ccgtggcggg cggcgggggg 9840tggggggagt gtctggcgga ggtgctgctg
atgatgtaat tgaagtaggc ggacttgaca 9900cggcggatgg tcgacaggag caccatgtcc
ttgggtccgg cctgctggat gcggaggcgg 9960tcggctatgc cccaggcttc gttctggcat
cggcgcaggt ccttgtagta gtcttgcatg 10020agcctttcca ccggcacctc ttctccttcc
tcttctgctt cttccatgtc tgcttcggcc 10080ctggggcggc gccgcgcccc cctgcccccc
atgcgcgtga ccccgaaccc cctgagcggt 10140tggagcaggg ccaggtcggc gacgacgcgc
tcggccagga tggcctgctg cacctgcgtg 10200agggtggttt ggaagtcatc caagtccacg
aagcggtggt aggcgcccgt gttgatggtg 10260taggtgcagt tggccatgac ggaccagttg
acggtctggt ggcccggttg cgacatctcg 10320gtgtacctga gtcgcgagta ggcgcgggag
tcgaagacgt agtcgttgca agtccgcacc 10380aggtactggt agcccaccag gaagtgcggc
ggcggctggc ggtagagggg ccagcgcagg 10440gtggcggggg ctccgggggc caggtcttcc
agcatgaggc ggtggtaggc gtagatgtac 10500ctggacatcc aggtgatacc cgcggcggtg
gtggaggcgc gcgggaagtc gcgcacccgg 10560ttccagatgt tgcgcagggg cagaaagtgc
tccatggtag gcgtgctctg tccagtcaga 10620cgcgcgcagt cgttgatact ctagaccagg
gaaaacgaaa gccggtcagc gggcactctt 10680ccgtggtctg gtgaatagat cgcaagggta
tcatggcgga gggcctcggt tcgagccccg 10740ggtccgggcc ggacggtccg ccatgatcca
cgcggttacc gcccgcgtgt cgaacccagg 10800tgtgcgacgt cagacaacgg tggagtgttc
cttttggcgt ttttctggcc gggcgccggc 10860gccgcgtaag agactaagcc gcgaaagcga
aagcagtaag tggctcgctc cccgtagccg 10920gagggatcct tgctaagggt tgcgttgcgg
cgaaccccgg ttcgaatccc gtactcgggc 10980cggccggacc cgcggctaag gtgttggatt
ggcctccccc tcgtataaag accccgcttg 11040cggattgact ccggacacgg ggacgagccc
cttttatttt tgctttcccc agatgcatcc 11100ggtgctgcgg cagatgcgcc ccccgcccca
gcagcagcaa caacaccagc aagagcggca 11160gcaacagcag cgggagtcat gcagggcccc
ctcacccacc ctcggcgggc cggccacctc 11220ggcgtccgcg gccgtgtctg gcgcctgcgg
cggcggcggg gggccggctg acgaccccga 11280ggagcccccg cggcgcaggg ccagacacta
cctggacctg gaggagggcg agggcctggc 11340gcggctgggg gcgccgtctc ccgagcgcca
cccgcgggtg cagctgaagc gcgactcgcg 11400cgaggcgtac gtgcctcggc agaacctgtt
cagggaccgc gcgggcgagg agcccgagga 11460gatgcgggac aggaggttca gcgcagggcg
ggagctgcgg caggggctga accgcgagcg 11520gctgctgcgc gaggaggact ttgagcccga
cgcgcggacg gggatcagcc ccgcgcgcgc 11580gcacgtggcg gccgccgacc tggtgacggc
gtacgagcag acggtgaacc aggagatcaa 11640cttccaaaag agtttcaaca accacgtgcg
cacgctggtg gcgcgcgagg aggtgaccat 11700cgggctgatg cacctgtggg actttgtaag
cgcgctggtg cagaacccca acagcaagcc 11760tctgacggcg cagctgttcc tgatagtgca
gcacagcagg gacaacgagg cgtttaggga 11820cgcgctgctg aacatcaccg agcccgaggg
tcggtggctg ctggacctga ttaacatcct 11880gcagagcata gtggtgcagg agcgcagcct
gagcctggcc gacaaggtgg cggccatcaa 11940ctactcgatg ctgagcctgg gcaagtttta
cgcgcgcaag atctaccaga cgccgtacgt 12000gcccatagac aaggaggtga agatcgacgg
tttttacatg cgcatggcgc tgaaggtgct 12060caccctgagc gacgacctgg gcgtgtaccg
caacgagcgc atccacaagg ccgtgagcgt 12120gagccggcgg cgcgagctga gcgaccgcga
gctgatgcac agcctgcagc gggcgctggc 12180gggcgccggc agcggcgaca gggaggcgga
gtcctacttc gatgcggggg cggacctgcg 12240ctgggcgccc agccggcggg ccctggaggc
cgcgggggtc cgcgaggact atgacgagga 12300cggcgaggag gatgaggagt acgagctaga
ggagggcgag tacctggact aaaccgcggg 12360tggtgtttcc ggtagatgca agacccgaac
gtggtggacc cggcgctgcg ggcggctctg 12420cagagccagc cgtccggcct taactcctca
gacgactggc gacaggtcat ggaccgcatc 12480atgtcgctga cggcgcgtaa cccggacgcg
ttccggcagc agccgcaggc caacaggctc 12540tccgccatcc tggaggcggt ggtgcctgcg
cgctcgaacc ccacgcacga gaaggtgctg 12600gccatagtga acgcgctggc cgagaacagg
gccatccgcc cggacgaggc cgggctggtg 12660tacgacgcgc tgctgcagcg cgtggcccgc
tacaacagcg gcaacgtgca gaccaacctg 12720gaccggctgg tgggggacgt gcgcgaggcg
gtggcgcagc gcgagcgcgc ggatcggcag 12780ggcaacctgg gctccatggt ggcgctgaat
gccttcctga gcacgcagcc ggccaacgtg 12840ccgcgggggc aggaagacta caccaacttt
gtgagcgcgc tgcggctgat ggtgaccgag 12900accccccaga gcgaggtgta ccagtcgggc
ccggactact tcttccagac cagcagacag 12960ggcctgcaga cggtgaacct gagccaggct
ttcaagaacc tgcgggggct gtggggcgtg 13020aaggcgccca ccggcgaccg ggcgacggtg
tccagcctgc tgacgcccaa ctcgcgcctg 13080ctgctgctgc tgatcgcgcc gttcacggac
agcggcagcg tgtcccggga cacctacctg 13140gggcacctgc tgaccctgta ccgcgaggcc
atcgggcagg cgcaggtgga cgagcacacc 13200ttccaggaga tcaccagcgt gagccgcgcg
ctggggcagg aggacacgag cagcctggag 13260gcgactctga actacctgct gaccaaccgg
cggcagaaga ttccctcgct gcacagcctg 13320acctccgagg aggagcgcat cttgcgctac
gtgcagcaga gcgtgagcct gaacctgatg 13380cgcgacgggg tgacgcccag cgtggcgctg
gacatgaccg cgcgcaacat ggaaccgggc 13440atgtacgccg cgcaccggcc ttacatcaac
cgcctgatgg actacctgca tcgcgcggcg 13500gccgtgaacc ccgagtactt taccaacgcc
atcctgaacc cgcactggct cccgccgccc 13560gggttctaca gcgggggctt cgaggtcccg
gagaccaacg atggcttcct gtgggacgac 13620atggacgaca gcgtgttctc cccgcggccg
caggcgctgg cggaagcgtc cctgctgcgt 13680cccaagaagg aggaggagga ggaggcgagt
cgccgccgcg gcagcagcgg cgtggcttct 13740ctgtccgagc tgggggcggc agccgccgcg
cgccccgggt ccctgggcgg cagccccttt 13800ccgagcctgg tggggtctct gcacagcgag
cgcaccaccc gccctcggct gctgggcgag 13860gacgagtacc tgaataactc cctgctgcag
ccggtgcggg agaaaaacct gcctcccgcc 13920ttccccaaca acgggataga gagcctggtg
gacaagatga gcagatggaa gacctatgcg 13980caggagcaca gggacgcgcc tgcgctccgg
ccgcccacgc ggcgccagcg ccacgaccgg 14040cagcgggggc tggtgtggga tgacgaggac
tccgcggacg atagcagcgt gctggacctg 14100ggagggagcg gcaacccgtt cgcgcacctg
cgcccccgcc tggggaggat gttttaaaaa 14160aaaaaaaaaa aagcaagaag catgatgcaa
aaattaaata aaactcacca aggccatggc 14220gaccgagcgt tggtttcttg tgttcccttc
agtatgcggc gcgcggcgat gtaccaggag 14280ggacctcctc cctcttacga gagcgtggtg
ggcgcggcgg cggcggcgcc ctcttctccc 14340tttgcgtcgc agctgctgga gccgccgtac
gtgcctccgc gctacctgcg gcctacgggg 14400gggagaaaca gcatccgtta ctcggagctg
gcgcccctgt tcgacaccac ccgggtgtac 14460ctggtggaca acaagtcggc ggacgtggcc
tccctgaact accagaacga ccacagcaat 14520tttttgacca cggtcatcca gaacaatgac
tacagcccga gcgaggccag cacccagacc 14580atcaatctgg atgaccggtc gcactggggc
ggcgacctga aaaccatcct gcacaccaac 14640atgcccaacg tgaacgagtt catgttcacc
aataagttca aggcgcgggt gatggtgtcg 14700cgctcgcaca ccaaggaaga ccgggtggag
ctgaagtacg agtgggtgga gttcgagctg 14760ccagagggca actactccga gaccatgacc
attgacctga tgaacaacgc gatcgtggag 14820cactatctga aagtgggcag gcagaacggg
gtcctggaga gcgacatcgg ggtcaagttc 14880gacaccagga acttccgcct ggggctggac
cccgtgaccg ggctggttat gcccggggtg 14940tacaccaacg aggccttcca tcccgacatc
atcctgctgc ccggctgcgg ggtggacttc 15000acttacagcc gcctgagcaa cctcctgggc
atccgcaagc ggcagccctt ccaggagggc 15060ttcaggatca cctacgagga cctggagggg
ggcaacatcc ccgcgctcct cgatgtggag 15120gcctaccagg atagcttgaa ggaaaatgag
gcgggacagg aggataccgc ccccgccgcc 15180tccgccgccg ccgagcaggg cgaggatgct
gctgacaccg cggccgcgga cggggcagag 15240gccgaccccg ctatggtggt ggaggctccc
gagcaggagg aggacatgaa tgacagtgcg 15300gtgcgcggag acaccttcgt cacccggggg
gaggaaaagc aagcggaggc cgaggccgcg 15360gccgaggaaa agcaactggc ggcagcagcg
gcggcggcgg cgttggccgc ggcggaggct 15420gagtctgagg ggaccaagcc cgccaaggag
cccgtgatta agcccctgac cgaagatagc 15480aagaagcgca gttacaacct gctcaaggac
agcaccaaca ccgcgtaccg cagctggtac 15540ctggcctaca actacggcga cccgtcgacg
ggggtgcgct cctggaccct gctgtgcacg 15600ccggacgtga cctgcggctc ggagcaggtg
tactggtcgc tgcccgacat gatgcaagac 15660cccgtgacct tccgctccac gcggcaggtc
agcaacttcc cggtggtggg cgccgagctg 15720ctgcccgtgc actccaagag cttctacaac
gaccaggccg tctactccca gctcatccgc 15780cagttcacct ctctgaccca cgtgttcaat
cgctttcctg agaaccagat tctggcgcgc 15840ccgcccgccc ccaccatcac caccgtcagt
gaaaacgttc ctgctctcac agatcacggg 15900acgctaccgc tgcgcaacag catcggagga
gtccagcgag tgaccgttac tgacgccaga 15960cgccgcacct gcccctacgt ttacaaggcc
ttgggcatag tctcgccgcg cgtcctttcc 16020agccgcactt tttgagcaac accaccatca
tgtccatcct gatctcaccc agcaataact 16080ccggctgggg actgctgcgc gcgcccagca
agatgttcgg aggggcgagg aagcgttccg 16140agcagcaccc cgtgcgcgtg cgcgggcact
tccgcgcccc ctggggagcg cacaaacgcg 16200gccgcgcggg gcgcaccacc gtggacgacg
ccatcgactc ggtggtggag caggcgcgca 16260actacaggcc cgcggtctct accgtggacg
cggccatcca gaccgtggtg cggggcgcgc 16320ggcggtacgc caagctgaag agccgccgga
agcgcgtggc ccgccgccac cgccgccgac 16380ccggggccgc cgccaaacgc gccgccgcgg
ccctgcttcg ccgggccaag cgcacgggcc 16440gccgcgccgc catgagggcc gcgcgccgct
tggccgccgg catcaccgcc gccaccatgg 16500ccccccgtac ccgaagacgc gcggccgccg
ccgccgccgc cgccatcagt gacatggcca 16560gcaggcgccg gggcaacgtg tactgggtgc
gcgactcggt gaccggcacg cgcgtgcccg 16620tgcgcttccg ccccccgcgg acttgagatg
atgtgaaaaa acaacactga gtctcctgct 16680gttgtgtgta tcccagcggc ggcggcgcgc
gcagcgtcat gtccaagcgc aaaatcaaag 16740aagagatgct ccaggtcgtc gcgccggaga
tctatgggcc cccgaagaag gaagagcagg 16800attcgaagcc ccgcaagata aagcgggtca
aaaagaaaaa gaaagatgat gacgatgccg 16860atggggaggt ggagttcctg cgcgccacgg
cgcccaggcg cccggtgcag tggaagggcc 16920ggcgcgtaaa gcgcgtcctg cgccccggca
ccgcggtggt cttcacgccc ggcgagcgct 16980ccacccggac tttcaagcgc gtctatgacg
aggtgtacgg cgacgaagac ctgctggagc 17040aggccaacga gcgcttcgga gagtttgctt
acgggaagcg tcagcgggcg ctggggaagg 17100aggacctgct ggcgctgccg ctggaccagg
gcaaccccac ccccagtctg aagcccgtga 17160ccctgcagca ggtgctgccg agcagcgcac
cctccgaggc gaagcggggt ctgaagcgcg 17220agggcggcga cctggcgccc accgtgcagc
tcatggtgcc caagcggcag aggctggagg 17280atgtgctgga gaaaatgaaa gtagaccccg
gtctgcagcc ggacatcagg gtccgcccca 17340tcaagcaggt ggcgccgggc ctcggcgtgc
agaccgtgga cgtggtcatc cccaccggca 17400actcccccgc cgccgccacc actaccgctg
cctccacgga catggagaca cagaccgatc 17460ccgccgcagc cgcagccgca gccgccgccg
cgacctcctc ggcggaggtg cagacggacc 17520cctggctgcc gccggcgatg tcagctcccc
gcgcgcgtcg cgggcgcagg aagtacggcg 17580ccgccaacgc gctcctgccc gagtacgcct
tgcatccttc catcgcgccc acccccggct 17640accgaggcta tacctaccgc ccgcgaagag
ccaagggttc cacccgccgt ccccgccgac 17700gcgccgccgc caccacccgc cgccgccgcc
gcagacgcca gcccgcactg gctccagtct 17760ccgtgaggaa agtggcgcgc gacggacaca
ccctggtgct gcccagggcg cgctaccacc 17820ccagcatcgt ttaaaagcct gttgtggttc
ttgcagatat ggccctcact tgccgcctcc 17880gtttcccggt gccgggatac cgaggaggaa
gatcgcgccg caggaggggt ctggccggcc 17940gcggcctgag cggaggcagc cgccgcgcgc
accggcggcg acgcgccacc agccgacgca 18000tgcgcggcgg ggtgctgccc ctgttaatcc
ccctgatcgc cgcggcgatc ggcgccgtgc 18060ccgggatcgc ctccgtggcc ttgcaagcgt
cccagaggca ttgacagact tgcaaacttg 18120caaatatgga aaaaaaaacc ccaataaaaa
agtctagact ctcacgctcg cttggtcctg 18180tgactatttt gtagaatgga agacatcaac
tttgcgtcgc tggccccgcg tcacggctcg 18240cgcccgttcc tgggacactg gaacgatatc
ggcaccagca acatgagcgg tggcgccttc 18300agttggggct ctctgtggag cggcattaaa
agtatcgggt ctgccgttaa aaattacggc 18360tcccgggcct ggaacagcag cacgggccag
atgttgagag acaagttgaa agagcagaac 18420ttccagcaga aggtggtgga gggcctggcc
tccggcatca acggggtggt ggacctggcc 18480aaccaggccg tgcagaataa gatcaacagc
agactggacc cccggccgcc ggtggaggag 18540gtgccgccgg cgctggagac ggtgtccccc
gatgggcgtg gcgagaagcg cccgcggccc 18600gatagggaag agaccactct ggtcacgcag
accgatgagc cgcccccgta tgaggaggcc 18660ctgaagcaag gtctgcccac cacgcggccc
atcgcgccca tggccaccgg ggtggtgggc 18720cgccacaccc ccgccacgct ggacttgcct
ccgcccgccg atgtgccgca gcagcagaag 18780gcggcacagc cgggcccgcc cgcgaccgcc
tcccgttcct ccgccggtcc tctgcgccgc 18840gcggccagcg gcccccgcgg gggggtcgcg
aggcacggca actggcagag cacgctgaac 18900agcatcgtgg gtctgggggt gcggtccgtg
aagcgccgcc gatgctactg aatagcttag 18960ctaacgtgtt gtatgtgtgt atgcgcccta
tgtcgccgcc agaggagctg ctgagtcgcc 19020gccgttcgcg cgcccaccac caccgccact
ccgcccctca agatggcgac cccatcgatg 19080atgccgcagt ggtcgtacat gcacatctcg
ggccaggacg cctcggagta cctgagcccc 19140gggctggtgc agttcgcccg cgccaccgag
agctacttca gcctgagtaa caagtttagg 19200aaccccacgg tggcgcccac gcacgatgtg
accaccgacc ggtctcagcg cctgacgctg 19260cggttcattc ccgtggaccg cgaggacacc
gcgtactcgt acaaggcgcg gttcaccctg 19320gccgtgggcg acaaccgcgt gctggacatg
gcctccacct actttgacat ccgcggggtg 19380ctggaccggg gtcccacttt caagccctac
tctggcaccg cctacaactc cctggccccc 19440aagggcgctc ccaactcctg cgagtgggag
caagaggaaa ctcaggcagt tgaagaagca 19500gcagaagagg aagaagaaga tgctgacggt
caagctgagg aagagcaagc agctaccaaa 19560aagactcatg tatatgctca ggctcccctt
tctggcgaaa aaattagtaa agatggtctg 19620caaataggaa cggacgctac agctacagaa
caaaaaccta tttatgcaga ccctacattc 19680cagcccgaac cccaaatcgg ggagtcccag
tggaatgagg cagatgctac agtcgccggc 19740ggtagagtgc taaagaaatc tactcccatg
aaaccatgct atggttccta tgcaagaccc 19800acaaatgcta atggaggtca gggtgtacta
acggcaaatg cccagggaca gctagaatct 19860caggttgaaa tgcaattctt ttcaacttct
gaaaacgccc gtaacgaggc taacaacatt 19920cagcccaaat tggtgctgta tagtgaggat
gtgcacatgg agaccccgga tacgcacctt 19980tcttacaagc ccgcaaaaag cgatgacaat
tcaaaaatca tgctgggtca gcagtccatg 20040cccaacagac ctaattacat cggcttcaga
gacaacttta tcggcctcat gtattacaat 20100agcactggca acatgggagt gcttgcaggt
caggcctctc agttgaatgc agtggtggac 20160ttgcaagaca gaaacacaga actgtcctac
cagctcttgc ttgattccat gggtgacaga 20220accagatact tttccatgtg gaatcaggca
gtggacagtt atgacccaga tgttagaatt 20280attgaaaatc atggaactga agacgagctc
cccaactatt gtttccctct gggtggcata 20340ggggtaactg acacttacca ggctgttaaa
accaacaatg gcaataacgg gggccaggtg 20400acttggacaa aagatgaaac ttttgcagat
cgcaatgaaa taggggtggg aaacaatttc 20460gctatggaga tcaacctcag tgccaacctg
tggagaaact tcctgtactc caacgtggcg 20520ctgtacctac cagacaagct taagtacaac
ccctccaatg tggacatctc tgacaacccc 20580aacacctacg attacatgaa caagcgagtg
gtggccccgg ggctggtgga ctgctacatc 20640aacctgggcg cgcgctggtc gctggactac
atggacaacg tcaacccctt caaccaccac 20700cgcaatgcgg gcctgcgcta ccgctccatg
ctcctgggca acgggcgcta cgtgcccttc 20760cacatccagg tgccccagaa gttctttgcc
atcaagaacc tcctcctcct gccgggctcc 20820tacacctacg agtggaactt caggaaggat
gtcaacatgg tcctccagag ctctctgggt 20880aacgatctca gggtggacgg ggccagcatc
aagttcgaga gcatctgcct ctacgccacc 20940ttcttcccca tggcccacaa cacggcctcc
acgctcgagg ccatgctcag gaacgacacc 21000aacgaccagt ccttcaatga ctacctctcc
gccgccaaca tgctctaccc catacccgcc 21060aacgccacca acgtccccat ctccatcccc
tcgcgcaact gggcggcctt ccgcggctgg 21120gccttcaccc gcctcaagac caaggagacc
ccctccctgg gctcgggatt cgacccctac 21180tacacctact cgggctccat tccctacctg
gacggcacct tctacctcaa ccacactttc 21240aagaaggtct cggtcacctt cgactcctcg
gtcagctggc cgggcaacga ccgtctgctc 21300acccccaacg agttcgagat caagcgctcg
gtcgacgggg agggctacaa cgtggcccag 21360tgcaacatga ccaaggactg gttcctggtc
cagatgctgg ccaactacaa catcggctac 21420cagggcttct acatcccaga gagctacaag
gacaggatgt actccttctt caggaacttc 21480cagcccatga gccggcaggt ggtggaccag
accaagtaca aggactacca ggaggtgggc 21540atcatccacc agcacaacaa ctcgggcttc
gtgggctacc tcgcccccac catgcgcgag 21600ggacaggcct accccgccaa cttcccctat
ccgctcatag gcaagaccgc ggtcgacagc 21660atcacccaga aaaagttcct ctgcgaccgc
accctctggc gcatcccctt ctccagcaac 21720ttcatgtcca tgggtgcgct ctcggacctg
ggccagaact tgctctacgc caactccgcc 21780cacgccctcg acatgacctt cgaggtcgac
cccatggacg agcccaccct tctctatgtt 21840ctgttcgaag tctttgacgt ggtccgggtc
caccagccgc accgcggcgt catcgagacc 21900gtgtacctgc gtacgccctt ctcggccggc
aacgccacca cctaaagaag caagccgcag 21960tcatcgccgc ctgcatgccg tcgggttcca
ccgagcaaga gctcagggcc atcgtcagag 22020acctgggatg cgggccctat tttttgggca
ccttcgacaa gcgcttccct ggctttgtct 22080ccccacacaa gctggcctgc gccatcgtca
acacggccgg ccgcgagacc gggggcgtgc 22140actggctggc cttcgcctgg aacccgcgct
ccaaaacatg cttcctcttt gaccccttcg 22200gcttttcgga ccagcggctc aagcaaatct
acgagttcga gtacgagggc ttgctgcgtc 22260gcagcgccat cgcctcctcg cccgaccgct
gcgtcaccct cgaaaagtcc acccagaccg 22320tgcaggggcc cgactcggcc gcctgcggtc
tcttctgctg catgtttctg cacgcctttg 22380tgcactggcc tcagagtccc atggaccgca
accccaccat gaacttgctg acgggggtgc 22440ccaactccat gctccagagc ccccaggtcg
agcccaccct gcgccgcaac caggagcagc 22500tctacagctt cctggagcgc cactcgcctt
acttccgccg ccacagcgca cagatcagga 22560gggccacctc cttctgccac ttgcaagaga
tgcaagaagg gtaataacga tgtacacact 22620ttttttctca ataaatggca tctttttatt
tatacaagct ctctggggta ttcatttccc 22680accaccaccc gccgttgtcg ccatctggct
ctatttagaa atcgaaaggg ttctgccggg 22740agtcgccgtg cgccacgggc agggacacgt
tgcgatactg gtagcgggtg ccccacttga 22800actcgggcac caccaggcga ggcagctcgg
ggaagttttc gctccacagg ctgcgggtca 22860gcaccagcgc gttcatcagg tcgggcgccg
agatcttgaa gtcgcagttg gggccgccgc 22920cctgcgcgcg cgagttgcgg tacaccgggt
tgcagcactg gaacaccaac agcgccgggt 22980gcttcacgct ggccagcacg ctgcggtcgg
agatcagctc ggcgtccagg tcctccgcgt 23040tgctcagcgc gaacggggtc atcttgggca
cttgccgccc caggaagggc gcgtgccccg 23100gtttcgagtt gcagtcgcag cgcagcggga
tcagcaggtg cccgtgcccg gactcggcgt 23160tggggtacag cgcgcgcatg aaggcctgca
tctggcggaa ggccatctgg gccttggcgc 23220cctccgagaa gaacatgccg caggacttgc
ccgagaactg gtttgcgggg cagctggcgt 23280cgtgcaggca gcagcgcgcg tcggtgttgg
cgatctgcac cacgttgcgc ccccaccggt 23340tcttcacgat cttggccttg gacgattgct
ccttcagcgc gcgctgcccg ttctcgctgg 23400tcacatccat ctcgatcaca tgttccttgt
tcaccatgct gctgccgtgc agacacttca 23460gctcgccctc cgtctcggtg cagcggtgct
gccacagcgc gcagcccgtg ggctcgaaag 23520acttgtaggt cacctccgcg aaggactgca
ggtacccctg caaaaagcgg cccatcatgg 23580tcacgaaggt cttgttgctg ctgaaggtca
gctgcagccc gcggtgctcc tcgttcagcc 23640aggtcttgca cacggccgcc agcgcctcca
cctggtcggg cagcatcttg aagttcacct 23700tcagctcatt ctccacgtgg tacttgtcca
tcagcgtgcg cgccgcctcc atgcccttct 23760cccaggccga caccagcggc aggctcacgg
ggttcttcac catcaccgtg gccgccgcct 23820ccgccgcgct ttcgctttcc gccccgctgt
tctcttcctc ttcctcctct tcctcgccgc 23880cgcccactcg cagcccccgc accacggggt
cgtcttcctg caggcgctgc accttgcgct 23940tgccgttgcg cccctgcttg atgcgcacgg
gcgggttgct gaagcccacc atcaccagcg 24000cggcctcttc ttgctcgtcc tcgctgtcca
gaatgacctc cggggagggg gggttggtca 24060tcctcagtac cgaggcacgc ttctttttct
tcctgggggc gttcgccagc tccgcggctg 24120cggccgctgc cgaggtcgaa ggccgagggc
tgggcgtgcg cggcaccagc gcgtcctgcg 24180agccgtcctc gtcctcctcg gactcgagac
ggaggcgggc ccgcttcttc gggggcgcgc 24240ggggcggcgg aggcggcggc ggcgacggag
acggggacga gacatcgtcc agggtgggtg 24300gacggcgggc cgcgccgcgt ccgcgctcgg
gggtggtctc gcgctggtcc tcttcccgac 24360tggccatctc ccactgctcc ttctcctata
ggcagaaaga gatcatggag tctctcatgc 24420gagtcgagaa ggaggaggac agcctaaccg
ccccctctga gccctccacc accgccgcca 24480ccaccgccaa tgccgccgcg gacgacgcgc
ccaccgagac caccgccagt accaccctcc 24540ccagcgacgc acccccgctc gagaatgaag
tgctgatcga gcaggacccg ggttttgtga 24600gcggagagga ggatgaggtg gatgagaagg
agaaggagga ggtcgccgcc tcagtgccaa 24660aagaggataa aaagcaagac caggacgacg
cagataagga tgagacagca gtcgggcggg 24720ggaacggaag ccatgatgct gatgacggct
acctagacgt gggagacgac gtgctgctta 24780agcacctgca ccgccagtgc gtcatcgtct
gcgacgcgct gcaggagcgc tgcgaagtgc 24840ccctggacgt ggcggaggtc agccgcgcct
acgagcggca cctcttcgcg ccgcacgtgc 24900cccccaagcg ccgggagaac ggcacctgcg
agcccaaccc gcgtctcaac ttctacccgg 24960tcttcgcggt acccgaggtg ctggccacct
accacatctt tttccaaaac tgcaagatcc 25020ccctctcctg ccgcgccaac cgcacccgcg
ccgacaaaac cctgaccctg cggcagggcg 25080cccacatacc tgatatcgcc tctctggagg
aagtgcccaa gatcttcgag ggtctcggtc 25140gcgacgagaa acgggcggcg aacgctctgc
acggagacag cgaaaacgag agtcactcgg 25200gggtgctggt ggagctcgag ggcgacaacg
cgcgcctggc cgtactcaag cgcagcatag 25260aggtcaccca ctttgcctac ccggcgctca
acctgccccc caaggtcatg agtgtggtca 25320tgggcgagct catcatgcgc cgcgcccagc
ccctggccgc ggatgcaaac ttgcaagagt 25380cctccgagga aggcctgccc gcggtcagcg
acgagcagct ggcgcgctgg ctggagaccc 25440gcgaccccgc gcagctggag gagcggcgca
agctcatgat ggccgcggtg ctggtcaccg 25500tggagctcga gtgtctgcag cgcttcttcg
cggaccccga gatgcagcgc aagctcgagg 25560agaccctgca ctacaccttc cgccagggct
acgtgcgcca ggcctgcaag atctccaacg 25620tggagctctg caacctggtc tcctacctgg
gcatcctgca cgagaaccgc ctcgggcaga 25680acgtcctgca ctccaccctc aaaggggagg
cgcgccgcga ctacatccgc gactgcgcct 25740acctcttcct ctgctacacc tggcagacgg
ccatgggggt ctggcagcag tgcctggagg 25800agcgcaacct caaggagctg gaaaagctcc
tcaagcgcac cctcagggac ctctggacgg 25860gcttcaacga gcgctcggtg gccgccgcgc
tggcggacat catctttccc gagcgcctgc 25920tcaagaccct gcagcagggc ctgcccgact
tcaccagcca gagcatgctg cagaacttca 25980ggactttcat cctggagcgc tcgggcatcc
tgccggccac ttgctgcgcg ctgcccagcg 26040acttcgtgcc catcaagtac agggagtgcc
cgccgccgct ctggggccac tgctacctct 26100tccagctggc caactacctc gcctaccact
cggacctcat ggaagacgtg agcggcgagg 26160gcctgctcga gtgccactgc cgctgcaacc
tctgcacgcc ccaccgctct ctagtctgca 26220acccgcagct gctcagcgag agtcagatta
tcggtacctt cgagctgcag ggtccctcgc 26280ctgacgagaa gtccgcggct ccagggctga
aactcactcc ggggctgtgg acttccgcct 26340acctacgcaa atttgtacct gaggactacc
acgcccacga gatcaggttc tacgaagacc 26400aatcccgccc gcccaaggcg gagctcaccg
cctgcgtcat cacccagggg cacatcctgg 26460gccaattgca agccatcaac aaagcccgcc
gagagttctt gctgaaaaag ggtcgggggg 26520tgtacctgga cccccagtcc ggcgaggagc
taaacccgct acccccgccg ccgccccagc 26580agcgggacct tgcttcccag gatggcaccc
agaaagaagc agcagccgcc gccgccgccg 26640cagccataca tgcttctgga ggaagaggag
gaggactggg acagtcaggc agaggaggtt 26700tcggacgagg agcaggagga gatgatggaa
gactgggagg aggacagcag cctagacgag 26760gaagcttcag aggccgaaga ggtggcagac
gcaacaccat cgccctcggt cgcagccccc 26820tcgccggggc ccctgaaatc ctccgaaccc
agcaccagcg ctataacctc cgctcctccg 26880gcgccggcgc cacccgcccg cagacccaac
cgtagatggg acaccacagg aaccggggtc 26940ggtaagtcca agtgcccgcc gccgccaccg
cagcagcagc agcagcagcg ccagggctac 27000cgctcgtggc gcgggcacaa gaacgccata
gtcgcctgct tgcaagactg cgggggcaac 27060atctctttcg cccgccgctt cctgctattc
caccacgggg tcgcctttcc ccgcaatgtc 27120ctgcattact accgtcatct ctacagcccc
tactgcagcg gcgacccaga ggcggcagcg 27180gcagccacag cggcgaccac cacctaggaa
gatatcctcc gcgggcaaga cagcggcagc 27240agcggccagg agacccgcgg cagcagcggc
gggagcggtg ggcgcactgc gcctctcgcc 27300caacgaaccc ctctcgaccc gggagctcag
acacaggatc ttccccactt tgtatgccat 27360cttccaacag agcagaggcc aggagcagga
gctgaaaata aaaaacagat ctctgcgctc 27420cctcacccgc agctgtctgt atcacaaaag
cgaagatcag cttcggcgca cgctggagga 27480cgcggaggca ctcttcagca aatactgcgc
gctcactctt aaagactagc tccgcgccct 27540tctcgaattt aggcgggaga aaactacgtc
atcgccggcc gccgcccagc ccgcccagcc 27600gagatgagca aagagattcc cacgccatac
atgtggagct accagccgca gatgggactc 27660gcggcgggag cggcccagga ctactccacc
cgcatgaact acatgagcgc gggaccccac 27720atgatctcac aggtcaacgg gatccgcgcc
cagcgaaacc aaatactgct ggaacaggcg 27780gccatcaccg ccacgccccg ccataatctc
aacccccgaa attggcccgc cgccctcgtg 27840taccaggaaa ccccctccgc caccaccgta
ctacttccgc gtgacgccca ggccgaagtc 27900cagatgacta actcaggggc gcagctcgcg
ggcggctttc gtcacggggc gcggccgctc 27960cgaccaggta taagacacct gatgatcaga
ggccgaggta tccagctcaa cgacgagtcg 28020gtgagctctt cgctcggtct ccgtccggac
ggaactttcc agctcgccgg atccggccgc 28080tcttcgttca cgccccgcca ggcgtacctg
actctgcaga cctcgtcctc ggagccccgc 28140tccggcggca tcggaaccct ccagttcgtg
gaggagttcg tgccctcggt ctacttcaac 28200cccttctcgg gacctcccgg acgctacccc
gaccagttca ttccgaactt tgacgcggtg 28260aaggactcgg cggacggcta cgactgaatg
tcaggtgtcg aggcagagca gcttcgcctg 28320agacacctcg agcactgccg ccgccacaag
tgcttcgccc gcggttctgg tgagttctgc 28380tactttcagc tacccgagga gcataccgag
gggccggcgc acggcgtccg cctgaccacc 28440cagggcgagg ttacctgttc cctcatccgg
gagtttaccc tccgtcccct gctagtggag 28500cgggagcggg gtccctgtgt cctaactatc
gcctgcaact gccctaaccc tggattacat 28560caagatcttt gctgtcatct ctgtgctgag
tttaataaac gctgagatca gaatctactg 28620gggctcctgt cgccatcctg tgaacgccac
cgtcttcacc caccccgacc aggcccaggc 28680gaacctcacc tgcggtctgc atcggagggc
caagaagtac ctcacctggt acttcaacgg 28740cacccccttt gtggtttaca acagcttcga
cggggacgga gtctccctga aagaccagct 28800ctccggtctc agctactcca tccacaagaa
caccaccctc caactcttcc ctccctacct 28860gccgggaacc tacgagtgcg tcaccggccg
ctgcacccac ctcacccgcc tgatcgtaaa 28920ccagagcttt ccgggaacag ataactccct
cttccccaga acaggaggtg agctcaggaa 28980actccccggg gaccagggcg gagacgtacc
ttcgaccctt gtggggttag gattttttat 29040taccgggttg ctggctcttt taatcaaagt
ttccttgaga tttgttcttt ccttctacgt 29100gtatgaacac ctcaacctcc aataactcta
ccctttcttc ggaatcaggt gacttctctg 29160aaatcgggct tggtgtgctg cttactctgt
tgattttttt ccttatcata ctcagccttc 29220tgtgcctcag gctcgccgcc tgctgcgcac
acatctatat ctactgctgg ttgctcaagt 29280gcaggggtcg ccacccaaga tgaacaggta
catggtccta tcgatcctag gcctgctggc 29340cctggcggcc tgcagcgccg ccaaaaaaga
gattaccttt gaggagcccg cttgcaatgt 29400aactttcaag cccgagggtg accaatgcac
caccctcgtc aaatgcgtta ccaatcatga 29460gaggctgcgc atcgactaca aaaacaaaac
tggccagttt gcggtctata gtgtgtttac 29520gcccggagac ccctctaact actctgtcac
cgtcttccag ggcggacagt ctaagatatt 29580caattacact ttcccttttt atgagttatg
cgatgcggtc atgtacatgt caaaacagta 29640caacctgtgg cctccctctc cccaggcgtg
tgtggaaaat actgggtctt actgctgtat 29700ggctttcgca atcactacgc tcgctctaat
ctgcacggtg ctatacataa aattcaggca 29760gaggcgaatc tttatcgatg aaaagaaaat
gccttgatcg ctaacaccgg ctttctatct 29820gcagaatgaa tgcaatcacc tccctactaa
tcaccaccac cctccttgcg attgcccatg 29880ggttgacacg aatcgaagtg ccagtggggt
ccaatgtcac catggtgggc cccgccggca 29940attccaccct catgtgggaa aaatttgtcc
gcaatcaatg ggttcatttc tgctctaacc 30000gaatcagtat caagcccaga gccatctgcg
atgggcaaaa tctaactctg atcaatgtgc 30060aaatgatgga tgctgggtac tattacgggc
agcggggaga aatcattaat tactggcgac 30120cccacaagga ctacatgctg catgtagtcg
aggcacttcc cactaccacc cccactacca 30180cctctcccac caccaccacc actactacta
ctactactac tactactact actaccacta 30240ccgctgcccg ccatacccgc aaaagcacca
tgattagcac aaagccccct cgtgctcact 30300cccacgccgg cgggcccatc ggtgcgacct
cagaaaccac cgagctttgc ttctgccaat 30360gcactaacgc cagcgctcat gaactgttcg
acctggagaa tgaggatgtc cagcagagct 30420ccgcttgcct gacccaggag gctgtggagc
ccgttgccct gaagcagatc ggtgattcaa 30480taattgactc ttcttctttt gccactcccg
aataccctcc cgattctact ttccacatca 30540cgggtaccaa agaccctaac ctctctttct
acctgatgct gctgctctgt atctctgtgg 30600tctcttccgc gctgatgtta ctggggatgt
tctgctgcct gatctgccgc agaaagagaa 30660aagctcgctc tcagggccaa ccactgatgc
ccttccccta ccccccggat tttgcagata 30720acaagatatg agctcgctgc tgacactaac
cgctttacta gcctgcgctc taacccttgt 30780cgcttgcgac tcgagattcc acaatgtcac
agctgtggca ggagaaaatg ttactttcaa 30840ctccacggcc gatacccagt ggtcgtggag
tggctcaggt agctacttaa ctatctgcaa 30900tagctccact tcccccggca tatccccaac
caagtaccaa tgcaatgcca gcctgttcac 30960cctcatcaac gcttccaccc tggacaatgg
actctatgta ggctatgtac cctttggtgg 31020gcaaggaaag acccacgctt acaacctgga
agttcgccag cccagaacca ctacccaagc 31080ttctcccacc accaccacca ccaccaccat
caccagcagc agcagcagca gcagccacag 31140cagcagcagc agattattga ctttggtttt
ggccagctca tctgccgcta cccaggccat 31200ctacagctct gtgcccgaaa ccactcagat
ccaccgccca gaaacgacca ccgccaccac 31260cctacacacc tccagcgatc agatgccgac
caacatcacc cccttggctc ttcaaatggg 31320acttacaagc cccactccaa aaccagtgga
tgcggccgag gtctccgccc tcgtcaatga 31380ctgggcgggg ctgggaatgt ggtggttcgc
cataggcatg atggcgctct gcctgcttct 31440gctctggctc atctgctgcc tccaccgcag
gcgagccaga ccccccatct atagacccat 31500cattgtcctg aaccccgata atgatgggat
ccatagattg gatggcctga aaaacctact 31560tttttctttt acagtatgat aaattgagac
atgcctcgca ttttcttgta catgttcctt 31620ctcccacctt ttctggggtg ttctacgctg
gccgctgtgt ctcacctgga ggtagactgc 31680ctctcaccct tcactgtcta cctgctttac
ggattggtca ccctcactct catctgcagc 31740ctaatcacag taatcatcgc cttcatccag
tgcattgatt acatctgtgt gcgcctcgca 31800tacttcagac accacccgca gtaccgagac
aggaacattg cccaacttct aagactgctc 31860taatcatgca taagactgtg atctgccttc
tgatcctctg catcctgccc accctcacct 31920cctgccagta caccacaaaa tctccgcgca
aaagacatgc ctcctgccgc ttcacccaac 31980tgtggaatat acccaaatgc tacaacgaaa
agagcgagct ctccgaagct tggctgtatg 32040gggtcatctg tgtcttagtt ttctgcagca
ctgtctttgc cctcataatc tacccctact 32100ttgatttggg atggaacgcg atcgatgcca
tgaattaccc cacctttccc gcacccgaga 32160taattccact gcgacaagtt gtacccgttg
tcgttaatca acgcccccca tcccctacgc 32220ccactgaaat cagctacttt aacctaacag
gcggagatga ctgacgccct agatctagaa 32280atggacggca tcagtaccga gcagcgtctc
ctagagaggc gcaggcaggc ggctgagcaa 32340gagcgcctca atcaggagct ccgagatctc
gttaacctgc accagtgcaa aagaggcatc 32400ttttgtctgg taaagcaggc caaagtcacc
tacgagaaga ccggcaacag ccaccgcctc 32460agttacaaat tgcccaccca gcgccagaag
ctggtgctca tggtgggtga gaatcccatc 32520accgtcaccc agcactcggt agagaccgag
gggtgtctgc actccccctg tcggggtcca 32580gaagacctct gcaccctggt aaagaccctg
tgcggtctca gagatttagt cccctttaac 32640taatcaaaca ctggaatcaa taaaaagaat
cacttactta aaatcagaca gcaggtctct 32700gtccagttta ttcagcagca cctccttccc
ctcctcccaa ctctggtact ccaaacgcct 32760tctggcggca aacttcctcc acaccctgaa
gggaatgtca gattcttgct cctgtccctc 32820cgcacccact atcttcatgt tgttgcagat
gaagcgcacc aaaacgtctg acgagagctt 32880caaccccgtg tacccctatg acacggaaag
cggccctccc tccgtccctt tcctcacccc 32940tcccttcgtg tctcccgatg gattccaaga
aagtcccccc ggggtcctgt ctctgaacct 33000ggccgagccc ctggtcactt cccacggcat
gctcgccctg aaaatgggaa gtggcctctc 33060cctggacgac gctggcaacc tcacctctca
agatatcacc accgctagcc ctcccctcaa 33120aaaaaccaag accaacctca gcctagaaac
ctcatccccc ctaactgtga gcacctcagg 33180cgccctcacc gtagcagccg ccgctcccct
ggcggtggcc ggcacctccc tcaccatgca 33240atcagaggcc cccctgacag tacaggatgc
aaaactcacc ctggccacca aaggccccct 33300gaccgtgtct gaaggcaaac tggccttgca
aacatcggcc ccgctgacgg ccgctgacag 33360cagcaccctc acagtcagtg ccacaccacc
ccttagcaca agcaatggca gcttgggtat 33420tgacatgcaa gcccccattt acaccaccaa
tggaaaacta ggacttaact ttggcgctcc 33480cctgcatgtg gtagacagcc taaatgcact
gactgtagtt actggccaag gtcttacgat 33540aaacggaaca gccctacaaa ctagagtctc
aggtgccctc aactatgaca catcaggaaa 33600cctagaattg agagctgcag ggggtatgcg
agttgatgca aatggtcaac ttatccttga 33660tgtagcttac ccatttgatg cacaaaacaa
tctcagcctt aggcttggac agggacccct 33720gtttgttaac tctgcccaca acttggatgt
taactacaac agaggcctct acctgttcac 33780atctggaaat accaaaaagc tagaagttaa
tatcaaaaca gccaagggtc tcatttatga 33840tgacactgct atagcaatca atgcgggtga
tgggctacag tttgactcag gctcagatac 33900aaatccatta aaaactaaac ttggattagg
actggattat gactccagca gagccataat 33960tgctaaactg ggaactggcc taagctttga
caacacaggt gccatcacag taggcaacaa 34020aaatgatgac aagcttacct tgtggaccac
accagaccca tcccctaact gtagaatcta 34080ttcagagaaa gatgctaaat tcacacttgt
tttgactaaa tgcggcagtc aggtgttggc 34140cagcgtttct gttttatctg taaaaggtag
ccttgcgccc atcagtggca cagtaactag 34200tgctcagatt gtcctcagat ttgatgaaaa
tggagttcta ctaagcaatt cttcccttga 34260ccctcaatac tggaactaca gaaaaggtga
ccttacagag ggcactgcat ataccaacgc 34320agtgggattt atgcccaacc tcacagcata
cccaaaaaca cagagccaaa ctgctaaaag 34380caacattgta agtcaggttt acttgaatgg
ggacaaatcc aaacccatga ccctcaccat 34440taccctcaat ggaactaatg aaacaggaga
tgccacagta agcacttact ccatgtcatt 34500ctcatggaac tggaatggaa gtaattacat
taatgaaacg ttccaaacca actccttcac 34560cttctcctac atcgcccaag aataaaaagc
atgacgctgt tgatttgatt caatgtgttt 34620ctgttttatt ttcaagcaca acaaaatcat
tcaagtcatt cttccatctt agcttaatag 34680acacagtagc ttaatagacc cagtagtgca
aagccccatt ctagcttata gatcagacag 34740tgataattaa ccaccaccac caccatacct
tttgattcag gaaatcatga tcatcacagg 34800atcctagtct tcaggccgcc ccctccctcc
caagacacag aatacacagt cctctccccc 34860cgactggctt taaataacac catctggttg
gtcacagaca tgttcttagg ggttatattc 34920cacacggtct cctgccgcgc caggcgctcg
tcggtgatgt tgataaactc tcccggcagc 34980tcgctcaagt tcacgtcgct gtccagcggc
tgaacctccg gctgacgcga taactgtgcg 35040accggctgct ggacgaacgg aggccgcgcc
tacaaggggg tagagtcata atcctcggtc 35100aggatagggc ggtgatgcag cagcagcgag
cgaaacatct gctgccgccg ccgctccgtc 35160cggcaggaaa acaacacgcc ggtggtctcc
tccgcgataa tccgcaccgc ccgcagcatc 35220agcttcctcg ttctccgcgc gcagcacctc
acccttatct cgctcaaatc ggcgcagtag 35280gtacagcaca gcaccacgat gttattcatg
atcccacagt gcagggcgct gtatccaaag 35340ctcatgccgg gaaccaccgc ccccacgtgg
ccatcgtacc acaagcgcac gtaaatcaag 35400tgtcgacccc tcatgaacgc gctggacaca
aacattactt ccttgggcat gttgtaattc 35460accacctccc ggtaccagat aaacctctgg
ttgaacaggg caccttccac caccatcctg 35520aaccaagagg ccagaacctg cccaccggct
atgcactgca gggaacccgg gttggaacaa 35580tgacaatgca gactccaagg ctcgtaaccg
tggatcatcc ggctgctgaa ggcatcgatg 35640ttggcacaac acagacacac gtgcatgcac
tttctcatga ttagcagctc ttccctcgtc 35700aggatcatat cccaaggaat aacccattct
tgaatcaacg taaaacccac acagcaggga 35760aggcctcgca cataactcac gttgtgcatg
gtcagcgtgt tgcattccgg aaacagcgga 35820tgatcctcca gtatcgaggc gcgggtctcc
ttctcacagg gaggtaaagg gtccctgctg 35880tacggactgc gccgggacga ccgagatcgt
gttgagcgta gtgtcatgga aaagggaacg 35940ccggacgtgg tcatacttct tgaagcagaa
ccaggttcgc gcgtggcagg cctccttgcg 36000tctgcggtct cgccgtctag ctcgctccgt
gtgatagttg tagtacagcc actcccgcag 36060agcgtcgagg cgcaccctgg cttccggatc
tatgtagact ccgtcttgca ccgcggccct 36120gataatatcc accaccgtag aataagcaac
acccagccaa gcaatacact cgctctgcga 36180gcggcagaca ggaggagcgg gcagagatgg
gagaaccatg ataaaaaact ttttttaaag 36240aatattttcc aattcttcga aagtaagatc
tatcaagtgg cagcgctccc ctccactggc 36300gcggtcaaac tctacggcca aagcacagac
aacggcattt ctaagatgtt ccttaatggc 36360gtccaaaaga cacaccgctc tcaagttgca
gtaaactatg aatgaaaacc catccggctg 36420attttccaat atagacgcgc cggcagcgtc
caccaaaccc agataatttt cttctctcca 36480gcggtttacg atctgtctaa gcaaatccct
tatatcaagt ccgaccatgc caaaaatctg 36540ctcaagagcg ccctccacct tcatgtacaa
gcagcgcatc atgattgcaa aaattcaggt 36600tcttcagaga cctgtataag attcaaaatg
ggaacattaa caaaaattcc tctgtcgcgc 36660agatcccttc gcagggcaag ctgaacataa
tcagacaggt ccgaacggac cagtgaggcc 36720aaatccccac caggaaccag atccagagac
cctatactga ttatgacgcg catactcggg 36780gctatgctga ccagcgtagc gccgatgtag
gcgtgctgca tgggcggcga gataaaatgc 36840aaagtgctgg ttaaaaaatc aggcaaagcc
tcgcgcaaaa aagctaacac atcataatca 36900tgctcatgca ggtagttgca ggtaagctca
ggaaccaaaa cggaataaca cacgattttc 36960ctctcaaaca tgacttcgcg gatactgcgt
aaaacaaaaa attataaata aaaaattaat 37020taaataactt aaacattgga agcctgtctc
acaacaggaa aaaccacttt aatcaacata 37080agacgggcca cgggcatgcc ggcatagccg
taaaaaaatt ggtccccgtg attaacaagt 37140accacagaca gctccccggt catgtcgggg
gtcatcatgt gagactctgt atacacgtct 37200ggattgtgaa catcagacaa acaaagaaat
cgagccacgt agcccggagg tataatcacc 37260cgcaggcgga ggtacagcaa aacgaccccc
ataggaggaa tcacaaaatt agtaggagaa 37320aaaaatacat aaacaccaga aaaaccctgt
tgctgaggca aaatagcgcc ctcccgatcc 37380aaaacaacat aaagcgcttc cacaggagca
gccataacaa agacccgagt cttaccagta 37440aaagaaaaaa gatctctcaa cgcagcacca
gcaccaacac ttcgcagtgt aaaaggccaa 37500gtgccgagag agtatatata ggaataaaaa
gtgacgtaaa cgggcaaagt ccaaaaaacg 37560cccagaaaaa ccgcacgcga acctacgccc
cgaaacgaaa gccaaaaaac actagacact 37620cccttccggc gtcaacttcc gctttcccac
gctacgtcac ttcccccggt caaacaaact 37680acatatcccg aacttccaag tcgccacgcc
caaaacaccg cctacacctc cccgcccgcc 37740ggcccgcccc cggacccgcc tcccgccccg
cgccgcccat ctcattatca tattggcttc 37800aatccaaaat aaggtatatt attgatgatg
378301137559DNAArtificial
Sequencesource/note="Description of Artificial Sequence Synthetic
polynucleotide" 11catcatcaat aatatacctt attttggatt gaagccaata tgataatgag
atgggcggcg 60cggggcgggg cgcggggcgg gaggcgggtt tgggggcggg ccggcgggcg
gggcggtgtg 120gcggaagtgg actttgtaag tgtggcggat gtgacttgct agtgccgggc
gcggtaaaag 180tgacgttttc cgtgcgcgac aacgcccccg ggaagtgaca tttttcccgc
ggtttttacc 240ggatgttgta gtgaatttgg gcgtaaccaa gtaagatttg gccattttcg
cgggaaaact 300gaaacgggga agtgaaatct gattaatttt gcgttagtca taccgcgtaa
tatttgtcta 360gggccgaggg actttggccg attacgtgga ggactcgccc aggtgttttt
tgaggtgaat 420ttccgcgttc cgggtcaaag tctgcgtttt attattatag gatatcccat
tgcatacgtt 480gtatccatat cataatatgt acatttatat tggctcatgt ccaacattac
cgccatgttg 540acattgatta ttgactagtt attaatagta atcaattacg gggtcattag
ttcatagccc 600atatatggag ttccgcgtta cataacttac ggtaaatggc ccgcctggct
gaccgcccaa 660cgacccccgc ccattgacgt caataatgac gtatgttccc atagtaacgc
caatagggac 720tttccattga cgtcaatggg tggagtattt acggtaaact gcccacttgg
cagtacatca 780agtgtatcat atgccaagta cgccccctat tgacgtcaat gacggtaaat
ggcccgcctg 840gcattatgcc cagtacatga ccttatggga ctttcctact tggcagtaca
tctacgtatt 900agtcatcgct attaccatgg tgatgcggtt ttggcagtac atcaatgggc
gtggatagcg 960gtttgactca cggggatttc caagtctcca ccccattgac gtcaatggga
gtttgttttg 1020gcaccaaaat caacgggact ttccaaaatg tcgtaacaac tccgccccat
tgacgcaaat 1080gggcggtagg cgtgtacggt gggaggtcta tataagcaga gctctcccta
tcagtgatag 1140agatctccct atcagtgata gagatcgtcg acgagctcgt ttagtgaacc
gtcagatcgc 1200ctggagacgc catccacgct gttttgacct ccatagaaga caccgggacc
gatccagcct 1260ccgcggccgg gaacggtgca ttggaacgcg gattccccgt gccaagagtg
agatcttccg 1320tttatctagg taccagatat cgccaccatg gaactgctga tcctgaaggc
caacgccatc 1380accaccatcc tgaccgccgt gaccttctgc ttcgccagcg gccagaacat
caccgaggaa 1440ttctaccaga gcacctgtag cgccgtgagc aagggctacc tgagcgccct
gagaaccggc 1500tggtacacca gcgtgatcac catcgagctg agcaacatca aagaaaacaa
gtgcaacggc 1560accgacgcca aagtgaagct gatcaagcag gaactggaca agtacaagaa
cgccgtgacc 1620gagctgcagc tgctgatgca gagcaccccc gccaccaaca accgggccag
acgggagctg 1680ccccggttca tgaactacac cctgaacaac gccaaaaaga ccaacgtgac
cctgagcaag 1740aagcggaagc ggcggttcct gggctttctg ctgggcgtgg gcagcgccat
tgccagcggc 1800gtggccgtgt ctaaggtgct gcacctggaa ggcgaagtga acaagatcaa
gagcgccctg 1860ctgagcacca acaaggccgt ggtgtccctg agcaacggcg tgagcgtgct
gaccagcaag 1920gtgctggatc tgaagaacta catcgacaag cagctgctgc ccatcgtgaa
caagcagagc 1980tgcagcatca gcaacatcga gacagtgatc gagttccagc agaagaacaa
ccggctgctg 2040gaaatcaccc gggagttcag cgtgaacgcc ggcgtgacca cccctgtgtc
cacctacatg 2100ctgaccaaca gcgagctgct gagcctgatc aacgacatgc ccatcaccaa
cgaccagaaa 2160aagctgatga gcaacaacgt gcagatcgtg cggcagcaga gctactccat
catgtccatc 2220atcaaagaag aggtgctggc ctacgtggtg cagctgcccc tgtacggcgt
gatcgacacc 2280ccctgctgga agctgcacac cagccccctg tgcaccacca acaccaaaga
gggcagcaac 2340atctgcctga cccggaccga cagaggctgg tactgcgaca acgccggcag
cgtgtcattc 2400tttccacagg ccgagacatg caaggtgcag agcaaccggg tgttctgcga
caccatgaac 2460agcctgaccc tgccctccga agtgaacctg tgcaacgtgg acatcttcaa
ccccaagtac 2520gactgcaaga tcatgacctc caagaccgac gtgtccagct ccgtgatcac
ctccctgggc 2580gccatcgtgt cctgctacgg caagaccaag tgcaccgcca gcaacaagaa
ccggggcatc 2640atcaagacct tcagcaacgg ctgcgactac gtgtccaaca agggggtgga
caccgtgtcc 2700gtgggcaaca ccctgtacta cgtgaacaaa caggaaggca agagcctgta
cgtgaagggc 2760gagcccatca tcaacttcta cgaccccctg gtgttcccca gcgacgagtt
cgacgccagc 2820atcagccagg tgaacgagaa gatcaaccag agcctggcct tcatccggaa
gtccgacgag 2880ctgctgcaca atgtgaatgc cggcaagtcc accaccaacc ggaagcggag
agcccctgtg 2940aagcagaccc tgaacttcga cctgctgaag ctggccggcg acgtggagag
caatcccggc 3000cctatggccc tgagcaaagt gaaactgaac gatacactga acaaggacca
gctgctgtcc 3060agcagcaagt acaccatcca gcggagcacc ggcgacagca tcgatacccc
caactacgac 3120gtgcagaagc acatcaacaa gctgtgcggc atgctgctga tcacagagga
cgccaaccac 3180aagttcaccg gcctgatcgg catgctgtac gccatgagcc ggctgggccg
ggaggacacc 3240atcaagatcc tgcgggacgc cggctaccac gtgaaggcca atggcgtgga
cgtgaccaca 3300caccggcagg acatcaacgg caaagaaatg aagttcgagg tgctgaccct
ggccagcctg 3360accaccgaga tccagatcaa tatcgagatc gagagccgga agtcctacaa
gaaaatgctg 3420aaagaaatgg gcgaggtggc ccccgagtac agacacgaca gccccgactg
cggcatgatc 3480atcctgtgta tcgccgccct ggtgatcaca aagctggccg ctggcgacag
atctggcctg 3540acagccgtga tcagacgggc caacaatgtg ctgaagaacg agatgaagcg
gtacaagggc 3600ctgctgccca aggacattgc caacagcttc tacgaggtgt tcgagaagta
cccccacttc 3660atcgacgtgt tcgtgcactt cggcattgcc cagagcagca ccagaggcgg
ctccagagtg 3720gagggcatct tcgccggcct gttcatgaac gcctacggcg ctggccaggt
gatgctgaga 3780tggggcgtgc tggccaagag cgtgaagaac atcatgctgg gccacgccag
cgtgcaggcc 3840gagatggaac aggtggtgga ggtgtacgag tacgcccaga agctgggcgg
agaggccggc 3900ttctaccaca tcctgaacaa ccctaaggcc tccctgctgt ccctgaccca
gttcccccac 3960ttctccagcg tggtgctggg aaatgccgcc ggactgggca tcatgggcga
gtaccggggc 4020acccccagaa accaggacct gtacgacgcc gccaaggcct acgccgagca
gctgaaagaa 4080aacggcgtga tcaactacag cgtgctggac ctgaccgctg aggaactgga
agccatcaag 4140caccagctga accccaagga caacgacgtg gagctgggag gcggaggatc
tggcggcgga 4200ggcatgagca gacggaaccc ctgcaagttc gagatccggg gccactgcct
gaacggcaag 4260cggtgccact tcagccacaa ctacttcgag tggccccctc atgctctgct
ggtgcggcag 4320aacttcatgc tgaaccggat cctgaagtcc atggacaaga gcatcgacac
cctgagcgag 4380atcagcggag ccgccgagct ggacagaacc gaggaatatg ccctgggcgt
ggtgggagtg 4440ctggaaagct acatcggctc catcaacaac atcacaaagc agagcgcctg
cgtggccatg 4500agcaagctgc tgacagagct gaacagcgac gacatcaaga agctgaggga
caacgaggaa 4560ctgaacagcc ccaagatccg ggtgtacaac accgtgatca gctacattga
gagcaaccgc 4620aagaacaaca agcagaccat ccatctgctg aagcggctgc ccgccgacgt
gctgaaaaag 4680accatcaaga acaccctgga catccacaag tccatcacca tcaacaatcc
caaagaaagc 4740accgtgtctg acaccaacga tcacgccaag aacaacgaca ccacctgatg
agcggccgcg 4800atctgctgtg ccttctagtt gccagccatc tgttgtttgc ccctcccccg
tgccttcctt 4860gaccctggaa ggtgccactc ccactgtcct ttcctaataa aatgaggaaa
ttgcatcgca 4920ttgtctgagt aggtgtcatt ctattctggg gggtggggtg gggcaggaca
gcaaggggga 4980ggattgggaa gacaatagca ggcatgctgg ggatgcggtg ggctctatgg
ccgatcagcg 5040atcgctgagg tgggtgagtg ggcgtggcct ggggtggtca tgaaaatata
taagttgggg 5100gtcttagggt ctctttattt gtgttgcaga gaccgccgga gccatgagcg
ggagcagcag 5160cagcagcagt agcagcagcg ccttggatgg cagcatcgtg agcccttatt
tgacgacgcg 5220gatgccccac tgggccgggg tgcgtcagaa tgtgatgggc tccagcatcg
acggccgacc 5280cgtcctgccc gcaaattccg ccacgctgac ctatgcgacc gtcgcgggga
cgccgttgga 5340cgccaccgcc gccgccgccg ccaccgcagc cgcctcggcc gtgcgcagcc
tggccacgga 5400ctttgcattc ctgggaccac tggcgacagg ggctacttct cgggccgctg
ctgccgccgt 5460tcgcgatgac aagctgaccg ccctgctggc gcagttggat gcgcttactc
gggaactggg 5520tgacctttct cagcaggtca tggccctgcg ccagcaggtc tcctccctgc
aagctggcgg 5580gaatgcttct cccacaaatg ccgtttaaga taaataaaac cagactctgt
ttggattaaa 5640gaaaagtagc aagtgcattg ctctctttat ttcataattt tccgcgcgcg
ataggcccta 5700gaccagcgtt ctcggtcgtt gagggtgcgg tgtatcttct ccaggacgtg
gtagaggtgg 5760ctctggacgt tgagatacat gggcatgagc ccgtcccggg ggtggaggta
gcaccactgc 5820agagcttcat gctccggggt ggtgttgtag atgatccagt cgtagcagga
gcgctgggca 5880tggtgcctaa aaatgtcctt cagcagcagg ccgatggcca gggggaggcc
cttggtgtaa 5940gtgtttacaa aacggttaag ttgggaaggg tgcattcggg gagagatgat
gtgcatcttg 6000gactgtattt ttagattggc gatgtttccg cccagatccc ttctgggatt
catgttgtgc 6060aggaccacca gtacagtgta tccggtgcac ttggggaatt tgtcatgcag
cttagaggga 6120aaagcgtgga agaacttgga gacgcctttg tggcctccca gattttccat
gcattcgtcc 6180atgatgatgg caatgggccc gcgggaggca gcttgggcaa agatatttct
ggggtcgctg 6240acgtcgtagt tgtgttccag ggtgaggtcg tcataggcca tttttacaaa
gcgcgggcgg 6300agggtgcccg actgggggat gatggtcccc tctggccctg gggcgtagtt
gccctcgcag 6360atctgcattt cccaggcctt aatctcggag gggggaatca tatccacctg
cggggcgatg 6420aagaaaacgg tttccggagc cggggagatt aactgggatg agagcaggtt
tctaagcagc 6480tgtgattttc cacaaccggt gggcccataa ataacaccta taaccggttg
cagctggtag 6540tttagagagc tgcagctgcc gtcgtcccgg aggagggggg ccacctcgtt
gagcatgtcc 6600ctgacgcgca tgttctcccc gaccagatcc gccagaaggc gctcgccgcc
cagggacagc 6660agctcttgca aggaagcaaa gtttttcagc ggcttgaggc cgtccgccgt
gggcatgttt 6720ttcagggtct ggctcagcag ctccaggcgg tcccagagct cggtgacgtg
ctctacggca 6780tctctatcca gcatatctcc tcgtttcgcg ggttggggcg actttcgctg
tagggcacca 6840agcggtggtc gtccagcggg gccagagtca tgtccttcca tgggcgcagg
gtcctcgtca 6900gggtggtctg ggtcacggtg aaggggtgcg ctccgggctg agcgcttgcc
aaggtgcgct 6960tgaggctggt tctgctggtg ctgaagcgct gccggtcttc gccctgcgcg
tcggccaggt 7020agcatttgac catggtgtca tagtccagcc cctccgcggc gtgtcccttg
gcgcgcagct 7080tgcccttgga ggtggcgccg cacgaggggc agagcaggct cttgagcgcg
tagagcttgg 7140gggcgaggaa gaccgattcg ggggagtagg cgtccgcgcc gcagaccccg
cacacggtct 7200cgcactccac cagccaggtg agctcggggc gcgccgggtc aaaaaccagg
tttcccccat 7260gctttttgat gcgtttctta cctcgggtct ccatgaggtg gtgtccccgc
tcggtgacga 7320agaggctgtc cgtgtctccg tagaccgact tgaggggtct tttctccagg
ggggtccctc 7380ggtcttcctc gtagaggaac tcggaccact ctgagacgaa ggcccgcgtc
caggccagga 7440cgaaggaggc tatgtgggag gggtagcggt cgttgtccac tagggggtcc
accttctcca 7500aggtgtgaag acacatgtcg ccttcctcgg cgtccaggaa ggtgattggc
ttgtaggtgt 7560aggccacgtg accgggggtt cctgacgggg gggtataaaa gggggtgggg
gcgcgctcgt 7620cgtcactctc ttccgcatcg ctgtctgcga gggccagctg ctggggtgag
tattccctct 7680cgaaggcggg catgacctcc gcgctgaggt tgtcagtttc caaaaacgag
gaggatttga 7740tgttcacctg tcccgaggtg atacctttga gggtacccgc gtccatctgg
tcagaaaaca 7800cgatcttttt attgtccagc ttggtggcga acgacccgta gagggcgttg
gagagcagct 7860tggcgatgga gcgcagggtc tggttcttgt ccctgtcggc gcgctccttg
gccgcgatgt 7920tgagctgcac gtactcgcgc gcgacgcagc gccactcggg gaagacggtg
gtgcgctcgt 7980cgggcaccag gcgcacgcgc cagccgcggt tgtgcagggt gaccaggtcc
acgctggtgg 8040cgacctcgcc gcgcaggcgc tcgttggtcc agcagagacg gccgcccttg
cgcgagcaga 8100aggggggcag ggggtcgagc tgggtctcgt ccggggggtc cgcgtccacg
gtgaaaaccc 8160cggggcgcag gcgcgcgtcg aagtagtcta tcttgcaacc ttgcatgtcc
agcgcctgct 8220gccagtcgcg ggcggcgagc gcgcgctcgt aggggttgag cggcgggccc
cagggcatgg 8280ggtgggtgag tgcggaggcg tacatgccgc agatgtcata gacgtagagg
ggctcccgca 8340ggaccccgat gtaggtgggg tagcagcggc cgccgcggat gctggcgcgc
acgtagtcat 8400acagctcgtg cgagggggcg aggaggtcgg ggcccaggtt ggtgcgggcg
gggcgctccg 8460cgcggaagac gatctgcctg aagatggcat gcgagttgga agagatggtg
gggcgctgga 8520agacgttgaa gctggcgtcc tgcaggccga cggcgtcgcg cacgaaggag
gcgtaggagt 8580cgcgcagctt gtgtaccagc tcggcggtga cctgcacgtc gagcgcgcag
tagtcgaggg 8640tctcgcggat gatgtcatat ttagcctgcc ccttcttttt ccacagctcg
cggttgagga 8700caaactcttc gcggtctttc cagtactctt ggatcgggaa accgtccggt
tccgaacggt 8760aagagcctag catgtagaac tggttgacgg cctggtaggc gcagcagccc
ttctccacgg 8820ggagggcgta ggcctgcgcg gccttgcgga gcgaggtgtg ggtcagggcg
aaggtgtccc 8880tgaccatgac tttgaggtac tggtgcttga agtcggagtc gtcgcagccg
ccccgctccc 8940agagcgagaa gtcggtgcgc ttcttggagc gggggttggg cagagcgaag
gtgacatcgt 9000tgaagaggat tttgcccgcg cggggcatga agttgcgggt gatgcggaag
ggccccggca 9060cttcagagcg gttgttgatg acctgggcgg cgagcacgat ctcgtcgaag
ccgttgatgt 9120tgtggcccac gatgtagagt tccaggaagc ggggccggcc ctttacggtg
ggcagcttct 9180ttagctcttc gtaggtgagc tcctcgggcg aggcgaggcc gtgctcggcc
agggcccagt 9240ccgcgaggtg cgggttgtct ctgaggaagg acttccagag gtcgcgggcc
aggagggtct 9300gcaggcggtc tctgaaggtc ctgaactggc ggcccacggc cattttttcg
ggggtgatgc 9360agtagaaggt gagggggtct tgctgccagc ggtcccagtc gagctgcagg
gcgaggtcgc 9420gcgcggcggt gaccaggcgc tcgtcgcccc cgaatttcat gaccagcatg
aagggcacga 9480gctgctttcc gaaggccccc atccaagtgt aggtctctac atcgtaggtg
acaaagaggc 9540gctccgtgcg aggatgcgag ccgatcggga agaactggat ctcccgccac
cagttggagg 9600agtggctgtt gatgtggtgg aagtagaagt cccgtcgccg ggccgaacac
tcgtgctggc 9660ttttgtaaaa gcgagcgcag tactggcagc gctgcacggg ctgtacctca
tgcacgagat 9720gcacctttcg cccgcgcacg aggaagccga ggggaaatct gagccccccg
cctggctcgc 9780ggcatggctg gttctcttct actttggatg cgtgtccgtc tccgtctggc
tcctcgaggg 9840gtgttacggt ggagcggacc accacgccgc gcgagccgca ggtccagata
tcggcgcgcg 9900gcggtcggag tttgatgacg acatcgcgca gctgggagct gtccatggtc
tggagctccc 9960gcggcggcgg caggtcagcc gggagttctt gcaggttcac ctcgcagagt
cgggccaggg 10020cgcggggcag gtctaggtgg tacctgatct ctaggggcgt gttggtggcg
gcgtcgatgg 10080cttgcaggag cccgcagccc cggggggcga cgacggtgcc ccgcggggtg
gtggtggtgg 10140tggcggtgca gctcagaagc ggtgccgcgg gcgggccccc ggaggtaggg
ggggctccgg 10200tcccgcgggc aggggcggca gcggcacgtc ggcgtggagc gcgggcagga
gttggtgctg 10260tgcccggagg ttgctggcga aggcgacgac gcggcggttg atctcctgga
tctggcgcct 10320ctgcgtgaag acgacgggcc cggtgagctt gaacctgaaa gagagttcga
cagaatcaat 10380ctcggtgtca ttgaccgcgg cctggcgcag gatctcctgc acgtctcccg
agttgtcttg 10440gtaggcgatc tcggccatga actgctcgat ctcttcctcc tggaggtctc
cgcgtccggc 10500gcgttccacg gtggccgcca ggtcgttgga gatgcgcccc atgagctgcg
agaaggcgtt 10560gagtccgccc tcgttccaga ctcggctgta gaccacgccc ccctggtcat
cgcgggcgcg 10620catgaccacc tgcgcgaggt tgagctccac gtgccgcgcg aagacggcgt
agttgcgcag 10680acgctggaag aggtagttga gggtggtggc ggtgtgctcg gccacgaaga
agttcatgac 10740ccagcggcgc aacgtggatt cgttgatgtc ccccaaggcc tccagccgtt
ccatggcctc 10800gtagaagtcc acggcgaagt tgaaaaactg ggagttgcgc gccgacacgg
tcaactcctc 10860ctccagaaga cggatgagct cggcgacggt gtcgcgcacc tcgcgctcga
aggctatggg 10920gatctcttcc tccgctagca tcaccacctc ctcctcttcc tcctcttctg
gcacttccat 10980gatggcttcc tcctcttcgg ggggtggcgg cggcggcggt gggggagggg
gcgctctgcg 11040ccggcggcgg cgcaccggga ggcggtccac gaagcgcgcg atcatctccc
cgcggcggcg 11100gcgcatggtc tcggtgacgg cgcggccgtt ctcccggggg cgcagttgga
agacgccgcc 11160ggacatctgg tgctggggcg ggtggccgtg aggcagcgag acggcgctga
cgatgcatct 11220caacaattgc tgcgtaggta cgccgccgag ggacctgagg gagtccatat
ccaccggatc 11280cgaaaacctt tcgaggaagg cgtctaacca gtcgcagtcg caaggtaggc
tgagcaccgt 11340ggcgggcggc ggggggtggg gggagtgtct ggcggaggtg ctgctgatga
tgtaattgaa 11400gtaggcggac ttgacacggc ggatggtcga caggagcacc atgtccttgg
gtccggcctg 11460ctggatgcgg aggcggtcgg ctatgcccca ggcttcgttc tggcatcggc
gcaggtcctt 11520gtagtagtct tgcatgagcc tttccaccgg cacctcttct ccttcctctt
ctgcttcttc 11580catgtctgct tcggccctgg ggcggcgccg cgcccccctg ccccccatgc
gcgtgacccc 11640gaaccccctg agcggttgga gcagggccag gtcggcgacg acgcgctcgg
ccaggatggc 11700ctgctgcacc tgcgtgaggg tggtttggaa gtcatccaag tccacgaagc
ggtggtaggc 11760gcccgtgttg atggtgtagg tgcagttggc catgacggac cagttgacgg
tctggtggcc 11820cggttgcgac atctcggtgt acctgagtcg cgagtaggcg cgggagtcga
agacgtagtc 11880gttgcaagtc cgcaccaggt actggtagcc caccaggaag tgcggcggcg
gctggcggta 11940gaggggccag cgcagggtgg cgggggctcc gggggccagg tcttccagca
tgaggcggtg 12000gtaggcgtag atgtacctgg acatccaggt gatacccgcg gcggtggtgg
aggcgcgcgg 12060gaagtcgcgc acccggttcc agatgttgcg caggggcaga aagtgctcca
tggtaggcgt 12120gctctgtcca gtcagacgcg cgcagtcgtt gatactctag accagggaaa
acgaaagccg 12180gtcagcgggc actcttccgt ggtctggtga atagatcgca agggtatcat
ggcggagggc 12240ctcggttcga gccccgggtc cgggccggac ggtccgccat gatccacgcg
gttaccgccc 12300gcgtgtcgaa cccaggtgtg cgacgtcaga caacggtgga gtgttccttt
tggcgttttt 12360ctggccgggc gccggcgccg cgtaagagac taagccgcga aagcgaaagc
agtaagtggc 12420tcgctccccg tagccggagg gatccttgct aagggttgcg ttgcggcgaa
ccccggttcg 12480aatcccgtac tcgggccggc cggacccgcg gctaaggtgt tggattggcc
tccccctcgt 12540ataaagaccc cgcttgcgga ttgactccgg acacggggac gagccccttt
tatttttgct 12600ttccccagat gcatccggtg ctgcggcaga tgcgcccccc gccccagcag
cagcaacaac 12660accagcaaga gcggcagcaa cagcagcggg agtcatgcag ggccccctca
cccaccctcg 12720gcgggccggc cacctcggcg tccgcggccg tgtctggcgc ctgcggcggc
ggcggggggc 12780cggctgacga ccccgaggag cccccgcggc gcagggccag acactacctg
gacctggagg 12840agggcgaggg cctggcgcgg ctgggggcgc cgtctcccga gcgccacccg
cgggtgcagc 12900tgaagcgcga ctcgcgcgag gcgtacgtgc ctcggcagaa cctgttcagg
gaccgcgcgg 12960gcgaggagcc cgaggagatg cgggacagga ggttcagcgc agggcgggag
ctgcggcagg 13020ggctgaaccg cgagcggctg ctgcgcgagg aggactttga gcccgacgcg
cggacgggga 13080tcagccccgc gcgcgcgcac gtggcggccg ccgacctggt gacggcgtac
gagcagacgg 13140tgaaccagga gatcaacttc caaaagagtt tcaacaacca cgtgcgcacg
ctggtggcgc 13200gcgaggaggt gaccatcggg ctgatgcacc tgtgggactt tgtaagcgcg
ctggtgcaga 13260accccaacag caagcctctg acggcgcagc tgttcctgat agtgcagcac
agcagggaca 13320acgaggcgtt tagggacgcg ctgctgaaca tcaccgagcc cgagggtcgg
tggctgctgg 13380acctgattaa catcctgcag agcatagtgg tgcaggagcg cagcctgagc
ctggccgaca 13440aggtggcggc catcaactac tcgatgctga gcctgggcaa gttttacgcg
cgcaagatct 13500accagacgcc gtacgtgccc atagacaagg aggtgaagat cgacggtttt
tacatgcgca 13560tggcgctgaa ggtgctcacc ctgagcgacg acctgggcgt gtaccgcaac
gagcgcatcc 13620acaaggccgt gagcgtgagc cggcggcgcg agctgagcga ccgcgagctg
atgcacagcc 13680tgcagcgggc gctggcgggc gccggcagcg gcgacaggga ggcggagtcc
tacttcgatg 13740cgggggcgga cctgcgctgg gcgcccagcc ggcgggccct ggaggccgcg
ggggtccgcg 13800aggactatga cgaggacggc gaggaggatg aggagtacga gctagaggag
ggcgagtacc 13860tggactaaac cgcgggtggt gtttccggta gatgcaagac ccgaacgtgg
tggacccggc 13920gctgcgggcg gctctgcaga gccagccgtc cggccttaac tcctcagacg
actggcgaca 13980ggtcatggac cgcatcatgt cgctgacggc gcgtaacccg gacgcgttcc
ggcagcagcc 14040gcaggccaac aggctctccg ccatcctgga ggcggtggtg cctgcgcgct
cgaaccccac 14100gcacgagaag gtgctggcca tagtgaacgc gctggccgag aacagggcca
tccgcccgga 14160cgaggccggg ctggtgtacg acgcgctgct gcagcgcgtg gcccgctaca
acagcggcaa 14220cgtgcagacc aacctggacc ggctggtggg ggacgtgcgc gaggcggtgg
cgcagcgcga 14280gcgcgcggat cggcagggca acctgggctc catggtggcg ctgaatgcct
tcctgagcac 14340gcagccggcc aacgtgccgc gggggcagga agactacacc aactttgtga
gcgcgctgcg 14400gctgatggtg accgagaccc cccagagcga ggtgtaccag tcgggcccgg
actacttctt 14460ccagaccagc agacagggcc tgcagacggt gaacctgagc caggctttca
agaacctgcg 14520ggggctgtgg ggcgtgaagg cgcccaccgg cgaccgggcg acggtgtcca
gcctgctgac 14580gcccaactcg cgcctgctgc tgctgctgat cgcgccgttc acggacagcg
gcagcgtgtc 14640ccgggacacc tacctggggc acctgctgac cctgtaccgc gaggccatcg
ggcaggcgca 14700ggtggacgag cacaccttcc aggagatcac cagcgtgagc cgcgcgctgg
ggcaggagga 14760cacgagcagc ctggaggcga ctctgaacta cctgctgacc aaccggcggc
agaagattcc 14820ctcgctgcac agcctgacct ccgaggagga gcgcatcttg cgctacgtgc
agcagagcgt 14880gagcctgaac ctgatgcgcg acggggtgac gcccagcgtg gcgctggaca
tgaccgcgcg 14940caacatggaa ccgggcatgt acgccgcgca ccggccttac atcaaccgcc
tgatggacta 15000cctgcatcgc gcggcggccg tgaaccccga gtactttacc aacgccatcc
tgaacccgca 15060ctggctcccg ccgcccgggt tctacagcgg gggcttcgag gtcccggaga
ccaacgatgg 15120cttcctgtgg gacgacatgg acgacagcgt gttctccccg cggccgcagg
cgctggcgga 15180agcgtccctg ctgcgtccca agaaggagga ggaggaggag gcgagtcgcc
gccgcggcag 15240cagcggcgtg gcttctctgt ccgagctggg ggcggcagcc gccgcgcgcc
ccgggtccct 15300gggcggcagc ccctttccga gcctggtggg gtctctgcac agcgagcgca
ccacccgccc 15360tcggctgctg ggcgaggacg agtacctgaa taactccctg ctgcagccgg
tgcgggagaa 15420aaacctgcct cccgccttcc ccaacaacgg gatagagagc ctggtggaca
agatgagcag 15480atggaagacc tatgcgcagg agcacaggga cgcgcctgcg ctccggccgc
ccacgcggcg 15540ccagcgccac gaccggcagc gggggctggt gtgggatgac gaggactccg
cggacgatag 15600cagcgtgctg gacctgggag ggagcggcaa cccgttcgcg cacctgcgcc
cccgcctggg 15660gaggatgttt taaaaaaaaa aaaaaaaagc aagaagcatg atgcaaaaat
taaataaaac 15720tcaccaaggc catggcgacc gagcgttggt ttcttgtgtt cccttcagta
tgcggcgcgc 15780ggcgatgtac caggagggac ctcctccctc ttacgagagc gtggtgggcg
cggcggcggc 15840ggcgccctct tctccctttg cgtcgcagct gctggagccg ccgtacgtgc
ctccgcgcta 15900cctgcggcct acggggggga gaaacagcat ccgttactcg gagctggcgc
ccctgttcga 15960caccacccgg gtgtacctgg tggacaacaa gtcggcggac gtggcctccc
tgaactacca 16020gaacgaccac agcaattttt tgaccacggt catccagaac aatgactaca
gcccgagcga 16080ggccagcacc cagaccatca atctggatga ccggtcgcac tggggcggcg
acctgaaaac 16140catcctgcac accaacatgc ccaacgtgaa cgagttcatg ttcaccaata
agttcaaggc 16200gcgggtgatg gtgtcgcgct cgcacaccaa ggaagaccgg gtggagctga
agtacgagtg 16260ggtggagttc gagctgccag agggcaacta ctccgagacc atgaccattg
acctgatgaa 16320caacgcgatc gtggagcact atctgaaagt gggcaggcag aacggggtcc
tggagagcga 16380catcggggtc aagttcgaca ccaggaactt ccgcctgggg ctggaccccg
tgaccgggct 16440ggttatgccc ggggtgtaca ccaacgaggc cttccatccc gacatcatcc
tgctgcccgg 16500ctgcggggtg gacttcactt acagccgcct gagcaacctc ctgggcatcc
gcaagcggca 16560gcccttccag gagggcttca ggatcaccta cgaggacctg gaggggggca
acatccccgc 16620gctcctcgat gtggaggcct accaggatag cttgaaggaa aatgaggcgg
gacaggagga 16680taccgccccc gccgcctccg ccgccgccga gcagggcgag gatgctgctg
acaccgcggc 16740cgcggacggg gcagaggccg accccgctat ggtggtggag gctcccgagc
aggaggagga 16800catgaatgac agtgcggtgc gcggagacac cttcgtcacc cggggggagg
aaaagcaagc 16860ggaggccgag gccgcggccg aggaaaagca actggcggca gcagcggcgg
cggcggcgtt 16920ggccgcggcg gaggctgagt ctgaggggac caagcccgcc aaggagcccg
tgattaagcc 16980cctgaccgaa gatagcaaga agcgcagtta caacctgctc aaggacagca
ccaacaccgc 17040gtaccgcagc tggtacctgg cctacaacta cggcgacccg tcgacggggg
tgcgctcctg 17100gaccctgctg tgcacgccgg acgtgacctg cggctcggag caggtgtact
ggtcgctgcc 17160cgacatgatg caagaccccg tgaccttccg ctccacgcgg caggtcagca
acttcccggt 17220ggtgggcgcc gagctgctgc ccgtgcactc caagagcttc tacaacgacc
aggccgtcta 17280ctcccagctc atccgccagt tcacctctct gacccacgtg ttcaatcgct
ttcctgagaa 17340ccagattctg gcgcgcccgc ccgcccccac catcaccacc gtcagtgaaa
acgttcctgc 17400tctcacagat cacgggacgc taccgctgcg caacagcatc ggaggagtcc
agcgagtgac 17460cgttactgac gccagacgcc gcacctgccc ctacgtttac aaggccttgg
gcatagtctc 17520gccgcgcgtc ctttccagcc gcactttttg agcaacacca ccatcatgtc
catcctgatc 17580tcacccagca ataactccgg ctggggactg ctgcgcgcgc ccagcaagat
gttcggaggg 17640gcgaggaagc gttccgagca gcaccccgtg cgcgtgcgcg ggcacttccg
cgccccctgg 17700ggagcgcaca aacgcggccg cgcggggcgc accaccgtgg acgacgccat
cgactcggtg 17760gtggagcagg cgcgcaacta caggcccgcg gtctctaccg tggacgcggc
catccagacc 17820gtggtgcggg gcgcgcggcg gtacgccaag ctgaagagcc gccggaagcg
cgtggcccgc 17880cgccaccgcc gccgacccgg ggccgccgcc aaacgcgccg ccgcggccct
gcttcgccgg 17940gccaagcgca cgggccgccg cgccgccatg agggccgcgc gccgcttggc
cgccggcatc 18000accgccgcca ccatggcccc ccgtacccga agacgcgcgg ccgccgccgc
cgccgccgcc 18060atcagtgaca tggccagcag gcgccggggc aacgtgtact gggtgcgcga
ctcggtgacc 18120ggcacgcgcg tgcccgtgcg cttccgcccc ccgcggactt gagatgatgt
gaaaaaacaa 18180cactgagtct cctgctgttg tgtgtatccc agcggcggcg gcgcgcgcag
cgtcatgtcc 18240aagcgcaaaa tcaaagaaga gatgctccag gtcgtcgcgc cggagatcta
tgggcccccg 18300aagaaggaag agcaggattc gaagccccgc aagataaagc gggtcaaaaa
gaaaaagaaa 18360gatgatgacg atgccgatgg ggaggtggag ttcctgcgcg ccacggcgcc
caggcgcccg 18420gtgcagtgga agggccggcg cgtaaagcgc gtcctgcgcc ccggcaccgc
ggtggtcttc 18480acgcccggcg agcgctccac ccggactttc aagcgcgtct atgacgaggt
gtacggcgac 18540gaagacctgc tggagcaggc caacgagcgc ttcggagagt ttgcttacgg
gaagcgtcag 18600cgggcgctgg ggaaggagga cctgctggcg ctgccgctgg accagggcaa
ccccaccccc 18660agtctgaagc ccgtgaccct gcagcaggtg ctgccgagca gcgcaccctc
cgaggcgaag 18720cggggtctga agcgcgaggg cggcgacctg gcgcccaccg tgcagctcat
ggtgcccaag 18780cggcagaggc tggaggatgt gctggagaaa atgaaagtag accccggtct
gcagccggac 18840atcagggtcc gccccatcaa gcaggtggcg ccgggcctcg gcgtgcagac
cgtggacgtg 18900gtcatcccca ccggcaactc ccccgccgcc gccaccacta ccgctgcctc
cacggacatg 18960gagacacaga ccgatcccgc cgcagccgca gccgcagccg ccgccgcgac
ctcctcggcg 19020gaggtgcaga cggacccctg gctgccgccg gcgatgtcag ctccccgcgc
gcgtcgcggg 19080cgcaggaagt acggcgccgc caacgcgctc ctgcccgagt acgccttgca
tccttccatc 19140gcgcccaccc ccggctaccg aggctatacc taccgcccgc gaagagccaa
gggttccacc 19200cgccgtcccc gccgacgcgc cgccgccacc acccgccgcc gccgccgcag
acgccagccc 19260gcactggctc cagtctccgt gaggaaagtg gcgcgcgacg gacacaccct
ggtgctgccc 19320agggcgcgct accaccccag catcgtttaa aagcctgttg tggttcttgc
agatatggcc 19380ctcacttgcc gcctccgttt cccggtgccg ggataccgag gaggaagatc
gcgccgcagg 19440aggggtctgg ccggccgcgg cctgagcgga ggcagccgcc gcgcgcaccg
gcggcgacgc 19500gccaccagcc gacgcatgcg cggcggggtg ctgcccctgt taatccccct
gatcgccgcg 19560gcgatcggcg ccgtgcccgg gatcgcctcc gtggccttgc aagcgtccca
gaggcattga 19620cagacttgca aacttgcaaa tatggaaaaa aaaaccccaa taaaaaagtc
tagactctca 19680cgctcgcttg gtcctgtgac tattttgtag aatggaagac atcaactttg
cgtcgctggc 19740cccgcgtcac ggctcgcgcc cgttcctggg acactggaac gatatcggca
ccagcaacat 19800gagcggtggc gccttcagtt ggggctctct gtggagcggc attaaaagta
tcgggtctgc 19860cgttaaaaat tacggctccc gggcctggaa cagcagcacg ggccagatgt
tgagagacaa 19920gttgaaagag cagaacttcc agcagaaggt ggtggagggc ctggcctccg
gcatcaacgg 19980ggtggtggac ctggccaacc aggccgtgca gaataagatc aacagcagac
tggacccccg 20040gccgccggtg gaggaggtgc cgccggcgct ggagacggtg tcccccgatg
ggcgtggcga 20100gaagcgcccg cggcccgata gggaagagac cactctggtc acgcagaccg
atgagccgcc 20160cccgtatgag gaggccctga agcaaggtct gcccaccacg cggcccatcg
cgcccatggc 20220caccggggtg gtgggccgcc acacccccgc cacgctggac ttgcctccgc
ccgccgatgt 20280gccgcagcag cagaaggcgg cacagccggg cccgcccgcg accgcctccc
gttcctccgc 20340cggtcctctg cgccgcgcgg ccagcggccc ccgcgggggg gtcgcgaggc
acggcaactg 20400gcagagcacg ctgaacagca tcgtgggtct gggggtgcgg tccgtgaagc
gccgccgatg 20460ctactgaata gcttagctaa cgtgttgtat gtgtgtatgc gccctatgtc
gccgccagag 20520gagctgctga gtcgccgccg ttcgcgcgcc caccaccacc gccactccgc
ccctcaagat 20580ggcgacccca tcgatgatgc cgcagtggtc gtacatgcac atctcgggcc
aggacgcctc 20640ggagtacctg agccccgggc tggtgcagtt cgcccgcgcc accgagagct
acttcagcct 20700gagtaacaag tttaggaacc ccacggtggc gcccacgcac gatgtgacca
ccgaccggtc 20760tcagcgcctg acgctgcggt tcattcccgt ggaccgcgag gacaccgcgt
actcgtacaa 20820ggcgcggttc accctggccg tgggcgacaa ccgcgtgctg gacatggcct
ccacctactt 20880tgacatccgc ggggtgctgg accggggtcc cactttcaag ccctactctg
gcaccgccta 20940caactccctg gcccccaagg gcgctcccaa ctcctgcgag tgggagcaag
aggaaactca 21000ggcagttgaa gaagcagcag aagaggaaga agaagatgct gacggtcaag
ctgaggaaga 21060gcaagcagct accaaaaaga ctcatgtata tgctcaggct cccctttctg
gcgaaaaaat 21120tagtaaagat ggtctgcaaa taggaacgga cgctacagct acagaacaaa
aacctattta 21180tgcagaccct acattccagc ccgaacccca aatcggggag tcccagtgga
atgaggcaga 21240tgctacagtc gccggcggta gagtgctaaa gaaatctact cccatgaaac
catgctatgg 21300ttcctatgca agacccacaa atgctaatgg aggtcagggt gtactaacgg
caaatgccca 21360gggacagcta gaatctcagg ttgaaatgca attcttttca acttctgaaa
acgcccgtaa 21420cgaggctaac aacattcagc ccaaattggt gctgtatagt gaggatgtgc
acatggagac 21480cccggatacg cacctttctt acaagcccgc aaaaagcgat gacaattcaa
aaatcatgct 21540gggtcagcag tccatgccca acagacctaa ttacatcggc ttcagagaca
actttatcgg 21600cctcatgtat tacaatagca ctggcaacat gggagtgctt gcaggtcagg
cctctcagtt 21660gaatgcagtg gtggacttgc aagacagaaa cacagaactg tcctaccagc
tcttgcttga 21720ttccatgggt gacagaacca gatacttttc catgtggaat caggcagtgg
acagttatga 21780cccagatgtt agaattattg aaaatcatgg aactgaagac gagctcccca
actattgttt 21840ccctctgggt ggcatagggg taactgacac ttaccaggct gttaaaacca
acaatggcaa 21900taacgggggc caggtgactt ggacaaaaga tgaaactttt gcagatcgca
atgaaatagg 21960ggtgggaaac aatttcgcta tggagatcaa cctcagtgcc aacctgtgga
gaaacttcct 22020gtactccaac gtggcgctgt acctaccaga caagcttaag tacaacccct
ccaatgtgga 22080catctctgac aaccccaaca cctacgatta catgaacaag cgagtggtgg
ccccggggct 22140ggtggactgc tacatcaacc tgggcgcgcg ctggtcgctg gactacatgg
acaacgtcaa 22200ccccttcaac caccaccgca atgcgggcct gcgctaccgc tccatgctcc
tgggcaacgg 22260gcgctacgtg cccttccaca tccaggtgcc ccagaagttc tttgccatca
agaacctcct 22320cctcctgccg ggctcctaca cctacgagtg gaacttcagg aaggatgtca
acatggtcct 22380ccagagctct ctgggtaacg atctcagggt ggacggggcc agcatcaagt
tcgagagcat 22440ctgcctctac gccaccttct tccccatggc ccacaacacg gcctccacgc
tcgaggccat 22500gctcaggaac gacaccaacg accagtcctt caatgactac ctctccgccg
ccaacatgct 22560ctaccccata cccgccaacg ccaccaacgt ccccatctcc atcccctcgc
gcaactgggc 22620ggccttccgc ggctgggcct tcacccgcct caagaccaag gagaccccct
ccctgggctc 22680gggattcgac ccctactaca cctactcggg ctccattccc tacctggacg
gcaccttcta 22740cctcaaccac actttcaaga aggtctcggt caccttcgac tcctcggtca
gctggccggg 22800caacgaccgt ctgctcaccc ccaacgagtt cgagatcaag cgctcggtcg
acggggaggg 22860ctacaacgtg gcccagtgca acatgaccaa ggactggttc ctggtccaga
tgctggccaa 22920ctacaacatc ggctaccagg gcttctacat cccagagagc tacaaggaca
ggatgtactc 22980cttcttcagg aacttccagc ccatgagccg gcaggtggtg gaccagacca
agtacaagga 23040ctaccaggag gtgggcatca tccaccagca caacaactcg ggcttcgtgg
gctacctcgc 23100ccccaccatg cgcgagggac aggcctaccc cgccaacttc ccctatccgc
tcataggcaa 23160gaccgcggtc gacagcatca cccagaaaaa gttcctctgc gaccgcaccc
tctggcgcat 23220ccccttctcc agcaacttca tgtccatggg tgcgctctcg gacctgggcc
agaacttgct 23280ctacgccaac tccgcccacg ccctcgacat gaccttcgag gtcgacccca
tggacgagcc 23340cacccttctc tatgttctgt tcgaagtctt tgacgtggtc cgggtccacc
agccgcaccg 23400cggcgtcatc gagaccgtgt acctgcgtac gcccttctcg gccggcaacg
ccaccaccta 23460aagaagcaag ccgcagtcat cgccgcctgc atgccgtcgg gttccaccga
gcaagagctc 23520agggccatcg tcagagacct gggatgcggg ccctattttt tgggcacctt
cgacaagcgc 23580ttccctggct ttgtctcccc acacaagctg gcctgcgcca tcgtcaacac
ggccggccgc 23640gagaccgggg gcgtgcactg gctggccttc gcctggaacc cgcgctccaa
aacatgcttc 23700ctctttgacc ccttcggctt ttcggaccag cggctcaagc aaatctacga
gttcgagtac 23760gagggcttgc tgcgtcgcag cgccatcgcc tcctcgcccg accgctgcgt
caccctcgaa 23820aagtccaccc agaccgtgca ggggcccgac tcggccgcct gcggtctctt
ctgctgcatg 23880tttctgcacg cctttgtgca ctggcctcag agtcccatgg accgcaaccc
caccatgaac 23940ttgctgacgg gggtgcccaa ctccatgctc cagagccccc aggtcgagcc
caccctgcgc 24000cgcaaccagg agcagctcta cagcttcctg gagcgccact cgccttactt
ccgccgccac 24060agcgcacaga tcaggagggc cacctccttc tgccacttgc aagagatgca
agaagggtaa 24120taacgatgta cacacttttt ttctcaataa atggcatctt tttatttata
caagctctct 24180ggggtattca tttcccacca ccacccgccg ttgtcgccat ctggctctat
ttagaaatcg 24240aaagggttct gccgggagtc gccgtgcgcc acgggcaggg acacgttgcg
atactggtag 24300cgggtgcccc acttgaactc gggcaccacc aggcgaggca gctcggggaa
gttttcgctc 24360cacaggctgc gggtcagcac cagcgcgttc atcaggtcgg gcgccgagat
cttgaagtcg 24420cagttggggc cgccgccctg cgcgcgcgag ttgcggtaca ccgggttgca
gcactggaac 24480accaacagcg ccgggtgctt cacgctggcc agcacgctgc ggtcggagat
cagctcggcg 24540tccaggtcct ccgcgttgct cagcgcgaac ggggtcatct tgggcacttg
ccgccccagg 24600aagggcgcgt gccccggttt cgagttgcag tcgcagcgca gcgggatcag
caggtgcccg 24660tgcccggact cggcgttggg gtacagcgcg cgcatgaagg cctgcatctg
gcggaaggcc 24720atctgggcct tggcgccctc cgagaagaac atgccgcagg acttgcccga
gaactggttt 24780gcggggcagc tggcgtcgtg caggcagcag cgcgcgtcgg tgttggcgat
ctgcaccacg 24840ttgcgccccc accggttctt cacgatcttg gccttggacg attgctcctt
cagcgcgcgc 24900tgcccgttct cgctggtcac atccatctcg atcacatgtt ccttgttcac
catgctgctg 24960ccgtgcagac acttcagctc gccctccgtc tcggtgcagc ggtgctgcca
cagcgcgcag 25020cccgtgggct cgaaagactt gtaggtcacc tccgcgaagg actgcaggta
cccctgcaaa 25080aagcggccca tcatggtcac gaaggtcttg ttgctgctga aggtcagctg
cagcccgcgg 25140tgctcctcgt tcagccaggt cttgcacacg gccgccagcg cctccacctg
gtcgggcagc 25200atcttgaagt tcaccttcag ctcattctcc acgtggtact tgtccatcag
cgtgcgcgcc 25260gcctccatgc ccttctccca ggccgacacc agcggcaggc tcacggggtt
cttcaccatc 25320accgtggccg ccgcctccgc cgcgctttcg ctttccgccc cgctgttctc
ttcctcttcc 25380tcctcttcct cgccgccgcc cactcgcagc ccccgcacca cggggtcgtc
ttcctgcagg 25440cgctgcacct tgcgcttgcc gttgcgcccc tgcttgatgc gcacgggcgg
gttgctgaag 25500cccaccatca ccagcgcggc ctcttcttgc tcgtcctcgc tgtccagaat
gacctccggg 25560gagggggggt tggtcatcct cagtaccgag gcacgcttct ttttcttcct
gggggcgttc 25620gccagctccg cggctgcggc cgctgccgag gtcgaaggcc gagggctggg
cgtgcgcggc 25680accagcgcgt cctgcgagcc gtcctcgtcc tcctcggact cgagacggag
gcgggcccgc 25740ttcttcgggg gcgcgcgggg cggcggaggc ggcggcggcg acggagacgg
ggacgagaca 25800tcgtccaggg tgggtggacg gcgggccgcg ccgcgtccgc gctcgggggt
ggtctcgcgc 25860tggtcctctt cccgactggc catctcccac tgctccttct cctataggca
gaaagagatc 25920atggagtctc tcatgcgagt cgagaaggag gaggacagcc taaccgcccc
ctctgagccc 25980tccaccaccg ccgccaccac cgccaatgcc gccgcggacg acgcgcccac
cgagaccacc 26040gccagtacca ccctccccag cgacgcaccc ccgctcgaga atgaagtgct
gatcgagcag 26100gacccgggtt ttgtgagcgg agaggaggat gaggtggatg agaaggagaa
ggaggaggtc 26160gccgcctcag tgccaaaaga ggataaaaag caagaccagg acgacgcaga
taaggatgag 26220acagcagtcg ggcgggggaa cggaagccat gatgctgatg acggctacct
agacgtggga 26280gacgacgtgc tgcttaagca cctgcaccgc cagtgcgtca tcgtctgcga
cgcgctgcag 26340gagcgctgcg aagtgcccct ggacgtggcg gaggtcagcc gcgcctacga
gcggcacctc 26400ttcgcgccgc acgtgccccc caagcgccgg gagaacggca cctgcgagcc
caacccgcgt 26460ctcaacttct acccggtctt cgcggtaccc gaggtgctgg ccacctacca
catctttttc 26520caaaactgca agatccccct ctcctgccgc gccaaccgca cccgcgccga
caaaaccctg 26580accctgcggc agggcgccca catacctgat atcgcctctc tggaggaagt
gcccaagatc 26640ttcgagggtc tcggtcgcga cgagaaacgg gcggcgaacg ctctgcacgg
agacagcgaa 26700aacgagagtc actcgggggt gctggtggag ctcgagggcg acaacgcgcg
cctggccgta 26760ctcaagcgca gcatagaggt cacccacttt gcctacccgg cgctcaacct
gccccccaag 26820gtcatgagtg tggtcatggg cgagctcatc atgcgccgcg cccagcccct
ggccgcggat 26880gcaaacttgc aagagtcctc cgaggaaggc ctgcccgcgg tcagcgacga
gcagctggcg 26940cgctggctgg agacccgcga ccccgcgcag ctggaggagc ggcgcaagct
catgatggcc 27000gcggtgctgg tcaccgtgga gctcgagtgt ctgcagcgct tcttcgcgga
ccccgagatg 27060cagcgcaagc tcgaggagac cctgcactac accttccgcc agggctacgt
gcgccaggcc 27120tgcaagatct ccaacgtgga gctctgcaac ctggtctcct acctgggcat
cctgcacgag 27180aaccgcctcg ggcagaacgt cctgcactcc accctcaaag gggaggcgcg
ccgcgactac 27240atccgcgact gcgcctacct cttcctctgc tacacctggc agacggccat
gggggtctgg 27300cagcagtgcc tggaggagcg caacctcaag gagctggaaa agctcctcaa
gcgcaccctc 27360agggacctct ggacgggctt caacgagcgc tcggtggccg ccgcgctggc
ggacatcatc 27420tttcccgagc gcctgctcaa gaccctgcag cagggcctgc ccgacttcac
cagccagagc 27480atgctgcaga acttcaggac tttcatcctg gagcgctcgg gcatcctgcc
ggccacttgc 27540tgcgcgctgc ccagcgactt cgtgcccatc aagtacaggg agtgcccgcc
gccgctctgg 27600ggccactgct acctcttcca gctggccaac tacctcgcct accactcgga
cctcatggaa 27660gacgtgagcg gcgagggcct gctcgagtgc cactgccgct gcaacctctg
cacgccccac 27720cgctctctag tctgcaaccc gcagctgctc agcgagagtc agattatcgg
taccttcgag 27780ctgcagggtc cctcgcctga cgagaagtcc gcggctccag ggctgaaact
cactccgggg 27840ctgtggactt ccgcctacct acgcaaattt gtacctgagg actaccacgc
ccacgagatc 27900aggttctacg aagaccaatc ccgcccgccc aaggcggagc tcaccgcctg
cgtcatcacc 27960caggggcaca tcctgggcca attgcaagcc atcaacaaag cccgccgaga
gttcttgctg 28020aaaaagggtc ggggggtgta cctggacccc cagtccggcg aggagctaaa
cccgctaccc 28080ccgccgccgc cccagcagcg ggaccttgct tcccaggatg gcacccagaa
agaagcagca 28140gccgccgccg ccgccgcagc catacatgct tctggaggaa gaggaggagg
actgggacag 28200tcaggcagag gaggtttcgg acgaggagca ggaggagatg atggaagact
gggaggagga 28260cagcagccta gacgaggaag cttcagaggc cgaagaggtg gcagacgcaa
caccatcgcc 28320ctcggtcgca gccccctcgc cggggcccct gaaatcctcc gaacccagca
ccagcgctat 28380aacctccgct cctccggcgc cggcgccacc cgcccgcaga cccaaccgta
gatgggacac 28440cacaggaacc ggggtcggta agtccaagtg cccgccgccg ccaccgcagc
agcagcagca 28500gcagcgccag ggctaccgct cgtggcgcgg gcacaagaac gccatagtcg
cctgcttgca 28560agactgcggg ggcaacatct ctttcgcccg ccgcttcctg ctattccacc
acggggtcgc 28620ctttccccgc aatgtcctgc attactaccg tcatctctac agcccctact
gcagcggcga 28680cccagaggcg gcagcggcag ccacagcggc gaccaccacc taggaagata
tcctccgcgg 28740gcaagacagc ggcagcagcg gccaggagac ccgcggcagc agcggcggga
gcggtgggcg 28800cactgcgcct ctcgcccaac gaacccctct cgacccggga gctcagacac
aggatcttcc 28860ccactttgta tgccatcttc caacagagca gaggccagga gcaggagctg
aaaataaaaa 28920acagatctct gcgctccctc acccgcagct gtctgtatca caaaagcgaa
gatcagcttc 28980ggcgcacgct ggaggacgcg gaggcactct tcagcaaata ctgcgcgctc
actcttaaag 29040actagctccg cgcccttctc gaatttaggc gggagaaaac tacgtcatcg
ccggccgccg 29100cccagcccgc ccagccgaga tgagcaaaga gattcccacg ccatacatgt
ggagctacca 29160gccgcagatg ggactcgcgg cgggagcggc ccaggactac tccacccgca
tgaactacat 29220gagcgcggga ccccacatga tctcacaggt caacgggatc cgcgcccagc
gaaaccaaat 29280actgctggaa caggcggcca tcaccgccac gccccgccat aatctcaacc
cccgaaattg 29340gcccgccgcc ctcgtgtacc aggaaacccc ctccgccacc accgtactac
ttccgcgtga 29400cgcccaggcc gaagtccaga tgactaactc aggggcgcag ctcgcgggcg
gctttcgtca 29460cggggcgcgg ccgctccgac caggtataag acacctgatg atcagaggcc
gaggtatcca 29520gctcaacgac gagtcggtga gctcttcgct cggtctccgt ccggacggaa
ctttccagct 29580cgccggatcc ggccgctctt cgttcacgcc ccgccaggcg tacctgactc
tgcagacctc 29640gtcctcggag ccccgctccg gcggcatcgg aaccctccag ttcgtggagg
agttcgtgcc 29700ctcggtctac ttcaacccct tctcgggacc tcccggacgc taccccgacc
agttcattcc 29760gaactttgac gcggtgaagg actcggcgga cggctacgac tgaatgtcag
gtgtcgaggc 29820agagcagctt cgcctgagac acctcgagca ctgccgccgc cacaagtgct
tcgcccgcgg 29880ttctggtgag ttctgctact ttcagctacc cgaggagcat accgaggggc
cggcgcacgg 29940cgtccgcctg accacccagg gcgaggttac ctgttccctc atccgggagt
ttaccctccg 30000tcccctgcta gtggagcggg agcggggtcc ctgtgtccta actatcgcct
gcaactgccc 30060taaccctgga ttacatcaag atctttgctg tcatctctgt gctgagttta
ataaacgctg 30120agatcagaat ctactggggc tcctgtcgcc atcctgtgaa cgccaccgtc
ttcacccacc 30180ccgaccaggc ccaggcgaac ctcacctgcg gtctgcatcg gagggccaag
aagtacctca 30240cctggtactt caacggcacc ccctttgtgg tttacaacag cttcgacggg
gacggagtct 30300ccctgaaaga ccagctctcc ggtctcagct actccatcca caagaacacc
accctccaac 30360tcttccctcc ctacctgccg ggaacctacg agtgcgtcac cggccgctgc
acccacctca 30420cccgcctgat cgtaaaccag agctttccgg gaacagataa ctccctcttc
cccagaacag 30480gaggtgagct caggaaactc cccggggacc agggcggaga cgtaccttcg
acccttgtgg 30540ggttaggatt ttttattacc gggttgctgg ctcttttaat caaagtttcc
ttgagatttg 30600ttctttcctt ctacgtgtat gaacacctca acctccaata actctaccct
ttcttcggaa 30660tcaggtgact tctctgaaat cgggcttggt gtgctgctta ctctgttgat
ttttttcctt 30720atcatactca gccttctgtg cctcaggctc gccgcctgct gcgcacacat
ctatatctac 30780tgctggttgc tcaagtgcag gggtcgccac ccaagatgaa caggtacatg
gtcctatcga 30840tcctaggcct gctggccctg gcggcctgca gcgccgccaa aaaagagatt
acctttgagg 30900agcccgcttg caatgtaact ttcaagcccg agggtgacca atgcaccacc
ctcgtcaaat 30960gcgttaccaa tcatgagagg ctgcgcatcg actacaaaaa caaaactggc
cagtttgcgg 31020tctatagtgt gtttacgccc ggagacccct ctaactactc tgtcaccgtc
ttccagggcg 31080gacagtctaa gatattcaat tacactttcc ctttttatga gttatgcgat
gcggtcatgt 31140acatgtcaaa acagtacaac ctgtggcctc cctctcccca ggcgtgtgtg
gaaaatactg 31200ggtcttactg ctgtatggct ttcgcaatca ctacgctcgc tctaatctgc
acggtgctat 31260acataaaatt caggcagagg cgaatcttta tcgatgaaaa gaaaatgcct
tgatcgctaa 31320caccggcttt ctatctgcag aatgaatgca atcacctccc tactaatcac
caccaccctc 31380cttgcgattg cccatgggtt gacacgaatc gaagtgccag tggggtccaa
tgtcaccatg 31440gtgggccccg ccggcaattc caccctcatg tgggaaaaat ttgtccgcaa
tcaatgggtt 31500catttctgct ctaaccgaat cagtatcaag cccagagcca tctgcgatgg
gcaaaatcta 31560actctgatca atgtgcaaat gatggatgct gggtactatt acgggcagcg
gggagaaatc 31620attaattact ggcgacccca caaggactac atgctgcatg tagtcgaggc
acttcccact 31680accaccccca ctaccacctc tcccaccacc accaccacta ctactactac
tactactact 31740actactacta ccactaccgc tgcccgccat acccgcaaaa gcaccatgat
tagcacaaag 31800ccccctcgtg ctcactccca cgccggcggg cccatcggtg cgacctcaga
aaccaccgag 31860ctttgcttct gccaatgcac taacgccagc gctcatgaac tgttcgacct
ggagaatgag 31920gatgtccagc agagctccgc ttgcctgacc caggaggctg tggagcccgt
tgccctgaag 31980cagatcggtg attcaataat tgactcttct tcttttgcca ctcccgaata
ccctcccgat 32040tctactttcc acatcacggg taccaaagac cctaacctct ctttctacct
gatgctgctg 32100ctctgtatct ctgtggtctc ttccgcgctg atgttactgg ggatgttctg
ctgcctgatc 32160tgccgcagaa agagaaaagc tcgctctcag ggccaaccac tgatgccctt
cccctacccc 32220ccggattttg cagataacaa gatatgagct cgctgctgac actaaccgct
ttactagcct 32280gcgctctaac ccttgtcgct tgcgactcga gattccacaa tgtcacagct
gtggcaggag 32340aaaatgttac tttcaactcc acggccgata cccagtggtc gtggagtggc
tcaggtagct 32400acttaactat ctgcaatagc tccacttccc ccggcatatc cccaaccaag
taccaatgca 32460atgccagcct gttcaccctc atcaacgctt ccaccctgga caatggactc
tatgtaggct 32520atgtaccctt tggtgggcaa ggaaagaccc acgcttacaa cctggaagtt
cgccagccca 32580gaaccactac ccaagcttct cccaccacca ccaccaccac caccatcacc
agcagcagca 32640gcagcagcag ccacagcagc agcagcagat tattgacttt ggttttggcc
agctcatctg 32700ccgctaccca ggccatctac agctctgtgc ccgaaaccac tcagatccac
cgcccagaaa 32760cgaccaccgc caccacccta cacacctcca gcgatcagat gccgaccaac
atcaccccct 32820tggctcttca aatgggactt acaagcccca ctccaaaacc agtggatgcg
gccgaggtct 32880ccgccctcgt caatgactgg gcggggctgg gaatgtggtg gttcgccata
ggcatgatgg 32940cgctctgcct gcttctgctc tggctcatct gctgcctcca ccgcaggcga
gccagacccc 33000ccatctatag acccatcatt gtcctgaacc ccgataatga tgggatccat
agattggatg 33060gcctgaaaaa cctacttttt tcttttacag tatgataaat tgagacatgc
ctcgcatttt 33120cttgtacatg ttccttctcc caccttttct ggggtgttct acgctggccg
ctgtgtctca 33180cctggaggta gactgcctct cacccttcac tgtctacctg ctttacggat
tggtcaccct 33240cactctcatc tgcagcctaa tcacagtaat catcgccttc atccagtgca
ttgattacat 33300ctgtgtgcgc ctcgcatact tcagacacca cccgcagtac cgagacagga
acattgccca 33360acttctaaga ctgctctaat catgcataag actgtgatct gccttctgat
cctctgcatc 33420ctgcccaccc tcacctcctg ccagtacacc acaaaatctc cgcgcaaaag
acatgcctcc 33480tgccgcttca cccaactgtg gaatataccc aaatgctaca acgaaaagag
cgagctctcc 33540gaagcttggc tgtatggggt catctgtgtc ttagttttct gcagcactgt
ctttgccctc 33600ataatctacc cctactttga tttgggatgg aacgcgatcg atgccatgaa
ttaccccacc 33660tttcccgcac ccgagataat tccactgcga caagttgtac ccgttgtcgt
taatcaacgc 33720cccccatccc ctacgcccac tgaaatcagc tactttaacc taacaggcgg
agatgactga 33780cgccctagat ctagaaatgg acggcatcag taccgagcag cgtctcctag
agaggcgcag 33840gcaggcggct gagcaagagc gcctcaatca ggagctccga gatctcgtta
acctgcacca 33900gtgcaaaaga ggcatctttt gtctggtaaa gcaggccaaa gtcacctacg
agaagaccgg 33960caacagccac cgcctcagtt acaaattgcc cacccagcgc cagaagctgg
tgctcatggt 34020gggtgagaat cccatcaccg tcacccagca ctcggtagag accgaggggt
gtctgcactc 34080cccctgtcgg ggtccagaag acctctgcac cctggtaaag accctgtgcg
gtctcagaga 34140tttagtcccc tttaactaat caaacactgg aatcaataaa aagaatcact
tacttaaaat 34200cagacagcag gtctctgtcc agtttattca gcagcacctc cttcccctcc
tcccaactct 34260ggtactccaa acgccttctg gcggcaaact tcctccacac cctgaaggga
atgtcagatt 34320cttgctcctg tccctccgca cccactatct tcatgttgtt gcagatgaag
cgcaccaaaa 34380cgtctgacga gagcttcaac cccgtgtacc cctatgacac ggaaagcggc
cctccctccg 34440tccctttcct cacccctccc ttcgtgtctc ccgatggatt ccaagaaagt
ccccccgggg 34500tcctgtctct gaacctggcc gagcccctgg tcacttccca cggcatgctc
gccctgaaaa 34560tgggaagtgg cctctccctg gacgacgctg gcaacctcac ctctcaagat
atcaccaccg 34620ctagccctcc cctcaaaaaa accaagacca acctcagcct agaaacctca
tcccccctaa 34680ctgtgagcac ctcaggcgcc ctcaccgtag cagccgccgc tcccctggcg
gtggccggca 34740cctccctcac catgcaatca gaggcccccc tgacagtaca ggatgcaaaa
ctcaccctgg 34800ccaccaaagg ccccctgacc gtgtctgaag gcaaactggc cttgcaaaca
tcggccccgc 34860tgacggccgc tgacagcagc accctcacag tcagtgccac accacccctt
agcacaagca 34920atggcagctt gggtattgac atgcaagccc ccatttacac caccaatgga
aaactaggac 34980ttaactttgg cgctcccctg catgtggtag acagcctaaa tgcactgact
gtagttactg 35040gccaaggtct tacgataaac ggaacagccc tacaaactag agtctcaggt
gccctcaact 35100atgacacatc aggaaaccta gaattgagag ctgcaggggg tatgcgagtt
gatgcaaatg 35160gtcaacttat ccttgatgta gcttacccat ttgatgcaca aaacaatctc
agccttaggc 35220ttggacaggg acccctgttt gttaactctg cccacaactt ggatgttaac
tacaacagag 35280gcctctacct gttcacatct ggaaatacca aaaagctaga agttaatatc
aaaacagcca 35340agggtctcat ttatgatgac actgctatag caatcaatgc gggtgatggg
ctacagtttg 35400actcaggctc agatacaaat ccattaaaaa ctaaacttgg attaggactg
gattatgact 35460ccagcagagc cataattgct aaactgggaa ctggcctaag ctttgacaac
acaggtgcca 35520tcacagtagg caacaaaaat gatgacaagc ttaccttgtg gaccacacca
gacccatccc 35580ctaactgtag aatctattca gagaaagatg ctaaattcac acttgttttg
actaaatgcg 35640gcagtcaggt gttggccagc gtttctgttt tatctgtaaa aggtagcctt
gcgcccatca 35700gtggcacagt aactagtgct cagattgtcc tcagatttga tgaaaatgga
gttctactaa 35760gcaattcttc ccttgaccct caatactgga actacagaaa aggtgacctt
acagagggca 35820ctgcatatac caacgcagtg ggatttatgc ccaacctcac agcataccca
aaaacacaga 35880gccaaactgc taaaagcaac attgtaagtc aggtttactt gaatggggac
aaatccaaac 35940ccatgaccct caccattacc ctcaatggaa ctaatgaaac aggagatgcc
acagtaagca 36000cttactccat gtcattctca tggaactgga atggaagtaa ttacattaat
gaaacgttcc 36060aaaccaactc cttcaccttc tcctacatcg cccaagaata aaaagcatga
cgctgttgat 36120ttgattcaat gtgtttctgt tttattttca agcacaacaa aatcattcaa
gtcattcttc 36180catcttagct taatagacac agtagcttaa tagacccagt agtgcaaagc
cccattctag 36240cttataacta gtggagaagt actcgcctac atgggggtag agtcataatc
gtgcatcagg 36300atagggcggt ggtgctgcag cagcgcgcga ataaactgct gccgccgccg
ctccgtcctg 36360caggaataca acatggcagt ggtctcctca gcgatgattc gcaccgcccg
cagcataagg 36420cgccttgtcc tccgggcaca gcagcgcacc ctgatctcac ttaaatcagc
acagtaactg 36480cagcacagca ccacaatatt gttcaaaatc ccacagtgca aggcgctgta
tccaaagctc 36540atggcgggga ccacagaacc cacgtggcca tcataccaca agcgcaggta
gattaagtgg 36600cgacccctca taaacacgct ggacataaac attacctctt ttggcatgtt
gtaattcacc 36660acctcccggt accatataaa cctctgatta aacatggcgc catccaccac
catcctaaac 36720cagctggcca aaacctgccc gccggctata cactgcaggg aaccgggact
ggaacaatga 36780cagtggagag cccaggactc gtaaccatgg atcatcatgc tcgtcatgat
atcaatgttg 36840gcacaacaca ggcacacgtg catacacttc ctcaggatta caagctcctc
ccgcgttaga 36900accatatccc agggaacaac ccattcctga atcagcgtaa atcccacact
gcagggaaga 36960cctcgcacgt aactcacgtt gtgcattgtc aaagtgttac attcgggcag
cagcggatga 37020tcctccagta tggtagcgcg ggtttctgtc tcaaaaggag gtagacgatc
cctactgtac 37080ggagtgcgcc gagacaaccg agatcgtgtt ggtcgtagtg tcatgccaaa
tggaacgccg 37140gacgtagtca tatttcctga agtcttagat ctctcaacgc agcaccagca
ccaacacttc 37200gcagtgtaaa aggccaagtg ccgagagagt atatatagga ataaaaagtg
acgtaaacgg 37260gcaaagtcca aaaaacgccc agaaaaaccg cacgcgaacc tacgccccga
aacgaaagcc 37320aaaaaacact agacactccc ttccggcgtc aacttccgct ttcccacgct
acgtcacttg 37380ccccagtcaa acaaactaca tatcccgaac ttccaagtcg ccacgcccaa
aacaccgcct 37440acacctcccc gcccgccggc ccgcccccaa acccgcctcc cgccccgcgc
cccgccccgc 37500gccgcccatc tcattatcat attggcttca atccaaaata aggtatatta
ttgatgatg 37559121109DNAArtificial Sequencesource/note="Description of
Artificial Sequence Synthetic polynucleotide" 12ggagttccgc
gttacataac ttacggtaaa tggcccgcct ggctgaccgc ccaacgaccc 60ccgcccattg
acgtcaataa tgacgtatgt tcccatagta acgccaatag ggactttcca 120ttgacgtcaa
tgggtggagt atttacggta aactgcccac ttggcagtac atcaagtgta 180tcatatgcca
agtacgcccc ctattgacgt caatgacggt aaatggcccg cctggcatta 240tgcccagtac
atgaccttat gggactttcc tacttggcag tacatctacg tattagtcat 300cgctattacc
atggtcgagg tgagccccac gttctgcttc actctcccca tctccccccc 360ctccccaccc
ccaattttgt atttatttat tttttaatta ttttgtgcag cgatgggggc 420gggggggggg
gggggcgcgc gccaggcggg gcggggcggg gcgaggggcg gggcggggcg 480aggcggagag
gtgcggcggc agccaatcag agcggcgcgc tccgaaagtt tccttttatg 540gcgaggcggc
ggcggcggcg gccctataaa aagcgaagcg ctccctatca gtgatagaga 600tctccctatc
agtgatagag atcgtcgacg agctcgcggc gggcgggagt cgctgcgcgc 660tgccttcgcc
ccgtgccccg ctccgccgcc gcctcgcgcc gcccgccccg gctctgactg 720accgcgttac
taaaacaggt aagtccggcc tccgcgccgg gttttggcgc ctcccgcggg 780cgcccccctc
ctcacggcga gcgctgccac gtcagacgaa gggcgcagcg agcgtcctga 840tccttccgcc
cggacgctca ggacagcggc ccgctgctca taagactcgg ccttagaacc 900ccagtatcag
cagaaggaca ttttaggacg ggacttgggt gactctaggg cactggtttt 960ctttccagag
agcggaacag gcgaggaaaa gtagtccctt ctcggcgatt ctgcggaggg 1020atctccgtgg
ggcggtgaac gccgatgatg cctctactaa ccatgttcat gttttctttt 1080tttttctaca
ggtcctgggt gacgaacag
11091333DNAArtificial Sequencesource/note="Description of Artificial
Sequence Synthetic primer" 13atacggacta gtggagaagt actcgcctac atg
331436DNAArtificial
Sequencesource/note="Description of Artificial Sequence Synthetic
primer" 14atacggaaga tctaagactt caggaaatat gactac
361546DNAArtificial Sequencesource/note="Description of Artificial
Sequence Synthetic primer" 15attcagtgta caggcgcgcc aaagcatgac
gctgttgatt tgattc 461635DNAArtificial
Sequencesource/note="Description of Artificial Sequence Synthetic
primer" 16actaggacta gttataagct agaatggggc tttgc
351779DNAArtificial Sequencesource/note="Description of Artificial
Sequence Synthetic primer" 17ttaatagaca cagtagctta atagacccag
tagtgcaaag ccccattcta gcttataacc 60cctatttgtt tatttttct
791893DNAArtificial
Sequencesource/note="Description of Artificial Sequence Synthetic
primer" 18atatatactc tctcggcact tggcctttta cactgcgaag tgttggtgct
ggtgctgcgt 60tgagagatct ttatttgtta actgttaatt gtc
931923DNAArtificial Sequencesource/note="Description of
Artificial Sequence Synthetic primer" 19ttaatagaca cagtagctta ata
232020DNAArtificial
Sequencesource/note="Description of Artificial Sequence Synthetic
primer" 20ggaagggagt gtctagtgtt
202125DNAArtificial Sequencesource/note="Description of Artificial
Sequence Synthetic primer" 21caatgggcgt ggatagcggt ttgac
252219DNAArtificial
Sequencesource/note="Description of Artificial Sequence Synthetic
primer" 22cagcatgcct gctattgtc
192329DNAArtificial Sequencesource/note="Description of Artificial
Sequence Synthetic primer" 23catctacgta ttagtcatcg ctattacca
292421DNAArtificial
Sequencesource/note="Description of Artificial Sequence Synthetic
primer" 24gacttggaaa tccccgtgag t
212525DNAArtificial Sequencesource/note="Description of Artificial
Sequence Synthetic probe" 25acatcaatgg gcgtggatag cggtt
2526592DNAWoodchuck hepatitis virus
26taatcaacct ctggattaca aaatttgtga aagattgact ggtattctta actatgttgc
60tccttttacg ctatgtggat acgctgcttt aatgcctttg tatcatgcta ttgcttcccg
120tatggctttc attttctcct ccttgtataa atcctggttg ctgtctcttt atgaggagtt
180gtggcccgtt gtcaggcaac gtggcgtggt gtgcactgtg tttgctgacg caacccccac
240tggttggggc attgccacca cctgtcagct cctttccggg actttcgctt tccccctccc
300tattgccacg gcggaactca tcgccgcctg ccttgcccgc tgctggacag gggctcggct
360gttgggcact gacaattccg tggtgttgtc ggggaaatca tcgtcctttc cttggctgct
420cgcctgtgtt gccacctgga ttctgcgcgg gacgtccttc tgctacgtcc cttcggccct
480caatccagcg gaccttcctt cccgcggcct gctgccggct ctgcggcctc ttccgcgtct
540tcgccttcgc cctcagacga gtcggatctc cctttgggcc gcctccccgc ct
59227543PRTSimian adenovirus 27Met Lys Arg Thr Lys Thr Ser Asp Glu Ser
Phe Asn Pro Val Tyr Pro1 5 10
15Tyr Asp Thr Glu Ser Gly Pro Pro Ser Val Pro Phe Leu Thr Pro Pro
20 25 30Phe Val Ser Pro Asp Gly
Phe Gln Glu Ser Pro Pro Gly Val Leu Ser 35 40
45Leu Asn Leu Ala Glu Pro Leu Val Thr Ser His Gly Met Leu
Ala Leu 50 55 60Lys Met Gly Ser Gly
Leu Ser Leu Asp Asp Ala Gly Asn Leu Thr Ser65 70
75 80Gln Asp Ile Thr Thr Ala Ser Pro Pro Leu
Lys Lys Thr Lys Thr Asn 85 90
95Leu Ser Leu Glu Thr Ser Ser Pro Leu Thr Val Ser Thr Ser Gly Ala
100 105 110Leu Thr Val Ala Ala
Ala Ala Pro Leu Ala Val Ala Gly Thr Ser Leu 115
120 125Thr Met Gln Ser Glu Ala Pro Leu Thr Val Gln Asp
Ala Lys Leu Thr 130 135 140Leu Ala Thr
Lys Gly Pro Leu Thr Val Ser Glu Gly Lys Leu Ala Leu145
150 155 160Gln Thr Ser Ala Pro Leu Thr
Ala Ala Asp Ser Ser Thr Leu Thr Val 165
170 175Ser Ala Thr Pro Pro Ile Asn Val Ser Ser Gly Ser
Leu Gly Leu Asp 180 185 190Met
Glu Asp Pro Met Tyr Thr His Asp Gly Lys Leu Gly Ile Arg Ile 195
200 205Gly Gly Pro Leu Arg Val Val Asp Ser
Leu His Thr Leu Thr Val Val 210 215
220Thr Gly Asn Gly Leu Thr Val Asp Asn Asn Ala Leu Gln Thr Arg Val225
230 235 240Thr Gly Ala Leu
Gly Tyr Asp Thr Ser Gly Asn Leu Gln Leu Arg Ala 245
250 255Ala Gly Gly Met Arg Ile Asp Ala Asn Gly
Gln Leu Ile Leu Asn Val 260 265
270Ala Tyr Pro Phe Asp Ala Gln Asn Asn Leu Ser Leu Arg Leu Gly Gln
275 280 285Gly Pro Leu Tyr Ile Asn Thr
Asp His Asn Leu Asp Leu Asn Cys Asn 290 295
300Arg Gly Leu Thr Thr Thr Thr Thr Asn Asn Thr Lys Lys Leu Glu
Thr305 310 315 320Lys Ile
Ser Ser Gly Leu Asp Tyr Asp Thr Asn Gly Ala Val Ile Ile
325 330 335Lys Leu Gly Thr Gly Leu Ser
Phe Asp Asn Thr Gly Ala Leu Thr Val 340 345
350Gly Asn Thr Gly Asp Asp Lys Leu Thr Leu Trp Thr Thr Pro
Asp Pro 355 360 365Ser Pro Asn Cys
Arg Ile His Ser Asp Lys Asp Cys Lys Phe Thr Leu 370
375 380Val Leu Thr Lys Cys Gly Ser Gln Ile Leu Ala Ser
Val Ala Ala Leu385 390 395
400Ala Val Ser Gly Asn Leu Ala Ser Ile Thr Gly Thr Val Ala Ser Val
405 410 415Thr Ile Phe Leu Arg
Phe Asp Gln Asn Gly Val Leu Met Glu Asn Ser 420
425 430Ser Leu Asp Arg Gln Tyr Trp Asn Phe Arg Asn Gly
Asn Ser Thr Asn 435 440 445Ala Ala
Pro Tyr Thr Asn Ala Val Gly Phe Met Pro Asn Leu Ala Ala 450
455 460Tyr Pro Lys Thr Gln Ser Gln Thr Ala Lys Asn
Asn Ile Val Ser Gln465 470 475
480Val Tyr Leu Asn Gly Asp Lys Ser Lys Pro Met Thr Leu Thr Ile Thr
485 490 495Leu Asn Gly Thr
Asn Glu Ser Ser Glu Thr Ser Gln Val Ser His Tyr 500
505 510Ser Met Ser Phe Thr Trp Ala Trp Glu Ser Gly
Gln Tyr Ala Thr Glu 515 520 525Thr
Phe Ala Thr Asn Ser Phe Thr Phe Ser Tyr Ile Ala Glu Gln 530
535 54028541PRTSimian adenovirus 28Met Lys Arg Ala
Lys Thr Ser Asp Glu Thr Phe Asn Pro Val Tyr Pro1 5
10 15Tyr Asp Thr Glu Asn Gly Pro Pro Ser Val
Pro Phe Leu Thr Pro Pro 20 25
30Phe Val Ser Pro Asp Gly Phe Gln Glu Ser Pro Pro Gly Val Leu Ser
35 40 45Leu Arg Leu Ser Glu Pro Leu Val
Thr Ser His Gly Met Leu Ala Leu 50 55
60Lys Met Gly Asn Gly Leu Ser Leu Asp Asp Ala Gly Asn Leu Thr Ser65
70 75 80Gln Asp Val Thr Thr
Val Thr Pro Pro Leu Lys Lys Thr Lys Thr Asn 85
90 95Leu Ser Leu Gln Thr Ser Ala Pro Leu Thr Val
Ser Ser Gly Ser Leu 100 105
110Thr Val Ala Ala Ala Ala Pro Leu Ala Val Ala Gly Thr Ser Leu Thr
115 120 125Met Gln Ser Gln Ala Pro Leu
Thr Val Gln Asp Ala Lys Leu Gly Leu 130 135
140Ala Thr Gln Gly Pro Leu Thr Val Ser Glu Gly Lys Leu Thr Leu
Gln145 150 155 160Thr Ser
Ala Pro Leu Thr Ala Ala Asp Ser Ser Thr Leu Thr Val Gly
165 170 175Thr Thr Pro Pro Ile Ser Val
Ser Ser Gly Ser Leu Gly Leu Asp Met 180 185
190Glu Asp Pro Met Tyr Thr His Asp Gly Lys Leu Gly Ile Arg
Ile Gly 195 200 205Gly Pro Leu Gln
Val Val Asp Ser Leu His Thr Leu Thr Val Val Thr 210
215 220Gly Asn Gly Ile Thr Val Ala Asn Asn Ala Leu Gln
Thr Lys Val Ala225 230 235
240Gly Ala Leu Gly Tyr Asp Ser Ser Gly Asn Leu Glu Leu Arg Ala Ala
245 250 255Gly Gly Met Arg Ile
Asn Thr Gly Gly Gln Leu Ile Leu Asp Val Ala 260
265 270Tyr Pro Phe Asp Ala Gln Asn Asn Leu Ser Leu Arg
Leu Gly Gln Gly 275 280 285Pro Leu
Tyr Val Asn Thr Asn His Asn Leu Asp Leu Asn Cys Asn Arg 290
295 300Gly Leu Thr Thr Thr Thr Ser Ser Asn Thr Thr
Lys Leu Glu Thr Lys305 310 315
320Ile Asp Ser Gly Leu Asp Tyr Asn Ala Asn Gly Ala Ile Ile Ala Lys
325 330 335Leu Gly Thr Gly
Leu Thr Phe Asp Asn Thr Gly Ala Ile Thr Val Gly 340
345 350Asn Thr Gly Asp Asp Lys Leu Thr Leu Trp Thr
Thr Pro Asp Pro Ser 355 360 365Pro
Asn Cys Arg Ile His Ala Asp Lys Asp Lys Phe Thr Leu Val Leu 370
375 380Thr Lys Cys Gly Ser Gln Ile Leu Ala Ser
Val Ala Ala Leu Ala Val385 390 395
400Ser Gly Asn Leu Ser Ser Met Thr Gly Thr Val Ser Ser Val Thr
Ile 405 410 415Phe Leu Arg
Phe Asp Gln Asn Gly Val Leu Met Glu Asn Ser Ser Leu 420
425 430Asp Lys Glu Tyr Trp Asn Phe Arg Asn Gly
Asn Ser Thr Asn Ala Thr 435 440
445Pro Tyr Thr Asn Ala Val Gly Phe Met Pro Asn Leu Ser Ala Tyr Pro 450
455 460Lys Thr Gln Ser Gln Thr Ala Lys
Asn Asn Ile Val Ser Glu Val Tyr465 470
475 480Leu His Gly Asp Lys Ser Lys Pro Met Ile Leu Thr
Ile Thr Leu Asn 485 490
495Gly Thr Asn Glu Ser Ser Glu Thr Ser Gln Val Ser His Tyr Ser Met
500 505 510Ser Phe Thr Trp Ser Trp
Asp Ser Gly Lys Tyr Ala Thr Glu Thr Phe 515 520
525Ala Thr Asn Ser Phe Thr Phe Ser Tyr Ile Ala Glu Gln
530 535 54029543PRTSimian adenovirus
29Met Lys Arg Thr Lys Thr Ser Asp Glu Ser Phe Asn Pro Val Tyr Pro1
5 10 15Tyr Asp Thr Glu Ser Gly
Pro Pro Ser Val Pro Phe Leu Thr Pro Pro 20 25
30Phe Val Ser Pro Asp Gly Phe Gln Glu Ser Pro Pro Gly
Val Leu Ser 35 40 45Leu Asn Leu
Ala Glu Pro Leu Val Thr Ser His Gly Met Leu Ala Leu 50
55 60Lys Met Gly Ser Gly Leu Ser Leu Asp Asp Ala Gly
Asn Leu Thr Ser65 70 75
80Gln Asp Ile Thr Ser Thr Thr Pro Pro Leu Lys Lys Thr Lys Thr Asn
85 90 95Leu Ser Leu Glu Thr Ser
Ser Pro Leu Thr Val Ser Thr Ser Gly Ala 100
105 110Leu Thr Val Ala Ala Ala Ala Pro Leu Ala Val Ala
Gly Thr Ser Leu 115 120 125Thr Met
Gln Ser Glu Ala Pro Leu Ala Val Gln Asp Ala Lys Leu Thr 130
135 140Leu Ala Thr Lys Gly Pro Leu Thr Val Ser Glu
Gly Lys Leu Ala Leu145 150 155
160Gln Thr Ser Ala Pro Leu Thr Ala Ala Asp Ser Ser Thr Leu Thr Val
165 170 175Ser Ser Thr Pro
Pro Ile Ser Val Ser Ser Gly Ser Leu Gly Leu Asp 180
185 190Met Glu Asp Pro Met Tyr Thr His Asp Gly Lys
Leu Gly Ile Arg Ile 195 200 205Gly
Gly Pro Leu Arg Val Val Asp Ser Leu His Thr Leu Thr Val Val 210
215 220Thr Gly Asn Gly Leu Thr Val Asp Asn Asn
Ala Leu Gln Thr Arg Val225 230 235
240Thr Gly Ala Leu Gly Tyr Asp Thr Ser Gly Asn Leu Gln Leu Arg
Ala 245 250 255Ala Gly Gly
Met Arg Ile Asp Ala Asn Gly Gln Leu Ile Leu Asp Val 260
265 270Ala Tyr Pro Phe Asp Ala Gln Asn Asn Leu
Ser Leu Arg Leu Gly Gln 275 280
285Gly Pro Leu Tyr Val Asn Thr Asp His Asn Leu Asp Leu Asn Cys Asn 290
295 300Arg Gly Leu Thr Thr Thr Thr Thr
Asn Asn Thr Lys Lys Leu Glu Thr305 310
315 320Lys Ile Ser Ser Gly Leu Asp Tyr Asp Thr Asn Gly
Ala Val Ile Ile 325 330
335Lys Leu Gly Thr Gly Leu Ser Phe Asp Asn Thr Gly Ala Leu Thr Val
340 345 350Gly Asn Thr Gly Asp Asp
Lys Leu Thr Leu Trp Thr Thr Pro Asp Pro 355 360
365Ser Pro Asn Cys Arg Ile His Ser Asp Lys Asp Cys Lys Phe
Thr Leu 370 375 380Val Leu Thr Lys Cys
Gly Ser Gln Ile Leu Ala Ser Val Ala Ala Leu385 390
395 400Ala Val Ser Gly Asn Leu Ala Ser Ile Thr
Gly Thr Val Ala Ser Val 405 410
415Thr Ile Phe Leu Arg Phe Asp Gln Asn Gly Val Leu Met Glu Asn Ser
420 425 430Ser Leu Asp Lys Gln
Tyr Trp Asn Phe Arg Asn Gly Asn Ser Thr Asn 435
440 445Ala Ala Pro Tyr Thr Asn Ala Val Gly Phe Met Pro
Asn Leu Ala Ala 450 455 460Tyr Pro Lys
Thr Gln Ser Gln Thr Ala Lys Asn Asn Ile Val Ser Gln465
470 475 480Val Tyr Leu Asn Gly Asp Lys
Ser Lys Pro Met Thr Leu Thr Ile Thr 485
490 495Leu Asn Gly Thr Asn Glu Ser Ser Glu Thr Ser Gln
Val Ser His Tyr 500 505 510Ser
Met Ser Phe Thr Trp Ala Trp Glu Ser Gly Gln Tyr Ala Thr Glu 515
520 525Thr Phe Ala Thr Asn Ser Phe Thr Phe
Ser Tyr Ile Ala Glu Gln 530 535
54030543PRTSimian adenovirus 30Met Lys Arg Thr Lys Thr Ser Asp Lys Ser
Phe Asn Pro Val Tyr Pro1 5 10
15Tyr Asp Thr Glu Asn Gly Pro Pro Ser Val Pro Phe Leu Thr Pro Pro
20 25 30Phe Val Ser Pro Asp Gly
Phe Gln Glu Ser Pro Pro Gly Val Leu Ser 35 40
45Leu Asn Leu Ala Glu Pro Leu Val Thr Ser His Gly Met Leu
Ala Leu 50 55 60Lys Met Gly Ser Gly
Leu Ser Leu Asp Asp Ala Gly Asn Leu Thr Ser65 70
75 80Gln Asp Val Thr Thr Thr Thr Pro Pro Leu
Lys Lys Thr Lys Thr Asn 85 90
95Leu Ser Leu Glu Thr Ser Ala Pro Leu Thr Val Ser Thr Ser Gly Ala
100 105 110Leu Thr Leu Ala Ala
Ala Ala Pro Leu Ala Val Ala Gly Thr Ser Leu 115
120 125Thr Met Gln Ser Glu Ala Pro Leu Thr Val Gln Asp
Ala Lys Leu Thr 130 135 140Leu Ala Thr
Lys Gly Pro Leu Thr Val Ser Glu Gly Lys Leu Ala Leu145
150 155 160Gln Thr Ser Ala Pro Leu Thr
Ala Ala Asp Ser Ser Thr Leu Thr Val 165
170 175Ser Ala Thr Pro Pro Ile Ser Val Ser Ser Gly Ser
Leu Gly Leu Asp 180 185 190Met
Glu Asp Pro Met Tyr Thr His Asp Gly Lys Leu Gly Ile Arg Ile 195
200 205Gly Gly Pro Leu Arg Val Val Asp Ser
Leu His Thr Leu Thr Val Val 210 215
220Thr Gly Asn Gly Ile Ala Val Asp Asn Asn Ala Leu Gln Thr Arg Val225
230 235 240Thr Gly Ala Leu
Gly Tyr Asp Thr Ser Gly Asn Leu Gln Leu Arg Ala 245
250 255Ala Gly Gly Met Arg Ile Asp Ala Asn Gly
Gln Leu Ile Leu Asp Val 260 265
270Ala Tyr Pro Phe Asp Ala Gln Asn Asn Leu Ser Leu Arg Leu Gly Gln
275 280 285Gly Pro Leu Tyr Val Asn Thr
Asp His Asn Leu Asp Leu Asn Cys Asn 290 295
300Arg Gly Leu Thr Thr Thr Thr Thr Asn Asn Thr Lys Lys Leu Glu
Thr305 310 315 320Lys Ile
Gly Ser Gly Leu Asp Tyr Asp Thr Asn Gly Ala Val Ile Ile
325 330 335Lys Leu Gly Thr Gly Val Ser
Phe Asp Ser Thr Gly Ala Leu Ser Val 340 345
350Gly Asn Thr Gly Asp Asp Lys Leu Thr Leu Trp Thr Thr Pro
Asp Pro 355 360 365Ser Pro Asn Cys
Arg Ile His Ser Asp Lys Asp Cys Lys Phe Thr Leu 370
375 380Val Leu Thr Lys Cys Gly Ser Gln Ile Leu Ala Ser
Val Ala Ala Leu385 390 395
400Ala Val Ser Gly Asn Leu Ala Ser Ile Thr Gly Thr Val Ser Ser Val
405 410 415Thr Ile Phe Leu Arg
Phe Asp Gln Asn Gly Val Leu Met Glu Asn Ser 420
425 430Ser Leu Asp Lys Gln Tyr Trp Asn Phe Arg Asn Gly
Asn Ser Thr Asn 435 440 445Ala Thr
Pro Tyr Thr Asn Ala Val Gly Phe Met Pro Asn Leu Ala Ala 450
455 460Tyr Pro Lys Thr Gln Ser Gln Thr Ala Lys Asn
Asn Ile Val Ser Gln465 470 475
480Val Tyr Leu Asn Gly Asp Lys Ser Lys Pro Met Thr Leu Thr Ile Thr
485 490 495Leu Asn Gly Thr
Asn Glu Ser Ser Glu Thr Ser Gln Val Ser His Tyr 500
505 510Ser Met Ser Phe Thr Trp Ala Trp Glu Ser Gly
Gln Tyr Ala Thr Glu 515 520 525Thr
Phe Ala Thr Asn Ser Phe Thr Phe Ser Tyr Ile Ala Glu Gln 530
535 54031543PRTSimian adenovirus 31Met Lys Arg Thr
Lys Thr Ser Asp Glu Ser Phe Asn Pro Val Tyr Pro1 5
10 15Tyr Asp Thr Glu Asn Gly Pro Pro Ser Val
Pro Phe Leu Thr Pro Pro 20 25
30Phe Val Ser Pro Asp Gly Phe Gln Glu Ser Pro Pro Gly Val Leu Ser
35 40 45Leu Asn Leu Ala Glu Pro Leu Val
Thr Ser His Gly Met Leu Ala Leu 50 55
60Lys Met Gly Ser Gly Leu Ser Leu Asp Asp Ala Gly Asn Leu Thr Ser65
70 75 80Gln Asp Val Thr Thr
Thr Thr Pro Pro Leu Lys Lys Thr Lys Thr Asn 85
90 95Leu Ser Leu Glu Thr Ser Ala Pro Leu Thr Val
Ser Thr Ser Gly Ala 100 105
110Leu Thr Leu Ala Ala Ala Ala Pro Leu Ala Val Ala Gly Thr Ser Leu
115 120 125Thr Met Gln Ser Glu Ala Pro
Leu Thr Val Gln Asp Ala Lys Leu Thr 130 135
140Leu Ala Thr Lys Gly Pro Leu Thr Val Ser Glu Gly Lys Leu Ala
Leu145 150 155 160Gln Thr
Ser Ala Pro Leu Thr Ala Ala Asp Ser Ser Thr Leu Thr Val
165 170 175Ser Ala Thr Pro Pro Ile Asn
Val Ser Ser Gly Ser Leu Gly Leu Asp 180 185
190Met Glu Asn Pro Met Tyr Thr His Asp Gly Lys Leu Gly Ile
Arg Ile 195 200 205Gly Gly Pro Leu
Arg Val Val Asp Ser Leu His Thr Leu Thr Val Val 210
215 220Thr Gly Asn Gly Ile Ala Val Asp Asn Asn Ala Leu
Gln Thr Arg Val225 230 235
240Thr Gly Ala Leu Gly Tyr Asp Thr Ser Gly Asn Leu Gln Leu Arg Ala
245 250 255Ala Gly Gly Met Arg
Ile Asp Ala Asn Gly Gln Leu Ile Leu Asp Val 260
265 270Ala Tyr Pro Phe Asp Ala Gln Asn Asn Leu Ser Leu
Arg Leu Gly Gln 275 280 285Gly Pro
Leu Tyr Val Asn Thr Asp His Asn Leu Asp Leu Asn Cys Asn 290
295 300Arg Gly Leu Thr Thr Thr Thr Thr Asn Asn Thr
Lys Lys Leu Glu Thr305 310 315
320Lys Ile Gly Ser Gly Leu Asp Tyr Asp Thr Asn Gly Ala Val Ile Ile
325 330 335Lys Leu Gly Thr
Gly Val Ser Phe Asp Ser Thr Gly Ala Leu Ser Val 340
345 350Gly Asn Thr Gly Asp Asp Lys Leu Thr Leu Trp
Thr Thr Pro Asp Pro 355 360 365Ser
Pro Asn Cys Arg Ile His Ser Asp Lys Asp Cys Lys Phe Thr Leu 370
375 380Val Leu Thr Lys Cys Gly Ser Gln Ile Leu
Ala Ser Val Ala Ala Leu385 390 395
400Ala Val Ser Gly Asn Leu Ala Ser Ile Thr Gly Thr Val Ser Ser
Val 405 410 415Thr Ile Phe
Leu Arg Phe Asp Gln Asn Gly Val Leu Met Glu Asn Ser 420
425 430Ser Leu Asp Lys Gln Tyr Trp Asn Phe Arg
Asn Gly Asn Ser Thr Asn 435 440
445Ala Thr Pro Tyr Thr Asn Ala Val Gly Phe Met Pro Asn Leu Ala Ala 450
455 460Tyr Pro Lys Thr Gln Ser Gln Thr
Ala Lys Asn Asn Ile Val Ser Gln465 470
475 480Val Tyr Leu Asn Gly Asp Lys Ser Lys Pro Met Ile
Leu Thr Ile Thr 485 490
495Leu Asn Gly Thr Asn Glu Ser Ser Glu Thr Ser Gln Val Ser His Tyr
500 505 510Ser Met Ser Phe Thr Trp
Ala Trp Glu Ser Gly Gln Tyr Ala Thr Glu 515 520
525Thr Phe Ala Thr Asn Ser Phe Thr Phe Ser Tyr Ile Ala Glu
Gln 530 535 54032578PRTSimian
adenovirus 32Met Lys Arg Thr Lys Thr Ser Asp Glu Ser Phe Asn Pro Val Tyr
Pro1 5 10 15Tyr Asp Thr
Glu Asn Gly Pro Pro Ser Val Pro Phe Leu Thr Pro Pro 20
25 30Phe Val Ser Pro Asp Gly Phe Gln Glu Ser
Pro Pro Gly Val Leu Ser 35 40
45Leu Asn Leu Ala Glu Pro Leu Val Thr Ser His Gly Met Leu Ala Leu 50
55 60Lys Met Gly Ser Gly Leu Ser Leu Asp
Asp Ala Gly Asn Leu Thr Ser65 70 75
80Gln Asp Val Thr Thr Thr Thr Pro Pro Leu Lys Lys Thr Lys
Thr Asn 85 90 95Leu Ser
Leu Glu Thr Ser Ala Pro Leu Thr Val Ser Thr Ser Gly Ala 100
105 110Leu Thr Leu Ala Ala Ala Val Pro Leu
Ala Val Ala Gly Thr Ser Leu 115 120
125Thr Met Gln Ser Glu Ala Pro Leu Thr Val Gln Asp Ala Lys Leu Thr
130 135 140Leu Ala Thr Lys Gly Pro Leu
Thr Val Ser Glu Gly Lys Leu Ala Leu145 150
155 160Gln Thr Ser Ala Pro Leu Thr Ala Ala Asp Ser Ser
Thr Leu Thr Ile 165 170
175Ser Ala Thr Pro Pro Leu Ser Thr Ser Asn Gly Ser Leu Gly Ile Asp
180 185 190Met Gln Ala Pro Ile Tyr
Thr Thr Asn Gly Lys Leu Gly Leu Asn Phe 195 200
205Gly Ala Pro Leu His Val Val Asp Ser Leu Asn Ala Leu Thr
Val Val 210 215 220Thr Gly Gln Gly Leu
Thr Ile Asn Gly Thr Ala Leu Gln Thr Arg Val225 230
235 240Ser Gly Ala Leu Asn Tyr Asp Ser Ser Gly
Asn Leu Glu Leu Arg Ala 245 250
255Ala Gly Gly Met Arg Val Asp Ala Asn Gly Lys Leu Ile Leu Asp Val
260 265 270Ala Tyr Pro Phe Asp
Ala Gln Asn Asn Leu Ser Leu Arg Leu Gly Gln 275
280 285Gly Pro Leu Phe Val Asn Ser Ala His Asn Leu Asp
Val Asn Tyr Asn 290 295 300Arg Gly Leu
Tyr Leu Phe Thr Ser Gly Asn Thr Lys Lys Leu Glu Val305
310 315 320Asn Ile Lys Thr Ala Lys Gly
Leu Ile Tyr Asp Asp Thr Ala Ile Ala 325
330 335Ile Asn Pro Gly Asp Gly Leu Glu Phe Gly Ser Gly
Ser Asp Thr Asn 340 345 350Pro
Leu Lys Thr Lys Leu Gly Leu Gly Leu Glu Tyr Asp Ser Ser Arg 355
360 365Ala Ile Ile Ala Lys Leu Gly Thr Gly
Leu Ser Phe Asp Asn Thr Gly 370 375
380Ala Ile Thr Val Gly Asn Lys Asn Asp Asp Lys Leu Thr Leu Trp Thr385
390 395 400Thr Pro Asp Pro
Ser Pro Asn Cys Arg Ile Tyr Ser Glu Lys Asp Ala 405
410 415Lys Phe Thr Leu Val Leu Thr Lys Cys Gly
Ser Gln Val Leu Ala Ser 420 425
430Val Ser Val Leu Ser Val Lys Gly Ser Leu Ala Pro Ile Ser Gly Thr
435 440 445Val Thr Ser Ala Gln Ile Ile
Leu Arg Phe Asp Glu Asn Gly Val Leu 450 455
460Leu Ser Asn Ser Ser Leu Asp Pro Gln Tyr Trp Asn Tyr Arg Lys
Gly465 470 475 480Asp Leu
Thr Glu Gly Thr Ala Tyr Thr Asn Ala Val Gly Phe Met Pro
485 490 495Asn Leu Thr Ala Tyr Pro Lys
Thr Gln Ser Gln Thr Ala Lys Ser Asn 500 505
510Ile Val Ser Gln Val Tyr Leu Asn Gly Asp Lys Ser Lys Pro
Met Ile 515 520 525Leu Thr Ile Thr
Leu Asn Gly Thr Asn Glu Thr Gly Asp Ala Thr Val 530
535 540Ser Thr Tyr Ser Met Ser Phe Ser Trp Asn Trp Asn
Gly Ser Asn Tyr545 550 555
560Ile Asn Glu Thr Phe Gln Thr Asn Ser Phe Thr Phe Ser Tyr Ile Ala
565 570 575Gln Glu33578PRTSimian
adenovirus 33Met Lys Arg Thr Lys Thr Ser Asp Glu Ser Phe Asn Pro Val Tyr
Pro1 5 10 15Tyr Asp Thr
Glu Ser Gly Pro Pro Ser Val Pro Phe Leu Thr Pro Pro 20
25 30Phe Val Ser Pro Asp Gly Phe Gln Glu Ser
Pro Pro Gly Val Leu Ser 35 40
45Leu Asn Leu Ala Glu Pro Leu Val Thr Ser His Gly Met Leu Ala Leu 50
55 60Lys Met Gly Ser Gly Leu Ser Leu Asp
Asp Ala Gly Asn Leu Thr Ser65 70 75
80Gln Asp Ile Thr Thr Ala Ser Pro Pro Leu Lys Lys Thr Lys
Thr Asn 85 90 95Leu Ser
Leu Glu Thr Ser Ser Pro Leu Thr Val Ser Thr Ser Gly Ala 100
105 110Leu Thr Val Ala Ala Ala Ala Pro Leu
Ala Val Ala Gly Thr Ser Leu 115 120
125Thr Met Gln Ser Glu Ala Pro Leu Thr Val Gln Asp Ala Lys Leu Thr
130 135 140Leu Ala Thr Lys Gly Pro Leu
Thr Val Ser Glu Gly Lys Leu Ala Leu145 150
155 160Gln Thr Ser Ala Pro Leu Thr Ala Ala Asp Ser Ser
Thr Leu Thr Val 165 170
175Ser Ala Thr Pro Pro Leu Ser Thr Ser Asn Gly Ser Leu Gly Ile Asp
180 185 190Met Gln Ala Pro Ile Tyr
Thr Thr Asn Gly Lys Leu Gly Leu Asn Phe 195 200
205Gly Ala Pro Leu His Val Val Asp Ser Leu Asn Ala Leu Thr
Val Val 210 215 220Thr Gly Gln Gly Leu
Thr Ile Asn Gly Thr Ala Leu Gln Thr Arg Val225 230
235 240Ser Gly Ala Leu Asn Tyr Asp Thr Ser Gly
Asn Leu Glu Leu Arg Ala 245 250
255Ala Gly Gly Met Arg Val Asp Ala Asn Gly Gln Leu Ile Leu Asp Val
260 265 270Ala Tyr Pro Phe Asp
Ala Gln Asn Asn Leu Ser Leu Arg Leu Gly Gln 275
280 285Gly Pro Leu Phe Val Asn Ser Ala His Asn Leu Asp
Val Asn Tyr Asn 290 295 300Arg Gly Leu
Tyr Leu Phe Thr Ser Gly Asn Thr Lys Lys Leu Glu Val305
310 315 320Asn Ile Lys Thr Ala Lys Gly
Leu Ile Tyr Asp Asp Thr Ala Ile Ala 325
330 335Ile Asn Ala Gly Asp Gly Leu Gln Phe Asp Ser Gly
Ser Asp Thr Asn 340 345 350Pro
Leu Lys Thr Lys Leu Gly Leu Gly Leu Asp Tyr Asp Ser Ser Arg 355
360 365Ala Ile Ile Ala Lys Leu Gly Thr Gly
Leu Ser Phe Asp Asn Thr Gly 370 375
380Ala Ile Thr Val Gly Asn Lys Asn Asp Asp Lys Leu Thr Leu Trp Thr385
390 395 400Thr Pro Asp Pro
Ser Pro Asn Cys Arg Ile Tyr Ser Glu Lys Asp Ala 405
410 415Lys Phe Thr Leu Val Leu Thr Lys Cys Gly
Ser Gln Val Leu Ala Ser 420 425
430Val Ser Val Leu Ser Val Lys Gly Ser Leu Ala Pro Ile Ser Gly Thr
435 440 445Val Thr Ser Ala Gln Ile Val
Leu Arg Phe Asp Glu Asn Gly Val Leu 450 455
460Leu Ser Asn Ser Ser Leu Asp Pro Gln Tyr Trp Asn Tyr Arg Lys
Gly465 470 475 480Asp Leu
Thr Glu Gly Thr Ala Tyr Thr Asn Ala Val Gly Phe Met Pro
485 490 495Asn Leu Thr Ala Tyr Pro Lys
Thr Gln Ser Gln Thr Ala Lys Ser Asn 500 505
510Ile Val Ser Gln Val Tyr Leu Asn Gly Asp Lys Ser Lys Pro
Met Thr 515 520 525Leu Thr Ile Thr
Leu Asn Gly Thr Asn Glu Thr Gly Asp Ala Thr Val 530
535 540Ser Thr Tyr Ser Met Ser Phe Ser Trp Asn Trp Asn
Gly Ser Asn Tyr545 550 555
560Ile Asn Glu Thr Phe Gln Thr Asn Ser Phe Thr Phe Ser Tyr Ile Ala
565 570 575Gln Glu34578PRTSimian
adenovirus 34Met Lys Arg Thr Lys Thr Ser Asp Glu Ser Phe Asn Pro Val Tyr
Pro1 5 10 15Tyr Asp Thr
Glu Ser Gly Pro Pro Ser Val Pro Phe Leu Thr Pro Pro 20
25 30Phe Val Ser Pro Asp Gly Phe Gln Glu Ser
Pro Pro Gly Val Leu Ser 35 40
45Leu Asn Leu Ala Glu Pro Leu Val Thr Ser His Gly Met Leu Ala Leu 50
55 60Lys Met Gly Ser Gly Leu Ser Leu Asp
Asp Ala Gly Asn Leu Thr Ser65 70 75
80Gln Asp Ile Thr Thr Ala Ser Pro Pro Leu Lys Lys Thr Lys
Thr Asn 85 90 95Leu Ser
Leu Glu Thr Ser Ser Pro Leu Thr Val Ser Thr Ser Gly Ala 100
105 110Leu Thr Val Ala Ala Ala Ala Pro Leu
Ala Val Ala Gly Thr Ser Leu 115 120
125Thr Met Gln Ser Glu Ala Pro Leu Thr Val Gln Asp Ala Lys Leu Thr
130 135 140Leu Ala Thr Lys Gly Pro Leu
Thr Val Ser Glu Gly Lys Leu Ala Leu145 150
155 160Gln Thr Ser Ala Pro Leu Thr Ala Ala Asp Ser Ser
Thr Leu Thr Val 165 170
175Ser Ala Thr Pro Pro Leu Ser Thr Ser Asn Gly Ser Leu Gly Ile Asp
180 185 190Met Gln Ala Pro Ile Tyr
Thr Thr Asn Gly Lys Leu Gly Leu Asn Phe 195 200
205Gly Ala Pro Leu His Val Val Asp Ser Leu Asn Ala Leu Thr
Val Val 210 215 220Thr Gly Gln Gly Leu
Thr Ile Asn Gly Thr Ala Leu Gln Thr Arg Val225 230
235 240Ser Gly Ala Leu Asn Tyr Asp Thr Ser Gly
Asn Leu Glu Leu Arg Ala 245 250
255Ala Gly Gly Met Arg Val Asp Ala Asn Gly Gln Leu Ile Leu Asp Val
260 265 270Ala Tyr Pro Phe Asp
Ala Gln Asn Asn Leu Ser Leu Arg Leu Gly Gln 275
280 285Gly Pro Leu Phe Val Asn Ser Ala His Asn Leu Asp
Val Asn Tyr Asn 290 295 300Arg Gly Leu
Tyr Leu Phe Thr Ser Gly Asn Thr Lys Lys Leu Glu Val305
310 315 320Asn Ile Lys Thr Ala Lys Gly
Leu Ile Tyr Asp Asp Thr Ala Ile Ala 325
330 335Ile Asn Ala Gly Asp Gly Leu Gln Phe Asp Ser Gly
Ser Asp Thr Asn 340 345 350Pro
Leu Lys Thr Lys Leu Gly Leu Gly Leu Asp Tyr Asp Ser Ser Arg 355
360 365Ala Ile Ile Ala Lys Leu Gly Thr Gly
Leu Ser Phe Asp Asn Thr Gly 370 375
380Ala Ile Thr Val Gly Asn Lys Asn Asp Asp Lys Leu Thr Leu Trp Thr385
390 395 400Thr Pro Asp Pro
Ser Pro Asn Cys Arg Ile Tyr Ser Glu Lys Asp Ala 405
410 415Lys Phe Thr Leu Val Leu Thr Lys Cys Gly
Ser Gln Val Leu Ala Ser 420 425
430Val Ser Val Leu Ser Val Lys Gly Ser Leu Ala Pro Ile Ser Gly Thr
435 440 445Val Thr Ser Ala Gln Ile Val
Leu Arg Phe Asp Glu Asn Gly Val Leu 450 455
460Leu Ser Asn Ser Ser Leu Asp Pro Gln Tyr Trp Asn Tyr Arg Lys
Gly465 470 475 480Asp Leu
Thr Glu Gly Thr Ala Tyr Thr Asn Ala Val Gly Phe Met Pro
485 490 495Asn Leu Thr Ala Tyr Pro Lys
Thr Gln Ser Gln Thr Ala Lys Ser Asn 500 505
510Ile Val Ser Gln Val Tyr Leu Asn Gly Asp Lys Ser Lys Pro
Met Thr 515 520 525Leu Thr Ile Thr
Leu Asn Gly Thr Asn Glu Thr Gly Asp Ala Thr Val 530
535 540Ser Thr Tyr Ser Met Ser Phe Ser Trp Asn Trp Asn
Gly Ser Asn Tyr545 550 555
560Ile Asn Glu Thr Phe Gln Thr Asn Ser Phe Thr Phe Ser Tyr Ile Ala
565 570 575Gln Glu35577PRTSimian
adenovirus 35Met Lys Arg Ala Lys Thr Ser Asp Glu Thr Phe Asn Pro Val Tyr
Pro1 5 10 15Tyr Asp Thr
Glu Asn Gly Pro Pro Ser Val Pro Phe Leu Thr Pro Pro 20
25 30Phe Val Ser Pro Asp Gly Phe Gln Glu Ser
Pro Pro Gly Val Leu Ser 35 40
45Leu Arg Leu Ser Glu Pro Leu Val Thr Ser His Gly Met Leu Ala Leu 50
55 60Lys Met Gly Asn Gly Leu Ser Leu Asp
Asp Ala Gly Asn Leu Thr Ser65 70 75
80Gln Asp Val Thr Thr Val Thr Pro Pro Leu Lys Lys Thr Lys
Thr Asn 85 90 95Leu Ser
Leu Gln Thr Ser Ala Pro Leu Thr Val Ser Ser Gly Ser Leu 100
105 110Thr Val Ala Ala Ala Ala Pro Leu Ala
Val Ala Gly Thr Ser Leu Thr 115 120
125Met Gln Ser Gln Ala Pro Leu Thr Val Gln Asp Ala Lys Leu Gly Leu
130 135 140Ala Thr Gln Gly Pro Leu Thr
Val Ser Glu Gly Lys Leu Thr Leu Gln145 150
155 160Thr Ser Ala Pro Leu Thr Ala Ala Asp Ser Ser Thr
Leu Thr Val Ser 165 170
175Ala Thr Pro Pro Leu Ser Thr Ser Asn Gly Ser Leu Ser Ile Asp Met
180 185 190Gln Ala Pro Ile Tyr Thr
Thr Asn Gly Lys Leu Ala Leu Asn Ile Gly 195 200
205Ala Pro Leu His Val Val Asp Thr Leu Asn Ala Leu Thr Val
Val Thr 210 215 220Gly Gln Gly Leu Thr
Ile Asn Gly Arg Ala Leu Gln Thr Arg Val Thr225 230
235 240Gly Ala Leu Ser Tyr Asp Thr Glu Gly Asn
Ile Gln Leu Gln Ala Gly 245 250
255Gly Gly Met Arg Ile Asp Asn Asn Gly Gln Leu Ile Leu Asn Val Ala
260 265 270Tyr Pro Phe Asp Ala
Gln Asn Asn Leu Ser Leu Arg Leu Gly Gln Gly 275
280 285Pro Leu Ile Val Asn Ser Ala His Asn Leu Asp Leu
Asn Leu Asn Arg 290 295 300Gly Leu Tyr
Leu Phe Thr Ser Gly Asn Thr Lys Lys Leu Glu Val Asn305
310 315 320Ile Lys Thr Ala Lys Gly Leu
Phe Tyr Asp Gly Thr Ala Ile Ala Ile 325
330 335Asn Ala Gly Asp Gly Leu Gln Phe Gly Ser Gly Ser
Asp Thr Asn Pro 340 345 350Leu
Gln Thr Lys Leu Gly Leu Gly Leu Glu Tyr Asp Ser Asn Lys Ala 355
360 365Ile Ile Thr Lys Leu Gly Thr Gly Leu
Ser Phe Asp Asn Thr Gly Ala 370 375
380Ile Thr Val Gly Asn Lys Asn Asp Asp Lys Leu Thr Leu Trp Thr Thr385
390 395 400Pro Asp Pro Ser
Pro Asn Cys Arg Ile Asn Ser Glu Lys Asp Ala Lys 405
410 415Leu Thr Leu Val Leu Thr Lys Cys Gly Ser
Gln Val Leu Ala Ser Val 420 425
430Ser Val Leu Ser Val Lys Gly Ser Leu Ala Pro Ile Ser Gly Thr Val
435 440 445Thr Ser Ala Gln Ile Val Leu
Arg Phe Asp Glu Asn Gly Val Leu Leu 450 455
460Ser Asn Ser Ser Leu Asp Pro Gln Tyr Trp Asn Tyr Arg Lys Gly
Asp465 470 475 480Ser Thr
Glu Gly Thr Ala Tyr Thr Asn Ala Val Gly Phe Met Pro Asn
485 490 495Leu Thr Ala Tyr Pro Lys Thr
Gln Ser Gln Thr Ala Lys Ser Asn Ile 500 505
510Val Ser Gln Val Tyr Leu Asn Gly Asp Lys Thr Lys Pro Met
Thr Leu 515 520 525Thr Ile Thr Leu
Asn Gly Thr Asn Glu Thr Gly Asp Ala Thr Val Ser 530
535 540Thr Tyr Ser Met Ser Phe Ser Trp Asn Trp Asn Gly
Ser Asn Tyr Ile545 550 555
560Asn Asp Thr Phe Gln Thr Asn Ser Phe Thr Phe Ser Tyr Ile Ala Gln
565 570 575Glu36577PRTSimian
adenovirus 36Met Lys Arg Ala Lys Thr Ser Asp Glu Thr Phe Asn Pro Val Tyr
Pro1 5 10 15Tyr Asp Thr
Glu Asn Gly Pro Pro Ser Val Pro Phe Leu Thr Pro Pro 20
25 30Phe Val Ser Pro Asp Gly Phe Gln Glu Ser
Pro Pro Gly Val Leu Ser 35 40
45Leu Arg Leu Ser Glu Pro Leu Val Thr Ser His Gly Met Leu Ala Leu 50
55 60Lys Met Gly Asn Gly Leu Ser Leu Asp
Asp Ala Gly Asn Leu Thr Ser65 70 75
80Gln Asp Val Thr Thr Val Thr Pro Pro Leu Lys Lys Thr Lys
Thr Asn 85 90 95Leu Ser
Leu Gln Thr Ser Ala Pro Leu Thr Val Ser Ser Gly Ser Leu 100
105 110Thr Val Ala Ala Ala Ala Pro Leu Ala
Val Ala Gly Thr Ser Leu Thr 115 120
125Met Gln Ser Gln Ala Pro Leu Thr Val Gln Asp Ala Lys Leu Gly Leu
130 135 140Ala Thr Gln Gly Pro Leu Thr
Val Ser Glu Gly Lys Leu Thr Leu Gln145 150
155 160Thr Ser Ala Pro Leu Thr Ala Ala Asp Ser Ser Thr
Leu Thr Val Ser 165 170
175Ala Thr Pro Pro Leu Ser Thr Ser Asn Gly Ser Leu Ser Ile Asp Met
180 185 190Gln Ala Pro Ile Tyr Thr
Thr Asn Gly Lys Leu Ala Leu Asn Ile Gly 195 200
205Ala Pro Leu His Val Val Asp Thr Leu Asn Ala Leu Thr Val
Val Thr 210 215 220Gly Gln Gly Leu Thr
Ile Asn Gly Arg Ala Leu Gln Thr Arg Val Thr225 230
235 240Gly Ala Leu Ser Tyr Asp Thr Glu Gly Asn
Ile Gln Leu Gln Ala Gly 245 250
255Gly Gly Met Arg Ile Asp Asn Asn Gly Gln Leu Ile Leu Asn Val Ala
260 265 270Tyr Pro Phe Asp Ala
Gln Asn Asn Leu Ser Leu Arg Leu Gly Gln Gly 275
280 285Pro Leu Ile Val Asn Ser Ala His Asn Leu Asp Leu
Asn Leu Asn Arg 290 295 300Gly Leu Tyr
Leu Phe Thr Ser Gly Asn Thr Lys Lys Leu Glu Val Asn305
310 315 320Ile Lys Thr Ala Lys Gly Leu
Phe Tyr Asp Gly Thr Ala Ile Ala Ile 325
330 335Asn Ala Gly Asp Gly Leu Gln Phe Gly Ser Gly Ser
Asp Thr Asn Pro 340 345 350Leu
Gln Thr Lys Leu Gly Leu Gly Leu Glu Tyr Asp Ser Asn Lys Ala 355
360 365Ile Ile Thr Lys Leu Gly Thr Gly Leu
Ser Phe Asp Asn Thr Gly Ala 370 375
380Ile Thr Val Gly Asn Lys Asn Asp Asp Lys Leu Thr Leu Trp Thr Thr385
390 395 400Pro Asp Pro Ser
Pro Asn Cys Arg Ile Asn Ser Glu Lys Asp Ala Lys 405
410 415Leu Thr Leu Val Leu Thr Lys Cys Gly Ser
Gln Val Leu Ala Ser Val 420 425
430Ser Val Leu Ser Val Lys Gly Ser Leu Ala Pro Ile Ser Gly Thr Val
435 440 445Thr Ser Ala Gln Ile Val Leu
Arg Phe Asp Glu Asn Gly Val Leu Leu 450 455
460Ser Asn Ser Ser Leu Asp Pro Gln Tyr Trp Asn Tyr Arg Lys Gly
Asp465 470 475 480Ser Thr
Glu Gly Thr Ala Tyr Thr Asn Ala Val Gly Phe Met Pro Asn
485 490 495Leu Thr Ala Tyr Pro Lys Thr
Gln Ser Gln Thr Ala Lys Ser Asn Ile 500 505
510Val Ser Gln Val Tyr Leu Asn Gly Asp Lys Thr Lys Pro Met
Thr Leu 515 520 525Thr Ile Thr Leu
Asn Gly Thr Asn Glu Thr Gly Asp Ala Thr Val Ser 530
535 540Thr Tyr Ser Met Ser Phe Ser Trp Asn Trp Asn Gly
Ser Asn Tyr Ile545 550 555
560Asn Asp Thr Phe Gln Thr Asn Ser Phe Thr Phe Ser Tyr Ile Ala Gln
565 570
575Glu371145PRTRespiratory syncytial virus 37Met Glu Leu Leu Ile Leu Lys
Ala Asn Ala Ile Thr Thr Ile Leu Thr1 5 10
15Ala Val Thr Phe Cys Phe Ala Ser Gly Gln Asn Ile Thr
Glu Glu Phe 20 25 30Tyr Gln
Ser Thr Cys Ser Ala Val Ser Lys Gly Tyr Leu Ser Ala Leu 35
40 45Arg Thr Gly Trp Tyr Thr Ser Val Ile Thr
Ile Glu Leu Ser Asn Ile 50 55 60Lys
Glu Asn Lys Cys Asn Gly Thr Asp Ala Lys Val Lys Leu Ile Lys65
70 75 80Gln Glu Leu Asp Lys Tyr
Lys Asn Ala Val Thr Glu Leu Gln Leu Leu 85
90 95Met Gln Ser Thr Pro Ala Thr Asn Asn Arg Ala Arg
Arg Glu Leu Pro 100 105 110Arg
Phe Met Asn Tyr Thr Leu Asn Asn Ala Lys Lys Thr Asn Val Thr 115
120 125Leu Ser Lys Lys Arg Lys Arg Arg Phe
Leu Gly Phe Leu Leu Gly Val 130 135
140Gly Ser Ala Ile Ala Ser Gly Val Ala Val Ser Lys Val Leu His Leu145
150 155 160Glu Gly Glu Val
Asn Lys Ile Lys Ser Ala Leu Leu Ser Thr Asn Lys 165
170 175Ala Val Val Ser Leu Ser Asn Gly Val Ser
Val Leu Thr Ser Lys Val 180 185
190Leu Asp Leu Lys Asn Tyr Ile Asp Lys Gln Leu Leu Pro Ile Val Asn
195 200 205Lys Gln Ser Cys Ser Ile Ser
Asn Ile Glu Thr Val Ile Glu Phe Gln 210 215
220Gln Lys Asn Asn Arg Leu Leu Glu Ile Thr Arg Glu Phe Ser Val
Asn225 230 235 240Ala Gly
Val Thr Thr Pro Val Ser Thr Tyr Met Leu Thr Asn Ser Glu
245 250 255Leu Leu Ser Leu Asn Asp Met
Pro Ile Thr Asn Asp Gln Lys Lys Leu 260 265
270Met Ser Asn Asn Val Gln Ile Val Arg Gln Gln Ser Tyr Ser
Ile Met 275 280 285Ser Ile Ile Lys
Glu Glu Val Leu Ala Tyr Val Val Gln Leu Pro Leu 290
295 300Tyr Gly Val Ile Asp Thr Pro Cys Trp Lys Leu His
Thr Ser Pro Leu305 310 315
320Cys Thr Thr Asn Thr Lys Glu Gly Ser Asn Ile Cys Leu Thr Arg Thr
325 330 335Asp Arg Gly Trp Tyr
Cys Asp Asn Ala Gly Ser Val Ser Phe Phe Pro 340
345 350Gln Ala Glu Thr Cys Lys Val Gln Ser Asn Arg Val
Phe Cys Asp Thr 355 360 365Met Asn
Ser Leu Thr Leu Pro Ser Glu Val Asn Leu Cys Asn Val Asp 370
375 380Ile Phe Asn Pro Lys Tyr Asp Cys Lys Ile Met
Thr Ser Lys Thr Asp385 390 395
400Val Ser Ser Ser Val Ile Thr Ser Leu Gly Ala Ile Val Ser Cys Tyr
405 410 415Gly Lys Thr Lys
Cys Thr Ala Ser Asn Lys Asn Arg Gly Ile Ile Lys 420
425 430Thr Phe Ser Asn Gly Cys Asp Tyr Val Ser Asn
Lys Gly Val Asp Thr 435 440 445Val
Ser Val Gly Asn Thr Leu Tyr Tyr Val Asn Lys Gln Glu Gly Lys 450
455 460Ser Leu Tyr Val Lys Gly Glu Pro Ile Ile
Asn Phe Tyr Asp Pro Leu465 470 475
480Val Phe Pro Ser Asp Glu Phe Asp Ala Ser Ile Ser Gln Val Asn
Glu 485 490 495Lys Ile Asn
Gln Ser Leu Ala Phe Ile Arg Lys Ser Asp Glu Leu Leu 500
505 510His Asn Val Asn Ala Gly Lys Ser Thr Thr
Asn Arg Lys Arg Arg Ala 515 520
525Pro Val Lys Gln Thr Leu Asn Phe Asp Leu Leu Lys Leu Ala Gly Asp 530
535 540Val Glu Ser Asn Pro Gly Pro Met
Ala Leu Ser Lys Val Lys Leu Asn545 550
555 560Asp Thr Leu Asn Lys Asp Gln Leu Leu Ser Ser Ser
Lys Tyr Thr Ile 565 570
575Gln Arg Ser Thr Gly Asp Ser Ile Asp Thr Pro Asn Tyr Asp Val Gln
580 585 590Lys His Ile Asn Lys Leu
Cys Gly Met Leu Leu Ile Thr Glu Asp Ala 595 600
605Asn His Lys Phe Thr Gly Leu Ile Gly Met Leu Tyr Ala Met
Ser Arg 610 615 620Leu Gly Arg Glu Asp
Thr Ile Lys Ile Leu Arg Asp Ala Gly Tyr His625 630
635 640Val Lys Ala Asn Gly Val Asp Val Thr Thr
His Arg Gln Asp Ile Asn 645 650
655Gly Lys Glu Met Lys Phe Glu Val Leu Thr Leu Ala Ser Leu Thr Thr
660 665 670Glu Ile Gln Ile Asn
Ile Glu Ile Glu Ser Arg Lys Ser Tyr Lys Lys 675
680 685Met Leu Lys Glu Met Gly Glu Val Ala Pro Glu Tyr
Arg His Asp Ser 690 695 700Pro Asp Cys
Gly Met Ile Ile Leu Cys Ile Ala Ala Leu Val Ile Thr705
710 715 720Lys Leu Ala Ala Gly Asp Arg
Ser Gly Leu Thr Ala Val Ile Arg Arg 725
730 735Ala Asn Asn Val Leu Lys Asn Glu Met Lys Arg Tyr
Lys Gly Leu Leu 740 745 750Pro
Lys Asp Ile Ala Asn Ser Phe Tyr Glu Val Phe Glu Lys Tyr Pro 755
760 765His Phe Ile Asp Val Phe Val His Phe
Gly Ile Ala Gln Ser Ser Thr 770 775
780Arg Gly Gly Ser Arg Val Glu Gly Ile Phe Ala Gly Leu Phe Met Asn785
790 795 800Ala Tyr Gly Ala
Gly Gln Val Met Leu Arg Trp Gly Val Leu Ala Lys 805
810 815Ser Val Lys Asn Ile Met Leu Gly His Ala
Ser Val Gln Ala Glu Met 820 825
830Glu Gln Val Val Glu Val Tyr Glu Tyr Ala Gln Lys Leu Gly Gly Glu
835 840 845Ala Gly Phe Tyr His Ile Leu
Asn Asn Pro Lys Ala Ser Leu Leu Ser 850 855
860Leu Thr Gln Phe Pro His Phe Ser Ser Val Val Leu Gly Asn Ala
Ala865 870 875 880Gly Leu
Gly Ile Met Gly Glu Tyr Arg Gly Thr Pro Arg Asn Gln Asp
885 890 895Leu Tyr Asp Ala Ala Lys Ala
Tyr Ala Glu Gln Leu Lys Glu Asn Gly 900 905
910Val Ile Asn Tyr Ser Val Leu Asp Leu Thr Ala Glu Glu Leu
Glu Ala 915 920 925Ile Lys His Gln
Leu Asn Pro Lys Asp Asn Asp Val Glu Leu Gly Gly 930
935 940Gly Gly Ser Gly Gly Gly Gly Met Ser Arg Arg Asn
Pro Cys Lys Phe945 950 955
960Glu Ile Arg Gly His Cys Leu Asn Gly Lys Arg Cys His Phe Ser His
965 970 975Asn Tyr Phe Glu Trp
Pro Pro His Ala Leu Leu Val Arg Gln Asn Phe 980
985 990Met Leu Asn Arg Ile Leu Lys Ser Met Asp Lys Ser
Ile Asp Thr Leu 995 1000 1005Ser
Glu Ile Ser Gly Ala Ala Glu Leu Asp Arg Thr Glu Glu Tyr 1010
1015 1020Ala Leu Gly Val Val Gly Val Leu Glu
Ser Tyr Ile Gly Ser Ile 1025 1030
1035Asn Asn Ile Thr Lys Gln Ser Ala Cys Val Ala Met Ser Lys Leu
1040 1045 1050Leu Thr Glu Leu Asn Ser
Asp Asp Ile Lys Lys Leu Arg Asp Asn 1055 1060
1065Glu Glu Leu Asn Ser Pro Lys Ile Arg Val Tyr Asn Thr Val
Ile 1070 1075 1080Ser Tyr Ile Glu Ser
Asn Arg Lys Asn Asn Lys Gln Thr Ile His 1085 1090
1095Leu Leu Lys Arg Leu Pro Ala Asp Val Leu Lys Lys Thr
Ile Lys 1100 1105 1110Asn Thr Leu Asp
Ile His Lys Ser Ile Thr Ile Asn Asn Pro Lys 1115
1120 1125Glu Ser Thr Val Ser Asp Thr Asn Asp His Ala
Lys Asn Asn Asp 1130 1135 1140Thr Thr
1145381502DNAHuman immunodeficiency virus 38atgggtgcta gggcttctgt
gctgtctggt ggtgagctgg acaagtggga gaagatcagg 60ctgaggcctg gtggcaagaa
gaagtacaag ctaaagcaca ttgtgtgggc ctccagggag 120ctggagaggt ttgctgtgaa
ccctggcctg ctggagacct ctgaggggtg caggcagatc 180ctgggccagc tccagccctc
cctgcaaaca ggctctgagg agctgaggtc cctgtacaac 240acagtggcta ccctgtactg
tgtgcaccag aagattgatg tgaaggacac caaggaggcc 300ctggagaaga ttgaggagga
gcagaacaag tccaagaaga aggcccagca ggctgctgct 360ggcacaggca actccagcca
ggtgtcccag aactacccca ttgtgcagaa cctccagggc 420cagatggtgc accaggccat
ctccccccgg accctgaatg cctgggtgaa ggtggtggag 480gagaggcctt ctcccctgag
gtgatcccca tgttctctgc cctgtctgag ggtgccaccc 540cccaggacct gaacaccatg
ctgaacacag tggggggcca tcaggctgcc atgcagatgc 600tgaaggagac catcaatgag
gaggctgctg agtgggacag gctgcatcct gtgcacgctg 660gccccattgc ccccggccag
atgagggagc ccaggggctc tgacattgct ggcaccacct 720ccaccctcca ggagcagatt
ggctggatga ccaacaaccc ccccatccct gtgggggaaa 780tctacaagag gtggatcatc
ctgggcctga acaagattgt gaggatgtac tcccccacct 840ccatcctgga catcaggcag
ggccccaagg agcccttcag ggactatgtg gacaggttct 900acaagaccct gagggctgag
caggcctccc aggaggtgaa gaactggatg acagagaccc 960tgctggtgca gaatgccaac
cctgactgca agaccatcct gaaggccctg ggccctgctg 1020ccaccctgga ggagatgatg
acagcctgcc agggggtggg gggccctggt cacaaggcca 1080gggtgctggc tgaggccatg
tcccaggtga ccaactccgc caccatcatg atgcagaggg 1140gcaacttcag gaaccagagg
aagacagtga agtgcttcaa ctgtggcaag gtgggccaca 1200ttgccaagaa ctgtagggcc
cccaggaaga agggctgctg gaagtgtggc aaggagggcc 1260accagatgaa ggactgcaat
gagaggcagg ccaacttcct gggcaaaatc tggccctccc 1320acaagggcag gcctggcaac
ttcctccagt ccaggcctga gcccacagcc cctcccgagg 1380agtccttcag gtttggggag
gagaagacca cccccagcca gaagcaggag cccattgaca 1440aggagctgta ccccctggcc
tccctgaggt ccctgtttgg caacgacccc tcctcccagt 1500aa
150239110PRTMycobacterium
tuberculosis 39Met Arg Leu Ser Leu Thr Ala Leu Ser Ala Gly Val Gly Ala
Val Ala1 5 10 15Met Ser
Leu Thr Val Gly Ala Gly Val Ala Ser Ala Asp Pro Val Asp 20
25 30Ala Val Ile Asn Thr Thr Cys Asn Tyr
Gly Gln Val Val Ala Ala Leu 35 40
45Asn Ala Thr Asp Pro Gly Ala Ala Ala Gln Phe Asn Ala Ser Pro Val 50
55 60Ala Gln Ser Tyr Leu Arg Asn Phe Leu
Ala Ala Pro Pro Pro Gln Arg65 70 75
80Ala Ala Met Ala Ala Gln Leu Gln Ala Val Pro Gly Ala Ala
Gln Tyr 85 90 95Ile Gly
Leu Val Glu Ser Val Ala Gly Ser Cys Asn Asn Tyr 100
105 1104097PRTMycobacterium tuberculosis 40Met Ser
Leu Leu Asp Ala His Ile Pro Gln Leu Val Ala Ser Gln Ser1 5
10 15Ala Phe Ala Ala Lys Ala Gly Leu
Met Arg His Thr Ile Gly Gln Ala 20 25
30Glu Gln Ala Ala Met Ser Ala Gln Ala Phe His Gln Gly Glu Ser
Ser 35 40 45Ala Ala Phe Gln Ala
Ala His Ala Arg Phe Val Ala Ala Ala Ala Lys 50 55
60Val Asn Thr Leu Leu Asp Val Ala Gln Ala Asn Leu Gly Glu
Ala Ala65 70 75 80Gly
Thr Tyr Val Ala Ala Asp Ala Ala Ala Ala Ser Thr Tyr Thr Gly
85 90 95Phe4194PRTMycobacterium
tuberculosis 41Met Thr Ile Asn Tyr Gln Phe Gly Asp Val Asp Ala His Gly
Ala Met1 5 10 15Ile Arg
Ala Gln Ala Ala Ser Leu Glu Ala Glu His Gln Ala Ile Val 20
25 30Arg Asp Val Leu Ala Ala Gly Asp Phe
Trp Gly Gly Ala Gly Ser Val 35 40
45Ala Cys Gln Glu Phe Ile Thr Gln Leu Gly Arg Asn Phe Gln Val Ile 50
55 60Tyr Glu Gln Ala Asn Ala His Gly Gln
Lys Val Gln Ala Ala Gly Asn65 70 75
80Asn Met Ala Gln Thr Asp Ser Ala Val Gly Ser Ser Trp Ala
85 9042423PRTMycobacterium tuberculosis
42Met Asp Phe Gly Leu Leu Pro Pro Glu Val Asn Ser Ser Arg Met Tyr1
5 10 15Ser Gly Pro Gly Pro Glu
Ser Met Leu Ala Ala Ala Ala Ala Trp Asp 20 25
30Gly Val Ala Ala Glu Leu Thr Ser Ala Ala Val Ser Tyr
Gly Ser Val 35 40 45Val Ser Thr
Leu Ile Val Glu Pro Trp Met Gly Pro Ala Ala Ala Ala 50
55 60Met Ala Ala Ala Ala Thr Pro Tyr Val Gly Trp Leu
Ala Ala Thr Ala65 70 75
80Ala Leu Ala Lys Glu Thr Ala Thr Gln Ala Arg Ala Ala Ala Glu Ala
85 90 95Phe Gly Thr Ala Phe Ala
Met Thr Val Pro Pro Ser Leu Val Ala Ala 100
105 110Asn Arg Ser Arg Leu Met Ser Leu Val Ala Ala Asn
Ile Leu Gly Gln 115 120 125Asn Ser
Ala Ala Ile Ala Ala Thr Gln Ala Glu Tyr Ala Glu Met Trp 130
135 140Ala Gln Asp Ala Ala Val Met Tyr Ser Tyr Glu
Gly Ala Ser Ala Ala145 150 155
160Ala Ser Ala Leu Pro Pro Phe Thr Pro Pro Val Gln Gly Thr Gly Pro
165 170 175Ala Gly Pro Ala
Ala Ala Ala Ala Ala Thr Gln Ala Ala Gly Ala Gly 180
185 190Ala Val Ala Asp Ala Gln Ala Thr Leu Ala Gln
Leu Pro Pro Gly Ile 195 200 205Leu
Ser Asp Ile Leu Ser Ala Leu Ala Ala Asn Ala Asp Pro Leu Thr 210
215 220Ser Gly Leu Leu Gly Ile Ala Ser Thr Leu
Asn Pro Gln Val Gly Ser225 230 235
240Ala Gln Pro Ile Val Ile Pro Thr Pro Ile Gly Glu Leu Asp Val
Ile 245 250 255Ala Leu Tyr
Ile Ala Ser Ile Ala Thr Gly Ser Ile Ala Leu Ala Ile 260
265 270Thr Asn Thr Ala Arg Pro Trp His Ile Gly
Leu Tyr Gly Asn Ala Gly 275 280
285Gly Leu Gly Pro Thr Gln Gly His Pro Leu Ser Ser Ala Thr Asp Glu 290
295 300Pro Glu Pro His Trp Gly Pro Phe
Gly Gly Ala Ala Pro Val Ser Ala305 310
315 320Gly Val Gly His Ala Ala Leu Val Gly Ala Leu Ser
Val Pro His Ser 325 330
335Trp Thr Thr Ala Ala Pro Glu Ile Gln Leu Ala Val Gln Ala Thr Pro
340 345 350Thr Phe Ser Ser Ser Ala
Gly Ala Asp Pro Thr Ala Leu Asn Gly Met 355 360
365Pro Ala Gly Leu Leu Ser Gly Met Ala Leu Ala Ser Leu Ala
Ala Arg 370 375 380Gly Thr Thr Gly Gly
Gly Gly Thr Arg Ser Gly Thr Ser Thr Asp Gly385 390
395 400Gln Glu Asp Gly Arg Lys Pro Pro Val Val
Val Ile Arg Glu Gln Pro 405 410
415Pro Pro Gly Asn Pro Pro Arg 4204395PRTMycobacterium
tuberculosis 43Met Thr Glu Gln Gln Trp Asn Phe Ala Gly Ile Glu Ala Ala
Ala Ser1 5 10 15Ala Ile
Gln Gly Asn Val Thr Ser Ile His Ser Leu Leu Asp Glu Gly 20
25 30Lys Gln Ser Leu Thr Lys Leu Ala Ala
Ala Trp Gly Gly Ser Gly Ser 35 40
45Glu Ala Tyr Gln Gly Val Gln Gln Lys Trp Asp Ala Thr Ala Thr Glu 50
55 60Leu Asn Asn Ala Leu Gln Asn Leu Ala
Arg Thr Ile Ser Glu Ala Gly65 70 75
80Gln Ala Met Ala Ser Thr Glu Gly Asn Val Thr Gly Met Phe
Ala 85 90
9544338PRTMycobacterium tuberculosis 44Met Gln Leu Val Asp Arg Val Arg
Gly Ala Val Thr Gly Met Ser Arg1 5 10
15Arg Leu Val Val Gly Ala Val Gly Ala Ala Leu Val Ser Gly
Leu Val 20 25 30Gly Ala Val
Gly Gly Thr Ala Thr Ala Gly Ala Phe Ser Arg Pro Gly 35
40 45Leu Pro Val Glu Tyr Leu Gln Val Pro Ser Pro
Ser Met Gly Arg Asp 50 55 60Ile Lys
Val Gln Phe Gln Ser Gly Gly Ala Asn Ser Pro Ala Leu Tyr65
70 75 80Leu Leu Asp Gly Leu Arg Ala
Gln Asp Asp Phe Ser Gly Trp Asp Ile 85 90
95Asn Thr Pro Ala Phe Glu Trp Tyr Asp Gln Ser Gly Leu
Ser Val Val 100 105 110Met Pro
Val Gly Gly Gln Ser Ser Phe Tyr Ser Asp Trp Tyr Gln Pro 115
120 125Ala Cys Gly Lys Ala Gly Cys Gln Thr Tyr
Lys Trp Glu Thr Phe Leu 130 135 140Thr
Ser Glu Leu Pro Gly Trp Leu Gln Ala Asn Arg His Val Lys Pro145
150 155 160Thr Gly Ser Ala Val Val
Gly Leu Ser Met Ala Ala Ser Ser Ala Leu 165
170 175Thr Leu Ala Ile Tyr His Pro Gln Gln Phe Val Tyr
Ala Gly Ala Met 180 185 190Ser
Gly Leu Leu Asp Pro Ser Gln Ala Met Gly Pro Thr Leu Ile Gly 195
200 205Leu Ala Met Gly Asp Ala Gly Gly Tyr
Lys Ala Ser Asp Met Trp Gly 210 215
220Pro Lys Glu Asp Pro Ala Trp Gln Arg Asn Asp Pro Leu Leu Asn Val225
230 235 240Gly Lys Leu Ile
Ala Asn Asn Thr Arg Val Trp Val Tyr Cys Gly Asn 245
250 255Gly Lys Pro Ser Asp Leu Gly Gly Asn Asn
Leu Pro Ala Lys Phe Leu 260 265
270Glu Gly Phe Val Arg Thr Ser Asn Ile Lys Phe Gln Asp Ala Tyr Asn
275 280 285Ala Gly Gly Gly His Asn Gly
Val Phe Asp Phe Pro Asp Ser Gly Thr 290 295
300His Ser Trp Glu Tyr Trp Gly Ala Gln Leu Asn Ala Met Lys Pro
Asp305 310 315 320Leu Gln
Arg Ala Leu Gly Ala Thr Pro Asn Thr Gly Pro Ala Pro Gln
325 330 335Gly Ala45325PRTMycobacterium
tuberculosis 45Met Thr Asp Val Ser Arg Lys Ile Arg Ala Trp Gly Arg Arg
Leu Met1 5 10 15Ile Gly
Thr Ala Ala Ala Val Val Leu Pro Gly Leu Val Gly Leu Ala 20
25 30Gly Gly Ala Ala Thr Ala Gly Ala Phe
Ser Arg Pro Gly Leu Pro Val 35 40
45Glu Tyr Leu Gln Val Pro Ser Pro Ser Met Gly Arg Asp Ile Lys Val 50
55 60Gln Phe Gln Ser Gly Gly Asn Asn Ser
Pro Ala Val Tyr Leu Leu Asp65 70 75
80Gly Leu Arg Ala Gln Asp Asp Tyr Asn Gly Trp Asp Ile Asn
Thr Pro 85 90 95Ala Phe
Glu Trp Tyr Tyr Gln Ser Gly Leu Ser Ile Val Met Pro Val 100
105 110Gly Gly Gln Ser Ser Phe Tyr Ser Asp
Trp Tyr Ser Pro Ala Cys Gly 115 120
125Lys Ala Gly Cys Gln Thr Tyr Lys Trp Glu Thr Phe Leu Thr Ser Glu
130 135 140Leu Pro Gln Trp Leu Ser Ala
Asn Arg Ala Val Lys Pro Thr Gly Ser145 150
155 160Ala Ala Ile Gly Leu Ser Met Ala Gly Ser Ser Ala
Met Ile Leu Ala 165 170
175Ala Tyr His Pro Gln Gln Phe Ile Tyr Ala Gly Ser Leu Ser Ala Leu
180 185 190Leu Asp Pro Ser Gln Gly
Met Gly Pro Ser Leu Ile Gly Leu Ala Met 195 200
205Gly Asp Ala Gly Gly Tyr Lys Ala Ala Asp Met Trp Gly Pro
Ser Ser 210 215 220Asp Pro Ala Trp Glu
Arg Asn Asp Pro Thr Gln Gln Ile Pro Lys Leu225 230
235 240Val Ala Asn Asn Thr Arg Leu Trp Val Tyr
Cys Gly Asn Gly Thr Pro 245 250
255Asn Glu Leu Gly Gly Ala Asn Ile Pro Ala Glu Phe Leu Glu Asn Phe
260 265 270Val Arg Ser Ser Asn
Leu Lys Phe Gln Asp Ala Tyr Asn Ala Ala Gly 275
280 285Gly His Asn Ala Val Phe Asn Phe Pro Pro Asn Gly
Thr His Ser Trp 290 295 300Glu Tyr Trp
Gly Ala Gln Leu Asn Ala Met Lys Gly Asp Leu Gln Ser305
310 315 320Ser Leu Gly Ala Gly
32546144PRTMycobacterium tuberculosis 46Met Ala Thr Thr Leu Pro Val
Gln Arg His Pro Arg Ser Leu Phe Pro1 5 10
15Glu Phe Ser Glu Leu Phe Ala Ala Phe Pro Ser Phe Ala
Gly Leu Arg 20 25 30Pro Thr
Phe Asp Thr Arg Leu Met Arg Leu Glu Asp Glu Met Lys Glu 35
40 45Gly Arg Tyr Glu Val Arg Ala Glu Leu Pro
Gly Val Asp Pro Asp Lys 50 55 60Asp
Val Asp Ile Met Val Arg Asp Gly Gln Leu Thr Ile Lys Ala Glu65
70 75 80Arg Thr Glu Gln Lys Asp
Phe Asp Gly Arg Ser Glu Phe Ala Tyr Gly 85
90 95Ser Phe Val Arg Thr Val Ser Leu Pro Val Gly Ala
Asp Glu Asp Asp 100 105 110Ile
Lys Ala Thr Tyr Asp Lys Gly Ile Leu Thr Val Ser Val Ala Val 115
120 125Ser Glu Gly Lys Pro Thr Glu Lys His
Ile Gln Ile Arg Ser Thr Asn 130 135
14047228PRTMycobacterium tuberculosis 47Met Arg Ile Lys Ile Phe Met Leu
Val Thr Ala Val Val Leu Leu Cys1 5 10
15Cys Ser Gly Val Ala Thr Ala Ala Pro Lys Thr Tyr Cys Glu
Glu Leu 20 25 30Lys Gly Thr
Asp Thr Gly Gln Ala Cys Gln Ile Gln Met Ser Asp Pro 35
40 45Ala Tyr Asn Ile Asn Ile Ser Leu Pro Ser Tyr
Tyr Pro Asp Gln Lys 50 55 60Ser Leu
Glu Asn Tyr Ile Ala Gln Thr Arg Asp Lys Phe Leu Ser Ala65
70 75 80Ala Thr Ser Ser Thr Pro Arg
Glu Ala Pro Tyr Glu Leu Asn Ile Thr 85 90
95Ser Ala Thr Tyr Gln Ser Ala Ile Pro Pro Arg Gly Thr
Gln Ala Val 100 105 110Val Leu
Lys Val Tyr Gln Asn Ala Gly Gly Thr His Pro Thr Thr Thr 115
120 125Tyr Lys Ala Phe Asp Trp Asp Gln Ala Tyr
Arg Lys Pro Ile Thr Tyr 130 135 140Asp
Thr Leu Trp Gln Ala Asp Thr Asp Pro Leu Pro Val Val Phe Pro145
150 155 160Ile Val Gln Gly Glu Leu
Ser Lys Gln Thr Gly Gln Gln Val Ser Ile 165
170 175Ala Pro Asn Ala Gly Leu Asp Pro Val Asn Tyr Gln
Asn Phe Ala Val 180 185 190Thr
Asn Asp Gly Val Ile Phe Phe Phe Asn Pro Gly Glu Leu Leu Pro 195
200 205Glu Ala Ala Gly Pro Thr Gln Val Leu
Val Pro Arg Ser Ala Ile Asp 210 215
220Ser Met Leu Ala2254896PRTMycobacterium tuberculosis 48Met Ser Gln Ile
Met Tyr Asn Tyr Pro Ala Met Leu Gly His Ala Gly1 5
10 15Asp Met Ala Gly Tyr Ala Gly Thr Leu Gln
Ser Leu Gly Ala Glu Ile 20 25
30Ala Val Glu Gln Ala Ala Leu Gln Ser Ala Trp Gln Gly Asp Thr Gly
35 40 45Ile Thr Tyr Gln Ala Trp Gln Ala
Gln Trp Asn Gln Ala Met Glu Asp 50 55
60Leu Val Arg Ala Tyr His Ala Met Ser Ser Thr His Glu Ala Asn Thr65
70 75 80Met Ala Met Met Ala
Arg Asp Thr Ala Glu Ala Ala Lys Trp Gly Gly 85
90 95491053PRTMycobacterium tuberculosis 49Met Asn
Phe Ser Val Leu Pro Pro Glu Ile Asn Ser Ala Leu Ile Phe1 5
10 15Ala Gly Ala Gly Pro Glu Pro Met
Ala Ala Ala Ala Thr Ala Trp Asp 20 25
30Gly Leu Ala Met Glu Leu Ala Ser Ala Ala Ala Ser Phe Gly Ser
Val 35 40 45Thr Ser Gly Leu Val
Gly Gly Ala Trp Gln Gly Ala Ser Ser Ser Ala 50 55
60Met Ala Ala Ala Ala Ala Pro Tyr Ala Ala Trp Leu Ala Ala
Ala Ala65 70 75 80Val
Gln Ala Glu Gln Thr Ala Ala Gln Ala Ala Ala Met Ile Ala Glu
85 90 95Phe Glu Ala Val Lys Thr Ala
Val Val Gln Pro Met Leu Val Ala Ala 100 105
110Asn Arg Ala Asp Leu Val Ser Leu Val Met Ser Asn Leu Phe
Gly Gln 115 120 125Asn Ala Pro Ala
Ile Ala Ala Ile Glu Ala Thr Tyr Glu Gln Met Trp 130
135 140Ala Ala Asp Val Ser Ala Met Ser Ala Tyr His Ala
Gly Ala Ser Ala145 150 155
160Ile Ala Ser Ala Leu Ser Pro Phe Ser Lys Pro Leu Gln Asn Leu Ala
165 170 175Gly Leu Pro Ala Trp
Leu Ala Ser Gly Ala Pro Ala Ala Ala Met Thr 180
185 190Ala Ala Ala Gly Ile Pro Ala Leu Ala Gly Gly Pro
Thr Ala Ile Asn 195 200 205Leu Gly
Ile Ala Asn Val Gly Gly Gly Asn Val Gly Asn Ala Asn Asn 210
215 220Gly Leu Ala Asn Ile Gly Asn Ala Asn Leu Gly
Asn Tyr Asn Phe Gly225 230 235
240Ser Gly Asn Phe Gly Asn Ser Asn Ile Gly Ser Ala Ser Leu Gly Asn
245 250 255Asn Asn Ile Gly
Phe Gly Asn Leu Gly Ser Asn Asn Val Gly Val Gly 260
265 270Asn Leu Gly Asn Leu Asn Thr Gly Phe Ala Asn
Thr Gly Leu Gly Asn 275 280 285Phe
Gly Phe Gly Asn Thr Gly Asn Asn Asn Ile Gly Ile Gly Leu Thr 290
295 300Gly Asn Asn Gln Ile Gly Ile Gly Gly Leu
Asn Ser Gly Thr Gly Asn305 310 315
320Phe Gly Leu Phe Asn Ser Gly Ser Gly Asn Val Gly Phe Phe Asn
Ser 325 330 335Gly Asn Gly
Asn Phe Gly Ile Gly Asn Ser Gly Asn Phe Asn Thr Gly 340
345 350Gly Trp Asn Ser Gly His Gly Asn Thr Gly
Phe Phe Asn Ala Gly Ser 355 360
365Phe Asn Thr Gly Met Leu Asp Val Gly Asn Ala Asn Thr Gly Ser Leu 370
375 380Asn Thr Gly Ser Tyr Asn Met Gly
Asp Phe Asn Pro Gly Ser Ser Asn385 390
395 400Thr Gly Thr Phe Asn Thr Gly Asn Ala Asn Thr Gly
Phe Leu Asn Ala 405 410
415Gly Asn Ile Asn Thr Gly Val Phe Asn Ile Gly His Met Asn Asn Gly
420 425 430Leu Phe Asn Thr Gly Asp
Met Asn Asn Gly Val Phe Tyr Arg Gly Val 435 440
445Gly Gln Gly Ser Leu Gln Phe Ser Ile Thr Thr Pro Asp Leu
Thr Leu 450 455 460Pro Pro Leu Gln Ile
Pro Gly Ile Ser Val Pro Ala Phe Ser Leu Pro465 470
475 480Ala Ile Thr Leu Pro Ser Leu Asn Ile Pro
Ala Ala Thr Thr Pro Ala 485 490
495Asn Ile Thr Val Gly Ala Phe Ser Leu Pro Gly Leu Thr Leu Pro Ser
500 505 510Leu Asn Ile Pro Ala
Ala Thr Thr Pro Ala Asn Ile Thr Val Gly Ala 515
520 525Phe Ser Leu Pro Gly Leu Thr Leu Pro Ser Leu Asn
Ile Pro Ala Ala 530 535 540Thr Thr Pro
Ala Asn Ile Thr Val Gly Ala Phe Ser Leu Pro Gly Leu545
550 555 560Thr Leu Pro Ser Leu Asn Ile
Pro Ala Ala Thr Thr Pro Ala Asn Ile 565
570 575Thr Val Gly Ala Phe Ser Leu Pro Gly Leu Thr Leu
Pro Ser Leu Asn 580 585 590Ile
Pro Ala Ala Thr Thr Pro Ala Asn Ile Thr Val Gly Ala Phe Ser 595
600 605Leu Pro Gly Leu Thr Leu Pro Ser Leu
Asn Ile Pro Ala Ala Thr Thr 610 615
620Pro Ala Asn Ile Thr Val Ser Gly Phe Gln Leu Pro Pro Leu Ser Ile625
630 635 640Pro Ser Val Ala
Ile Pro Pro Val Thr Val Pro Pro Ile Thr Val Gly 645
650 655Ala Phe Asn Leu Pro Pro Leu Gln Ile Pro
Glu Val Thr Ile Pro Gln 660 665
670Leu Thr Ile Pro Ala Gly Ile Thr Ile Gly Gly Phe Ser Leu Pro Ala
675 680 685Ile His Thr Gln Pro Ile Thr
Val Gly Gln Ile Gly Val Gly Gln Phe 690 695
700Gly Leu Pro Ser Ile Gly Trp Asp Val Phe Leu Ser Thr Pro Arg
Ile705 710 715 720Thr Val
Pro Ala Phe Gly Ile Pro Phe Thr Leu Gln Phe Gln Thr Asn
725 730 735Val Pro Ala Leu Gln Pro Pro
Gly Gly Gly Leu Ser Thr Phe Thr Asn 740 745
750Gly Ala Leu Ile Phe Gly Glu Phe Asp Leu Pro Gln Leu Val
Val His 755 760 765Pro Tyr Thr Leu
Thr Gly Pro Ile Val Ile Gly Ser Phe Phe Leu Pro 770
775 780Ala Phe Asn Ile Pro Gly Ile Asp Val Pro Ala Ile
Asn Val Asp Gly785 790 795
800Phe Thr Leu Pro Gln Ile Thr Thr Pro Ala Ile Thr Thr Pro Glu Phe
805 810 815Ala Ile Pro Pro Ile
Gly Val Gly Gly Phe Thr Leu Pro Gln Ile Thr 820
825 830Thr Gln Glu Ile Ile Thr Pro Glu Leu Thr Ile Asn
Ser Ile Gly Val 835 840 845Gly Gly
Phe Thr Leu Pro Gln Ile Thr Thr Pro Pro Ile Thr Thr Pro 850
855 860Pro Leu Thr Ile Asp Pro Ile Asn Leu Thr Gly
Phe Thr Leu Pro Gln865 870 875
880Ile Thr Thr Pro Pro Ile Thr Thr Pro Pro Leu Thr Ile Asp Pro Ile
885 890 895Asn Leu Thr Gly
Phe Thr Leu Pro Gln Ile Thr Thr Pro Pro Ile Thr 900
905 910Thr Pro Pro Leu Thr Ile Glu Pro Ile Gly Val
Gly Gly Phe Thr Thr 915 920 925Pro
Pro Leu Thr Val Pro Gly Ile His Leu Pro Ser Thr Thr Ile Gly 930
935 940Ala Phe Ala Ile Pro Gly Gly Pro Gly Tyr
Phe Asn Ser Ser Thr Ala945 950 955
960Pro Ser Ser Gly Phe Phe Asn Ser Gly Ala Gly Gly Asn Ser Gly
Phe 965 970 975Gly Asn Asn
Gly Ser Gly Leu Ser Gly Trp Phe Asn Thr Asn Pro Ala 980
985 990Gly Leu Leu Gly Gly Ser Gly Tyr Gln Asn
Phe Gly Gly Leu Ser Ser 995 1000
1005Gly Phe Ser Asn Leu Gly Ser Gly Val Ser Gly Phe Ala Asn Arg
1010 1015 1020Gly Ile Leu Pro Phe Ser
Val Ala Ser Val Val Ser Gly Phe Ala 1025 1030
1035Asn Ile Gly Thr Asn Leu Ala Gly Phe Phe Gln Gly Thr Thr
Ser 1040 1045
105050450PRTMycobacterium tuberculosis 50Met Ser Glu Leu Ser Val Ala Thr
Gly Ala Val Ser Thr Ala Ser Ser1 5 10
15Ser Ile Pro Met Pro Ala Gly Val Asn Pro Ala Asp Leu Ala
Ala Glu 20 25 30Leu Ala Ala
Val Val Thr Glu Ser Val Asp Glu Asp Tyr Leu Leu Tyr 35
40 45Glu Cys Asp Gly Gln Trp Val Leu Ala Ala Gly
Val Gln Ala Met Val 50 55 60Glu Leu
Asp Ser Asp Glu Leu Arg Val Ile Arg Asp Gly Val Thr Arg65
70 75 80Arg Gln Gln Trp Ser Gly Arg
Pro Gly Ala Ala Leu Gly Glu Ala Val 85 90
95Asp Arg Leu Leu Leu Glu Thr Asp Gln Ala Phe Gly Trp
Val Ala Phe 100 105 110Glu Phe
Gly Val His Arg Tyr Gly Leu Gln Gln Arg Leu Ala Pro His 115
120 125Thr Pro Leu Ala Arg Val Phe Ser Pro Arg
Thr Arg Ile Met Val Ser 130 135 140Glu
Lys Glu Ile Arg Leu Phe Asp Ala Gly Ile Arg His Arg Glu Ala145
150 155 160Ile Asp Arg Leu Leu Ala
Thr Gly Val Arg Glu Val Pro Gln Ser Arg 165
170 175Ser Val Asp Val Ser Asp Asp Pro Ser Gly Phe Arg
Arg Arg Val Ala 180 185 190Val
Ala Val Asp Glu Ile Ala Ala Gly Arg Tyr His Lys Val Ile Leu 195
200 205Ser Arg Cys Val Glu Val Pro Phe Ala
Ile Asp Phe Pro Leu Thr Tyr 210 215
220Arg Leu Gly Arg Arg His Asn Thr Pro Val Arg Ser Phe Leu Leu Gln225
230 235 240Leu Gly Gly Ile
Arg Ala Leu Gly Tyr Ser Pro Glu Leu Val Thr Ala 245
250 255Val Arg Ala Asp Gly Val Val Ile Thr Glu
Pro Leu Ala Gly Thr Arg 260 265
270Ala Leu Gly Arg Gly Pro Ala Ile Asp Arg Leu Ala Arg Asp Asp Leu
275 280 285Glu Ser Asn Ser Lys Glu Ile
Val Glu His Ala Ile Ser Val Arg Ser 290 295
300Ser Leu Glu Glu Ile Thr Asp Ile Ala Glu Pro Gly Ser Ala Ala
Val305 310 315 320Ile Asp
Phe Met Thr Val Arg Glu Arg Gly Ser Val Gln His Leu Gly
325 330 335Ser Thr Ile Arg Ala Arg Leu
Asp Pro Ser Ser Asp Arg Met Ala Ala 340 345
350Leu Glu Ala Leu Phe Pro Ala Val Thr Ala Ser Gly Ile Pro
Lys Ala 355 360 365Ala Gly Val Glu
Ala Ile Phe Arg Leu Asp Glu Cys Pro Arg Gly Leu 370
375 380Tyr Ser Gly Ala Val Val Met Leu Ser Ala Asp Gly
Gly Leu Asp Ala385 390 395
400Ala Leu Thr Leu Arg Ala Ala Tyr Gln Val Gly Gly Arg Thr Trp Leu
405 410 415Arg Ala Gly Ala Gly
Ile Ile Glu Glu Ser Glu Pro Glu Arg Glu Phe 420
425 430Glu Glu Thr Cys Glu Lys Leu Ser Thr Leu Thr Pro
Tyr Leu Val Ala 435 440 445Arg Gln
45051392PRTMycobacterium tuberculosis 51Met Ser Arg Ala Phe Ile Ile
Asp Pro Thr Ile Ser Ala Ile Asp Gly1 5 10
15Leu Tyr Asp Leu Leu Gly Ile Gly Ile Pro Asn Gln Gly
Gly Ile Leu 20 25 30Tyr Ser
Ser Leu Glu Tyr Phe Glu Lys Ala Leu Glu Glu Leu Ala Ala 35
40 45Ala Phe Pro Gly Asp Gly Trp Leu Gly Ser
Ala Ala Asp Lys Tyr Ala 50 55 60Gly
Lys Asn Arg Asn His Val Asn Phe Phe Gln Glu Leu Ala Asp Leu65
70 75 80Asp Arg Gln Leu Ile Ser
Leu Ile His Asp Gln Ala Asn Ala Val Gln 85
90 95Thr Thr Arg Asp Ile Leu Glu Gly Ala Lys Lys Gly
Leu Glu Phe Val 100 105 110Arg
Pro Val Ala Val Asp Leu Thr Tyr Ile Pro Val Val Gly His Ala 115
120 125Leu Ser Ala Ala Phe Gln Ala Pro Phe
Cys Ala Gly Ala Met Ala Val 130 135
140Val Gly Gly Ala Leu Ala Tyr Leu Val Val Lys Thr Leu Ile Asn Ala145
150 155 160Thr Gln Leu Leu
Lys Leu Leu Ala Lys Leu Ala Glu Leu Val Ala Ala 165
170 175Ala Ile Ala Asp Ile Ile Ser Asp Val Ala
Asp Ile Ile Lys Gly Thr 180 185
190Leu Gly Glu Val Trp Glu Phe Ile Thr Asn Ala Leu Asn Gly Leu Lys
195 200 205Glu Leu Trp Asp Lys Leu Thr
Gly Trp Val Thr Gly Leu Phe Ser Arg 210 215
220Gly Trp Ser Asn Leu Glu Ser Phe Phe Ala Gly Val Pro Gly Leu
Thr225 230 235 240Gly Ala
Thr Ser Gly Leu Ser Gln Val Thr Gly Leu Phe Gly Ala Ala
245 250 255Gly Leu Ser Ala Ser Ser Gly
Leu Ala His Ala Asp Ser Leu Ala Ser 260 265
270Ser Ala Ser Leu Pro Ala Leu Ala Gly Ile Gly Gly Gly Ser
Gly Phe 275 280 285Gly Gly Leu Pro
Ser Leu Ala Gln Val His Ala Ala Ser Thr Arg Gln 290
295 300Ala Leu Arg Pro Arg Ala Asp Gly Pro Val Gly Ala
Ala Ala Glu Gln305 310 315
320Val Gly Gly Gln Ser Gln Leu Val Ser Ala Gln Gly Ser Gln Gly Met
325 330 335Gly Gly Pro Val Gly
Met Gly Gly Met His Pro Ser Ser Gly Ala Ser 340
345 350Lys Gly Thr Thr Thr Lys Lys Tyr Ser Glu Gly Ala
Ala Ala Gly Thr 355 360 365Glu Asp
Ala Glu Arg Ala Pro Val Glu Ala Asp Ala Gly Gly Gly Gln 370
375 380Lys Val Leu Val Arg Asn Val Val385
3905299PRTMycobacterium tuberculosis 52Met Arg Ala Thr Val Gly Leu
Val Glu Ala Ile Gly Ile Arg Glu Leu1 5 10
15Arg Gln His Ala Ser Arg Tyr Leu Ala Arg Val Glu Ala
Gly Glu Glu 20 25 30Leu Gly
Val Thr Asn Lys Gly Arg Leu Val Ala Arg Leu Ile Pro Val 35
40 45Gln Ala Ala Glu Arg Ser Arg Glu Ala Leu
Ile Glu Ser Gly Val Leu 50 55 60Ile
Pro Ala Arg Arg Pro Gln Asn Leu Leu Asp Val Thr Ala Glu Pro65
70 75 80Ala Arg Gly Arg Lys Arg
Thr Leu Ser Asp Val Leu Asn Glu Met Arg 85
90 95Asp Glu Gln5375PRTMycobacterium tuberculosis 53Val
Ile Ala Gly Val Asp Gln Ala Leu Ala Ala Thr Gly Gln Ala Ser1
5 10 15Gln Arg Ala Ala Gly Ala Ser
Gly Gly Val Thr Val Gly Val Gly Val 20 25
30Gly Thr Glu Gln Arg Asn Leu Ser Val Val Ala Pro Ser Gln
Phe Thr 35 40 45Phe Ser Ser Arg
Ser Pro Asp Phe Val Asp Glu Thr Ala Gly Gln Ser 50 55
60Trp Cys Ala Ile Leu Gly Leu Asn Gln Phe His65
70 7554580PRTMycobacterium tuberculosis 54Met
Asn Phe Ala Val Leu Pro Pro Glu Val Asn Ser Ala Arg Ile Phe1
5 10 15Ala Gly Ala Gly Leu Gly Pro
Met Leu Ala Ala Ala Ser Ala Trp Asp 20 25
30Gly Leu Ala Glu Glu Leu His Ala Ala Ala Gly Ser Phe Ala
Ser Val 35 40 45Thr Thr Gly Leu
Ala Gly Asp Ala Trp His Gly Pro Ala Ser Leu Ala 50 55
60Met Thr Arg Ala Ala Ser Pro Tyr Val Gly Trp Leu Asn
Thr Ala Ala65 70 75
80Gly Gln Ala Ala Gln Ala Ala Gly Gln Ala Arg Leu Ala Ala Ser Ala
85 90 95Phe Glu Ala Thr Leu Ala
Ala Thr Val Ser Pro Ala Met Val Ala Ala 100
105 110Asn Arg Thr Arg Leu Ala Ser Leu Val Ala Ala Asn
Leu Leu Gly Gln 115 120 125Asn Ala
Pro Ala Ile Ala Ala Ala Glu Ala Glu Tyr Glu Gln Ile Trp 130
135 140Ala Gln Asp Val Ala Ala Met Phe Gly Tyr His
Ser Ala Ala Ser Ala145 150 155
160Val Ala Thr Gln Leu Ala Pro Ile Gln Glu Gly Leu Gln Gln Gln Leu
165 170 175Gln Asn Val Leu
Ala Gln Leu Ala Ser Gly Asn Leu Gly Ser Gly Asn 180
185 190Val Gly Val Gly Asn Ile Gly Asn Asp Asn Ile
Gly Asn Ala Asn Ile 195 200 205Gly
Phe Gly Asn Arg Gly Asp Ala Asn Ile Gly Ile Gly Asn Ile Gly 210
215 220Asp Arg Asn Leu Gly Ile Gly Asn Thr Gly
Asn Trp Asn Ile Gly Ile225 230 235
240Gly Ile Thr Gly Asn Gly Gln Ile Gly Phe Gly Lys Pro Ala Asn
Pro 245 250 255Asp Val Leu
Val Val Gly Asn Gly Gly Pro Gly Val Thr Ala Leu Val 260
265 270Met Gly Gly Thr Asp Ser Leu Leu Pro Leu
Pro Asn Ile Pro Leu Leu 275 280
285Glu Tyr Ala Ala Arg Phe Ile Thr Pro Val His Pro Gly Tyr Thr Ala 290
295 300Thr Phe Leu Glu Thr Pro Ser Gln
Phe Phe Pro Phe Thr Gly Leu Asn305 310
315 320Ser Leu Thr Tyr Asp Val Ser Val Ala Gln Gly Val
Thr Asn Leu His 325 330
335Thr Ala Ile Met Ala Gln Leu Ala Ala Gly Asn Glu Val Val Val Phe
340 345 350Gly Thr Ser Gln Ser Ala
Thr Ile Ala Thr Phe Glu Met Arg Tyr Leu 355 360
365Gln Ser Leu Pro Ala His Leu Arg Pro Gly Leu Asp Glu Leu
Ser Phe 370 375 380Thr Leu Thr Gly Asn
Pro Asn Arg Pro Asp Gly Gly Ile Leu Thr Arg385 390
395 400Phe Gly Phe Ser Ile Pro Gln Leu Gly Phe
Thr Leu Ser Gly Ala Thr 405 410
415Pro Ala Asp Ala Tyr Pro Thr Val Asp Tyr Ala Phe Gln Tyr Asp Gly
420 425 430Val Asn Asp Phe Pro
Lys Tyr Pro Leu Asn Val Phe Ala Thr Ala Asn 435
440 445Ala Ile Ala Gly Ile Leu Phe Leu His Ser Gly Leu
Ile Ala Leu Pro 450 455 460Pro Asp Leu
Ala Ser Gly Val Val Gln Pro Val Ser Ser Pro Asp Val465
470 475 480Leu Thr Thr Tyr Ile Leu Leu
Pro Ser Gln Asp Leu Pro Leu Leu Val 485
490 495Pro Leu Arg Ala Ile Pro Leu Leu Gly Asn Pro Leu
Ala Asp Leu Ile 500 505 510Gln
Pro Asp Leu Arg Val Leu Val Glu Leu Gly Tyr Asp Arg Thr Ala 515
520 525His Gln Asp Val Pro Ser Pro Phe Gly
Leu Phe Pro Asp Val Asp Trp 530 535
540Ala Glu Val Ala Ala Asp Leu Gln Gln Gly Ala Val Gln Gly Val Asn545
550 555 560Asp Ala Leu Ser
Gly Leu Gly Leu Pro Pro Pro Trp Gln Pro Ala Leu 565
570 575Pro Arg Leu Phe
5805594PRTMycobacterium tuberculosis 55Met Thr Ile Asn Tyr Gln Phe Gly
Asp Val Asp Ala His Gly Ala Met1 5 10
15Ile Arg Ala Gln Ala Gly Ser Leu Glu Ala Glu His Gln Ala
Ile Ile 20 25 30Ser Asp Val
Leu Thr Ala Ser Asp Phe Trp Gly Gly Ala Gly Ser Ala 35
40 45Ala Cys Gln Gly Phe Ile Thr Gln Leu Gly Arg
Asn Phe Gln Val Ile 50 55 60Tyr Glu
Gln Ala Asn Ala His Gly Gln Lys Val Gln Ala Ala Gly Asn65
70 75 80Asn Met Ala Gln Thr Asp Ser
Ala Val Gly Ser Ser Trp Ala 85
905698PRTMycobacterium tuberculosis 56Met Thr Ser Arg Phe Met Thr Asp Pro
His Ala Met Arg Asp Met Ala1 5 10
15Gly Arg Phe Glu Val His Ala Gln Thr Val Glu Asp Glu Ala Arg
Arg 20 25 30Met Trp Ala Ser
Ala Gln Asn Ile Ser Gly Ala Gly Trp Ser Gly Met 35
40 45Ala Glu Ala Thr Ser Leu Asp Thr Met Thr Gln Met
Asn Gln Ala Phe 50 55 60Arg Asn Ile
Val Asn Met Leu His Gly Val Arg Asp Gly Leu Val Arg65 70
75 80Asp Ala Asn Asn Tyr Glu Gln Gln
Glu Gln Ala Ser Gln Gln Ile Leu 85 90
95Ser Ser57143PRTMycobacterium tuberculosis 57Met Ile Thr
Asn Leu Arg Arg Arg Thr Ala Met Ala Ala Ala Gly Leu1 5
10 15Gly Ala Ala Leu Gly Leu Gly Ile Leu
Leu Val Pro Thr Val Asp Ala 20 25
30His Leu Ala Asn Gly Ser Met Ser Glu Val Met Met Ser Glu Ile Ala
35 40 45Gly Leu Pro Ile Pro Pro Ile
Ile His Tyr Gly Ala Ile Ala Tyr Ala 50 55
60Pro Ser Gly Ala Ser Gly Lys Ala Trp His Gln Arg Thr Pro Ala Arg65
70 75 80Ala Glu Gln Val
Ala Leu Glu Lys Cys Gly Asp Lys Thr Cys Lys Val 85
90 95Val Ser Arg Phe Thr Arg Cys Gly Ala Val
Ala Tyr Asn Gly Ser Lys 100 105
110Tyr Gln Gly Gly Thr Gly Leu Thr Arg Arg Ala Ala Glu Asp Asp Ala
115 120 125Val Asn Arg Leu Glu Gly Gly
Arg Ile Val Asn Trp Ala Cys Asn 130 135
14058362PRTMycobacterium tuberculosis 58Met Leu Arg Leu Val Val Gly Ala
Leu Leu Leu Val Leu Ala Phe Ala1 5 10
15Gly Gly Tyr Ala Val Ala Ala Cys Lys Thr Val Thr Leu Thr
Val Asp 20 25 30Gly Thr Ala
Met Arg Val Thr Thr Met Lys Ser Arg Val Ile Asp Ile 35
40 45Val Glu Glu Asn Gly Phe Ser Val Asp Asp Arg
Asp Asp Leu Tyr Pro 50 55 60Ala Ala
Gly Val Gln Val His Asp Ala Asp Thr Ile Val Leu Arg Arg65
70 75 80Ser Arg Pro Leu Gln Ile Ser
Leu Asp Gly His Asp Ala Lys Gln Val 85 90
95Trp Thr Thr Ala Ser Thr Val Asp Glu Ala Leu Ala Gln
Leu Ala Met 100 105 110Thr Asp
Thr Ala Pro Ala Ala Ala Ser Arg Ala Ser Arg Val Pro Leu 115
120 125Ser Gly Met Ala Leu Pro Val Val Ser Ala
Lys Thr Val Gln Leu Asn 130 135 140Asp
Gly Gly Leu Val Arg Thr Val His Leu Pro Ala Pro Asn Val Ala145
150 155 160Gly Leu Leu Ser Ala Ala
Gly Val Pro Leu Leu Gln Ser Asp His Val 165
170 175Val Pro Ala Ala Thr Ala Pro Ile Val Glu Gly Met
Gln Ile Gln Val 180 185 190Thr
Arg Asn Arg Ile Lys Lys Val Thr Glu Arg Leu Pro Leu Pro Pro 195
200 205Asn Ala Arg Arg Val Glu Asp Pro Glu
Met Asn Met Ser Arg Glu Val 210 215
220Val Glu Asp Pro Gly Val Pro Gly Thr Gln Asp Val Thr Phe Ala Val225
230 235 240Ala Glu Val Asn
Gly Val Glu Thr Gly Arg Leu Pro Val Ala Asn Val 245
250 255Val Val Thr Pro Ala His Glu Ala Val Val
Arg Val Gly Thr Lys Pro 260 265
270Gly Thr Glu Val Pro Pro Val Ile Asp Gly Ser Ile Trp Asp Ala Ile
275 280 285Ala Gly Cys Glu Ala Gly Gly
Asn Trp Ala Ile Asn Thr Gly Asn Gly 290 295
300Tyr Tyr Gly Gly Val Gln Phe Asp Gln Gly Thr Trp Glu Ala Asn
Gly305 310 315 320Gly Leu
Arg Tyr Ala Pro Arg Ala Asp Leu Ala Thr Arg Glu Glu Gln
325 330 335Ile Ala Val Ala Glu Val Thr
Arg Leu Arg Gln Gly Trp Gly Ala Trp 340 345
350Pro Val Cys Ala Ala Arg Ala Gly Ala Arg 355
36059154PRTMycobacterium tuberculosis 59Met Thr Pro Gly Leu Leu
Thr Thr Ala Gly Ala Gly Arg Pro Arg Asp1 5
10 15Arg Cys Ala Arg Ile Val Cys Thr Val Phe Ile Glu
Thr Ala Val Val 20 25 30Ala
Thr Met Phe Val Ala Leu Leu Gly Leu Ser Thr Ile Ser Ser Lys 35
40 45Ala Asp Asp Ile Asp Trp Asp Ala Ile
Ala Gln Cys Glu Ser Gly Gly 50 55
60Asn Trp Ala Ala Asn Thr Gly Asn Gly Leu Tyr Gly Gly Leu Gln Ile65
70 75 80Ser Gln Ala Thr Trp
Asp Ser Asn Gly Gly Val Gly Ser Pro Ala Ala 85
90 95Ala Ser Pro Gln Gln Gln Ile Glu Val Ala Asp
Asn Ile Met Lys Thr 100 105
110Gln Gly Pro Gly Ala Trp Pro Lys Cys Ser Ser Cys Ser Gln Gly Asp
115 120 125Ala Pro Leu Gly Ser Leu Thr
His Ile Leu Thr Phe Leu Ala Ala Glu 130 135
140Thr Gly Gly Cys Ser Gly Ser Arg Asp Asp145
15060143PRTMycobacterium tuberculosis 60Met Thr Thr Ala Arg Asp Ile Met
Asn Ala Gly Val Thr Cys Val Gly1 5 10
15Glu His Glu Thr Leu Thr Ala Ala Ala Gln Tyr Met Arg Glu
His Asp 20 25 30Ile Gly Ala
Leu Pro Ile Cys Gly Asp Asp Asp Arg Leu His Gly Met 35
40 45Leu Thr Asp Arg Asp Ile Val Ile Lys Gly Leu
Ala Ala Gly Leu Asp 50 55 60Pro Asn
Thr Ala Thr Ala Gly Glu Leu Ala Arg Asp Ser Ile Tyr Tyr65
70 75 80Val Asp Ala Asn Ala Ser Ile
Gln Glu Met Leu Asn Val Met Glu Glu 85 90
95His Gln Val Arg Arg Val Pro Val Ile Ser Glu His Arg
Leu Val Gly 100 105 110Ile Val
Thr Glu Ala Asp Ile Ala Arg His Leu Pro Glu His Ala Ile 115
120 125Val Gln Phe Val Lys Ala Ile Cys Ser Pro
Met Ala Leu Ala Ser 130 135
14061210PRTMycobacterium tuberculosis 61Met Ile Ala Thr Thr Arg Asp Arg
Glu Gly Ala Thr Met Ile Thr Phe1 5 10
15Arg Leu Arg Leu Pro Cys Arg Thr Ile Leu Arg Val Phe Ser
Arg Asn 20 25 30Pro Leu Val
Arg Gly Thr Asp Arg Leu Glu Ala Val Val Met Leu Leu 35
40 45Ala Val Thr Val Ser Leu Leu Thr Ile Pro Phe
Ala Ala Ala Ala Gly 50 55 60Thr Ala
Val Gln Asp Ser Arg Ser His Val Tyr Ala His Gln Ala Gln65
70 75 80Thr Arg His Pro Ala Thr Ala
Thr Val Ile Asp His Glu Gly Val Ile 85 90
95Asp Ser Asn Thr Thr Ala Thr Ser Ala Pro Pro Arg Thr
Lys Ile Thr 100 105 110Val Pro
Ala Arg Trp Val Val Asn Gly Ile Glu Arg Ser Gly Glu Val 115
120 125Asn Ala Lys Pro Gly Thr Lys Ser Gly Asp
Arg Val Gly Ile Trp Val 130 135 140Asp
Ser Ala Gly Gln Leu Val Asp Glu Pro Ala Pro Pro Ala Arg Ala145
150 155 160Ile Ala Asp Ala Ala Leu
Ala Ala Leu Gly Leu Trp Leu Ser Val Ala 165
170 175Ala Val Ala Gly Ala Leu Leu Ala Leu Thr Arg Ala
Ile Leu Ile Arg 180 185 190Val
Arg Asn Ala Ser Trp Gln His Asp Ile Asp Ser Leu Phe Cys Thr 195
200 205Gln Arg 21062380PRTMycobacterium
tuberculosis 62Met Asp Phe Ala Leu Leu Pro Pro Glu Val Asn Ser Ala Arg
Met Tyr1 5 10 15Thr Gly
Pro Gly Ala Gly Ser Leu Leu Ala Ala Ala Gly Gly Trp Asp 20
25 30Ser Leu Ala Ala Glu Leu Ala Thr Thr
Ala Glu Ala Tyr Gly Ser Val 35 40
45Leu Ser Gly Leu Ala Ala Leu His Trp Arg Gly Pro Ala Ala Glu Ser 50
55 60Met Ala Val Thr Ala Ala Pro Tyr Ile
Gly Trp Leu Tyr Thr Thr Ala65 70 75
80Glu Lys Thr Gln Gln Thr Ala Ile Gln Ala Arg Ala Ala Ala
Leu Ala 85 90 95Phe Glu
Gln Ala Tyr Ala Met Thr Leu Pro Pro Pro Val Val Ala Ala 100
105 110Asn Arg Ile Gln Leu Leu Ala Leu Ile
Ala Thr Asn Phe Phe Gly Gln 115 120
125Asn Thr Ala Ala Ile Ala Ala Thr Glu Ala Gln Tyr Ala Glu Met Trp
130 135 140Ala Gln Asp Ala Ala Ala Met
Tyr Gly Tyr Ala Thr Ala Ser Ala Ala145 150
155 160Ala Ala Leu Leu Thr Pro Phe Ser Pro Pro Arg Gln
Thr Thr Asn Pro 165 170
175Ala Gly Leu Thr Ala Gln Ala Ala Ala Val Ser Gln Ala Thr Asp Pro
180 185 190Leu Ser Leu Leu Ile Glu
Thr Val Thr Gln Ala Leu Gln Ala Leu Thr 195 200
205Ile Pro Ser Phe Ile Pro Glu Asp Phe Thr Phe Leu Asp Ala
Ile Phe 210 215 220Ala Gly Tyr Ala Thr
Val Gly Val Thr Gln Asp Val Glu Ser Phe Val225 230
235 240Ala Gly Thr Ile Gly Ala Glu Ser Asn Leu
Gly Leu Leu Asn Val Gly 245 250
255Asp Glu Asn Pro Ala Glu Val Thr Pro Gly Asp Phe Gly Ile Gly Glu
260 265 270Leu Val Ser Ala Thr
Ser Pro Gly Gly Gly Val Ser Ala Ser Gly Ala 275
280 285Gly Gly Ala Ala Ser Val Gly Asn Thr Val Leu Ala
Ser Val Gly Arg 290 295 300Ala Asn Ser
Ile Gly Gln Leu Ser Val Pro Pro Ser Trp Ala Ala Pro305
310 315 320Ser Thr Arg Pro Val Ser Ala
Leu Ser Pro Ala Gly Leu Thr Thr Leu 325
330 335Pro Gly Thr Asp Val Ala Glu His Gly Met Pro Gly
Val Pro Gly Val 340 345 350Pro
Val Ala Ala Gly Arg Ala Ser Gly Val Leu Pro Arg Tyr Gly Val 355
360 365Arg Leu Thr Val Met Ala His Pro Pro
Ala Ala Gly 370 375
38063199PRTMycobacterium tuberculosis 63Met Ala Glu Asn Ser Asn Ile Asp
Asp Ile Lys Ala Pro Leu Leu Ala1 5 10
15Ala Leu Gly Ala Ala Asp Leu Ala Leu Ala Thr Val Asn Glu
Leu Ile 20 25 30Thr Asn Leu
Arg Glu Arg Ala Glu Glu Thr Arg Thr Asp Thr Arg Ser 35
40 45Arg Val Glu Glu Ser Arg Ala Arg Leu Thr Lys
Leu Gln Glu Asp Leu 50 55 60Pro Glu
Gln Leu Thr Glu Leu Arg Glu Lys Phe Thr Ala Glu Glu Leu65
70 75 80Arg Lys Ala Ala Glu Gly Tyr
Leu Glu Ala Ala Thr Ser Arg Tyr Asn 85 90
95Glu Leu Val Glu Arg Gly Glu Ala Ala Leu Glu Arg Leu
Arg Ser Gln 100 105 110Gln Ser
Phe Glu Glu Val Ser Ala Arg Ala Glu Gly Tyr Val Asp Gln 115
120 125Ala Val Glu Leu Thr Gln Glu Ala Leu Gly
Thr Val Ala Ser Gln Thr 130 135 140Arg
Ala Val Gly Glu Arg Ala Ala Lys Leu Val Gly Ile Glu Leu Pro145
150 155 160Lys Lys Ala Ala Pro Ala
Lys Lys Ala Ala Pro Ala Lys Lys Ala Ala 165
170 175Pro Ala Lys Lys Ala Ala Ala Lys Lys Ala Pro Ala
Lys Lys Ala Ala 180 185 190Ala
Lys Lys Val Thr Gln Lys 19564323PRTMycobacterium tuberculosis
64Ala Pro Pro Ala Leu Ser Gln Asp Arg Phe Ala Asp Phe Pro Ala Leu1
5 10 15Pro Leu Asp Pro Ser Ala
Met Val Ala Gln Val Gly Pro Gln Val Val 20 25
30Asn Ile Asn Thr Lys Leu Gly Tyr Asn Asn Ala Val Gly
Ala Gly Thr 35 40 45Gly Ile Val
Ile Asp Pro Asn Gly Val Val Leu Thr Asn Asn His Val 50
55 60Ile Ala Gly Ala Thr Asp Ile Asn Ala Phe Ser Val
Gly Ser Gly Gln65 70 75
80Thr Tyr Gly Val Asp Val Val Gly Tyr Asp Arg Thr Gln Asp Val Ala
85 90 95Val Leu Gln Leu Arg Gly
Ala Gly Gly Leu Pro Ser Ala Ala Ile Gly 100
105 110Gly Gly Val Ala Val Gly Glu Pro Val Val Ala Met
Gly Asn Ser Gly 115 120 125Gly Gln
Gly Gly Thr Pro Arg Ala Val Pro Gly Arg Val Val Ala Leu 130
135 140Gly Gln Thr Val Gln Ala Ser Asp Ser Leu Thr
Gly Ala Glu Glu Thr145 150 155
160Leu Asn Gly Leu Ile Gln Phe Asp Ala Ala Ile Gln Pro Gly Asp Ser
165 170 175Gly Gly Pro Val
Val Asn Gly Leu Gly Gln Val Val Gly Met Asn Thr 180
185 190Ala Ala Ser Asp Asn Phe Gln Leu Ser Gln Gly
Gly Gln Gly Phe Ala 195 200 205Ile
Pro Ile Gly Gln Ala Met Ala Ile Ala Gly Gln Ile Arg Ser Gly 210
215 220Gly Gly Ser Pro Thr Val His Ile Gly Pro
Thr Ala Phe Leu Gly Leu225 230 235
240Gly Val Val Asp Asn Asn Gly Asn Gly Ala Arg Val Gln Arg Val
Val 245 250 255Gly Ser Ala
Pro Ala Ala Ser Leu Gly Ile Ser Thr Gly Asp Val Ile 260
265 270Thr Ala Val Asp Gly Ala Pro Ile Asn Ser
Ala Thr Ala Met Ala Asp 275 280
285Ala Leu Asn Gly His His Pro Gly Asp Val Ile Ser Val Thr Trp Gln 290
295 300Thr Lys Ser Gly Gly Thr Arg Thr
Gly Asn Val Thr Leu Ala Glu Gly305 310
315 320Pro Pro Ala65323PRTArtificial SequenceSer/Ala
mutated mature Rv0125 65Ala Pro Pro Ala Leu Ser Gln Asp Arg Phe Ala Asp
Phe Pro Ala Leu1 5 10
15Pro Leu Asp Pro Ser Ala Met Val Ala Gln Val Gly Pro Gln Val Val
20 25 30Asn Ile Asn Thr Lys Leu Gly
Tyr Asn Asn Ala Val Gly Ala Gly Thr 35 40
45Gly Ile Val Ile Asp Pro Asn Gly Val Val Leu Thr Asn Asn His
Val 50 55 60Ile Ala Gly Ala Thr Asp
Ile Asn Ala Phe Ser Val Gly Ser Gly Gln65 70
75 80Thr Tyr Gly Val Asp Val Val Gly Tyr Asp Arg
Thr Gln Asp Val Ala 85 90
95Val Leu Gln Leu Arg Gly Ala Gly Gly Leu Pro Ser Ala Ala Ile Gly
100 105 110Gly Gly Val Ala Val Gly
Glu Pro Val Val Ala Met Gly Asn Ser Gly 115 120
125Gly Gln Gly Gly Thr Pro Arg Ala Val Pro Gly Arg Val Val
Ala Leu 130 135 140Gly Gln Thr Val Gln
Ala Ser Asp Ser Leu Thr Gly Ala Glu Glu Thr145 150
155 160Leu Asn Gly Leu Ile Gln Phe Asp Ala Ala
Ile Gln Pro Gly Asp Ala 165 170
175Gly Gly Pro Val Val Asn Gly Leu Gly Gln Val Val Gly Met Asn Thr
180 185 190Ala Ala Ser Asp Asn
Phe Gln Leu Ser Gln Gly Gly Gln Gly Phe Ala 195
200 205Ile Pro Ile Gly Gln Ala Met Ala Ile Ala Gly Gln
Ile Arg Ser Gly 210 215 220Gly Gly Ser
Pro Thr Val His Ile Gly Pro Thr Ala Phe Leu Gly Leu225
230 235 240Gly Val Val Asp Asn Asn Gly
Asn Gly Ala Arg Val Gln Arg Val Val 245
250 255Gly Ser Ala Pro Ala Ala Ser Leu Gly Ile Ser Thr
Gly Asp Val Ile 260 265 270Thr
Ala Val Asp Gly Ala Pro Ile Asn Ser Ala Thr Ala Met Ala Asp 275
280 285Ala Leu Asn Gly His His Pro Gly Asp
Val Ile Ser Val Thr Trp Gln 290 295
300Thr Lys Ser Gly Gly Thr Arg Thr Gly Asn Val Thr Leu Ala Glu Gly305
310 315 320Pro Pro
Ala66132PRTMycobacterium tuberculosis 66Thr Ala Ala Ser Asp Asn Phe Gln
Leu Ser Gln Gly Gly Gln Gly Phe1 5 10
15Ala Ile Pro Ile Gly Gln Ala Met Ala Ile Ala Gly Gln Ile
Arg Ser 20 25 30Gly Gly Gly
Ser Pro Thr Val His Ile Gly Pro Thr Ala Phe Leu Gly 35
40 45Leu Gly Val Val Asp Asn Asn Gly Asn Gly Ala
Arg Val Gln Arg Val 50 55 60Val Gly
Ser Ala Pro Ala Ala Ser Leu Gly Ile Ser Thr Gly Asp Val65
70 75 80Ile Thr Ala Val Asp Gly Ala
Pro Ile Asn Ser Ala Thr Ala Met Ala 85 90
95Asp Ala Leu Asn Gly His His Pro Gly Asp Val Ile Ser
Val Thr Trp 100 105 110Gln Thr
Lys Ser Gly Gly Thr Arg Thr Gly Asn Val Thr Leu Ala Glu 115
120 125Gly Pro Pro Ala
13067195PRTMycobacterium tuberculosis 67Ala Pro Pro Ala Leu Ser Gln Asp
Arg Phe Ala Asp Phe Pro Ala Leu1 5 10
15Pro Leu Asp Pro Ser Ala Met Val Ala Gln Val Gly Pro Gln
Val Val 20 25 30Asn Ile Asn
Thr Lys Leu Gly Tyr Asn Asn Ala Val Gly Ala Gly Thr 35
40 45Gly Ile Val Ile Asp Pro Asn Gly Val Val Leu
Thr Asn Asn His Val 50 55 60Ile Ala
Gly Ala Thr Asp Ile Asn Ala Phe Ser Val Gly Ser Gly Gln65
70 75 80Thr Tyr Gly Val Asp Val Val
Gly Tyr Asp Arg Thr Gln Asp Val Ala 85 90
95Val Leu Gln Leu Arg Gly Ala Gly Gly Leu Pro Ser Ala
Ala Ile Gly 100 105 110Gly Gly
Val Ala Val Gly Glu Pro Val Val Ala Met Gly Asn Ser Gly 115
120 125Gly Gln Gly Gly Thr Pro Arg Ala Val Pro
Gly Arg Val Val Ala Leu 130 135 140Gly
Gln Thr Val Gln Ala Ser Asp Ser Leu Thr Gly Ala Glu Glu Thr145
150 155 160Leu Asn Gly Leu Ile Gln
Phe Asp Ala Ala Ile Gln Pro Gly Asp Ser 165
170 175Gly Gly Pro Val Val Asn Gly Leu Gly Gln Val Val
Gly Met Asn Thr 180 185 190Ala
Ala Ser 19568391PRTMycobacterium tuberculosis 68Met Val Asp Phe
Gly Ala Leu Pro Pro Glu Ile Asn Ser Ala Arg Met1 5
10 15Tyr Ala Gly Pro Gly Ser Ala Ser Leu Val
Ala Ala Ala Gln Met Trp 20 25
30Asp Ser Val Ala Ser Asp Leu Phe Ser Ala Ala Ser Ala Phe Gln Ser
35 40 45Val Val Trp Gly Leu Thr Val Gly
Ser Trp Ile Gly Ser Ser Ala Gly 50 55
60Leu Met Val Ala Ala Ala Ser Pro Tyr Val Ala Trp Met Ser Val Thr65
70 75 80Ala Gly Gln Ala Glu
Leu Thr Ala Ala Gln Val Arg Val Ala Ala Ala 85
90 95Ala Tyr Glu Thr Ala Tyr Gly Leu Thr Val Pro
Pro Pro Val Ile Ala 100 105
110Glu Asn Arg Ala Glu Leu Met Ile Leu Ile Ala Thr Asn Leu Leu Gly
115 120 125Gln Asn Thr Pro Ala Ile Ala
Val Asn Glu Ala Glu Tyr Gly Glu Met 130 135
140Trp Ala Gln Asp Ala Ala Ala Met Phe Gly Tyr Ala Ala Ala Thr
Ala145 150 155 160Thr Ala
Thr Ala Thr Leu Leu Pro Phe Glu Glu Ala Pro Glu Met Thr
165 170 175Ser Ala Gly Gly Leu Leu Glu
Gln Ala Ala Ala Val Glu Glu Ala Ser 180 185
190Asp Thr Ala Ala Ala Asn Gln Leu Met Asn Asn Val Pro Gln
Ala Leu 195 200 205Gln Gln Leu Ala
Gln Pro Thr Gln Gly Thr Thr Pro Ser Ser Lys Leu 210
215 220Gly Gly Leu Trp Lys Thr Val Ser Pro His Arg Ser
Pro Ile Ser Asn225 230 235
240Met Val Ser Met Ala Asn Asn His Met Ser Met Thr Asn Ser Gly Val
245 250 255Ser Met Thr Asn Thr
Leu Ser Ser Met Leu Lys Gly Phe Ala Pro Ala 260
265 270Ala Ala Ala Gln Ala Val Gln Thr Ala Ala Gln Asn
Gly Val Arg Ala 275 280 285Met Ser
Ser Leu Gly Ser Ser Leu Gly Ser Ser Gly Leu Gly Gly Gly 290
295 300Val Ala Ala Asn Leu Gly Arg Ala Ala Ser Val
Gly Ser Leu Ser Val305 310 315
320Pro Gln Ala Trp Ala Ala Ala Asn Gln Ala Val Thr Pro Ala Ala Arg
325 330 335Ala Leu Pro Leu
Thr Ser Leu Thr Ser Ala Ala Glu Arg Gly Pro Gly 340
345 350Gln Met Leu Gly Gly Leu Pro Val Gly Gln Met
Gly Ala Arg Ala Gly 355 360 365Gly
Gly Leu Ser Gly Val Leu Arg Val Pro Pro Arg Pro Tyr Val Met 370
375 380Pro His Ser Pro Ala Ala Gly385
39069723PRTArtificial SequenceMtb72f 69Met Thr Ala Ala Ser Asp Asn
Phe Gln Leu Ser Gln Gly Gly Gln Gly1 5 10
15Phe Ala Ile Pro Ile Gly Gln Ala Met Ala Ile Ala Gly
Gln Ile Arg 20 25 30Ser Gly
Gly Gly Ser Pro Thr Val His Ile Gly Pro Thr Ala Phe Leu 35
40 45Gly Leu Gly Val Val Asp Asn Asn Gly Asn
Gly Ala Arg Val Gln Arg 50 55 60Val
Val Gly Ser Ala Pro Ala Ala Ser Leu Gly Ile Ser Thr Gly Asp65
70 75 80Val Ile Thr Ala Val Asp
Gly Ala Pro Ile Asn Ser Ala Thr Ala Met 85
90 95Ala Asp Ala Leu Asn Gly His His Pro Gly Asp Val
Ile Ser Val Thr 100 105 110Trp
Gln Thr Lys Ser Gly Gly Thr Arg Thr Gly Asn Val Thr Leu Ala 115
120 125Glu Gly Pro Pro Ala Glu Phe Met Val
Asp Phe Gly Ala Leu Pro Pro 130 135
140Glu Ile Asn Ser Ala Arg Met Tyr Ala Gly Pro Gly Ser Ala Ser Leu145
150 155 160Val Ala Ala Ala
Gln Met Trp Asp Ser Val Ala Ser Asp Leu Phe Ser 165
170 175Ala Ala Ser Ala Phe Gln Ser Val Val Trp
Gly Leu Thr Val Gly Ser 180 185
190Trp Ile Gly Ser Ser Ala Gly Leu Met Val Ala Ala Ala Ser Pro Tyr
195 200 205Val Ala Trp Met Ser Val Thr
Ala Gly Gln Ala Glu Leu Thr Ala Ala 210 215
220Gln Val Arg Val Ala Ala Ala Ala Tyr Glu Thr Ala Tyr Gly Leu
Thr225 230 235 240Val Pro
Pro Pro Val Ile Ala Glu Asn Arg Ala Glu Leu Met Ile Leu
245 250 255Ile Ala Thr Asn Leu Leu Gly
Gln Asn Thr Pro Ala Ile Ala Val Asn 260 265
270Glu Ala Glu Tyr Gly Glu Met Trp Ala Gln Asp Ala Ala Ala
Met Phe 275 280 285Gly Tyr Ala Ala
Ala Thr Ala Thr Ala Thr Ala Thr Leu Leu Pro Phe 290
295 300Glu Glu Ala Pro Glu Met Thr Ser Ala Gly Gly Leu
Leu Glu Gln Ala305 310 315
320Ala Ala Val Glu Glu Ala Ser Asp Thr Ala Ala Ala Asn Gln Leu Met
325 330 335Asn Asn Val Pro Gln
Ala Leu Gln Gln Leu Ala Gln Pro Thr Gln Gly 340
345 350Thr Thr Pro Ser Ser Lys Leu Gly Gly Leu Trp Lys
Thr Val Ser Pro 355 360 365His Arg
Ser Pro Ile Ser Asn Met Val Ser Met Ala Asn Asn His Met 370
375 380Ser Met Thr Asn Ser Gly Val Ser Met Thr Asn
Thr Leu Ser Ser Met385 390 395
400Leu Lys Gly Phe Ala Pro Ala Ala Ala Ala Gln Ala Val Gln Thr Ala
405 410 415Ala Gln Asn Gly
Val Arg Ala Met Ser Ser Leu Gly Ser Ser Leu Gly 420
425 430Ser Ser Gly Leu Gly Gly Gly Val Ala Ala Asn
Leu Gly Arg Ala Ala 435 440 445Ser
Val Gly Ser Leu Ser Val Pro Gln Ala Trp Ala Ala Ala Asn Gln 450
455 460Ala Val Thr Pro Ala Ala Arg Ala Leu Pro
Leu Thr Ser Leu Thr Ser465 470 475
480Ala Ala Glu Arg Gly Pro Gly Gln Met Leu Gly Gly Leu Pro Val
Gly 485 490 495Gln Met Gly
Ala Arg Ala Gly Gly Gly Leu Ser Gly Val Leu Arg Val 500
505 510Pro Pro Arg Pro Tyr Val Met Pro His Ser
Pro Ala Ala Gly Asp Ile 515 520
525Ala Pro Pro Ala Leu Ser Gln Asp Arg Phe Ala Asp Phe Pro Ala Leu 530
535 540Pro Leu Asp Pro Ser Ala Met Val
Ala Gln Val Gly Pro Gln Val Val545 550
555 560Asn Ile Asn Thr Lys Leu Gly Tyr Asn Asn Ala Val
Gly Ala Gly Thr 565 570
575Gly Ile Val Ile Asp Pro Asn Gly Val Val Leu Thr Asn Asn His Val
580 585 590Ile Ala Gly Ala Thr Asp
Ile Asn Ala Phe Ser Val Gly Ser Gly Gln 595 600
605Thr Tyr Gly Val Asp Val Val Gly Tyr Asp Arg Thr Gln Asp
Val Ala 610 615 620Val Leu Gln Leu Arg
Gly Ala Gly Gly Leu Pro Ser Ala Ala Ile Gly625 630
635 640Gly Gly Val Ala Val Gly Glu Pro Val Val
Ala Met Gly Asn Ser Gly 645 650
655Gly Gln Gly Gly Thr Pro Arg Ala Val Pro Gly Arg Val Val Ala Leu
660 665 670Gly Gln Thr Val Gln
Ala Ser Asp Ser Leu Thr Gly Ala Glu Glu Thr 675
680 685Leu Asn Gly Leu Ile Gln Phe Asp Ala Ala Ile Gln
Pro Gly Asp Ser 690 695 700Gly Gly Pro
Val Val Asn Gly Leu Gly Gln Val Val Gly Met Asn Thr705
710 715 720Ala Ala Ser70723PRTArtificial
SequenceM72 70Met Thr Ala Ala Ser Asp Asn Phe Gln Leu Ser Gln Gly Gly Gln
Gly1 5 10 15Phe Ala Ile
Pro Ile Gly Gln Ala Met Ala Ile Ala Gly Gln Ile Arg 20
25 30Ser Gly Gly Gly Ser Pro Thr Val His Ile
Gly Pro Thr Ala Phe Leu 35 40
45Gly Leu Gly Val Val Asp Asn Asn Gly Asn Gly Ala Arg Val Gln Arg 50
55 60Val Val Gly Ser Ala Pro Ala Ala Ser
Leu Gly Ile Ser Thr Gly Asp65 70 75
80Val Ile Thr Ala Val Asp Gly Ala Pro Ile Asn Ser Ala Thr
Ala Met 85 90 95Ala Asp
Ala Leu Asn Gly His His Pro Gly Asp Val Ile Ser Val Thr 100
105 110Trp Gln Thr Lys Ser Gly Gly Thr Arg
Thr Gly Asn Val Thr Leu Ala 115 120
125Glu Gly Pro Pro Ala Glu Phe Met Val Asp Phe Gly Ala Leu Pro Pro
130 135 140Glu Ile Asn Ser Ala Arg Met
Tyr Ala Gly Pro Gly Ser Ala Ser Leu145 150
155 160Val Ala Ala Ala Gln Met Trp Asp Ser Val Ala Ser
Asp Leu Phe Ser 165 170
175Ala Ala Ser Ala Phe Gln Ser Val Val Trp Gly Leu Thr Val Gly Ser
180 185 190Trp Ile Gly Ser Ser Ala
Gly Leu Met Val Ala Ala Ala Ser Pro Tyr 195 200
205Val Ala Trp Met Ser Val Thr Ala Gly Gln Ala Glu Leu Thr
Ala Ala 210 215 220Gln Val Arg Val Ala
Ala Ala Ala Tyr Glu Thr Ala Tyr Gly Leu Thr225 230
235 240Val Pro Pro Pro Val Ile Ala Glu Asn Arg
Ala Glu Leu Met Ile Leu 245 250
255Ile Ala Thr Asn Leu Leu Gly Gln Asn Thr Pro Ala Ile Ala Val Asn
260 265 270Glu Ala Glu Tyr Gly
Glu Met Trp Ala Gln Asp Ala Ala Ala Met Phe 275
280 285Gly Tyr Ala Ala Ala Thr Ala Thr Ala Thr Ala Thr
Leu Leu Pro Phe 290 295 300Glu Glu Ala
Pro Glu Met Thr Ser Ala Gly Gly Leu Leu Glu Gln Ala305
310 315 320Ala Ala Val Glu Glu Ala Ser
Asp Thr Ala Ala Ala Asn Gln Leu Met 325
330 335Asn Asn Val Pro Gln Ala Leu Gln Gln Leu Ala Gln
Pro Thr Gln Gly 340 345 350Thr
Thr Pro Ser Ser Lys Leu Gly Gly Leu Trp Lys Thr Val Ser Pro 355
360 365His Arg Ser Pro Ile Ser Asn Met Val
Ser Met Ala Asn Asn His Met 370 375
380Ser Met Thr Asn Ser Gly Val Ser Met Thr Asn Thr Leu Ser Ser Met385
390 395 400Leu Lys Gly Phe
Ala Pro Ala Ala Ala Ala Gln Ala Val Gln Thr Ala 405
410 415Ala Gln Asn Gly Val Arg Ala Met Ser Ser
Leu Gly Ser Ser Leu Gly 420 425
430Ser Ser Gly Leu Gly Gly Gly Val Ala Ala Asn Leu Gly Arg Ala Ala
435 440 445Ser Val Gly Ser Leu Ser Val
Pro Gln Ala Trp Ala Ala Ala Asn Gln 450 455
460Ala Val Thr Pro Ala Ala Arg Ala Leu Pro Leu Thr Ser Leu Thr
Ser465 470 475 480Ala Ala
Glu Arg Gly Pro Gly Gln Met Leu Gly Gly Leu Pro Val Gly
485 490 495Gln Met Gly Ala Arg Ala Gly
Gly Gly Leu Ser Gly Val Leu Arg Val 500 505
510Pro Pro Arg Pro Tyr Val Met Pro His Ser Pro Ala Ala Gly
Asp Ile 515 520 525Ala Pro Pro Ala
Leu Ser Gln Asp Arg Phe Ala Asp Phe Pro Ala Leu 530
535 540Pro Leu Asp Pro Ser Ala Met Val Ala Gln Val Gly
Pro Gln Val Val545 550 555
560Asn Ile Asn Thr Lys Leu Gly Tyr Asn Asn Ala Val Gly Ala Gly Thr
565 570 575Gly Ile Val Ile Asp
Pro Asn Gly Val Val Leu Thr Asn Asn His Val 580
585 590Ile Ala Gly Ala Thr Asp Ile Asn Ala Phe Ser Val
Gly Ser Gly Gln 595 600 605Thr Tyr
Gly Val Asp Val Val Gly Tyr Asp Arg Thr Gln Asp Val Ala 610
615 620Val Leu Gln Leu Arg Gly Ala Gly Gly Leu Pro
Ser Ala Ala Ile Gly625 630 635
640Gly Gly Val Ala Val Gly Glu Pro Val Val Ala Met Gly Asn Ser Gly
645 650 655Gly Gln Gly Gly
Thr Pro Arg Ala Val Pro Gly Arg Val Val Ala Leu 660
665 670Gly Gln Thr Val Gln Ala Ser Asp Ser Leu Thr
Gly Ala Glu Glu Thr 675 680 685Leu
Asn Gly Leu Ile Gln Phe Asp Ala Ala Ile Gln Pro Gly Asp Ala 690
695 700Gly Gly Pro Val Val Asn Gly Leu Gly Gln
Val Val Gly Met Asn Thr705 710 715
720Ala Ala Ser
User Contributions:
Comment about this patent or add new information about this topic: