Patent application title: ANTIBODIES, VARIABLE DOMAINS & CHAINS TAILORED FOR HUMAN USE
Inventors:
Allan Bradley (Cambridge, GB)
Allan Bradley (Cambridge, GB)
Glenn Friedrich (Cambridge, GB)
E-Chiang Lee (Cambridge, GB)
E-Chiang Lee (Cambridge, GB)
Mark Strivens (Cambridge, GB)
Nicholas England (Cambridge, GB)
Assignees:
Kymab Limited
IPC8 Class: AA01K67027FI
USPC Class:
800 14
Class name: Nonhuman animal transgenic nonhuman animal (e.g., mollusks, etc.) mammal
Publication date: 2014-02-06
Patent application number: 20140041067
Abstract:
The invention relates to the provision of antibody therapeutics and
prophylactics that are tailored specifically for human use. The present
invention provides libraries, vertebrates and cells, such as transgenic
mice or rats or transgenic mouse or rat cells. Furthermore, the invention
relates to methods of using the vertebrates to isolate antibodies or
nucleotide sequences encoding antibodies. Antibodies, heavy chains,
polypeptides, nucleotide sequences, pharmaceutical compositions and uses
are also provided by the invention.Claims:
1. A non-human vertebrate whose genome comprises an immunoglobulin heavy
chain locus comprising one or more VH gene segments, one or more D gene
segments, and one or more J segments comprising human gene segment
JH6*02, upstream of a constant region; wherein said gene segments in said
heavy chain locus are operably linked to said constant region, and said
vertebrate is functional to recombine a said VH gene segment, a said D
gene segment and said human JH6*02 to form a VH-D-JH6*02 variable region
gene.
2. The vertebrate of claim 1, wherein said vertebrate has been immunised with a target antigen and comprises an antibody heavy chain comprising a variable domain encoded by said VH-D-JH6*02 gene produced by recombination of said heavy chain gene segments with said human JH6*02, wherein said variable domain comprises an HCDR3 having a length of at least 20 amino acids.
3. A non-human vertebrate cell whose genome comprises an immunoglobulin heavy chain locus comprising one or more VH gene segments, one or more D gene segments and one or more J segments comprising human gene segment JH6*02, upstream of a constant region, wherein said gene segments of said heavy chain locus are operably linked to said constant region and said vertebrate is functional to recombine a said VH segment, a said D segment and said human JH6*02 gene segment to form a VH-D-JH6*02 variable region gene.
4. The cell of claim 3, which is an ES cell, wherein said ES cell is capable of developing into a mouse comprising said immunoglobulin heavy chain locus in its germ line, and wherein said mouse comprises an antibody-producing cell that expresses an antibody comprising said heavy chain.
5. The vertebrate of claim 1, wherein said heavy chain locus comprises a human JH6*02 recombination signal sequence (RSS) positioned upstream of and operably linked to said JH6*02 gene segment.
6. The cell of claim 3, wherein the heavy chain locus comprises a human JH6*02 recombination signal sequence (RSS) positioned upstream of and operably linked to said JH6*02 gene segment.
7. The vertebrate of claim 5, wherein said RSS is SEQ ID NO: 238 or, wherein said SEQ ID NO: 238 comprises a flanking 9mer, a central 22mer and a flanking 7mer, and said sequence comprises identical 9mer and 7mer flanking sequences and said central sequence is at least 70% identical to said 22mer central sequence of SEQ ID NO: 238.
8. The cell of claim of claim 6, wherein the RSS is SEQ ID NO: 238 or, wherein said SEQ ID NO: 238 comprises a flanking 9mer, a central 22mer and a flanking 7mer, and said sequence comprises identical 9mer and 7mer flanking sequences and said central sequence is at least 70% identical to said 22mer central sequence of SEQ ID NO: 238.
9. The vertebrate of claim 6, wherein said RSS and JH6*02 are provided as SEQ ID NO: 237.
10. The cell of claim 8, wherein said RSS and JH6*02 are provided as SEQ ID NO: 237.
11. The vertebrate of claim 1, wherein said JH6*02 is the only JH6 gene segment in the genome.
12. The cell of claim 3, wherein said JH6*02 is the only JH6 gene segment in said genome.
13. The vertebrate of claim 1, wherein said J gene segments comprise a plurality of J gene segments comprising a 3' J gene segment positioned downstream in said plurality and upstream of said constant region in said locus, and said 3' J gene segment comprises said JH6*02 gene segment.
14. The cell of claim 3, wherein said J gene segments comprise a plurality of J gene segments comprising a 3' J gene segment positioned downstream in said plurality and upstream of said constant region in said locus, and said 3' J gene segment comprises said JH6*02 gene segment.
15. The vertebrate of claim 1, wherein said locus comprises one, more or all human D gene segments D3-9; D4-17; D3-10; D2-2; D5-24; D6-19; D3-22; D6-13; D5-12; D1-26; D1-20; D5-18; D3-16; D2-21; D1-14; D7-27; D1-1; D6-25; D2-14; and D4-23.
16. The cell of claim 3, wherein the locus comprises one, more or all human D gene segments D3-9; D4-17; D3-10; D2-2; D5-24; D6-19; D3-22; D6-13; D5-12; D1-26; D1-20; D5-18; D3-16; D2-21; D1-14; D7-27; D1-1; D6-25; D2-14; and D4-23.
17. The vertebrate of claim 15, wherein the locus comprises one, more or all human D gene segments D3-9, D3-10, D6-19, D4-17, D6-13, D3-22, D2-2, D2-25 and D3-3.
18. The cell of claim 16, wherein the locus comprises one, more or all human D gene segments D3-9, D3-10, D6-19, D4-17, D6-13, D3-22, D2-2, D2-25 and D3-3.
19. The vertebrate of claim 1, wherein the locus comprises a plurality of human D gene segments and said JH6*02 is in human germline configuration with respect to the 3'-most human D gene segment.
20. The cell of claim 3, wherein the locus comprises a plurality of human D gene segments and the JH6*02 is in human germline configuration with respect to the 3'-most human D gene segment.
21. The vertebrate of claim 1, wherein the locus comprises one, more or all of IGHV gene segments selected from V3-21, V3-13, V3-7, V6-1, V1-8, V1-2, V7-4-1, V1-3, V1-18, V4-4, V3-9, V3-23, V3-11 and V3-20.
22. The cell of claim 3, wherein the locus comprises one, more or all of IGHV gene segments selected from V3-21, V3-13, V3-7, V6-1, V1-8, V1-2, V7-4-1, V1-3, V1-18, V4-4, V3-9, V3-23, V3-11 and V3-20.
23. The vertebrate of claim 1, wherein the locus comprises one, more or all of human D3-9*01, D3-10*01, D6-19*01, D6-13*01, D1-26*01, IGHV1-8*01, IGHV4-61*01, IGHV6-1*01, IGHV4-4*02, IGHV1-3*01, IGHV3-66*03, IGHV3-7*01 and IGHV3-9*01.
24. The cell of claim 3, wherein the locus comprises one, more or all of human D3-9*01, D3-10*01, D6-19*01, D6-13*01, D1-26*01, IGHV1-8*01, IGHV4-61*01, IGHV6-1*01, IGHV4-4*02, IGHV1-3*01, IGHV3-66*03, IGHV3-7*01 and IGHV3-9*01.
25. An antibody-producing cell that is a progeny of the cell of claim 3, wherein the antibody-producing cell comprises a heavy chain locus comprising a rearranged variable region produced by recombination of human JH6*02 with a D gene segment and a VH gene segment.
26. The cell of claim 25, which is a B-cell or hybridoma that expresses a target antigen-specific antibody comprising a heavy chain that comprises a rearranged variable region encoded by a recombined Vh-D-human JH6*02 gene produced by recombination of said gene segments with said human JH6*02.
27. The vertebrate of claim 1, wherein the antibody heavy chain specifically binds a target antigen.
28. The cell of claim 3, wherein the antibody heavy chain specifically binds a target antigen.
29. The vertebrate of claim 1, wherein said antibody heavy chain comprises a variable region comprising a HCDR3 length of at least 20 amino acids.
30. The cell of claim 3, wherein said antibody heavy chain comprises a variable region comprising a HCDR3 length of at least 20 amino acids.
31. The vertebrate of claim 1, wherein said antibody heavy chain is a product of the recombination of JH6*02 with a human VH gene segment recited in claim 21 or 23 and/or a D gene segment recited in claim 15, 17 or 23.
32. The cell of claim 3, wherein said antibody heavy chain is a product of the recombination of JH6*02 with a human VH gene segment recited in claim 22 or 24 and/or a D gene segment recited in claim 16, 18 or 24.
33. The vertebrate of claim 1, wherein said genome comprises endogenous heavy chain variable region gene segments which are nonfunctional to produce endogenous Ig heavy chain.
34. The cell of claim 3, wherein said genome comprises endogenous heavy chain variable region gene segments which are nonfunctional to produce endogenous Ig heavy chain.
35. The vertebrate of claim 1, wherein the genome is homozygous for said heavy chain locus.
36. The cell of claim 3, wherein the genome is homozygous for said heavy chain locus.
37. The vertebrate of claim 1, wherein the constant region is an endogenous constant region.
38. The cell of claim 3, wherein the constant region is an endogenous constant region.
39. A method for producing an antibody specific to a target antigen, the method comprising immunizing the non-human vertebrate according to claim 1 with said antigen, thereby producing said antibody specific to said target antigen, wherein said antibody comprises a heavy chain variable region comprising an HCDR3, and wherein said heavy chain variable region is a product of the recombination of human JH6*02 with a VH and a D gene segment.
40. The method of claim 39, further comprising the step of isolating said antibody.
41. The method of claim 40, wherein the constant region is an endogenous constant region.
42. A method for producing a human an antibody comprising carrying out the method of claim 40 or 41, wherein the constant region of the locus is a non-human vertebrate constant region, further comprising the step of replacing the non-human constant region of the isolated antibody with a human constant region.
43. The method of claim 39, further comprising the step of isolating a B cell from said mouse, wherein said B cell expresses said antibody.
44. The method of claim 43, further comprising the step of producing a hybridoma from said B cell.
45. The method of claim 44, further comprising the step of isolating said antibody from said hybridoma.
46. The method of claim 39, wherein the non-constant region is an endogenous constant region.
47. The method of claim 46, further comprising the step of isolating a B cell from said mouse, wherein said B cell expresses said antibody.
48. The method of claim 47, further comprising the step of producing a hybridoma from said B cell, wherein said hybridoma expresses said antibody.
49. The method of claim 48, further comprising the step of isolating said antibody from said hybridoma.
50. A method for producing a heavy chain or VH domain thereof, of an antibody specific to a target antigen, the method comprising immunizing a non-human vertebrate according to claim 1 with the antigen, thereby producing said heavy chain or VH domain thereof, of an antibody specific to said target antigen, wherein the heavy chain or VH domain thereof comprises an HCDR3, and wherein said VH domain is a product of the recombination of human JH6*02 with a VH and a D gene segment.
51. The method of claim 50, further comprising the step of isolating said heavy chain or VH domain thereof.
52. The method of claim 51, wherein the constant region is an endogenous constant region.
53. The method of claim 50, 51 or 52, wherein the heavy chain or VH domain thereof, comprises mouse AID-pattern somatic hypermutations and/or mouse dTd-pattern mutations.
54. The method of claim 39 or 46, wherein the heavy chain or VH domain thereof, of said antibody comprises mouse AID-pattern somatic hypermutations and/or mouse dTd-pattern mutations.
55. A method for producing a human an antibody comprising carrying out the method of claim 40 or 41, wherein the constant region of the locus is a non-human vertebrate constant region, wherein the heavy chain or VH domain of said antibody comprises mouse AID-pattern somatic hypermutations and/or mouse dTd-pattern mutations, further comprising the step of replacing the non-human constant region of the isolated antibody with a human constant region.
56. The method of claim 50, further comprising the step of operably linking the VH domain with a human constant domain.
Description:
CROSS REFERENCE
[0001] This is a Continuation Application of PCT/GB2012/052296 filed Sep. 18, 2012, which claims priority to GB 1116122.1 filed Sep. 19, 2011, GB 1116120.5 filed Sep. 19, 2011, GB 1203257.9 filed Feb. 24, 2012, GB 1204592.8 filed Mar. 15, 2012, GB 1205702.2 filed Mar. 29, 2012, GB 1208749.0 filed May 18, 2012 and GB 1211692.7 filed Jul. 2, 2012, all of which are hereby incorporated by reference.
FIELD OF THE INVENTION
[0002] The present invention relates to the provision of antibody therapeutics and prophylactics that are tailored specifically for human use.
[0003] The present invention provides libraries, vertebrates and cells, such as transgenic mice or rats or transgenic mouse or rat cells. Furthermore, the invention relates to methods of using the vertebrates to isolate antibodies or nucleotide sequences encoding antibodies. Antibodies, heavy chains, polypeptides, nucleotide sequences, pharmaceutical compositions and uses are also provided by the invention.
BACKGROUND
[0004] The state of the art provides non-human vertebrates (e.g., mice and rats) and cells comprising transgenic immunoglobulin loci, such loci comprising human variable (V), diversity (D) and/or joining (J) segments, and optionally human constant regions. Alternatively, endogenous constant regions of the host vertebrate (e.g., mouse or rat constant regions) are provided in the transgenic loci. Methods of constructing such transgenic vertebrates and use of these to generate antibodies and nucleic acids thereof following antigen immunisation are known in the art, e.g., see U.S. Pat. No. 7,501,552 (Medarex), U.S. Pat. No. 5,939,598 (Abgenix), U.S. Pat. No. 6,130,364 (Abgenix), WO02/066630 (Regeneron), WO2011004192 (Genome Research Limited), WO2009076464, WO2009143472 and WO2010039900 (Ablexis), the disclosures of which are explicitly incorporated herein. Such transgenic loci in the art include varying amounts of the human V(D) J repertoire. Existing transgenic immunoglobulin loci are based on a single human DNA source. The potential diversity of human antibody variable regions in non-human vertebrates bearing such transgenic loci is thus confined.
[0005] The inventors considered that it would be desirable to tailor the genomes of these transgenic non-human vertebrates (and thus antibody and antibody chain products of these) to address the variability--and commonality--in the natural antibody gene usage of humans. The inventors wanted to do this in order to better address human use of antibody-based therapeutic and prophylactic drugs.
[0006] It would be desirable also to provide for novel and potentially expanded repertoire and diversity of human variable regions in transgenic immunoglobulin loci and non-human vertebrates harbouring these, as well as in antibodies produced following immunisation of such animals.
SUMMARY OF THE INVENTION
[0007] The present invention has been developed from extensive bioinformatics analysis of natural antibody gene segment distributions across a myriad of different human populations and across more than two thousand samples from human individuals. The inventors have undertaken this huge task to more thoroughly understand and design non-human vertebrate systems and resultant antibodies to better address human medical therapeutics as a whole, as well as to enable rational design to address specific ethnic populations of humans. Using such rational design, the inventors have constructed transgenic non-human vertebrates and isolated antibodies, antibody chains and cells expressing these in a way that yields products that utilise gene segments that have been purposely included on the basis of the human bioinformatics analysis. The examples illustrate worked experiments where the inventors isolated many cells and antibodies to this effect.
[0008] The invention also relates to synthetically-extended & ethnically-diverse superhuman immunoglobulin gene repertoires. The present invention thus provides for novel and potentially expanded synthetic immunoglobulin diversities, thus providing a pool of diversity from which human antibody therapeutic leads can be selected. This expanded pool is useful when seeking to find antibodies with desirable characteristics, such as relatively high affinity to target antigen without the need for further affinity maturation (e.g., using laborious in vitro techniques such as phage or ribosome display), or improved biophysical characteristics, or to address targets and new epitopes that have previously been difficult to address with antibodies are not reached by prior antibody binding sites.
[0009] The invention also provides for diversity that is potentially biased towards variable gene usage common to members of a specific human population, which is useful for generating antibodies for treating and/or preventing diseases or conditions within such population. This ability to bias the antibody repertoire allows one to tailor antibody therapeutics with the aim of more effectively treating and/or preventing disease or medical conditions in specific human populations.
[0010] The present inventors realised the possibility of providing immunoglobulin gene segments from disparate sources in transgenic loci, in order to provide for novel and potentially-expanded antibody diversities from which antibody therapeutics (and antibody tool reagents) could be generated. This--opens up the potential of transgenic human-mouse/rat technologies to the possibility of interrogating different and possibly larger antibody sequence-spaces than has hitherto been possible.
[0011] In rationally designing transgenic antibody loci, as well as antibodies and antibody chains, the inventors also realised that a relatively long HCDR3 length (at least 20 amino acids) is often desirable to address epitopes. For example, naturally-occurring antibodies have been isolated from humans infected with infectious disease pathogens, such antibodies having a long HCDR3 length. Neutralizing antibodies have been found in this respect. A long HCDR3 length would be desirable to address other antigens (e.g., receptor clefts or enzyme active sites), not just limited to infectious disease pathogens, and thus the inventors realised the general desirability of the possibility of engineering transgenic loci to be able to produce long HCDR3 antibodies and heavy chains. The inventors, through laborious execution of bioinformatics on in excess of 2000 human DNA samples via the 1000 Genomes project together with rational sequence choices, identified that the inclusion of the specific human gene segment variant JH6*02 is desirable for producing long HCDR3 antibodies and chains.
[0012] Additional rational design and bioinformatics has led the inventors to realise that specific human constant region variants are conserved across many diverse human populations. The inventors realised that this opens up the possibility of making a choice to humanize antibodies, chains and variable domains by using such specific constant regions in products, rather than arbitrarily choosing the human constant region (or a synthetic version of a human constant region). This aspect of the invention also enables one to tailor antibody-based drugs to specific human ethnic populations, thereby more closely matching drug to patient (and thus disease setting) than has hitherto been performed. It can be a problem in the state of the art that antibodies are humanized with an arbitrary choice of human constant region (presumably derived from one (often unknown) ethnic population or non-naturally occurring) that does not function as well in patients of a different human ethnic population. This is important, since the constant region has the major role in providing antibody effector functions, e.g., for antibody recycling, cellular and complement recruitment and for cell killing.
[0013] To this end, in a first configuration of the invention, there is provided
First Configuration
[0014] A non-human vertebrate or vertebrate cell (optionally an ES cell or antibody-producing cell) comprising a genome having a superhuman immunoglobulin heavy chain human VH and/or D and/or J gene repertoire.
[0015] A non-human vertebrate or vertebrate cell (optionally an ES cell or antibody-producing cell) comprising a genome having a superhuman immunoglobulin light chain human VL gene repertoire; optionally wherein the vertebrate or cell is according to the first configuration.
[0016] A non-human vertebrate or vertebrate cell (optionally an ES cell or antibody-producing cell) whose genome comprises a transgenic immunoglobulin locus (e.g., a heavy chain locus or a light chain locus), said locus comprising immunoglobulin gene segments according to the first and second human immunoglobulin gene segments (optionally V segments) as mentioned below operably connected upstream of an immunoglobulin constant region; optionally wherein the genome is homozygous for said transgenic immunoglobulin locus;
optionally wherein the immunoglobulin locus comprises more than the natural human complement of functional V gene segments; and/or optionally wherein the immunoglobulin locus comprises more than the natural human complement of functional D gene segments; and/or optionally wherein the immunoglobulin locus comprises more than the natural human complement of functional J gene segments.
[0017] A transgenic non-human vertebrate (e.g., a mouse or rat) or vertebrate cell (optionally an ES cell or antibody-producing cell) whose genome comprises a transgenic immunoglobulin locus comprising a plurality of human immunoglobulin gene segments operably connected upstream of a non-human vertebrate constant region for the production of a repertoire of chimaeric antibodies, or chimaeric light or heavy chains, having a non-human vertebrate constant region and a human variable region; wherein the transgenic locus comprises one or more human immunoglobulin V gene segments, one or more human J gene segments and optionally one or more human D gene segments, a first (optionally a V segment) of said gene segments and a second (optionally a V segment) of said gene segments being different and derived from the genomes of first and second human individuals respectively, wherein the individuals are different; and optionally not related; optionally wherein the immunoglobulin locus comprises more than the natural human complement of functional V gene segments; and/or
optionally wherein the immunoglobulin locus comprises more than the natural human complement of functional D gene segments; and/or optionally wherein the immunoglobulin locus comprises more than the natural human complement of functional J gene segments.
[0018] A transgenic non-human vertebrate (e.g., a mouse or rat) or vertebrate cell (optionally an ES cell or antibody-producing cell) whose genome comprises first and second transgenic immunoglobulin loci, each locus comprising a plurality of human immunoglobulin gene segments operably connected upstream of a non-human vertebrate constant region for the production of a repertoire of chimaeric antibodies, or chimaeric light or heavy chains, having a non-human vertebrate constant region and a human variable region;
wherein (i) the first transgenic locus comprises one or more human immunoglobulin V gene segments, one or more human J gene segments and optionally one or more human D gene segments, (ii) the second transgenic locus comprises one or more human immunoglobulin V gene segments, one or more human J gene segments and optionally one or more human D gene segments; and (iii) wherein a first (optionally a V) gene segment of said first locus and a second (optionally a V) gene segment of said second gene locus are different and derived from the genomes of first and second human individuals respectively, wherein the individuals are different; and optionally not related; optionally wherein the first and second loci are on different chromosomes (optionally chromosomes with the same chromosome number) in said genome; optionally wherein each immunoglobulin locus comprises more than the natural human complement of functional V gene segments; and/or optionally wherein each immunoglobulin locus comprises more than the natural human complement of functional D gene segments; and/or optionally wherein each immunoglobulin locus comprises more than the natural human complement of functional J gene segments.
[0019] A method of constructing a cell (e.g., an ES cell) according to the invention, the method comprising
[0020] (a) identifying functional V and J (and optionally D) gene segments of the genome sequence of a (or said) first human individual;
[0021] (b) identifying one or more functional V and/or D and/or J gene segments of the genome sequence of a (or said) second human individual, wherein these additional gene segments are not found in the genome sequence of the first individual;
[0022] (c) and constructing a transgenic immunoglobulin locus in the cell, wherein the gene segments of (a) and (b) are provided in the locus operably connected upstream of a constant region.
[0023] In one embodiment, the gene segment(s) in step (b) are identified from an immunoglobulin gene database selected from the 1000 Genomes, Ensembl, Genbank and IMGT databases.
[0024] Throughout this text, Genbank is a reference to Genbank release number 185.0 or 191.0; the 1000 Genomes database is Phase 1, release v3, 16 Mar. 2012; the Ensembl database is assembly GRCh37.p8 (Oct. 4, 2012); the IMGT database is available at www.imgt.org.
[0025] In one embodiment, the first and second human individuals are members of first and second ethnic populations respectively, wherein the populations are different, optionally wherein the human immunoglobulin gene segment derived from the genome sequence of the second individual is low-frequency (optionally rare) within the second ethnic population.
[0026] This configuration of the invention also provides a method of making a transgenic non-human vertebrate (e.g., a mouse or rat), the method comprising
[0027] (a) constructing an ES cell (e.g., a mouse C57BL/6N, C57BL/6J, 129S5 or 129Sv strain ES cell) by carrying out the method above;
[0028] (b) injecting the ES cell into a donor non-human vertebrate blastocyst (e.g., a mouse C57BL/6N, C57BU6J, 129S5 or 129Sv strain blastocyst);
[0029] (c) implanting the blastocyst into a foster non-human vertebrate mother (e.g., a C57BL/6N, C57BL/6J, 129S5 or 129Sv strain mouse); and
[0030] (d) obtaining a child from said mother, wherein the child genome comprises a transgenic immunoglobulin locus.
[0031] In one embodiment, the invention provides a method of isolating an antibody that binds a predetermined antigen (e.g., a bacterial or viral pathogen antigen), the method comprising immunizing a non-human vertebrate according to the invention.
Second Configuration
[0032] A library of antibody-producing transgenic cells whose genomes collectively encode a repertoire of antibodies, wherein
[0033] (a) a first transgenic cell expresses a first antibody having a chain encoded by a first immunoglobulin gene, the gene comprising a first variable domain nucleotide sequence produced following recombination of a first human unrearranged immunoglobulin gene segment;
[0034] (b) a second transgenic cell expresses a second antibody having a chain encoded by a second immunoglobulin gene, the second gene comprising a second variable domain nucleotide sequence produced following recombination of a second human unrearranged immunoglobulin gene segment, the first and second antibodies being non-identical;
[0035] (c) the first and second gene segments are different and derived from the genome sequences of first and second human individuals respectively, wherein the individuals are different; and optionally not related;
[0036] (d) wherein the cells are non-human vertebrate (e.g., mouse or rat) cells.
[0037] In one embodiment, the first and second human individuals are members of first and second ethnic populations respectively, wherein the populations are different; optionally wherein the ethnic populations are selected from those identified in the 1000 Genomes database.
[0038] In another embodiment, the second human immunoglobulin gene segment is a polymorphic variant of the first human immunoglobulin gene segment; optionally wherein the second gene segment is selected from the group consisting of a gene segment in any of Tables 1 to 7 and 9 to 14 below (e.g., selected from Table 13 or Table 14), e.g., the second gene segment is a polymorphic variant of VH1-69.
[0039] Third Configuration An isolated antibody having
[0040] (a) a heavy chain encoded by a nucleotide sequence produced following recombination in a transgenic non-human vertebrate cell of an unrearranged human immunoglobulin V gene segment with a human D and human J segment, optionally with affinity maturation in said cell, wherein one of the gene segments is derived from the genome of an individual from a first human ethnic population; and the other two gene segments are derived from the genome of an individual from a second, different, human ethnic population, and wherein the antibody comprises heavy chain constant regions of said non-human vertebrate (e.g., rodent, mouse or rat heavy chain constant regions); and/or
[0041] (b) a light chain encoded by a nucleotide sequence produced following recombination in a transgenic non-human vertebrate cell of an unrearranged human immunoglobulin V gene segment with a human J segment, optionally with affinity maturation in said cell, wherein one of the gene segments is derived from the genome of an individual from a first human ethnic population (optionally the same as the first population in (a)); and the other gene segment is derived from the genome of an individual from a second, different, human ethnic population (optionally the same as the second population in (a)), and wherein the antibody comprises light chain constant regions of said non-human vertebrate (e.g., rodent, mouse or rat heavy light constant regions);
[0042] (c) Optionally wherein each variable domain of the antibody is a human variable domain.
[0043] (d) Optionally wherein the heavy chain constant regions are gamma-type constant regions.
[0044] The invention also provides an isolated nucleotide sequence encoding the antibody, optionally wherein the sequence is provided in an antibody expression vector, optionally in a host cell.
[0045] The invention also provides a method of producing a human antibody, the method comprising replacing the non-human vertebrate constant regions of the antibody of the third configuration with human antibody constant regions.
[0046] The invention also provides a pharmaceutical composition comprising an antibody according to the third configuration, or an antibody produced according to the method above and a diluent, excipient or carrier; optionally wherein the composition is provided in a container connected to an IV needle or syringe or in an IV bag.
[0047] The invention also provides an antibody-producing cell that expresses the second antibody recited in any one of the configurations.
[0048] In an alternative configuration, the invention contemplates the combination of nucleotide sequences of first and second immunoglobulin gene segments (e.g., two or more polymorphic variants of a particular human germline VH or VL gene segment) to provide a synthetic gene segment. Such synthetic gene segment is used, in one embodiment, to build a transgenic immunoglobulin locus, wherein the synthetic gene segment is provided in combination with one or more human variable and J regions (and optionally one or more human D regions) operably connected upstream of a constant region. When provided in the genome of a non-human vertebrate or cell (e.g., mouse or rat cell, e.g., ES cell), the invention provides for superhuman gene segment diversity. The sequences to be combined can be selected from gene segments that have been observed to be commonly used in human antibodies raised against a particular antigen (e.g., a flu antigen, such as haemaglutinin). By combining the sequences, the synthetic gene segment may recombine in vivo to produce an antibody that is well suited to the treatment and/or prevention of a disease or condition (e.g., influenza) mediated by said antigen.
Fourth Configuration
[0049] A non-human vertebrate (optionally a mouse or a rat) or vertebrate cell whose genome comprises an immunoglobulin heavy chain locus comprising human gene segment JH6*02, one or more VH gene segments and one or more D gene segments upstream of a constant region; wherein the gene segments in the heavy chain locus are operably linked to the constant region thereof so that the mouse is capable of producing an antibody heavy chain produced by recombination of the human JH6*02 with a D segment and a VH segment.
[0050] A non-human vertebrate cell (optionally a mouse cell or a rat cell) whose genome comprises an immunoglobulin heavy chain locus comprising human gene segment JH6*02, one or more VH gene segments and one or more D gene segments upstream of a constant region; wherein the gene segments in the heavy chain locus are operably linked to the constant region thereof for producing (e.g., in a subsequent progeny cell) an antibody heavy chain produced by recombination of the human JH6*02 with a D segment and a VH segment.
[0051] A heavy chain (e.g., comprised by an antibody) isolated from a vertebrate of the invention wherein the heavy chain comprises a HCDR3 of at least 20 amino acids.
[0052] A method for producing a heavy chain, VH domain or an antibody specific to a target antigen, the method comprising immunizing a non-human vertebrate according to the invention with the antigen and isolating the heavy chain, VH domain or an antibody specific to a target antigen or a cell producing the heavy chain, VH domain or an antibody, wherein the heavy chain, VH domain or an antibody comprises a HCDR3 that is derived from the recombination of human JH6*02 with a VH gene segment and a D gene segment.
[0053] A heavy chain, VH domain or an antibody produced by the method.
[0054] A B-cell or hybridoma expressing a heavy chain VH domain that is identical to the VH domain of the heavy chain.
[0055] A nucleic acid encoding the VH domain of the heavy chain of claim 22, 23 or 28, or encoding the heavy chain.
[0056] A vector (e.g., a CHO cell or HEK293 cell vector) comprising the nucleic acid; optionally wherein the vector is in a host cell (e.g., a CHO cell or HEK293 cell).
[0057] A pharmaceutical composition comprising the antibody, heavy chain or VH domain (e.g., comprised by an antibody), together with a pharmaceutically-acceptable excipient, diluent or a medicament (e.g., a further antigen-specific variable domain, heavy chain or antibody).
[0058] The antibody, heavy chain or VH domain (e.g., comprised by an antibody) as above for use in medicine.
[0059] The use of an antibody, heavy chain or VH domain (e.g., comprised by an antibody) as above in the manufacture of a medicament for treating and/or preventing a medical condition in a human.
Fifth Configuration
[0060] A method of producing an antibody heavy chain, the method comprising
[0061] (a) providing an antigen-specific heavy chain variable domain; and
[0062] (b) combining the variable domain with a human heavy chain constant region to produce an antibody heavy chain comprising (in N- to C-terminal direction) the variable domain and the constant region; wherein the human heavy chain constant region is an IGHG1ref, IGHG2ref, IGHG2a, IGHG3ref, IGHG3a, IGHG3b, IGHG4ref or IGHG4a constant region.
[0063] An antibody comprising a human heavy chain, the heavy chain comprising a variable domain that is specific for an antigen and a constant region that is an IGHG1 ref, IGHG2ref, IGHG2a, IGHG3ref, IGHG3a, IGHG3b, IGHG4ref or IGHG4a constant region. Optionally, the variable domain comprises mouse-pattern AID somatic mutations.
[0064] A polypeptide comprising (in N- to C-terminal direction) a leader sequence, a human variable domain that is specific for an antigen and a human constant region that is an IGHG1 ref, IGHG2ref, IGHG2a, IGHG3ref, IGHG3a, IGHG3b, IGHG4ref or IGHG4a constant region wherein (i) the leader sequence is not the native human variable domain leader sequence; and/or (ii) the variable domain comprises mouse AID-pattern somatic mutations and/or mouse Terminal deoxynucleotidyl transferase (TdT)-pattern junctional mutations.
[0065] A nucleotide sequence encoding (in 5' to 3' direction) a leader sequence and a human antibody heavy chain, the heavy chain comprising a variable domain that is specific for an antigen and a constant region that is an IGHG1ref, IGHG2ref, IGHG2a, IGHG3ref, IGHG3a, IGHG3b, IGHG4ref or IGHG4a constant region; and the leader sequence being operable for expression of the heavy chain and wherein the leader sequence is not the native human variable domain leader sequence.
[0066] A nucleotide sequence encoding (in 5' to 3' direction) a promoter and a human antibody heavy chain, the heavy chain comprising a variable domain that is specific for an antigen and a constant region that is an IGHG1ref, IGHG2ref, IGHG2a, IGHG3ref, IGHG3a, IGHG3b, IGHG4ref or IGHG4a constant region; and the promoter being operable for expression of the heavy chain and wherein the promoter is not the native human promoter.
[0067] A vector (e.g., a CHO cell or HEK293 cell vector) comprising a IGHG1ref, IGHG2ref, IGHG2a, IGHG3ref, IGHG3a, IGHG3b, IGHG4ref or IGHG4a constant region nucleotide sequence that is 3' of a cloning site for the insertion of a human antibody heavy chain variable domain nucleotide sequence, such that upon insertion of such a variable domain sequence the vector comprises (in 5' to 3' direction) a promoter, a leader sequence, the variable domain sequence and the constant region sequence so that the vector is capable of expressing a human antibody heavy chain when present in a host cell.
Sixth Configuration
[0068] A non-human vertebrate (e.g., a mouse or rat) or a non-human vertebrate cell (e.g., an ES cell or a B-cell) having a genome comprising at least 3 human variable region gene segments of the same type (e.g., at least 3 human VH6-1 gene segments, at least 3 human JH6 gene segments, at least 3 human VK1-39 gene segments, at least 3 human D2-2 gene segments or at least 3 human JK1 gene segments), wherein at least two of the human gene segments are variants that are not identical to each other.
[0069] A non-human vertebrate (e.g., a mouse or rat) or a non-human vertebrate cell (e.g., an ES cell or a B-cell) having a genome comprising at least 2 different non-endogenous variable region gene segments of the same type (e.g., at least 2 human VH6-1 gene segments, at least 3 human JH6 gene segments, at least 2 human VK1-39 gene segments, at least 2 human D2-2 gene segments or at least 2 human JK1 gene segments) cis at the same Ig locus.
[0070] A non-human vertebrate (e.g., a mouse or rat) or a non-human vertebrate cell (e.g., an ES cell or a B-cell) having a genome comprising at least 2 different human variable region gene segments of the same type (e.g., at least 2 human VH6-1 gene segments, at least 2 human JH6 gene segments, at least 2 human VK1-39 gene segments, at least 2 human D2-2 gene segments or at least 2 human JK1 gene segments) trans at the same Ig locus; and optionally a third human gene segment of the same type, wherein the third gene segment is cis with one of said 2 different gene segments.
[0071] A population of non-human vertebrates (e.g., mice or rats) comprising a repertoire of human variable region gene segments, wherein the plurality comprises at least 2 human variable region gene segments of the same type (e.g., at least 2 human VH6-1 gene segments, at least 2 human JH6 gene segments, at least 2 human VK1-39 gene segments, at least 2 human D2-2 gene segments or at least 2 human JK1 gene segments), a first of said different gene segments is provided in the genome of a
first vertebrate of the population, and a second of said different gene segments being provided in the genome of a second vertebrate of the population, wherein the genome of the first vertebrate does not comprise the second gene segment.
[0072] A non-human vertebrate (e.g., a mouse or rat) or a non-human vertebrate cell (e.g., an ES cell or a B-cell) having a genome comprising at least 2 different non-endogenous variable region gene segments of the same type (e.g., at least 2 human VH6-1 gene segments, at least 2 human JH6 gene segments, at least 2 human VK1-39 gene segments, at least 2 human D2-2 gene segments or at least 2 human JK1 gene segments), wherein the gene segments are derived from the genome sequence of different human individuals that are not genetically related over at least 3 generations.
[0073] A method of enhancing the human immunoglobulin gene diversity of a non-human vertebrate (e.g., a mouse or rat), the method comprising providing the vertebrate with a genome comprising at least 3 human variable region gene segments of the same type (e.g., at least 3 human VH6-1 gene segments, at least 3 human JH6 gene segments, at least 3 human VK1-39 gene segments, at least 3 human D2-2 gene segments or at least 3 human JK1 gene segments), wherein at least two of the human gene segments are variants that are not identical to each other.
[0074] A method of enhancing the immunoglobulin gene diversity of a non-human vertebrate (e.g., a mouse or rat), the method comprising providing the vertebrate with a genome comprising at least 2 different non-endogenous variable region gene segments of the same type (e.g., at least 2 human VH6-1 gene segments, at least 2 human JH6 gene segments, at least 2 human VK1-39 gene segments, at least 2 human D2-2 gene segments or at least 2 human JK1 gene segments) cis at the same Ig locus.
[0075] A method of enhancing the immunoglobulin gene diversity of a non-human vertebrate (e.g., a mouse or rat), the method comprising providing the vertebrate with a genome comprising at least 2 different human variable region gene segments of the same type (e.g., at least 2 human VH6-1 gene segments, at least 2 human JH6 gene segments, at least 2 human VK1-39 gene segments, at least 2 human D2-2 gene segments or at least 2 human JK1 gene segments) trans at the same Ig locus; and optionally a third human gene segment of the same type, wherein the third gene segment is cis with one of said 2 different gene segments.
[0076] A method of providing an enhanced human immunoglobulin variable region gene segment repertoire, the method comprising providing a population of non-human vertebrates (e.g., a mouse or rat) comprising a repertoire of human variable region gene segments, wherein the method comprises providing at least 2 different human variable region gene segments of the same type (e.g., at least 2 human VH6-1 gene segments, at least 2 human JH6 gene segments, at least 2 human VK1-39 gene segments, at least 2 human D2-2 gene segments or at least 2 human JK1 gene segments), wherein a first of said different gene segments is provided in the genome of a first vertebrate of the population, and a second of said different gene segments is provided in the genome of a second vertebrate of the population, wherein the genome of the first vertebrate does not comprise the second gene segment.
[0077] A method of enhancing the human immunoglobulin gene diversity of a non-human vertebrate (e.g., a mouse or rat), the method comprising providing the vertebrate with a genome comprising at least 2 different non-endogenous variable region gene segments of the same type (e.g., at least 2 human VH6-1 gene segments, at least 2 human JH6 gene segments, at least 2 human VK1-39 gene segments, at least 2 human D2-2 gene segments or at least 2 human JK1 gene segments), wherein the gene segments are derived from the genome sequence of different human individuals that are not genetically related over at least 3 generations.
[0078] A method of enhancing the human immunoglobulin gene diversity of a non-human vertebrate (e.g., a mouse or rat), the method comprising providing the vertebrate with a genome comprising at least 2 human variable region gene segments of the same type (e.g., at least 2 human VH6-1 gene segments, at least 2 human JH6 gene segments, at least 2 human VK1-39 gene segments, at least 2 human D2-2 gene segments or at least 2 human JK1 gene segments), wherein the gene segments are derived from the genome sequence of different human individuals that are not genetically related over at least 3 generations; optionally wherein at least 2 or 3 of said different gene segments are provided at the same Ig locus in said genome.
[0079] A non-human vertebrate (e.g., a mouse or rat) or a non-human vertebrate cell (e.g., an ES cell or a B-cell) having a genome comprising first and second human Ig locus gene segments of the same type (e.g., first and second human JH6 gene segments; or first and second IgG2 gene segments; or first and second human JA7 gene segments), wherein the first gene segment is a gene segment selected from any one of Tables 1 and 9 to 14 (e.g., selected from Table 13 or Table 14) (e.g., IGHJ6-a) and the second gene segment is the corresponding reference sequence.
[0080] A population of non-human vertebrates (e.g., mice or rats) comprising first and second human Ig locus gene segments of the same type (e.g., first and second human JH6 gene segments; or first and second IgG2 gene segments; or first and second human JA7 gene segments), wherein the first gene segment is a gene segment selected from any one of Tables 1 and 9 to 14 (e.g., selected from Table 13 or Table 14) (e.g., IGHJ6-a) and the second gene segment is the corresponding reference sequence, wherein the first gene segment is provided in the genome of a first vertebrate of the population, and the second gene segment is provided in the genome of a second vertebrate of the population.
[0081] A method of enhancing the human immunoglobulin gene diversity of a non-human vertebrate (e.g., a mouse or rat), the method comprising providing the vertebrate with a genome comprising first and second human Ig locus gene segments of the same type (e.g., first and second human JH6 gene segments; or first and second IgG2 gene segments; or first and second human JA7 gene segments), wherein the first gene segment is a gene segment selected from any one of Tables 1 and 9 to 14 (e.g., selected from Table 13 or Table 14) (e.g., IGHJ6-a) and the second gene segment is the corresponding reference sequence.
[0082] In one aspect of this configuration, the invention relates to human D gene segment variants as described further below.
[0083] In one aspect of this configuration, the invention relates to human V gene segment variants as described further below.
[0084] In one aspect of this configuration, the invention relates to human J gene segment variants as described further below.
BRIEF DESCRIPTION OF THE FIGURES
[0085] FIGS. 1 to 3: Schematic illustrating a protocol for producing recombineered BAC vectors to add V gene segments into a mouse genome;
[0086] FIG. 4: Schematic illustrating a protocol for adding V gene segments to a mouse genome using sequential recombinase mediated cassette exchange (sRMCE); and
[0087] FIG. 5 (in 4 parts): Alignment of 13 IGHV1-69 variants showing the variable (V) coding region only. Nucleotides that differ from VH1-69 variant *01 are indicated at the appropriate position whereas identical nucleotides are marked with a dash. Where nucleotide changes result in amino acid differences, the encoded amino acid is shown above the corresponding triplet. Boxed regions correspond to CDR1, CDR2 and CDR3 as indicated.
[0088] FIG. 6 is a schematic illustrating gene segment diversity and the effect of including variant variants in cis according to the invention:--
[0089] (a) Situation in a normal person: Recombination on the same chromosome limits combinations of variants, for instance the antibody gene V4-4 can only be recombined within variant 1 to form for instance for instance V4-4-D-J6 or V4-4-D-J2A. Similarly the variant V4-4A can't be recombined with either J6 or J2A from variant 1 and can only be joined with J-genes from variant 2 to form V4-4A-D-J6A and V4-4A-D-J2. V4-4-J2/J6 complexity=4.
[0090] (b) Situation in a transgenic mouse: Only one variant is provided so the genome is limited. V4-4-J6/J2 complexity=2.
[0091] (c) Supra mouse of the invention: The variants are added in cis and thus can be recombined in every combination, expanding the repertoire. For instance V4-4 can be combined with J6A, J6, J2A or J2 and similarly V4-4A can be recombined with these same J-genes. The V4-4-J6/J2 complexity=8, which in this simple example is double that of a person and 4× that of a mouse with a single variant.
[0092] FIG. 7: Alignment of human JH6*02 variants. Nucleotides that differ from JH6*01 are indicated at the appropriate position whereas identical nucleotides are marked with a dash. Where nucleotide changes result in amino acid differences, the encoded amino acid is shown above. Accession numbers (e.g., J00256) are shown to the left of the IMGT variant name.
[0093] FIG. 8: Alignment of JH sequences from various species.
[0094] FIG. 9: Codon Table
[0095] FIG. 10: BAC database extract
BRIEF DESCRIPTION OF THE TABLES
[0096] Table 1: Human IgH V Polymorphic Variants
[0097] Table 2: Human IgH D Polymorphic Variants
[0098] Table 3: Human IgH J Polymorphic Variants
[0099] Table 4: Human Ig Vk Polymorphic Variants
[0100] Table 5: Human Ig VA Polymorphic Variants
[0101] Table 6: Human IgH Jk Polymorphic Variants
[0102] Table 7: Human IgH JA Polymorphic Variants
[0103] Table 8: 1000 Genomes Project Human Populations
[0104] Table 9: Immunoglobulin Gene Usage in Human Antibody Responses to Infectious Disease Pathogens
[0105] Table 10A: Human IgH JH5 Variant Occurrences
[0106] Table 10B: Non-Synonymous Human IgH JH5 Variants
[0107] Table 11A: Human IgH JH6 Variant Occurrences
[0108] Table 11B: Non-Synonymous Human IgH JH6 Variants
[0109] Table 12A: Human IgH JH2 Variant Occurrences
[0110] Table 12B: Non-Synonymous Human IgH JH2 Variants
[0111] Table 13: Variant Frequency Analyses & Human Population Distributions
[0112] Table 14: Frequent Human Variant Distributions
[0113] Table 15: Human Gene Segment Usage: Heavy Chain Repertoires From Naive Non-Human Vertebrates
[0114] Table 16: Human Gene Segment Usage: Heavy Chain Repertoires From Immunised Non-Human Vertebrates
[0115] Table 17: Human Gene Segment Usage: Heavy Chain Repertoires From Antigen-Specific Hybridomas
[0116] Table 18: Sequence Correlation Table
[0117] Table 19: Summary Of Function Correlated With Human Gamma Constant Region Sub-Type
[0118] Table 20: Gene Segments Prevalent In Few Human Populations
[0119] Table 21: Genomic and sequence information
DETAILED DESCRIPTION OF THE INVENTION
[0120] A suitable source of JH6*02 and other human DNA sequences for use in the invention will be readily apparent to the skilled person. For example, it is possible to collect a DNA sample from a consenting human donor (e.g., a cheek swab sample as per the Example herein) from which can be obtained suitable DNA sequences for use in constructing a locus of the invention. Other sources of human DNA are commercially available, as will be known to the skilled person. Alternatively, the skilled person is able to construct gene segment sequence by referring to one or more databases of human Ig gene segment sequences disclosed herein.
[0121] An example source for human V, D and J gene segments according to the invention are Bacterial Artificial Chromosomes (RPCI-11 BACs) obtained from Roswell Park Cancer Institute (RPCI)/Invitrogen. See http://bacpac.chori.org/hmale11.htm which describes the BACs as follows: --
"RPCI-11 Human Male BAC Library
[0122] The RPCI-11 Human Male BAC Library (Osoegawa et al., 2001) was constructed using improved cloning techniques (Osoegawa et al., 1998) developed by Kazutoyo Osoegawa. The library was generated by Kazutoyo Osoegawa. Construction was funded by a grant from the National Human Genome Research Institute (NHGRI, NIH) (#1R01RG01165-03). This library was generated according to the new NHGRI/DOE "Guidance on Human Subjects in Large-Scale DNA Sequencing . . . .
[0123] "Male blood was obtained via a double-blind selection protocol. Male blood DNA was isolated from one randomly chosen donor (out of 10 male donors)".
[0124] Osoegawa K, Mammoser A G, Wu C, Frengen E, Zeng C, Catanese J J, de Jong P J; Genome Res. 2001 March; 11(3):483-96; "A bacterial artificial chromosome library for sequencing the complete human genome";
[0125] Osoegawa, K., Woon, P. Y., Zhao, B., Frengen, E., Tateno, M., Catanese, J. J, and de Jong, P. J. (1998); "An Improved Approach for Construction of Bacterial Artificial Chromosome Libraries"; Genomics 52, 1-8.
Superhuman Immunoglobulin Gene Repertoires
[0126] The invention relates to synthetically-extended & ethnically-diverse superhuman immunoglobulin gene repertoires. The human immunoglobulin repertoires are beyond those found in nature (i.e., "Superhuman"), for example, they are more diverse than a natural human repertoire or they comprise combinations of human immunoglobulin gene segments from disparate sources in a way that is non-natural. Thus, the repertoires of the invention are "superhuman" immunoglobulin repertoires, and the invention relates to the application of these in transgenic cells and non-human vertebrates for utility in producing chimaeric antibodies (with the possibility of converting these into fully-human, isolated antibodies using recombinant DNA technology). The present invention thus provides for novel and potentially expanded synthetic immunoglobulin diversities, which provides for a pool of diversity from which antibody therapeutic leads (antibody therapeutics and antibody tool reagents) can be selected. This opens up the potential of transgenic human-mouse/rat technologies to the possibility of interrogating different and possibly larger antibody sequence-spaces than has hitherto been possible. To this end, in one embodiment, the invention provides a SUPERHUMAN MOUSE® (aka SUPRA-MOUSE®) and a SUPERHUMAN RAT® (aka SUPRA-RAT®)
[0127] In developing this thinking, the present inventors have realised the possibility of mining the huge genetics resources now available to the skilled person thanks to efforts such as the HapMap Project, 1000 Genomes Project and sundry other immunoglobulin gene databases (see below for more details). Thus, in some embodiments, the inventors realised the application of these genome sequencing developments in the present invention to generate synthetically-produced and ethnically-diverse artificial immunoglobulin gene repertoires. In one aspect, the inventors realised that such repertoires are useful for the production of antibodies having improved affinity and/or biophysical characteristics, and/or wherein the range of epitope specificities produced by means of such repertoire is novel, provides for antibodies to epitopes that have hitherto been intractable by prior transgenic immunoglobulin loci or difficult to address.
[0128] The present invention provides libraries, vertebrates and cells, such as transgenic mice or rats or transgenic mouse or rat cells. Furthermore, the invention relates to methods of using the vertebrates to isolate antibodies or nucleotide sequences encoding antibodies. Antibodies, nucleotide sequences, pharmaceutical compositions and uses are also provided by the invention.
Variation Analysis
[0129] The present inventors have realized methods and antibody loci designs that harness the power of genetic variation analysis. The reference human genome provides a foundation for experimental work and genetic analysis of human samples. The reference human is a compilation of the genomes from a small number of individuals and for any one segment of the genome a high quality single reference genome for one of the two chromosomes is available. Because the reference genome was assembled from a series of very large insert clones, the identity of these clones is known. Accordingly, experimental work with human genomic DNA is usually conducted on the clones from which the reference sequence was derived.
[0130] Individual humans differ in their sequence and recently several individuals have had their genomes sequenced, for instance James Watson and Craig Venter. Comparison of the genome sequence of these individuals has revealed differences between their sequences and the reference genome in both coding and non-coding parts of the genome, approximately 1 in 1000 bases are different. Some variants will be significant and contribute to differences between individuals. In extreme cases these will result in genetic disease. Variation can be implicated in differing responses to drugs administered to human patients, e.g., yielding an undesirable lowering of patient response to treatment.
[0131] The 1000-Genomes Project has the objective of identifying the most frequent variations in the human genome. This public domain project involved sequencing the genomes of more than 1000 individuals from diverse ethnic groups, comparing these sequences to the reference and assembling a catalogue of variants. This has enabled the annotation of variants in coding regions, but because this sequence wasn't derived from large clones of DNA, the analysis of the sequence from diploid individuals can't discriminate the distribution of the variation between the maternal and paternally inherited chromosomes. Where more than one variant is identified in a protein coding gene, it is not possible to illuminate the distribution of the pattern of variants in each version of the protein. For example, if two variants are detected in different positions of the same protein in an individual, this could have resulted from one copy with two variants and none in the other or each copy could have just one variant. To illuminate the sequence of real proteins, the 1000-Genome Project has sequenced mother-father-child trios. This allows one to "phase" the sequence variants, in other words identify blocks of sequence that are inherited from one or other parent and deconvolute the variants.
[0132] To further understand the variation within the 1000-genome set a tool has been developed that can identify the significant variants (defined as non-synonymous amino acid changes) from a region of DNA from the phased data in the 1000-genome data set. This tool has been made available online http://www.1000genomes.org/variation-pattern-finder. This tool allows an investigator to download non-synonymous variation delimited between specific coordinates. The downloaded files are configured as individual genotypes, but the data is phased so the haplotype information and the frequencies of specific halotypes in different populations can be extracted.
[0133] The inventors' analysis of the 1000-genome data for the individual human coding segments of the C, V D and J genes from the heavy and light chains reveals that there is significant variation in these segments. Individuals will usually have two different heavy chain alleles and also different light chain alleles at both kappa and lambda loci. The repertoire of antibodies that can be generated from each allele will be different. This variation will contribute to a better or differing immune response to certain antigens.
[0134] Humanized mice that have hitherto been generated with immunoglobulin heavy and light chain loci contain just one type of immunoglobulin locus. Even if these mice contain a full human heavy chain locus, the variation will be less than contained in a typical human because only one set of C, V, D and J genes are available, while a typical human would have two sets.
[0135] The inventors have devised ways to improve on this limitation when constructing transgenic non-human vertebrates and cells for human antibody and variable region production in vivo.
[0136] Mice can be generated with two different loci, each engineered to have a different repertoire of V, D and J segments. This could be in a single mouse or two or more separate mouse strains and would be analogous to or beyond the repertoire found in a normal human. The engineering of such a mouse would go beyond the repertoire described in humanized mice to date which only have one set of alleles.
[0137] However, the inventors also realized that this also has limitations, because the different loci would not normally interact to shuffle V, D and J variants between loci. This same limitation is also inherent in a human, thus this system does not utilize the advantage of recombining variants in all combinations.
[0138] To go beyond the normal repertoire in humans and take advantage of combinations of C, V, D and J variants the inventors decided, in one embodiment, to provide these on the same chromosome in cis. See FIG. 6. These loci would be characterized by having more than the normal number of J, D or V genes. For example n=6 for the J genes, but including one J6 variant and one J2 variant would increase this to n=8. This could be combined with additional variants for the D and V genes, for example. By detailed analysis of the 1000-Genomes database, the inventors have devised a collection of candidate polymorphic human variant gene segments, e.g., JH gene segments (e.g., see the examples), that can be built into the design of transgenic heavy and light chain loci in mice for expressing increasingly diverse and new, synthetic repertoires of human variable regions. Moreover, by utilizing naturally-occurring human variant gene segments, as per embodiments of the invention, this addresses compatibility with human patients since the inventors analysis has drawn out candidate variants that are naturally conserved and sometimes very prevalent amongst human ethnic populations. Additionally this enables one to tailor the configurations of the invention to provide for antibody-based drugs that better address specific human ethnic populations.
[0139] In an example according to any configuration of the invention, loci (and cells and vertebrates comprising these) are provided in which gene segments from different human populations are used. This is desirable to increase antibody gene diversity to better address more diverse human patients. In an example, the gene segments are from first and second different human populations respectively, and thus the second gene segment is found in the second human population, but not so (or rarely) in the first human population. Rarely means, for example, that the gene segment is found in 5, 4, 3, 2, or 1 or zero individuals in the first population in the 1000 Genomes database. For example, the first gene segment may be shown as present in a first population by reference to Table 13 or 14 herein, the second gene segment may be shown as present in the second population by reference to Table 13 and not in the first population. Optionally, the first gene segment may also be shown as being present in the second population by reference to Table 13 or 14.
[0140] In any configuration or aspect of the invention, where a V gene segment is used, this may be used optionally with the native leader sequence. For example, use of genomic DNA (e.g., from BACs as in the examples) will mean that the native leader will be used for each V gene segment incorporated into the locus and genomes of the invention. In an alternative, the skilled person may wish to inert a non-native leader sequence together with one or more of the V gene segments. Similarly, in any configuration or aspect of the invention, where a V gene segment is used, this may be used optionally with the native 5' UTR sequence. For example, use of genomic DNA (e.g., from BACs as in the examples) will mean that the native 5' UTR sequence will be used for each V gene segment incorporated into the locus and genomes of the invention. In an alternative, the skilled person may wish to exclude the native 5' UTR sequence.
The Present Invention Provides, in a First Configuration
(a) Superhuman Heavy Chain Gene Repertoires
[0141] A non-human vertebrate or vertebrate cell (optionally an ES cell or antibody-producing cell) comprising a genome having a superhuman immunoglobulin heavy chain human VH and/or D and/or J gene repertoire.
[0142] In one aspect the cell of the invention is an embryonic stem cell. For example, the ES cell is derived from the mouse C57BL/6N, C57BL/6J, 129S5 or 129Sv strain. In one aspect the non-human vertebrate is a rodent, suitably a mouse, and cells of the invention, are rodent cells or ES cells, suitably mouse ES cells. The ES cells of the present invention can be used to generate animals using techniques well known in the art, which comprise injection of the ES cell into a blastocyst followed by implantation of chimaeric blastocystys into females to produce offspring which can be bred and selected for homozygous recombinants having the required insertion. In one aspect the invention relates to a transgenic animal comprised of ES cell-derived tissue and host embryo derived tissue. In one aspect the invention relates to genetically-altered subsequent generation animals, which include animals having a homozygous recombinants for the VDJ and/or VJ regions.
[0143] The natural human immunoglobulin gene segment repertoire consists of (see e.g., www.imgt.org):--
[0144] VH: total-125; functional-41 DH: total-27; functional-23 JH: total-8; functional-6
[0145] Vk: total-77; functional-38 Jk: total-5; functional-5
[0146] V lambda: total-75; functional-31
[0147] J lambda: total-7; functional-5
[0148] In one embodiment, the vertebrate or cell genome comprises a transgenic immunoglobulin heavy chain locus comprising a plurality of human immunoglobulin VH gene segments, one or more human D gene segments and one or more human J gene segments, wherein the plurality of VH gene segments consists of more than the natural human repertoire of functional VH gene segments; optionally wherein the genome is homozygous for said transgenic heavy chain locus.
[0149] In one embodiment of the vertebrate or cell, the VH gene repertoire consists of a plurality of VH gene segments derived from the genome sequence of a first human individual, supplemented with one or more different VH gene segments derived from the genome sequence of a second, different human individual. Optionally the D and J segments are derived from the genome sequence of the first human individual. Optionally the VH gene segments from the genome sequence of the second individual are selected from the VH gene segments listed in Table 1, 13 or 14. In this way, the locus provides a superhuman repertoire of D gene segments.
[0150] Optionally the individuals are not related. Individuals are "not related" in the context of any configuration or aspect of the invention, for example, if one of the individuals does not appear in a family tree of the other individual in the same generation or going back one, two, three or four generations. Alternatively, are not related, for example, if they do not share a common ancestor in the present generation or going back one, two, three or four generations.
[0151] In one embodiment of the vertebrate or cell, the transgenic locus comprises more than 41 functional human VH gene segment species, and thus more than the natural human functional repertoire. Optionally the locus comprises at least 42, 43, 44, 45, 46, 47, 48, 49 or 50 functional human VH gene segment species (e.g., wherein the locus comprises the full functional VH repertoire of said first individual supplemented with one or more VH gene segments derived from the genome sequence of the second human individual and optionally with one or more VH gene segments derived from the genome sequence of a third human individual). In this way, the locus provides a superhuman repertoire of VH gene segments that is useful for generating a novel gene and antibody diversity for use in therapeutic and tool antibody selection.
[0152] In one embodiment of the vertebrate or cell, the transgenic locus comprises a first VH gene segment derived from the genome sequence of the first individual and a second VH gene segment derived from the genome sequence of the second individual, wherein the second VH gene segment is a polymorphic variant of the first VH gene segment. For example, the VH gene segments are polymorphic variants of VH1-69 as illustrated in the examples below. Optionally the locus comprises a further polymorphic variant of the first VH gene segment (e.g., a variant derived from the genome sequence of a third human individual). In this way, the locus provides a superhuman repertoire of VH gene segments.
[0153] In one embodiment of the vertebrate or cell, the genome (alternatively or additionally to the superhuman VH diversity) comprises a transgenic immunoglobulin heavy chain locus comprising a plurality of human immunoglobulin VH gene segments, a plurality of human D gene segments and one or more human J gene segments, wherein the plurality of D gene segments consists of more than the natural human repertoire of functional D gene segments. Optionally the genome is homozygous for said transgenic heavy chain locus.
[0154] In one embodiment of the vertebrate or cell, the D gene repertoire consists of a plurality of D gene segments derived from the genome sequence of a (or said) first human individual, supplemented with one or more different D gene segments derived from the genome sequence of a (or said) second, different human individual. Optionally the individuals are not related. Optionally the J segments are derived from the genome sequence of the first human individual. Optionally the D gene segments from the genome sequence of the second individual are selected from the D gene segments listed in Table 2, 13 or 14. In this way, the locus provides a superhuman repertoire of D gene segments.
[0155] In one embodiment of the vertebrate or cell, the transgenic locus comprises more than 23 functional human D gene segment species; optionally wherein the locus comprises at least 24, 25, 26, 27, 28, 29, 30 or 31 functional human D gene segment species (e.g., wherein the locus comprises the full functional D repertoire of said first individual supplemented with one or more D gene segments derived from the genome sequence of the second human individual and optionally with one or more D gene segments derived from the genome sequence of a third human individual). In this way, the locus provides a superhuman repertoire of D gene segments.
[0156] In one embodiment of the vertebrate or cell, the transgenic locus comprises a first D gene segment derived from the genome sequence of the first individual and a second D gene segment derived from the genome sequence of the second individual, wherein the second D gene segment is a polymorphic variant of the first D gene segment. Optionally the locus comprises a further polymorphic variant of the first D gene segment (e.g., a variant derived from the genome sequence of a third human individual). In this way, the locus provides a superhuman repertoire of D gene segments.
[0157] In one embodiment of the vertebrate or cell (alternatively or additionally to the superhuman VH and/or JH diversity), the genome comprises a (or said) transgenic immunoglobulin heavy chain locus comprising a plurality of human immunoglobulin VH gene segments, one or more human D gene segments and a plurality of human JH gene segments, wherein the plurality of J gene segments consists of more than the natural human repertoire of functional J gene segments; optionally wherein the genome is homozygous for said transgenic heavy chain locus.
[0158] In one embodiment of the vertebrate or cell, the JH gene repertoire consists of a plurality of J gene segments derived from the genome sequence of a (or said) first human individual, supplemented with one or more different J gene segments derived from the genome sequence of a (or said) second, different human individual. Optionally the individuals are not related. Optionally D segments are derived from the genome sequence of the first human individual. Optionally the J gene segments from the genome sequence of the second individual are selected from the J gene segments listed in Table 3 13 or 14. In this way, the locus provides a superhuman repertoire of JH gene segments.
[0159] In one embodiment of the vertebrate or cell, the transgenic locus comprises more than 6 functional human JH gene segment segments. Optionally the locus comprises at least 7, 8, 9, 10, 11, 12, 13, 14, 15, or 16 functional human JH gene segments (e.g., wherein the locus comprises the full functional JH repertoire of said first individual supplemented with one or more JH gene segments derived from the genome sequence of the second human individual and optionally with one or more JH gene segments derived from the genome sequence of a third human individual). In this way, the locus provides a superhuman repertoire of JH gene segments.
[0160] In one embodiment of the vertebrate or cell, the transgenic locus comprises a first JH gene segment derived from the genome sequence of the first individual and a second JH gene segment derived from the genome sequence of the second individual, wherein the second JH gene segment is a polymorphic variant of the first JH gene segment. Optionally the locus comprises a further polymorphic variant of the first JH gene segment (e.g., a variant derived from the genome sequence of a third human individual). In this way, the locus provides a superhuman repertoire of JH gene segments.
[0161] (b) Superhuman Light Chain Gene Repertoires
[0162] The first configuration of the invention also provides:--
[0163] A non-human vertebrate or vertebrate cell (optionally an ES cell or antibody-producing cell) comprising a genome having a superhuman immunoglobulin light chain human VL gene repertoire. Optionally the vertebrate or cell comprises a heavy chain transgene according to aspect (a) of the first configuration. Thus, superhuman diversity is provided in both the heavy and light chain immunoglobulin gene segments in the cell and vertebrate. For example, the genome of the cell or vertebrate is homozygous for the heavy and light chain transgenes and endogenous antibody expression is inactivated. Such a vertebrate is useful for immunisation with a predetermined antigen to produce one or more selected antibodies that bind the antigen and have human variable regions resulting from recombination within the superhuman gene segment repertoire. This provides potentially for a novel antibody and gene sequence space from which to select therapeutic, prophylactic and tool antibodies.
[0164] In one embodiment of aspect (b) of the first configuration, the vertebrate or cell genome comprises
[0165] (i) a transgenic immunoglobulin kappa light chain locus comprising a plurality of human immunoglobulin VK gene segments and one or more human J gene segments, wherein the plurality of VK gene segments consists of more than the natural human repertoire of functional VK gene segments; optionally wherein the genome is homozygous for said transgenic kappa light chain locus; and/or
[0166] (ii) a transgenic immunoglobulin lambda light chain locus comprising a plurality of human immunoglobulin VA gene segments and one or more human J gene segments, wherein the plurality of VA gene segments consists of more than the natural human repertoire of functional VA gene segments; optionally wherein the genome is homozygous for said transgenic lambda light chain locus.
[0167] In this way, the locus provides a superhuman repertoire of VL gene segments. In one embodiment of the vertebrate or cell,
[0168] (i) the VK gene repertoire consists of a plurality of VK gene segments derived from the genome sequence of a first human individual, supplemented with one or more VK gene segments derived from the genome sequence of a second, different human individual; optionally wherein the individuals are not related; optionally wherein the J segments are derived from the genome sequence of the first human individual; and optionally wherein the VK gene segments from the genome sequence of the second individual are selected from the VK gene segments listed in Table 4, 13 or 14; and
[0169] (i) the VA gene repertoire consists of a plurality of VA gene segments derived from the genome sequence of a first human individual, supplemented with one or more VA gene segments derived from the genome sequence of a second, different human individual; optionally wherein the individuals are not related; optionally wherein the J segments are derived from the genome sequence of the first human individual; and optionally wherein the VA gene segments from the genome sequence of the second individual are selected from the VA gene segments listed in Table 5, 13 or 14.
[0170] In this way, the locus provides a superhuman repertoire of VL gene segments.
[0171] In one embodiment of the vertebrate or cell,
[0172] the kappa light transgenic locus comprises more than 38 functional human VK gene segment species; optionally wherein the locus comprises at least 39, 40, 41, 42, 43, 44, 45, 46, 47 or 48 functional human VK gene segment species (e.g., wherein the locus comprises the full functional VK repertoire of said first individual supplemented with one or more VK gene segments derived from the genome sequence of the second human individual and optionally with one or more VK gene segments derived from the genome sequence of a third human individual); and
[0173] the lambda light transgenic locus comprises more than 31 functional human VA gene segment species; optionally wherein the locus comprises at least 32, 33, 34, 35, 36, 37, 38, 39, 40 or 41 functional human VA gene segment species (e.g., wherein the locus comprises the full functional VA repertoire of said first individual supplemented with one or more VA gene segments derived from the genome sequence of the second human individual and optionally with one or more VA gene segments derived from the genome sequence of a third human individual).
[0174] In this way, the locus provides a superhuman repertoire of VL gene segments.
[0175] In one embodiment of the vertebrate or cell,
[0176] the kappa light transgenic locus comprises a first VK gene segment derived from the genome sequence of the first individual and a second VK gene segment derived from the genome sequence of the second individual, wherein the second VK gene segment is a polymorphic variant of the first VK gene segment; optionally wherein the locus comprises a further polymorphic variant of the first VK gene segment (e.g., a variant derived from the genome sequence of a third human individual); and
[0177] the lambda light transgenic locus comprises a first VA gene segment derived from the genome sequence of the first individual and a second VA gene segment derived from the genome sequence of the second individual, wherein the second VA gene segment is a polymorphic variant of the first VA gene segment; optionally wherein the locus comprises a further polymorphic variant of the first VA gene segment (e.g., a variant derived from the genome sequence of a third human individual).
[0178] In this way, the locus provides a superhuman repertoire of VL gene segments.
[0179] In one embodiment of the vertebrate or cell, the genome comprises a (or said) transgenic immunoglobulin light chain locus comprising a plurality of human immunoglobulin VL gene segments and a plurality of human JL gene segments, wherein the plurality of J gene segments consists of more than the natural human repertoire of functional J gene segments; optionally wherein the genome is homozygous for said transgenic heavy chain locus.
[0180] In one embodiment of the vertebrate or cell,
[0181] (i) the JK gene repertoire consists of a plurality of JK gene segments derived from the genome sequence of a (or said) first human individual, supplemented with one or more JK gene segments derived from the genome sequence of a (or said) second, different human individual; optionally wherein the individuals are not related; optionally wherein the VK segments are derived from the genome sequence of the first human individual; optionally wherein the JK gene segments from the genome sequence of the second individual are selected from the JK gene segments listed in Table 6, 13 or 14; and
[0182] (ii) the JK gene repertoire consists of a plurality of JAgene segments derived from the genome sequence of a (or said) first human individual, supplemented with one or more JA gene segments derived from the genome sequence of a (or said) second, different human individual; optionally wherein the individuals are not related; optionally wherein the VA segments are derived from the genome sequence of the first human individual; optionally wherein the JA gene segments from the genome sequence of the second individual are selected from the JA gene segments listed in Table 7, 13 or 14.
[0183] In this way, the locus provides a superhuman repertoire of JL gene segments. In one embodiment of the vertebrate or cell,
[0184] (i) the transgenic light chain locus comprises more than 5 functional human JK gene segment species; optionally wherein the locus comprises at least 6, 7, 8, 9, 10, 11, 12, 13, 14 or 15 functional human JK gene segment species (e.g., wherein the locus comprises the full functional JK repertoire of said first individual supplemented with one or more JK gene segments derived from the genome sequence of the second human individual and optionally with one or more JK gene segments derived from the genome sequence of a third human individual); and/or
[0185] (i) the transgenic light chain locus comprises more than 5 functional human JA gene segment species; optionally wherein the locus comprises at least 6, 7, 8, 9, 10, 11, 12, 13, 14 or 15 functional human JA gene segment species (e.g., wherein the locus comprises the full functional JA repertoire of said first individual supplemented with one or more JA gene segments derived from the genome sequence of the second human individual and optionally with one or more JA gene segments derived from the genome sequence of a third human individual).
[0186] In this way, the locus provides a superhuman repertoire of JL gene segments.
[0187] In one embodiment of the vertebrate or cell,
[0188] (i) the kappa light transgenic locus comprises a first JK gene segment derived from the genome sequence of the first individual and a second JK gene segment derived from the genome sequence of the second individual, wherein the second JK gene segment is a polymorphic variant of the first JK gene segment; optionally wherein the locus comprises a further polymorphic variant of the first JK gene segment (e.g., a variant derived from the genome sequence of a third human individual); and
[0189] (ii) the lambda light transgenic locus comprises a first JA gene segment derived from the genome sequence of the first individual and a second JA gene segment derived from the genome sequence of the second individual, wherein the second JK gene segment is a polymorphic variant of the first JA gene segment; optionally wherein the locus comprises a further polymorphic variant of the first JA gene segment (e.g., a variant derived from the genome sequence of a third human individual).
[0190] In this way, the locus provides a superhuman repertoire of JL gene segments. Further aspects of the first configuration are described below.
The Present Invention Provides, in a Second Configuration
[0191] A library of antibody-producing transgenic cells whose genomes collectively encode a repertoire of antibodies, wherein
[0192] (a) a first transgenic cell expresses a first antibody having a chain (e.g., heavy chain) encoded by a first immunoglobulin gene, the gene comprising a first variable domain nucleotide sequence produced following recombination of a first human unrearranged immunoglobulin gene segment (e.g., a VH);
[0193] (b) a second transgenic cell expresses a second antibody having a chain (e.g., a heavy chain) encoded by a second immunoglobulin gene, the second gene comprising a second variable domain nucleotide sequence produced following recombination of a second human unrearranged immunoglobulin gene segment (e.g., a VH), the first and second antibodies being non-identical;
[0194] (c) the first and second gene segments are different and derived from the genome sequences of first and second human individuals respectively, wherein the individuals are different; and optionally not related;
[0195] (d) wherein the cells are non-human vertebrate (e.g., mouse or rat) cells (e.g., B-cells or hybridomas).
[0196] In one embodiment, the library is provided in vitro. In another embodiment, the library is provided in vivo by one or a plurality of transgenic non-human vertebrates. For example, the or each vertebrate is according to any aspect of the first configuration of the invention.
[0197] In one embodiment, the library encodes an antibody repertoire of from 10 to 109 antibodies, for example, 10, 20, 30, 40, 50, 100 or 1000 to 108; or 10, 20, 30, 40, 50, 100 or 1000 to 107; or 10, 20, 30, 40, 50, 100 or 1000 to 106; or 10, 20, 30, 40, 50, 100 or 1000 to 105; or 10, 20, 30, 40, 50, 100 or 1000 to 104 antibodies. In an example, library encodes an antibody repertoire of at least 103, 104, 105, 106, 107, 108, 109, or 1010 antibodies.
[0198] The first variable domain nucleotide sequence is produced following recombination of the first human unrearranged immunoglobulin gene segment with one or more other immunoglobulin gene segments (for example, human immunoglobulin gene segments). For example, where the first gene segment is a VH, the first variable domain nucleotide sequence (a VH domain) is produced following recombination of the VH with a human D and JH segments in vivo, optionally with somatic hypermutation, in the first transgenic cell or an ancestor thereof. For example, where the first gene segment is a VL, the first variable domain nucleotide sequence (a VL domain) is produced following recombination of the VL with a human JL segment in vivo, optionally with somatic hypermutation, in the first transgenic cell or an ancestor thereof.
[0199] The second variable domain nucleotide sequence is produced following recombination of the second human unrearranged immunoglobulin gene segment with one or more other immunoglobulin gene segments (for example, human immunoglobulin gene segments). For example, where the second gene segment is a VH, the second variable domain nucleotide sequence (a VH domain) is produced following recombination of the VH with a human D and JH segments in vivo, optionally with somatic hypermutation, in the second transgenic cell or an ancestor thereof. For example, where the second gene segment is a VL, the second variable domain nucleotide sequence (a VL domain) is produced following recombination of the VL with a human JL segment in vivo, optionally with somatic hypermutation, in the second transgenic cell or an ancestor thereof.
[0200] The first and second gene segments are respectively derived from genome sequences of first and second human individuals. In one example, such a gene segment is isolated or cloned from a sample cell taken from said individual using standard molecular biology techniques as know to the skilled person. The sequence of the gene segment may be mutated (e.g., by the introduction of up to 5, 6, 7, 8, 9, 10, 11, 12, 13, 14 or 15 nucleotide changes) prior to use in the present invention. In another example, a gene segment is derived by identifying a candidate human immunoglobulin gene segment in a database (see guidance below) and a nucleotide sequence encoding a gene segment for use in the present invention is made by reference (e.g., to be identical or a mutant with up to 5, 6, 7, 8, 9, 10, 11, 12, 13, 14 or 15 nucleotide changes to the reference sequence) to the database sequence. The skilled person will be aware of methods of obtaining nucleotide sequences by reference to databases or by obtaining from cellular samples.
[0201] In one embodiment of the vertebrate, cell or library of any configuration of the invention, the first and second human individuals are members of first and second ethnic populations respectively, wherein the populations are different. This, therefore, provides for superhuman gene diversity in transgenic loci, cells and vertebrates as per the invention.
Human Populations
[0202] Optionally the ethnic populations are selected from those identified in the 1000 Genomes Project of database. In this respect, see Table 8 which provides details of the ethnic populations on which the 1000 Genomes database is based.
[0203] N A Rosenberg et al (Science 20 Dec. 2002: vol. 298 no. 5602 2342-2343) studied the genetic structure of human populations of differing geographical ancestry. In total, 52 populations were sampled, these being populations with:
African Ancestry
[0204] (Mbuti Pygmies, Biaka Pygmies, San peoples, and speakers of Niger-Kordofanian languages (Bantu, Yoruba or Mandenka populations),
Eurasian Ancestry
[0205] (European ancestry (Orcadian, Adygei, Basque, French, Russians, Italians, Sardinian, Tuscan), Middle Eastern ancestry (Mozabite, Bedouin, Druze, Palestinians), Central/South Asian ancestry (Balochl, Brahul, Makrani, Sindhi, Pathan, Burusho, Hazara, Uygur, Kalash)),
East Asian Ancestry
[0206] (Han, Dal, Daur, Hezhen, Lahu, Miao, Orogen, She, Tujia, Tu, Xibo, Yi, Mongola, Naxi, Cambodian, Japanese, Yakut), Oceanic ancestry (Melanesian, Papuan); or
Americas Ancestry
(Karitiana, Surui, Colombian, Maya, Pima).
[0207] The International HapMap Project, Nature, 2003 Dec. 18; 426(6968):789-96, discloses that goal of the HapMap Project: to determine the common patterns of DNA sequence variation in the human genome by determining the genotypes of one million or more sequence variants, their frequencies and the degree of association between them in DNA samples from populations with ancestry from parts of Africa, Asia and Europe. The relevant human populations of differing geographical ancestry include Yoruba, Japanese, Chinese, Northern European and Western European populations. More specifically:--
[0208] Utah population with Northern or Western European ancestry (samples collected in 1980 by the Centre d'Etude du Polymorphisme Humain (CEPH)); population with ancestry of Yoruba people from Ibadan, Nigeria; population with Japanese ancestry; and population with ancestry of Han Chinese from China.
[0209] The authors, citing earlier publications, suggest that ancestral geography is a reasonable basis for sampling human populations.
[0210] A suitable sample of human populations from which the populations used in the present invention are selected is as follows:--
[0211] (a) European ancestry
[0212] (b) Northern European ancestry; Western European ancestry; Toscani ancestry; British ancestry, Finnish ancestry or Iberian ancestry.
[0213] (c) More specifically, population of Utah residents with Northern and/or Western European ancestry; Toscani population in Italia; British population in England and/or Scotland; Finnish population in Finland; or Iberian population in Spain.
[0214] (a) East Asian ancestry
[0215] (b) Japanese ancestry; Chinese ancestry or Vietnamese ancestry.
[0216] (c) More specifically, Japanese population in Tokyo, Japan; Han Chinese population in Beijing, China; Chinese Dai population in Xishuangbanna; Kinh population in Ho Chi Minh City, Vietnam; or Chinese population in Denver, Colo., USA.
[0217] (a) West African ancestry
[0218] (b) Yoruba ancestry; Luhya ancestry; Gambian ancestry; or Malawian ancestry.
[0219] (c) More specifically, Yoruba population in Ibadan, Nigeria; Luhya population in Webuye, Kenya; Gambian population in Western Division, The Gambia; or Malawian population in Blantyre, Malawi.
[0220] (a) Population of The Americas
[0221] (b) Native American ancestry; Afro-Caribbean ancestry; Mexican ancestry; Puerto Rican ancestry; Columbian ancestry; or Peruvian ancestry.
[0222] (c) More specifically, population of African Ancestry in Southwest US; population of African American in Jackson, Miss.; population of African Caribbean in Barbados; population of Mexican Ancestry in Los Angeles, Calif.; population of Puerto Rican in Puerto Rico; population of Colombian in Medellin, Colombia; or population of Peruvian in Lima, Peru.
[0223] (a) South Asian ancestry
[0224] (b) Ahom ancestry; Kayadtha ancestry; Reddy ancestry; Maratha; or Punjabi ancestry.
[0225] (c) More specifically, Ahom population in the State of Assam, India; Kayadtha population in Calcutta, India; Reddy population in Hyderabad, India; Maratha population in Bombay, India; or Punjabi population in Lahore, Pakistan.
[0226] In any configuration of the invention, in one embodiment, each human population is selected from a population marked "(a)" above.
[0227] In any configuration of the invention, in another embodiment, each human population is selected from a population marked "(b)" above.
[0228] In any configuration of the invention, in another embodiment, each human population is selected from a population marked "(c)" above.
[0229] In one embodiment of the library of the vertebrate, cell or library of the invention, the first and second ethnic populations are selected from the group consisting of an ethnic population with
[0230] European ancestry, an ethnic population with East Asian, an ethnic population with West African ancestry, an ethnic population with Americas ancestry and an ethnic population with South Asian ancestry.
[0231] In one embodiment of the library of the vertebrate, cell or library of the invention, the first and second ethnic populations are selected from the group consisting of an ethnic population with Northern European ancestry; or an ethnic population with Western European ancestry; or an ethnic population with Toscani ancestry; or an ethnic population with British ancestry; or an ethnic population with Icelandic ancestry; or an ethnic population with Finnish ancestry; or an ethnic population with Iberian ancestry; or an ethnic population with Japanese ancestry; or an ethnic population with Chinese ancestry; or an ethnic population Vietnamese ancestry; or an ethnic population with Yoruba ancestry; or an ethnic population with Luhya ancestry; or an ethnic population with Gambian ancestry; or an ethnic population with Malawian ancestry; or an ethnic population with Native American ancestry; or an ethnic population with Afro-Caribbean ancestry; or an ethnic population with Mexican ancestry; or an ethnic population with Puerto Rican ancestry; or an ethnic population with Columbian ancestry; or an ethnic population with Peruvian ancestry; or an ethnic population with Ahom ancestry; or an ethnic population with Kayadtha ancestry; or an ethnic population with Reddy ancestry; or an ethnic population with Maratha; or an ethnic population with Punjabi ancestry.
[0232] In one embodiment of any configuration of the vertebrate, cell or library of the invention, the human immunoglobulin gene segment derived from the genome sequence of the second individual is low-frequency (optionally rare) within the second ethnic population. Optionally human immunoglobulin gene segment has a Minor Allele Frequency (MAF) (cumulative frequency) of between 0.5%-5%, optionally less than 0.5%, in the second human population, e.g., as in the 1000 Genomes database.
[0233] In one embodiment of any configuration of the vertebrate, cell or library of the invention, the first variable region nucleotide sequence is produced by recombination of the first human immunoglobulin gene segment with a first J gene segment and optionally a first D gene segment, wherein the first human immunoglobulin gene segment is a V gene segment and the V, D and J segments are derived from the first human population, optionally from the genome of one individual
of the first human population.
[0234] In one embodiment of the library of the vertebrate, cell or library of the invention, the second variable region nucleotide sequence is produced by recombination of the second human immunoglobulin gene segment with a second J gene segment and optionally a second D gene segment, wherein the second human immunoglobulin gene segment is a V gene segment derived from the second population and the D and/or J segments are derived from the first human population, optionally the D and J gene segments being from the genome of one individual of the first human population.
[0235] In one embodiment of the library of the vertebrate, cell or library of the invention, all of the D and J segments that have been recombined with the first and second V gene segments are D and J segments derived from the first human population, optionally the D and J gene segments being from the genome of one individual of the first human population.
[0236] In one embodiment of the library, the second human immunoglobulin gene segment is a polymorphic variant of the first human immunoglobulin gene segment; optionally wherein the second gene segment is selected from the group consisting of a gene segment in any of Tables 1 to 7 and 9 to 14 (e.g., selected from Table 13 or 14).
[0237] In one embodiment of the library, the first and second human immunoglobulin gene segments are both (i) VH gene segments; (ii) D segments; (iii) J segments (optionally both JH segments, both JK segments or both J segments); (iv) constant regions (optionally both a gamma constant region, optionally both a C gamma-1 constant region); (v) CH1 regions; (vi) CH2 regions; or (vii) CH3 regions.
[0238] The library is, for example, a naive and optionally has a library size of from 10 or 102 to 109 cells. For example, from 10, 20, 30, 40, 50, 100 or 1000 to 108; or 10, 20, 30, 40, 50, 100 or 1000 to 107; or 10, 20, 30, 40, 50, 100 or 1000 to 10s; or 10, 20, 30, 40, 50, 100 or 1000 to 105; or 10, 20, 30, 40, 50, 100 or 1000 to 104 cells.
[0239] The library has, for example, been selected against a predetermined antigen and optionally has a library size of from 10 or 102 to 109 cells. For example, from 10, 20, 30, 40, 50, 100 or 1000 to 108; or 10, 20, 30, 40, 50, 100 or 1000 to 107; or 10, 20, 30, 40, 50, 100 or 1000 to 10s; or 10, 20, 30, 40, 50, 100 or 1000 to 105; or 10, 20, 30, 40, 50, 100 or 1000 to 104 cells.
[0240] In one embodiment of the library of the invention, said first and second cells are progeny of first and second ancestor non-human vertebrate cells respectively, wherein the first ancestor cell comprises a genome comprising said first human immunoglobulin gene segment; and the second ancestor cell comprises a genome comprising said second human immunoglobulin gene segment.
[0241] The invention further provides a library of antibody-producing transgenic cells whose genomes collectively encode a repertoire of antibodies, wherein the library comprises the first and second ancestor cells described above.
[0242] The invention further provides a library of hybridoma cells produced by fusion of the library of the invention (e.g., a B-cell library) with fusion partner cells and optionally has a library size of from 10 or 102 to 109 cells. For example, from 10, 20, 30, 40, 50, 100 or 1000 to 108; or 10, 20, 30, 40, 50, 100 or 1000 to 107; or 10, 20, 30, 40, 50, 100 or 1000 to 10s; or 10, 20, 30, 40, 50, 100 or 1000 to 105; or 10, 20, 30, 40, 50, 100 or 1000 to 104 cells. Production of hybridomas is well known to the skilled person. Examples of fusion partners are SP2/0-g14 (obtainable from ECACC), P3XS3-Ag8.S53 (obtainable from LGC Standards; CRL-1580), NS1 and NS0 cells. PEG fusion or electrofusion can be carried out, as is conventional.
The Invention Provides, in a Third Configuration:--
[0243] An isolated antibody having
[0244] (a) a heavy chain encoded by a nucleotide sequence produced following recombination in a transgenic non-human vertebrate cell of an unrearranged human immunoglobulin V gene segment with a human D and human J segment, optionally with affinity maturation in said cell, wherein one of the gene segments (e.g., VH) is derived from the genome of an individual from a first human ethnic population; and the other two gene segments (e.g., D and JH) are derived from the genome of an individual from a second (e.g., a second and third respectively), different, human ethnic population, and wherein the antibody comprises heavy chain constant regions (e.g., C gamma) of said non-human vertebrate (e.g., rodent, mouse or rat heavy chain constant regions); and/or
[0245] (b) a light chain encoded by a nucleotide sequence produced following recombination in a transgenic non-human vertebrate cell of an unrearranged human immunoglobulin V gene segment with a human J segment, optionally with affinity maturation in said cell, wherein one of the gene segments (e.g., VL) is derived from the genome of an individual from a first human ethnic population (optionally the same as the first population in (a)); and the other gene segment (e.g., JL) is derived from the genome of an individual from a second, different, human ethnic population (optionally the same as the second population in (a)), and wherein the antibody comprises light chain constant regions of said non-human vertebrate (e.g., rodent, mouse or rat heavy light constant regions);
[0246] (c) Optionally wherein each variable domain of the antibody is a human variable domain.
[0247] (d) Optionally wherein the heavy chain constant regions are mu- or gamma-type constant regions.
[0248] The invention also provides an isolated nucleotide sequence encoding the antibody of the third configuration, optionally wherein the sequence is provided in an antibody expression vector, optionally in a host cell. Suitable vectors are mammalian expression vectors (e.g., CHO cell vectors or HEK293 cell vectors), yeast vectors (e.g., a vector for expression in Picchia pastoris, or a bacterial expression vector, e.g., a vector for E. coli expression.
[0249] The invention also provides a method of producing a human antibody, the method comprising replacing the non-human vertebrate constant regions of the antibody of the third configuration with human antibody constant regions (e.g., a C variant disclosed in table 13 or 18). The skilled person will be aware of standard molecular biology techniques to do this. For example, see Harlow, E. & Lane, D. 1998, 5th edition, Antibodies: A Laboratory Manual, Cold Spring Harbor Lab. Press, Plainview, N.Y.; and Pasqualini and Arap, Proceedings of the National Academy of Sciences (2004) 101:257-259 for standard immunisation. Joining of the variable regions of an antibody to a human constant region can be effected by techniques readily available in the art, such as using conventional recombinant DNA and RNA technology as will be apparent to the skilled person. See e.g. Sambrook, J and Russell, D. (2001, 3'd edition) Molecular Cloning: A Laboratory Manual (Cold Spring Harbor Lab. Press, Plainview, N.Y.).
[0250] In one embodiment, the method comprises further making a mutant or derivative of the antibody.
[0251] The invention also provides a pharmaceutical composition comprising an antibody according to the third configuration, or a human antibody of the invention and a diluent, excipient or carrier; optionally wherein the composition is provided in a container connected to an IV needle or syringe or in an IV bag.
[0252] The invention also provides an antibody-producing cell (e.g., a mammalian cell, e.g., CHO or HEK293; a yeast cell, e.g., P. pastoris; a bacterial cell, e.g., E. coli; a B-cell; or a hybridoma) that expresses the second antibody of the third configuration or the isolated antibody of the invention.
The First Configuration of the Invention Also Provides:--
[0253] A non-human vertebrate or vertebrate cell (optionally an ES cell or antibody-producing cell) whose genome comprises a transgenic immunoglobulin locus (e.g., a heavy chain locus or a light chain locus), said locus comprising immunoglobulin gene segments according to the first and second human immunoglobulin gene segments (optionally V segments) described above in connection with the third configuration. The gene segments are operably connected upstream of an immunoglobulin constant region; optionally wherein the genome is homozygous for said transgenic immunoglobulin locus.
[0254] Optionally the immunoglobulin locus comprises more than the natural human complement of functional V gene segments; and/or
[0255] Optionally wherein the immunoglobulin locus comprises more than the natural human complement of functional D gene segments; and/or
[0256] Optionally wherein the immunoglobulin locus comprises more than the natural human complement of functional J gene segments.
[0257] In this way, a superhuman immunoglobulin gene repertoire is provided in a transgenic non-human vertebrate or vertebrate cell according to the invention.
The First Configuration Also Provides:--
[0258] A transgenic non-human vertebrate (e.g., a mouse or rat) or vertebrate cell (optionally an ES cell or antibody-producing cell) whose genome comprises a transgenic immunoglobulin locus comprising a plurality of human immunoglobulin gene segments operably connected upstream of a non-human vertebrate constant region for the production of a repertoire of chimaeric antibodies, or chimaeric light or heavy chains, having a non-human vertebrate constant region and a human variable region; wherein the transgenic locus comprises one or more human immunoglobulin V gene segments, one or more human J gene segments and optionally one or more human D gene segments, a first (optionally a V segment) of said gene segments and a second (optionally a V segment) of said gene segments being different and derived from the genomes of first and second human individuals respectively, wherein the individuals are different; and optionally not related;
optionally wherein the immunoglobulin locus comprises more than the natural human complement of functional V gene segments; and/or optionally wherein the immunoglobulin locus comprises more than the natural human complement of functional D gene segments; and/or optionally wherein the immunoglobulin locus comprises more than the natural human complement of functional J gene segments.
[0259] In this way, a superhuman immunoglobulin gene repertoire is provided in a transgenic non-human vertebrate or vertebrate cell according to the invention.
The First Configuration Also Provides:--
[0260] A transgenic non-human vertebrate (e.g., a mouse or rat) or vertebrate cell (optionally an ES cell or antibody-producing cell) whose genome comprises first and second transgenic immunoglobulin loci, each locus comprising a plurality of human immunoglobulin gene segments operably connected upstream of a non-human vertebrate constant region for the production of a repertoire of chimaeric antibodies, or chimaeric light or heavy chains, having a non-human vertebrate constant region and a human variable region;
wherein (i) the first transgenic locus comprises one or more human immunoglobulin V gene segments, one or more human J gene segments and optionally one or more human D gene segments, (ii) the second transgenic locus comprises one or more human immunoglobulin V gene segments, one or more human J gene segments and optionally one or more human D gene segments; and (iii) wherein a first (optionally a V) gene segment of said first locus and a second (optionally a V) gene segment of said second gene locus are different and derived from the genomes of first and second human individuals respectively, wherein the individuals are different; and optionally not related; optionally wherein the first and second loci are on different chromosomes (optionally chromosomes with the same chromosome number) in said genome; optionally wherein each immunoglobulin locus comprises more than the natural human complement of functional V gene segments; and/or optionally wherein each immunoglobulin locus comprises more than the natural human complement of functional D gene segments; and/or optionally wherein each immunoglobulin locus comprises more than the natural human complement of functional J gene segments.
[0261] In this way, a superhuman immunoglobulin gene repertoire is provided in a transgenic non-human vertebrate or vertebrate cell according to the invention.
[0262] In these embodiments of the first configuration, the immunoglobulin gene segments are optionally as described for the third configuration.
[0263] In these embodiments of the first configuration, the genome optionally comprises a third immunoglobulin gene segment (optionally a V segment), the third gene segment being derived from a human individual that is different from the individual from which the first (and optionally also the second) gene segment is derived; optionally wherein the first, second and third gene segments are polymorphic variants of a human immunoglobulin gene segment (e.g., VH1-69--see the examples for further description).
[0264] In these embodiments of the first configuration, the genome of the vertebrate or cell is optionally homozygous for the first, second and optional third gene segment, wherein a copy of the first, second and optional third gene segments are provided together on the same chromosome operably connected upstream of a common non-human vertebrate constant region.
[0265] For example, each first, second and optional third gene segment is a V gene segment.
[0266] In one example, the library of the invention is provided by a collection of non-human vertebrates (optionally a collection of rodents, mice or rats); optionally, wherein a first member of said collection produces said first antibody but not said second antibody, and a second member of the collection produces said second antibody (but optionally not said first antibody). It is therefore contemplated to make non-human vertebrates where different human genomes have been used as a source for building the transgenic loci in the vertebrates. For example, a first vertebrate comprises a transgenic heavy chain locus having gene segments only from a first (and optionally a second) human population or individual; a second vertebrate comprises a transgenic heavy chain locus having gene segments only from a third (and optionally a fourth) human population or individual; and optionally third and more vertebrates can be built similarly based on unique or overlapping human population genomes. However, when provided as a mixed population of transgenic vertebrates, the mixed population provides a collective pool of human immunoglobulin genes that is greater than found in a natural human repertoire. This is useful to extend the antibody and gene sequence space beyond those possible with prior transgenic mice and rats bearing human immunoglobulin loci. As explained above, these have been based on a single human genome.
[0267] In one embodiment, the collection of non-human vertebrates bear human immunoglobulin genes confined to human populations that are together grouped under the same population genus "(a)" mentioned above. This provides for a gene repertoire that is biased to producing human antibody variable regions prevalent in the population genus (a) and thus useful for generating antibody therapeutics/prophylactics for members of said population. Alternatively, where gene segments from different human populations are provided in a single transgene according to the invention (not necessarily in a collection of vertebrates), the different human populations are for example together grouped under the same population genus "(a)" mentioned above.
[0268] The invention also provides a repertoire of antibodies expressed from a library of cells according to the invention.
[0269] In the non-human vertebrate or cell of any configuration of the invention, the constant region of the transgenic locus is, in one example, an endogenous constant region of said vertebrate (e.g., endogenous mouse or rat constant region, e.g., from the same strain of mouse or rat as the non-human vertebrate itself).
[0270] The invention also provides a method of constructing a cell (e.g., an ES cell) according to the invention, the method comprising
[0271] (a) identifying functional V and J (and optionally D) gene segments of the genome sequence of a (or said) first human individual;
[0272] (b) identifying one or more functional V and/or D and/or J gene segments of the genome sequence of a (or said) second human individual, wherein these additional gene segments are not found in the genome sequence of the first individual;
[0273] (c) and constructing a transgenic immunoglobulin locus in the cell, wherein the gene segments of (a) and (b) are provided in the locus operably connected upstream of a constant region.
[0274] Optionally the cell comprises a heavy chain locus constructed according to steps (a) to (c) and/or a light chain locus (kappa and/or lambda loci) constructed according to steps (a) to (c).
[0275] Optionally the cell is homozygous for the or each transgenic locus; optionally wherein antibody expression from loci endogenous to said cell has been inactivated. This is useful for confining the functional antibody gene repertoire, and thus antibody production, to antibodies bearing human variable regions.
[0276] Optionally the gene segment(s) in step (b) are identified from an immunoglobulin gene database selected from the 1000 Genomes, Ensembl, Genbank and IMGT databases.
[0277] Optionally the first and second human individuals are members of first and second ethnic populations respectively, wherein the populations are different, optionally wherein the human immunoglobulin gene segment derived from the genome sequence of the second individual is low-frequency (optionally rare) within the second ethnic population.
[0278] The invention also provides a method of making a transgenic non-human vertebrate (e.g., a mouse or rat), the method comprising
[0279] (a) constructing an ES cell (e.g., a mouse C57BL/6N, C57BL/6J, 129S5 or 129Sv strain ES cell) by carrying out the method above;
[0280] (b) injecting the ES cell into a donor non-human vertebrate blastocyst (e.g., a mouse C57BL/6N, C57BU6J, 129S5 or 129Sv strain blastocyst);
[0281] (c) implanting the blastocyst into a foster non-human vertebrate mother (e.g., a C57BL/6N, C57BL/6J, 129S5 or 129Sv strain mouse); and
[0282] (d) obtaining a child from said mother, wherein the child genome comprises a transgenic immunoglobulin locus.
[0283] The invention provides a transgenic non-human vertebrate (e.g., a mouse or rat) made by the method or a progeny thereof. The invention also provides a population of such non-human vertebrates.
[0284] Microinjection of ES cells into blastocysts and generation of transgenic mice therafter are conventional practices in the state of the art, and the skilled person is aware of techniques useful to effect this. C57BL/6N, C57BL/6J, 129S5 or 129Sv mouse strains and ES cells are readily and publicly available.
[0285] The invention also provides a method of isolating an antibody that binds a predetermined antigen (e.g., a bacterial or viral pathogen antigen), the method comprising
[0286] (a) providing a vertebrate (optionally a mouse or rat) according to the invention;
[0287] (b) immunizing (e.g., using a standard prime-boost method) said vertebrate with said antigen (optionally wherein the antigen is an antigen of an infectious disease pathogen);
[0288] (c) removing B lymphocytes from the vertebrate and selecting one or more B lymphocytes expressing antibodies that bind to the antigen;
[0289] (d) optionally immortalizing said selected B lymphocytes or progeny thereof, optionally by producing hybridomas therefrom; and
[0290] (e) isolating an antibody (e.g., and IgG-type antibody) expressed by the B lymphocytes; and
[0291] (f) optionally producing a derivative or variant of the antibody.
[0292] This method optionally further comprises after step (e) the step of isolating from said B lymphocytes nucleic acid encoding said antibody that binds said antigen; optionally exchanging the heavy chain constant region nucleotide sequence of the antibody with a nucleotide sequence encoding a human or humanized heavy chain constant region and optionally affinity maturing the variable region of said antibody; and optionally inserting said nucleic acid into an expression vector and optionally a host.
Bioinformatics Analysis & Selection of Immunoglobulin Gene Segments
[0293] See also the discussion on variation analysis above.
[0294] The skilled person will know of sources of human antibody gene sequences, such as IMGT (www.imgt.org) GenBank (www.ncbi.nlm.nih.gov/genbank) Bioinformatics tools for database manipulation are also readily available and known to the skilled person, e.g., as publicly available from the 1000 Genomes Project/EBI (www.1000genomes.org)
[0295] As a source of antibody gene segment sequences, the skilled person will also be aware of e following available databases and resources (including updates thereof):--
[0296] 1.1. The Kabat Database (G. Johnson and T. T.Wu, 2002; http://www.kabatdatabase.com). Created by E. A. Kabat and T. T. Wu in 1966, the Kabat database publishes aligned sequences of antibodies, T-cell receptors, major histocompatibility complex (MHC) class I and II molecules, and other proteins of immunological interest. A searchable interface is provided by the SeqhuntII tool, and a range of utilities is available for sequence alignment, sequence subgroup classification, and the generation of variability plots. See also Kabat, E. A., Wu, T. T., Perry, H., Gottesman, K., and Foeller, C. (1991) Sequences of Proteins of Immunological Interest, 5th ed., NIH Publication No. 91-3242,Bethesda, Md., which is incorporated herein by reference, in particular with reference to human gene segments for use in the present invention.
[0297] 1.2. KabatMan (A. C. R. Martin, 2002; http://www.bioinforg.uk/abs/simkab.html). This is a web interface to make simple queries to the Kabat sequence database.
[0298] 1.3. IMGT, the International ImMunoGeneTics Information System®; M.-P. Lefranc, 2002; http://imgt.cines.fr). IMGT is an integrated information system that specializes in antibodies, T cell receptors, and MHC molecules of all vertebrate species. It provides a common portal to standardized data that include nucleotide and protein sequences, oligonucleotide primers, gene maps, genetic polymorphisms, specificities, and two-dimensional (2D) and three-dimensional (3D) structures. IMGT includes three sequence databases (IMGT/LIGM-DB, IMGT/MHC-DB, IMGT/PRIMERDB), one genome database (IMGT/GENE-DB), one 3D structure database (IMCT/3Dstructure-DB), and a range of web resources ("IMGT Marie-Paule page") and interactive tools.
[0299] 1.4. V-BASE (I. M. Tomlinson, 2002; http://www.mrc-cpe.cam.ac.uk/vbase). V-BASE is a comprehensive directory of all human antibody germline variable region sequences compiled from more than one thousand published sequences. It includes a version of the alignment software DNAPLOT (developed by Hans-Helmar Althaus and Werner Muller) that allows the assignment of rearranged antibody V genes to their closest germline gene segments.
[0300] 1.5. Antibodies-Structure and Sequence (A. C. R. Martin, 2002; http://www.bioinforg.uk/abs). This page summarizes useful information on antibody structure and sequence. It provides a query interface to the Kabat antibody sequence data, general information on antibodies, crystal structures, and links to other antibody-related information. It also distributes an automated summary of all antibody structures deposited in the Protein Databank (PDB). Of particular interest is a thorough description and comparison of the various numbering schemes for antibody variable regions.
[0301] 1.6. AAAAA--AHo's Amazing Atlas of Antibody Anatomy (A. Honegger, 2001; http://www.unizh.ch/Ëœantibody). This resource includes tools for structural analysis, modeling, and engineering. It adopts a unifying scheme for comprehensive structural alignment of antibody and T-cell-receptor sequences, and includes Excel macros for antibody analysis and graphical representation.
[0302] 1.7. WAM--Web Antibody Modeling (N. Whitelegg and A. R. Rees, 2001; http://antibody.bath.ac.uk). Hosted by the Centre for Protein Analysis and Design at the University of Bath, United Kingdom. Based on the AbM package (formerly marketed by Oxford Molecular) to construct 3D models of antibody Fv sequences using a combination of established theoretical methods, this site also includes the latest antibody structural information.
[0303] 1.8. Mike's Immunoglobulin Structure/Function Page (M. R. Clark, 2001; http://www.path.cam.ac.uk/Ëœmrc7/mikeimages.html) These pages provide educational materials on immunoglobulin structure and function, and are illustrated by many colour images, models, and animations. Additional information is available on antibody humanization and Mike Clark's Therapeutic Antibody Human Homology Project, which aims to correlate clinical efficacy and anti-immunoglobulin responses with variable region sequences of therapeutic antibodies.
[0304] 1.9. The Antibody Resource Page (The Antibody Resource Page, 2000; http://www.antibodyresource.com). This site describes itself as the "complete guide to antibody research and suppliers." Links to amino acid sequencing tools, nucleotide antibody sequencing tools, and hybridoma/cell-culture databases are provided.
[0305] 1.9. Humanization bYDesign (J. Saldanha, 2000; http://people.cryst.bbk.ac.uk/Ëœubcg07s). This resource provides an overview on antibody humanization technology. The most useful feature is a searchable database (by sequence and text) of more than 40 published humanized antibodies including information on design issues, framework choice, framework back-mutations, and binding affinity of the humanized constructs.
[0306] See also Antibody Engineering Methods and Protocols, Ed. Benny K C Lo, Methods in Molecular Biology®, Human Press. Also at http://www.blogsua.com/odf/antibody-engineering-methods-and-protocolsanti- body-engineering-methods-and-protocols.pdf
[0307] As a source of genomic sequence variation data, the skilled person will also be aware of the following available databases and resources (including updates thereof):--
[0308] 1. HapMap (The International HapMap Consortium. 2003; http://hapmap.ncbi.nlm.nih.gov/index.html.en). The HapMap Project is an international project that aims to compare the genetic sequences of different individuals to identify chromosomal regions containing shared genetic variants. The HapMap www site provides tools to identify chromosomal regions and the variant therein, with options to drill down to population level frequency data.
[0309] 2. 1000 Genomes (The 1000 Genomes Project Consortium 2010; http://www.1000genomes.org/). This resource provides complete genomic sequence for 2500 unidentified individuals from one of 25 distinct population groups, with the aim of identifying genomic variants of >1%. The site provides the ability to interrogate data utilizing online tools (e.g. `Variation Pattern Finder`) and to download variant data for individual population groups.
[0310] 3. Japanese SNP Database (H. Haga et al. 2002; http://snp.ims.u-tokyo.ac.jp/index.html). Based on a study identifying 190,562 human genetic variants this site catalogues genomic variants with useful features for searching and summarizing data.
[0311] It is possible to identify variants in immunoglobulin genes classed as low-frequency or rare variants that segregate with specific human ethnic populations. For the purpose of this analysis, a low-frequency immunoglobulin gene segment is classed as one with `Minor Allele Frequency` (MAF) (cumulative frequency) of between 0.5%-5%, rare variants are those classed as having a MAF of less than 0.5% in a particular human population.
[0312] The following bioinformatics protocol is envisaged to identify human immunoglobulin gene segments for use in the present invention:
[0313] (a) Identify one or more genomic regions containing gene segments of interest (`target genomic regions`) and calculate the genomic coordinates, using coordinates that match the sequence assembly build used by either the 1000 Genomes project or International HapMap project (or another selected human gene database of choice).
[0314] (b) Identify genomic variants mapped to the genomic regions previously identified in (a). Retrieve variant frequencies for variants for each super population and preferably sub-population where such data is available. Tools readily available on the HapMap WWW site and the VWC tools for the 1000Genomes Project are useful for this step.
[0315] (c) Filter list of genomic variants from target genomic regions to contain only variants classed as either `Non-synonymous` single nucleotide polymorphisms (SNPs) or genomic `insertions or defections` (indels). Filter further to include those that are present in exonic sequences only.
[0316] (d) Correlate population frequency data for each of the identified variants for each of the super populations (for example `European Ancestry`, `East Asian ancestry`, `West African ancestry`, `Americas`, and `South Asian ancestry`) to identify those variants that segregate with less than two super-populations. Further correlate all identified variants with each of the sub-populations (for example, `European ancestry` super-population might be subdivided into groups such as `CEU--Utah residents with Northern or Western European ancestry`, `TSI Toscani in Italia` and `British from England and Scotland`) and produce a second score for rarity of variants in within a super-population.
[0317] (e) Collect one or more gene segments that show segregation to specific sub-populations for construction of synthetic loci according to the invention.
[0318] In one embodiment throughout the present text, "germline" refers to the canonical germline gene segment sequence.
[0319] By detailed analysis of the 1000 Genomes database, the inventors have devised a collection of candidate polymorphic antibody gene segment variants, e.g., human variant JH gene segments (e.g., see Example 4), that can be built into the design of transgenic heavy chain loci in mice for expressing increasingly diverse and new, synthetic repertoires of human variable regions. To this end, the invention provides the following embodiments.
The Present Invention Provides in a Fourth Configuration--
Selection of Human JH6*02 Variant
Transgenic IgH Loci, Non-Human Vertebrates, Cells & Antibodies Based on Human JH6*02
[0320] As explained above, in designing transgenic Ig heavy chain loci the present inventors have considered the huge amount of data available from the 1000 Genomes project (see www.1000genomes.org) that analyses gene distributions amongst many human populations, and in particular data on Ig gene segments. The inventors were also aware of human gene segments disclosed in the IMGT database (see www.imgt.org) and in Ensembl (see www.ensembl.org). The inventors needed to make choices about which human gene segments to include amongst the large number of human gene segments presented in these databases and the other sources of human Ig gene segment information known in the art, including those other databases disclosed herein. When choosing human JH gene segments, the inventors were aware that human JH6 encodes a relatively long amino acid sequence, and thus the inventors thought it desirable to include this for increasing the chances of producing IgH chains with relatively long HCDR3 regions. Antibodies with long HCDR3 (at least 20 amino acids according to IMGT nomenclature) have been shown to neutralize a variety of pathogens effectively including HIV, Influenza virus, malaria and Africa trypanosomes. Reference is also made to naturally-occurring Camelid (e.g., llama or camel) heavy chain-only antibodies which bear long HCDR3s for reaching relatively inaccessible epitopes (see, e.g., EP0937140). Long HCDR3s can form unique stable subdomains with extended loop structure that towers above the antibody surface to confer fine specificity. In some cases, the long HCDR3 itself is sufficient for epitope binding and neutralization (Liu, L et al; Journal of Virology. 2011. 85: 8467-8476, incorporated herein by reference). The unique structure of the long HCDR3 allows it to bind to cognate epitopes within inaccessible structure or extensive glycosylation on a pathogen surface. In human peripheral blood, there is around 3.5% of naive B antibodies or 1.9% of memory B IgG antibodies containing the HCDR3s with lengths of more than 24 amino acids (PLoS One. 2012; 7(5):e36750. Epub 2012 May 9; "Human peripheral blood antibodies with long HCDR3s are established primarily at original recombination using a limited subset of germline genes"; Briney BS e al, incorporated herein by reference) (FIG. 1). The usage analysis indicates that these antibodies have the preference to use human JH6 with human D2-2, D3-3 or D2-15 (Brinley, B S et al, FIGS. 2-5). See also PLoS One. 2011 Mar. 30; 6(3):e16857; Comparison of antibody repertoires produced by HIV-1 infection, other chronic and acute infections, and systemic autoimmune disease"; Breden F et al, incorporated herein by reference. Around 20% of all HCDR3 of antibodies use JH6. However, in those antibodies with HCDR3 of more than 24 amino acids, 70% use JH6 (Brinley, B S et al, FIG. 2).
[0321] There is a need in the art for genetically modified non-human vertebrates and cells that can make antibodies and heavy chains that have long human HCDR3s, as well as antibodies, chains and VH domains that can be selected from such vertebrates and cells wherein these can address target epitopes better accessed by long HCDR3s.
[0322] The inventors, therefore, chose in this configuration of the invention to include a human JH6 gene segment as a mandatory human gene segment in their IgH locus design. Several different naturally-occurring human JH6 variants are known (e.g., JH6*01 to *04 as well as others; IMGT nomenclature). The inventors considered this when deciding upon which human JH6 variant should be included in the transgenic IgH locus design. An alignment of some human JH6 variants is shown in FIG. 7 (from www.imgt.org; dashes indicate identical nucleotides; nucleotide changes versus the *01 variant are shown by underlined nucleotides and corresponding amino acid changes are shown by underlined amino acids; Genbank accession numbers (release 185.0) are shown prefixed by J, X, M or A). The inventors used sequencing of human genomic DNA samples, inspection of public IgH DNA databases as well as informed choices on the basis of variant sequences as means to arrive at a rational choice of which JH6 variant to use.
[0323] The 1000 Genomes database uses human JH6*03 as the reference sequence, which would be a possible choice for the skilled person wishing to construct a transgenic IgH locus. The inventors noticed (e.g., FIG. 7 herein) that position 6 in JH6*03 is a tyrosine (Y) encoded by a TAC codon, whereas some other naturally-occurring human variants have a glycine (G) encoded by a GGT codon (the glycine being present as a YYG motif, forming part of a larger YYGXDX motif). To understand the potential significance of this, the inventors carried out analysis of JH sequences from other vertebrate species. The inventors surprisingly noticed that YYG and YYGXDX motifs are conserved across many vertebrate species (see FIGS. 7 & 8). This suggested to the inventors, therefore, that preservation of this motif might be desirable, which could guide the choice of JH6 variant for use in the present invention.
[0324] Another pointer arose when the inventors considered the TAC codon versus the GGT codon encoding Y or G respectively. The inventors considered the impact of these nucleotide sequences on the action of activation-induced cytidine deaminase (AID). The inventors knew that activation-induced cytidine deaminase (AID) is believed to initiate Ig somatic hypermutation (SHM) in a multi-step mechanism and they addressed this activity when rationally designing the locus. AID catalyses the deamination of C to U in DNA, generating mutations at C bases. Cytidines located within hotspot motifs are preferentially deaminated. Certain motifs are hotspots for AID activity (DGYW, WRC, WRCY, WRCH, RGYW, AGY, TAC, WGCW, wherein W=A or T, Y=C or T, D=A, G or T, H=A or C or T, and R=A or G). The presence of a TAC codon encoding Y at position 6 in JH6*03 creates AID mutation hotspots (the cytidine being the substrate of AID), these hotspots being the underlined motifs in the previous sentence. The inventors considered the impact of this and in doing so they considered possible mutants created by AID activity at the cytidine. Reference is made to FIG. 9. The inventors noticed that a mutation at the third base of the TAC codon would yield 3 possible outcomes: Y, stop or stop. Thus, out of the three stop codons possible in the genetic code (the other being encoded by TGA--see FIG. 9), two of them would be provided by mutation of the cytidine in the TAC codon encoding position 6 in JH6*03. The inventors, therefore, considered that this might increase the chances of non-productive IgH variable region production in transgenic loci based on JH6*03. Moreover, the inventors noticed that provision of a GGT codon instead (as per the other human JH6 variants) seemed preferable since mutation of the third base would never yield a stop codon (see FIG. 9), and furthermore would retain coding, and thus conservation, of glycine at position 6, which the inventors also noticed was is in the YYG and YYGXDX motifs conserved across species.
[0325] Having decided against using JH6*03, the inventors needed to make a choice from other possible human variants. The MDV motif is at the C-terminus of HCDR3 based on human JH6, the adjacent framework 4 (FW4) starting with the WGQ motif (with reference to the sequence shown encoded by JH6*01; FIG. 7). In making their choices for locus design, the inventors wished to maximise conservation of this HCDR3/FW4 junction in product IgH chains and antibodies including these. The inventors believed this to be desirable for heavy chain variable domain functionality and conformation. The inventors thought that this might in some cases be desirable to minimise immunogenicity (suitable for human pharmaceutical use). Consistent with these considerations, the inventors wanted to make a choice that would minimise mutation around the HCDR3/FW4 junction as a result of SHM in vivo to conserve junction configuration. See Rogozin & Diaz; "Cutting Edge: DGYW/WRCH Is a Better Predictor of Mutability at G:C Bases in Ig Hypermutation Than the Widely Accepted RGYW/WRCY Motif and Probably Reflects a Two-Step Activation-Induced Cytidine Deaminase-Triggered Process"; Journal of Immunology; Mar. 15, 2004 vol. 172 no. 6 3382-3384. An example of a DGYW motif is GGCA. The inventors had this in mind when analysing the variant sequences.
[0326] With these considerations in mind, the inventors decided specifically to use human JH6*02 as the mandatory human JH6 for their IgH locus design. JH6*01 was rejected as the mandatory JH6 gene segment since the nucleotide sequence GGG CAA (encoding G and Q) contains a GGCA motif which is an AID recognition hotspot. The inventors realised that JH6*04 also contains such a motif due to the presence of the sequence GGC AAA encoding G and K (positions 11 and 12 respectively). The inventors also realised that the *02 variant has a C instead of a G that is in the *01 variant, the C desirably being a synonymous change (i.e., not changing the encoded amino acid sequence around the CDR3/FW4 junction) and also this does not provide a GGCA AID hotspot motif. The inventors, therefore, decided that the mandatory JH6 should have this C base and this too pointed them to using the human JH6*02 variant.
[0327] In one example of any configuration of the invention herein, the only JH6 species included in the locus or genome is human JH6*02.
[0328] The inventors obtained 9 anonymised DNA samples from cheek swabs of 9 consenting human adults. Sequencing was performed on IgH locus DNA to confirm natural JH6 variant usage. It was found that the genome of all 9 humans contained a JH6*02 variant gene segment. In 7 out of the 9 humans, the genome was homozygous for JH6*02 (i.e., each chromosome 14 had JH6*02 as its JH6 gene segment in the IgH locus). The inventors also inspected the publicly-available sequence information from the genomes of well-known scientists Craig Venter and Jim Watson. Both of these genomes contain JH6*02 too. This indicated to the inventors that this variant is common in humans.
[0329] So, the inventors made a choice of human JH6*02 on the basis of
[0330] (i) Containing the YYG and YYGXDX motifs that is conserved across several vertebrate species;
[0331] (ii) Provision of one less TAC codon (an AID hotspot that risks stop codons) and a choice instead of a codon that preserves the YYG and YYGXDX motifs;
[0332] (iii) Avoidance of a GGCA AID hotspot in the region of the HCDR3/FW4 junction; and
[0333] (iv) Common occurrence (and thus conservation and acceptability) in humans of the JH6*02 variant.
[0334] This rationale was tested by the inventors in laboratory examples, in order to see if human JH6*02 could desirably participate in antibody gene segment recombination and heavy chain production in a foreign (non-human vertebrate) setting, and moreover to assess if long HCDR3s based on human JH6*02 could be produced in vivo (in naive and immunised settings) in such non-human systems. It was noted that in some non-human settings, such as a mouse, the YYG and YYGXDX motifs are not conserved, and thus the inventors decided that it was important to test whether or not JH6*02 (having the YYG and YYGXDX motifs) could function properly in such a foreign setting to participate in VDJ recombination and selection against antigen.
[0335] Thus, as explained further in the examples, the inventors constructed transgenic JH6*02-containing IgH loci in ES cells, generated transgenic non-human vertebrates from the ES cells (both naive and immunised with a range of different target antigen types), isolated antibodies and heavy chain sequences based on JH6*02 as well as B-cells expressing these and made hybridomas expressing antigen-specific antibodies that are based on the chosen JH6*02 variant. The inventors found that the JH6*02 variant was extensively used and could contribute to the production of HCDR3 of at least 20 amino acids in many different heavy chains (including antigen-specific heavy chains). The chosen variant was preferably used over other JH gene segments in all settings (naive, immunised and antigen-specific) for the production of HCDR3 of at least 20 amino acids.
[0336] Thus, the present invention provides an IgH locus including human JH6*02 (IMGT nomenclature) as a mandatory JH gene segment. In one embodiment, the locus comprises non-human vertebrate (e.g., mouse or rat) constant region gene segments downstream (i.e., 3' of) the human JH6*02; and one or more VH gene segments (e.g., a plurality of human VH gene segments) and one or more D gene segments (e.g., a plurality of human D gene segments) upstream of (i.e., 5' of) the human JH6*02. For example, the locus is comprised by a vector (e.g., a DNA vector, e.g., a yeast artificial chromosome (YAC), BAC or PAC). Such a vector (e.g., YAC) can be introduced into a non-human vertebrate (e.g., mouse or rat) cell using standard techniques (e.g., pronuclear injection) so that the locus is integrated into the cell genome for expression of IgH chains comprising at least one chain whose variable domain is a product of the recombination of human JH6*02 with a VH and a D gene segment.
[0337] In another example, the locus (e.g., with a completely human, rat or mouse constant region, or a human/mouse chimaeric constant region) can be provided in the genome of a non-human vertebrate (e.g., mouse or rat) cell. For example, the cell is an ES cell or an antibody-producing cell (e.g., an isolated B-cell, an iPS cell or a hybridoma).
[0338] In another example, the invention provides a non-human vertebrate (e.g., a mouse or a rat) comprising an IgH locus of the invention which comprises a human JH6*02 gene segment, wherein the locus can express an IgH chain whose variable domain is a product of the recombination of human JH6*02 with a VH and a D gene segment. As shown in the examples, the inventors have successfully produced such mice which produce such IgH chains with VH domains based on human JH6*02. The inventors isolated and sequenced IgH chains from the mice before (naive) and after (immunised) exposure to a range of target antigens and confirmed by comparison to IMGT IgH gene segment sequences that the isolated chains (and antibodies containing these) were produced based on JH6*02. Such chains were found in naive mice, as well as in antigen-specific antibodies from immunised mice. B-cells were isolated from immunised mice, wherein the B-cells express antibodies based on JH6*02 and hybridomas were generated from the B-cells, the hybridomas expressing antigen-specific antibodies based on JH6*02. The inventors, therefore, provided the locus, vertebrate, cell and hybridoma of the invention based on the use of human JH6*02 and showed that antibodies based on JH6*02 and B-cells expressing these can be successfully produced and isolated following immunisation of the vertebrates, corresponding hybridomas being a good source of antibodies whose VH domains are based on JH6*02, e.g. for administration to a patient, e.g., for human medicine. Furthermore, it was found possible to produce and isolated antigen-specific antibodies whose VH domains are based on JH6*02 and which had a relatively long HCDR3 (e.g., 20 amino acids).
[0339] Thus, the present invention provides embodiments as in the following clauses:--
[0340] 1. A non-human vertebrate (optionally a mouse or a rat) or vertebrate cell whose genome comprises an immunoglobulin heavy chain locus comprising human gene segment JH6*02, one or more VH gene segments and one or more D gene segments upstream of a constant region; wherein the gene segments in the heavy chain locus are operably linked to the constant region thereof so that the mouse is capable of producing an antibody heavy chain produced by recombination of the human JH6*02 with a D segment and a VH segment.
[0341] In another example, the invention provides
[0342] A non-human vertebrate (optionally a mouse or a rat) or vertebrate cell whose genome comprises an immunoglobulin heavy chain locus comprising one, more or all of human IGHV gene segments selected from V3-21, V3-13, V3-7, V6-1, V1-8, V1-2, V7-4-1, V1-3, V1-18, V4-4, V3-9, V3-23, V3-11 and V3-20 (e.g., one, more or all of V3-21*03, V3-13*01, V3-7*01, V6-1*01, V1-8*01, V1-2*02, V7-4-1*01, V1-3*01, V1-18*01, V4-4*01, V3-9*01 and V3-23*04). These segments were found in naive repertoires to be productive to produce HCDR3s of at least 20 amino acids in length. In an embodiment, the locus comprises a human JH6, e.g., JH6*02.
[0343] The invention also provides a HCDR3, VH domain, antibody heavy chain or antibody having a HCDR3 size of at least 20 amino acids. Optionally, the HCDR3 or VH domain (or VH domain of the heavy chain or antibody) comprises mouse AID-pattern somatic hypermutations and/or mouse dTd-pattern mutations. This can be provided, for example, wherein VH domain is produced in a mouse comprising mouse AID and/or mouse TdT (e.g., endogenous AID or TdT). See also Annu. Rev. Biochem. 2007. 76:1-22; Javier M. Di Noia and Michael S. Neuberger, "Molecular Mechanisms of Antibody Somatic Hypermutation" (in particular FIG. 1 and associated discussion on AID hotspots in mouse); and Curr Opin Immunol. 1995 April; 7(2):248-54, "Somatic hypermutation", Neuberger M S and Milstein C (in particular, discussion on hotspots in mouse), the disclosures of which are incorporated herein by reference.
[0344] These segments were found in naive repertoires to be productive in recombination with human JH6*02 to produce HCDR3s of at least 20 amino acids in length.
[0345] In an example, the vertebrate is naive. In another embodiment, the vertebrate instead is immunised with a target antigen.
[0346] In an example, the vertebrate or cell mentioned below is capable of so producing an antibody heavy chain upon immunisation with a target antigen. In an example, the vertebrate is an immunised vertebrate that produces antibody heavy chains specific for a target antigen and wherein the variable domains of the heavy chains are the product of recombination between a VH, D and JH6*02. For example, the D is selected from human D3-3, D2-15, D3-9; D4-17; D3-10; D2-2; D5-24; D6-19; D3-22; D6-13; D5-12; D1-26; D1-20; D5-18; D3-16; D2-21; D1-14; D7-27; D1-1; D6-25; D2-14 and D4-23 (e.g., selected from D3-9*01; D4-17*01; D3-10*01; D2-2*02; D5-24*01; D6-19*01; D3-22*01; D6-13*01; D5-12*01; D1-26*01; D1-20*01; D5-18*01; D3-16*02; D2-21*02; D1-14*01; D7-27*02; D1-1*01; D6-25*01; D2-15*01; and D4-23*01). For example, the D is human D3-9 or D3-10. In an example, the HCDR3 length is at least 20 amino acids (e.g., 20, 21, 23 or 24).
[0347] In an example of the vertebrate or cell, the genome comprises additional human JH gene segments (e.g., JH2, 3, 4 and 5 gene segments).
[0348] In an example of the vertebrate or cell, the genome comprises an immunoglobulin light chain locus comprising one or more human V gene segments and one or more human J gene segments upstream of a constant region (e.g., a human or a mouse lambda or kappa constant region).
[0349] For rearrangement and expression of heavy chains, the locus comprises control elements, such as an E and S between the J gene segment(s) and the constant region as is known by the skilled person. In one example, a mouse E and S is included in the heavy chain locus between the JH6*02 and the constant region (i.e., in 5' to 3' order the locus comprises the JH6*02, E and S and constant region). In an example, the E and S are E and S of a mouse 129-derived
[0350] genome (e.g., a 129Sv-derived genome, e.g., 129Sv/EV (such as 129S7Sv/Ev (such as from AB2.1 or AB2.2 cells obtainable from Baylor College of Medicine, Texas, USA) or 129S6Sv/Ev))); in another example, the E and S are E and S of a mouse C57BL/6-derived genome. In this respect, the locus can be constructed in the IgH locus of the genome of a cell selected from AB2.1, AB2.2, VGF1, CJ7 and FH14. VGF1 cells were established and described in Auerbach W, Dunmore J H, Fairchild-Huntress V, et al; Establishment and chimera analysis of 129/SvEv- and C57BU6-derived mouse embryonic stem cell lines. Biotechniques 2000; 29:1024-8, 30, 32, incorporated herein by reference.
[0351] Additionally or alternatively, the constant region (or at least a C or C and gamma constant regions thereof) is a constant region (or C or C and gamma constant regions thereof) is of a genome described in the paragraph immediately above.
[0352] A suitable source of JH6*02 and other human DNA sequences will be readily apparent to the skilled person. For example, it is possible to collect a DNA sample from a consenting human donor (e.g., a cheek swab sample as per the Example herein) from which can be obtained suitable DNA sequences for use in constructing a locus of the invention. Other sources of human DNA are commercially available, as will be known to the skilled person. Alternatively, the skilled person is able to construct gene segment sequence by referring to one or more databases of human Ig gene segment sequences disclosed herein.
[0353] 2. The vertebrate of clause 1, wherein the vertebrate has been immunised with a target antigen and wherein the variable domain of the heavy chain is the product of recombination between a VH, D and JH6*02 and wherein the HCDR3 length is at least 20 amino acids (e.g., 20, 21, 23 or 24).
[0354] Optionally, the immunised vertebrate produces an antibody heavy chain specific for a target antigen and wherein the variable domain of the heavy chain is the product of recombination between a VH, D and JH6*02 and wherein the HCDR3 length is at least 20 amino acids (e.g., 20, 21, 23 or 24).
[0355] 3. A non-human vertebrate cell (optionally a mouse cell or a rat cell) whose genome comprises an immunoglobulin heavy chain locus comprising human gene segment JH6*02, one or more VH gene segments and one or more D gene segments upstream of a constant region; wherein the gene segments in the heavy chain locus are operably linked to the constant region thereof for producing (e.g., in a subsequent progeny cell) an antibody heavy chain produced by recombination of the human JH6*02 with a D segment and a VH segment.
[0356] 4. The cell of clause 3, which is an ES cell capable of differentiation into a progeny antibody-producing cell that expresses said heavy chain.
[0357] 5. The vertebrate or cell of any preceding clause, wherein the heavy chain locus comprises a human JH6*02 recombination signal sequence (RSS) operably connected 5' to the JH6*02 gene segment.
[0358] For example, the native RSS-JH6*02 sequence can be used to advantageously maintain the natural pairing between RSS and theis JH gene segment. In this respect, the following sequence is used:--
[0359] ggtttttgtggggtgaggatggacattctgccattgtgattactactactactacggtatggacgt- ctggggccaagggaccac ggtcaccg tctcctcag (SEQ ID NO: 238)
[0360] RSSs have a common architecture: 9mer (e.g., first underlined sequence above) followed by a 22 bp spacer and then a 7mer (e.g., second underlined sequence above). Spacers are 23 bp+/-1 normally, while the 9 and 7mer are more conserved.
[0361] 6. The vertebrate or cell of clause 5, wherein the RSS is SEQ ID NO: 238 or a sequence having an identical 9mer and 7mer sequence flanking a sequence that is at least 70% identical to the 22mer sequence of SEQ ID NO: 238.
[0362] 7. The vertebrate or cell of clause 6, wherein the RSS and JH6*02 are provided as SEQ ID NO: 237.
[0363] 8. The vertebrate or cell of any preceding clause, wherein the JH6*02 is the only JH6-type gene segment in the genome.
[0364] 9. The vertebrate or cell of any preceding clause, wherein the JH6*02 is the closest JH gene segment to the constant region in the locus.
[0365] 10. The vertebrate or cell of any preceding clause, wherein the locus comprises one, more or all human D gene segments D3-9; D4-17; D3-10; D2-2; D5-24; D6-19; D3-22; D6-13; D5-12; D1-26; D1-20; D5-18; D3-16; D2-21; D1-14; D7-27; D1-1; D6-25; D2-14; and D4-23.
[0366] For example, the locus comprises one, more or all of human D gene segments D3-9*01; D4-17*01; D3-10*01; D2-2*02; D5-24*01; D6-19*01; D3-22*01; D6-13*01; D5-12*01; D1-26*01; D1-20*01; D5-18*01; D3-16*02; D2-21*02; D1-14*01; D7-27*02; D1-1*01; D6-25*01; D2-15*01; and D4-23*01.
[0367] 11. The vertebrate or cell of clause 10, wherein the locus comprises one, more or all human D gene segments D3-9, D3-10, D6-19, D4-17, D6-13, D3-22, D2-2, D2-25 and D3-3.
[0368] These D segments were found to be productive in recombination with human JH6*02 to produce HCDR3s of at least 20 amino acids in length.
[0369] In an example, the locus comprises one, more or all human D gene segments D3-9, D3-10, D6-19, D4-17, D6-13 and D3-22 (for example one, more or all of D3-9*01, D3-10*01, D6-19*01, D4-17*01, D6-13*01 and D3-22*01). These D segments were found in naive repertoires to be productive in recombination with human JH6*02 to produce HCDR3s of at least 20 amino acids in length.
[0370] In an example, the locus comprises one, more or all human D gene segments D3-10, D6-19 and D1-26 (for example, one, more or all of D3-10*01, D6-19*01 and D1-26*01). These D segments were found in immunised repertoires to be productive in recombination with human JH6*02 to produce HCDR3s of at least 20 amino acids in length.
[0371] In an example, the locus comprises one, more or all human D gene segments D3-9 and D3-10 (for example, one, more or all of D3-9*01 and D3-10*01). These D segments were found in antigen-specific repertoires to be productive in recombination with human JH6*02 to produce HCDR3s of at least 20 amino acids in length.
[0372] 12. The vertebrate or cell of any preceding clause, wherein the locus comprises a plurality of human D gene segments and the JH6*02 is in human germline configuration with respect to the 3'-most human D gene segment (or all of the human D segments comprised by the locus).
[0373] In an example, the 3'-most D gene segment is D7-27. In an example, the locus comprises all of human D gene segments from D1-1 to D7-27 as present in a germline human IgH locus (e.g., as shown in the IMGT database).
[0374] Alternatively or additionally, the JH6*02 is in human germline configuration with respect to one, more or all of the E S and constant region (e.g., Cu)
[0375] 13. The vertebrate or cell of any preceding clause, wherein the locus comprises one, more or all of IGHV gene segments selected from V3-21, V3-13, V3-7, V6-1, V1-8, V1-2, V7-4-1, V1-3, V1-18, V4-4, V3-9, V3-23, V3-11 and V3-20.
[0376] In an example, the locus comprises one, more or all human IGHV gene segments V3-21, V3-13, V3-7, V6-1, V1-8, V1-2, V7-4-1, V1-3, V1-18, V4-4, V3-9, V3-23 (for example, one, more or all of V3-21*03, V3-13*01, V3-7*01, V6-1*01, V1-8*01, V1-2*02, V7-4-1*01, V1-3*01, V1-18*01, V4-4*01, V3-9*01 and V3-23*04). These segments were found in naive repertoires to be productive in recombination with human JH6*02 to produce HCDR3s of at least 20 amino acids in length.
[0377] In an example, the locus comprises one, more or all human IGHV gene segments V3-7, V3-11 and V4-4 (for example, one, more or all of V3-7*01, V3-11*01 and V4-4*02). These segments were found in immunised repertoires to be productive in recombination with human JH6*02 to produce HCDR3s of at least 20 amino acids in length.
[0378] In an example, the locus comprises one, more or all human IGHV gene segments V4-4, V1-8, V3-9, V3-11 and V3-20 (for example, one, more or all of V4-4*02, V1-8*01, V3-9*01, V3-11*01 and V3-20 (e.g., *d01). These segments were found in antigen-specific repertoires to be productive in recombination with human JH6*02 to produce HCDR3s of at least 20 amino acids in length.
[0379] 14. The vertebrate or cell of any preceding clause, wherein the locus comprises one, more or all of human D3-9*01, D3-10*01, D6-19*01, D6-13*01, D1-26*01, IGHV1-8*01, IGHV4-61*01, IGHV6-1*01, IGHV4-4*02, IGHV1-3*01, IGHV3-66*03, IGHV3-7*01 and IGHV3-9*01.
[0380] These are gene segments that very frequently combine with JH6*02 to produce productive heavy chains and antibodies.
[0381] For example, the locus comprises one, more or all of human IGHV1-8*01, D3-9*01 and D3-10*01. These gene segments were productive with JH6*02 to produce HCDR3s of at least 20 amino acids in more than 10 antibodies.
[0382] 15. An antibody-producing cell (e.g., a B-cell) that is a progeny of the cell of any one of clauses 3 to 14, wherein the antibody-producing cell comprises a heavy chain locus comprising a rearranged variable region produced by recombination of human JH6*02 with a D segment and a VH segment (e.g., JH6*02 with human VH3-11 (e.g., VH3-11*01) and D3-9; VH3-20 (e.g., VH3-20*01) and D3-10; VH4-4 (e.g., VH4-4*02) and D3-10; VH3-9 (e.g., VH3-9*01) and D3-10; or VH1-8 (e.g., VH1-8*01) and D310).
[0383] Such a variable region would be the product of in vivo somatic hypermutation in a non-human vertebrate or cell of the invention.
[0384] 16. The cell of clause 15, which is a B-cell or hybridoma that expresses a target antigen-specific antibody comprising a heavy chain that comprises a rearranged variable region produced by recombination of human JH6*02 with a D segment and a VH segment (e.g., JH6*02 with human VH3-11 (e.g., VH3-11*01) and D3-9; VH3-20 (e.g., VH3-20*01) and D3-10; VH4-4 (e.g., VH4-4*02) and D3-10; VH3-9 (e.g., VH3-9*01) and D3-10; or VH1-8 (e.g., VH1-8*01) and D310).
[0385] Such a variable region would be the product of in vivo somatic hypermutation in a non-human vertebrate or cell of the invention
[0386] 17. The vertebrate or cell of any preceding clause, wherein the antibody heavy chain specifically binds a target antigen.
[0387] 18. The vertebrate or cell of any preceding clause, wherein the antibody heavy chain has a HCDR3 length of at least 20 amino acids.
[0388] Optionally, the HCDR3 length is at least 21, 22, 23, 24, 25, 26, 27, 28, 29 or 30 amino acids. Additionally, in one example the length is no more than 35, 34, 33, 32 or 31 amino acids. For example, the HCDR3 length is 20, 21, 22, 23 or 24 amino acids.
[0389] 19. The vertebrate or cell of any preceding clause, wherein the antibody heavy chain is a product of the recombination of JH6*02 with a human VH gene segment recited in clause 13 or 14 and/or a D gene segment recited in clause 10, 11 or 14.
[0390] 20. The vertebrate or cell of any preceding clause, wherein all endogenous non-human vertebrate heavy chain variable region gene segments have been inactivated in the genome (E.g., by gene segment deletion or inversion).
[0391] 21. The vertebrate or cell of any preceding clause, wherein the genome is homozygous for said heavy chain locus.
[0392] 22. A heavy chain (e.g., comprised by an antibody) isolated from a vertebrate of any one of clauses 1, 2, 5 to 14 and 17 to 21 wherein the heavy chain comprises a HCDR3 of at least 20 amino acids.
[0393] 23. The heavy chain of clause 22, wherein the HCDR3 is the product of recombination of human JH6*02 with a human VH gene segment recited in clause 13 or 14 and/or a D gene segment recited in clause 10, 11 or 14.
[0394] In an example, the heavy chain is chimaeric where the C region is non-human. In an example, the heavy chain is human where the C region is human.
[0395] 24. A heavy chain (e.g., comprised by an antibody) whose VH variable domain is identical to the VH variable domain of the heavy chain of clause 22 or 23, and which comprises a human constant region or a human-mouse chimaeric constant region (e.g., CH1 is human and the other constant domains are mouse).
[0396] 25. The heavy chain of clause 22, 23 or 24, whose VH variable domain is specific for a target antigen.
[0397] 26. A method for producing a heavy chain, VH domain or an antibody specific to a target antigen, the method comprising immunizing a non-human vertebrate according to any one of clauses 1, 2, 5 to 14 and 17 to 21 with the antigen and isolating the heavy chain, VH domain or an antibody specific to a target antigen or a cell producing the heavy chain, VH domain or an antibody, wherein the heavy chain, VH domain or an antibody comprises a HCDR3 that is derived from the recombination of human JH6*02 with a VH gene segment and a D gene segment.
[0398] 27. A method for producing a human heavy chain or antibody comprising carrying out the method of clause 26, wherein the constant region of the locus is a non-human vertebrate (e.g., mouse or rat) constant region, and then replacing the non-human constant region of the isolated heavy chain or antibody with a human constant region (e.g., by engineering of the nucleic acid encoding the antibody).
[0399] 28. A heavy chain, VH domain or an antibody produced by the method of clause 26 or 27. Optionally the HCDR3 length is at least 20 amino acids as herein described.
[0400] 29. A B-cell or hybridoma expressing a heavy chain VH domain that is identical to the VH domain of the heavy chain of clause 22, 23 or 28.
[0401] 30. A nucleic acid encoding the VH domain of the heavy chain of clause 22, 23 or 28, or encoding the heavy chain of clause 22, 23, 24, 25 or 28.
[0402] 31. A vector (e.g., a CHO cell or HEK293 cell vector) comprising the nucleic acid of clause 30; optionally wherein the vector is in a host cell (e.g., a CHO cell or HEK293 cell).
[0403] 32. A pharmaceutical composition comprising the antibody, heavy chain or VH domain (e.g., comprised by an antibody) of any one of clauses 22 to 25 and 28, together with a pharmaceutically-acceptable excipient, diluent or a medicament (e.g., a further antigen-specific variable domain, heavy chain or antibody).
[0404] 33. The antibody, heavy chain or VH domain (e.g., comprised by an antibody) of any one of clauses 22 to 25 and 28 for use in medicine (e.g., human medicine).
[0405] For example, the locus comprises the following human VH gene segments
IGHV6-1
IGHV3-7
IGHV1-8
IGHV3-9
IGHV3-11
IGHV3-13
IGHV1-18
IGHV3-30
IGHV4-31
IGHV4-39 IGHV4-59
[0406] Optionally also (i) and/or (ii)
(i)
IGHV1-2 IGHV2-5 and IGHV3-21
[0407] (ii)
IGHV1-2 IGHV2-5 IGHV3-21 IGHV1-24
[0408] For example, the locus comprises the following human VH gene segment variants
IGHV6-1*01
IGHV3-7'01
IGHV1-8'01
IGHV3-9'01
IGHV3-11*01
IGHV3-13*01
IGHV1-18'01
IGHV3-30'18
IGHV4-31'03
IGHV4-39*01 and
IGHV4-59*01;
[0409] Optionally also (iii) or (iv)
(ii)
IGHV1-2*04 IGHV2-5*10 and IGHV3-21*03
[0410] (iv)
IGHV1-2'02 IGHV2-5'01 IGHV3-21'01 and IGHV1-24'01
[0411] For example, the locus comprises the following human JH gene segment variants
IGHJ2*01 IGHJ3*02
IGHJ4*02 IGHJ5*02 and IGHJ6*02
[0412] For example, the locus comprises the following human D gene segments
IGHD1-1
IGHD2-2
IGHD3-9
IGHD3-10
IGHD5-12
IGHD6-13
IGHD1-14
IGHD2-15
IGHD3-16
IGHD4-17
IGHD6-19
IGHD2-21
IGHD5-24
IGHD1-26 and
IGHD7-27
[0413] and optionally also (v) or (vi) (v)
IGHD3-3
[0414] (vi)
IGHD3-3
IGHD4-4
IGHD5-5
IGHD6-6
IGHD1-7
IGHD2-8 and
IGHD2-8
The Present Invention Provides in a Fifth Configuration--
Constant Regions Tailored to Human Use & Antibody Humanization
[0415] Additional rational design and bioinformatics has led the inventors to realise that specific human constant region variants are conserved across many diverse human populations. The inventors realised that this opens up the possibility of making a choice to humanize antibodies, chains and variable domains by using such specific constant regions in products, rather than arbitrarily choosing the human constant region (or a synthetic version of a human constant region). This aspect of the invention also enables one to tailor antibody-based drugs to specific human ethnic populations, thereby more closely matching drug to patient (and thus disease setting) than has hitherto been performed. It can be a problem in the state of the art that antibodies are humanized with an arbitrary choice of human constant region (presumably derived from one (often unknown) ethnic population or non-naturally occurring) that does not function as well in patients of a different human ethnic population. This is important, since the constant region has the major role in providing antibody effector functions, e.g., for antibody recycling, cellular and complement recruitment and for cell killing.
[0416] As discussed further in WO2011066501, human IgG sub-types IgG1, IgG2, gG3 and IgG4 exhibit differential capacity to recruit immune functions, such as antibody-dependent cellular cytotoxicity (ADCC, e.g., IgG1 and IgG3), antibody-dependent cellular phagocytosis (ADCP, e.g., IgG1, IgG2, IgG3 and IgG4), and complement dependent cytotoxicity (CDC, e.g., IgG1, IgG3). Sub-type-specific engagement of such immune functions is based on selectivity for Fc receptors on distinct immune cells and the ability to bind C1q and activate the assembly of a membrane attack complex (MAC).
[0417] Among the various types, relative affinity for FcY receptors (e.g., FcYRI, FcYRIIa/b/c, FcYRIIIa/b) is high for IgG1 and IgG3, however, there is minimal affinity for IgG2 (restricted to the FcYRIIa 131H polymorphism), and IgG4 only has measurable affinity for FcYRI. Using comparative sequence analysis and co-crystal structures, the key contact residues for receptor binding have been mapped to the amino acid residues spanning the lower hinge and CH2 region. Using standard protein engineering techniques, some success in enhancing or reducing the affinity of an antibody preparation for Fc receptors and the C1q component of complement has been achieved.
[0418] Among the isotypes, IgG2 is least capable of binding the family of Fc receptors. Using IgG2 as the starting point, efforts have been made to find a mutant with diminished effector functions but which retains FcRn binding, prolonged stability, and low immunogenicity. Improved mutants of this nature may provide improved antibody therapeutics with retained safety. Human IgG1 therapeutic antibodies that bind to cell surface targets are able to engage effector cells that may mediate cell lysis of the target cell by antibody-dependent cellular cytotoxicity (ADCC) or complement dependent cytotoxicity (CDC). These mechanisms occur through interaction of the CH2 region of the antibody Fc domain to FcyR receptors on immune effector cells or with C1q, the first component of the complement cascade. Table 19 shows the activities of different human gamma sub-types. The skilled person may choose accordingly to promote or dampen-down activity depending upon the disease setting in humans of interest. For example, use of a human gamma-1 constant region is desirable when one wishes to isolated totally human heavy chains and antibodies that have relatively high complement activation activity by the classical pathway and FcYR1 recognition in human patients. See also Mol Immunol. 2003 December; 40(9):585-93; "Differential binding to human Fcgamma RIIe and Fcgamma RIIb receptors by human IgG wild type and mutant antibodies"; Armour K L et al, which is incorporated herein by reference.
[0419] IgG2 constant regions are well suited to producing antibodies and heavy chains according to the invention for binding to cytokines or soluble targets in humans, since IgG2 is essentially FcYRI,III-silent, FcYRIIa-active and has little Complement activity.
[0420] IgG1 constant regions have wide utility for human therapeutics, since IgG1 antibodies and heavy chains are FcYRI,II,III-active and have complement activity. This can be enhanced by using a human gamma-1 constant region that has been activated by engineering as is known in the art.
[0421] The work of the inventors has therefore identified a collection of human constant region of different isotypes from which an informed choice can be made when humanizing chimaeric antibody chains (or conjugating V domains, such as dAbs or Camelid VHH, to constant regions). The collection was identified on the basis of bioinformatics analysis of the 1000 Genomes database, the inventors selecting constant region variants that are frequently occurring across several human ethnic populations, as well as those that appear with relatively high frequency within individual populations (as assessed by the number of individuals whose genomes comprise the variant). By sorting through the myriad possible sequences on this basis, the inventors have provided a collection of human constant region variants that are naturally-occurring and which can be used when rationally designing
antibodies, heavy chains and other antibody-based formats that bear a human constant region. In particular, this is useful when humanizing chimaeric heavy chains to produce totally human chains in which both the variable and constant regions are human. This is useful for compatibility with human patients receiving antibody-based drugs.
[0422] To this end, the invention provides the following aspects:--
[0423] 1. method of producing an antibody heavy chain, the method comprising
[0424] (a) providing an antigen-specific heavy chain variable domain (e.g., VH (such as a human VH or dAb) or VHH or a humanized heavy chain variable domain); and
[0425] (b) combining the variable domain with a human heavy chain constant region to produce an antibody heavy chain comprising (in N- to C-terminal direction) the variable domain and the constant region;
[0426] wherein
[0427] the human heavy chain constant region is an IGHAref, IGHA1a, IGHA2a, IGHA2b, IGHG1ref, IGHG2ref, IGHG2a, IGHG3ref, IGHG3a, IGHG3b, IGHG4ref, IGHG4a, IGHDref, IGHEref, IGHMref, IGHMa or IGHMb constant region.
[0428] Step (b) can be carried out, e.g., using recombinant DNA technology using the corresponding nucleotide sequences.
[0429] For the constant region according to any aspect of this configuration, either genomic DNA or equivalent (i.e., having introns and exons and optionally also 5' UTR sequences, e.g., with native or a non-native leader sequence) can be used for the constant region. For example, any of the "GENOMIC" sequences disclosed as SEQ ID NO: 365 onwards herein. Alternatively, an intron less sequence can be used, for example any of the "CDS" sequences disclosed as SEQ ID NO: 365 onwards herein (e.g., with native or a non-native leader sequence).
[0430] Optionally for any aspect of this configuration of the invention, the human heavy chain constant region is an IGHAref constant region.
[0431] Optionally for any aspect of this configuration of the invention, the human heavy chain constant region is an IGHA1a constant region.
[0432] Optionally for any aspect of this configuration of the invention, the human heavy chain constant region is an IGHA2a constant region.
[0433] Optionally for any aspect of this configuration of the invention, the human heavy chain constant region is an IGHA2b constant region.
[0434] Optionally for any aspect of this configuration of the invention, the human heavy chain constant region is IGHG1 ref constant region.
[0435] Optionally for any aspect of this configuration of the invention, the human heavy chain constant region is an IGHG2ref constant region.
[0436] Optionally for any aspect of this configuration of the invention, the human heavy chain constant region is an IGHG2a constant region.
[0437] Optionally for any aspect of this configuration of the invention, the human heavy chain constant region is an IGHG3ref constant region.
[0438] Optionally for any aspect of this configuration of the invention, the human heavy chain constant region is an IGHG3a constant region.
[0439] Optionally for any aspect of this configuration of the invention, the human heavy chain constant region is an IGHG3b constant region.
[0440] Optionally for any aspect of this configuration of the invention, the human heavy chain constant region is an IGHG4ref constant region.
[0441] Optionally for any aspect of this configuration of the invention, the human heavy chain constant region is an IGHG4a constant region.
[0442] Optionally for any aspect of this configuration of the invention, the human heavy chain constant region is an IGHDref constant region.
[0443] Optionally for any aspect of this configuration of the invention, the human heavy chain constant region is an IGHEref constant region.
[0444] Optionally for any aspect of this configuration of the invention, the human heavy chain constant region is an IGHMref constant region.
[0445] Optionally for any aspect of this configuration of the invention, the human heavy chain constant region is an IGHMa constant region.
[0446] Optionally for any aspect of this configuration of the invention, the human heavy chain constant region is an IGHMb constant region.
[0447] Optionally, a derivative (e.g., a mutant or conjugate) of the heavy chain or an antibody containing the heavy chain is produced. For example, a toxic payload can be conjugated (e.g., for oncology applications). For example, one or more mutations can be introduced, as is known in the art, to inactivate or enhance Fc effector function.
[0448] 2. The method of aspect 1, wherein the variable domain is a human variable domain.
[0449] A human variable domain is, for example, the product of recombination in a transgenic non-human vertebrate of human VH, D and JH gene segments. Alternatively, the variable domain is identified using in vitro display technology from a human VH library, e.g., using phage display, ribosome display or yeast display, as is known in the art.
[0450] In another embodiment, the variable domain is a humanized variable domain, e.g., comprising human frameworks with non-human (e.g., mouse or rat) CDRs). Humanization technology is conventional in the art, and will be readily known to the skilled person.
[0451] 3. The method of any preceding aspect, wherein the variable domain has previously been selected from a non-human vertebrate that has been immunised with the antigen.
[0452] For example, the vertebrate (such as a mouse or rat) genome comprises a chimaeric heavy chain locus comprising a human variable region (human V, D and JH gene segments) operably connected upstream of a non-human vertebrate constant region so that the locus is able to rearrange for the expression of heavy chains comprising human variable domains and non-human vertebrate constant regions.
[0453] In alternative embodiments, the variable domain is selected using an in vitro technology such as phage display, ribosome display or yeast display. In this case the variable domain may be displayed with or without an constant region, provided that it is later combined with a human constant region as per the invention.
[0454] 4. The method of any preceding aspect, comprising providing an expression vector (E.g., a mammalian expression vector, such as a CHO or HEK293 vector) comprising a nucleotide sequence encoding the constant region; inserting a nucleotide sequence encoding the variable domain into the vector 5' of the constant region sequence; inserting the vector into a host cell and expressing the heavy chain by the host cell; the method further comprising isolating a heavy chain (e.g., as part of an antibody) comprising the variable domain and the human constant region.
[0455] The vector comprises regulatory elements sufficient to effect expression of the heavy chain when the vector is harboured by a host cell, e.g., a CHO or HEK293 cell.
[0456] 5. The method of any preceding aspect, further comprising obtaining a nucleotide sequence encoding the heavy chain.
[0457] 6. An antibody comprising a human heavy chain, the heavy chain comprising a variable domain that is specific for an antigen and a constant region that is an IGHAref, IGHA1a, IGHA2a, IGHA2b, IGHG1ref, IGHG2ref, IGHG2a, IGHG3ref, IGHG3a, IGHG3b, IGHG4ref, IGHG4a, IGHDref, IGHEref, IGHMref, IGHMa or IGHMb constant region.
[0458] 7. A polypeptide comprising (in N- to C-terminal direction) a leader sequence, a human variable domain that is specific for an antigen and a human constant region that is an IGHAref, IGHA1a, IGHA2a, IGHA2b, IGHG1ref, IGHG2ref, IGHG2a, IGHG3ref, IGHG3a, IGHG3b, IGHG4ref, IGHG4a, IGHDref, IGHEref, IGHMref, IGHMa or IGHMb constant region; wherein (i) the leader sequence is not the native human variable domain leader sequence (e.g., the leader sequence is another human leader sequence or a non-human leader sequence); and/or (ii) the variable domain comprises mouse AID-pattern somatic mutations or mouse terminal deoxynucleotidyl transferase (TdT)-pattern junctional mutations.
[0459] 8. A nucleotide sequence encoding (in 5' to 3' direction) a leader sequence and a human antibody heavy chain, the heavy chain comprising a variable domain that is specific for an antigen and a constant region that is an IGHAref, IGHA1a, IGHA2a, IGHA2b, IGHG1ref, IGHG2ref, IGHG2a, IGHG3ref, IGHG3a, IGHG3b, IGHG4ref, IGHG4a, IGHDref, IGHEref, IGHMref, IGHMa or IGHMb constant region; and the leader sequence being operable for expression (e.g., in a mammalian CHO or HEK293 cell) of the heavy chain and wherein the leader sequence is not the native human variable domain leader sequence (e.g., the leader sequence is another human leader sequence or a non-human leader sequence).
[0460] In an example, the leader sequence is
TABLE-US-00001
[0460] ATGGGCTGGTCCTGCATCATCCTGTTTCTGGTGGCCACCGCCACCGGCG TGCACAGC
[0461] Which translates to
TABLE-US-00002
[0461] MGWSCIILFLVATATGVHS
[0462] 9. A nucleotide sequence encoding (in 5' to 3' direction) a promoter and a human antibody heavy chain, the heavy chain comprising a variable domain that is specific for an antigen and a constant region that is an IGHAref, IGHA1a, IGHA2a, IGHA2b, IGHG1ref, IGHG2ref, IGHG2a, IGHG3ref, IGHG3a, IGHG3b, IGHG4ref, IGHG4a, IGHDref, IGHEref, IGHMref, IGHMa or IGHMb constant region; and the promoter being operable for expression (e.g., in a mammalian CHO or HEK293 cell) of the heavy chain and wherein the promoter is not the native human promoter.
[0463] In one embodiment, the promoter sequence is a human IGK 3-15 promoter.
[0464] 10. The antibody, polypeptide or nucleotide sequence of any one of aspects 6 to 9, wherein the variable domain comprises mouse AID-pattern somatic mutations and/or mouse terminal deoxynucleotidyl transferase (TdT)-pattern junctional mutations.
[0465] For example, one way, in any aspect of this configuration of the invention, to provide mouse AID-pattern somatic mutations and/or mouse terminal deoxynucleotidyl transferase (TdT)-pattern junctional mutations is to select a variable domain from a non-human vertebrate or cell. For example, a vertebrate or cell as disclosed herein.
[0466] 11. A vector (e.g., a CHO cell or HEK293 cell vector) comprising the nucleic acid of aspect 8, 9 or 10; optionally wherein the vector is in a host cell (e.g., a CHO cell or HEK293 cell).
[0467] 12. A pharmaceutical composition comprising the antibody or polypeptide of any one of aspects 6, 7 and 10, together with a pharmaceutically-acceptable excipient, diluent or a medicament (e.g., a further antigen-specific variable domain, antibody chain or antibody).
[0468] 13. The antibody or polypeptide of any one of aspects 6, 7 and 10 for use in treating and/or preventing a medical condition in a human patient.
[0469] 14. Use of the antibody or polypeptide of any one of aspects 6, 7 and 10 for the manufacture of a medicament for treating and/or preventing a medical condition in a human patient.
[0470] 15. The antibody, polypeptide or use of aspect 13 or 14, wherein the human is a member of a human population selected from population numbers 1-14, wherein the populations are numbered as follows (population labels being according to 1000 Genomes Project nomenclature)
1=ASW;
2=CEU;
3=CHB;
4=CHS;
5=CLM;
6=FIN;
7=GBR;
8=IBS;
9=JPT;
10=LWK;
11=MXL;
12=PUR;
13=TSI;
14=YRI.
[0470]
[0471] 16. The antibody, polypeptide or use of aspect 15, wherein the constant region is a
[0472] (i) IGHA1a constant region and the human population is selected from any population number 1-14;
[0473] (ii) IGHA2a constant region and the human population is selected from any population number 1-14;
[0474] (iii) IGHA2b constant region and the human population is selected from any population number 1-14;
[0475] (iv) IGHG2a constant region and the human population is selected from any population number 1-9 and 11-13;
[0476] (v) IGHG3a constant region and the human population is selected from any population number 1-14;
[0477] (vi) IGHG3b constant region and the human population is selected from any population number 1-8 and 11-13;
[0478] (vii) IGHG4a constant region and the human population is selected from any population number 1-9 and 11-13;
[0479] (viii) IGHMa constant region and the human population is selected from any population number 1-14; or
[0480] (ix) IGHMb constant region and the human population is selected from any population number 1-14;
[0481] Wherein the populations are numbered as follows (population labels being according to 1000 Genomes Project nomenclature)
1=ASW;
2=CEU;
3=CHB;
4=CHS;
5=CLM;
6=FIN;
7=GBR;
8=IBS;
9=JPT;
10=LWK;
11=MXL;
12=PUR;
13=TSI;
14=YRI.
[0482] 17. A vector (e.g., a CHO cell or HEK293 cell vector) comprising a IGHG1ref, IGHG2ref, IGHG2a, IGHG3ref, IGHG3a, IGHG3b, IGHG4ref or IGHG4a constant region nucleotide sequence that is 3' of a cloning site for the insertion of a human antibody heavy chain variable domain nucleotide sequence, such that upon insertion of such a variable domain sequence the vector comprises (in 5' to 3' direction) a promoter, a leader sequence, the variable domain sequence and the constant region sequence so that the vector is capable of expressing a human antibody heavy chain when present in a host cell.
The Present Invention Provides in a Sixth Configuration--
Multiple Variants in the Same Genome C is or Trans
[0483] The inventors' analysis has revealed groupings of naturally-occurring human antibody gene segment variants as set out in Table 13 and Table 14. This revealed the possibility of producing transgenic genomes in non-human vertebrates and cells wherein the genomes contain more than the natural human complement of specific human gene segments. In one example, this can be achieved by providing more than the natural human complement of a specific gene segment type on one or both of the respective Ig locus (e.g., one or both chromosomes harbouring IgH in a mouse genome or mouse cell genome).
[0484] To this end, this configuration of the invention provides the following (as set out in numbered paragraphs):--
[0485] 1. A non-human vertebrate (e.g., a mouse or rat) or a non-human vertebrate cell (e.g., an ES cell or a B-cell) having a genome comprising at least 3 human variable region gene segments of the same type (e.g., at least 3 human VH6-1 gene segments, at least 3 human JH6 gene segments, at least 3 human VK1-39 gene segments, at least 3 human D2-2 gene segments or at least 3 human JK1 gene segments), wherein at least two of the human gene segments are variants that are not identical to each other.
[0486] For example, the genome comprises a variable region that comprises V, D and J gene segments (for the variable region of a heavy chain locus) or V and J gene segments (for the variable region of a light chain locus) upstream of a constant region for expression of heavy or light chains respectively.
[0487] In an alternative, the skilled person can choose to provide more than the wild type human complement of a specific gene segment type by providing several copies of one variant type of the human gene segment. Thus, there is provided A non-human vertebrate (e.g., a mouse or rat) or a non-human vertebrate cell (e.g., an ES cell or a B-cell) having a genome comprising at least 3 human variable region gene segments of the same type (e.g., at least 3 human VH6-1 gene segments, at least 3 human JH6 gene segments, at least 3 human VK1-39 gene segments, at least 3 human D2-2 gene segments or at least 3 human JK1 gene segments), wherein the human gene segments are identical variants.
[0488] For example, the genome comprises a variable region that comprises V, D and J gene segments (for the variable region of a heavy chain locus) or V and J gene segments (for the variable region of a light chain locus) upstream of a constant region for expression of heavy or light chains respectively.
[0489] 2. A non-human vertebrate (e.g., a mouse or rat) or a non-human vertebrate cell (e.g., an ES cell or a B-cell) having a genome comprising at least 2 different non-endogenous variable region gene segments of the same type (e.g., at least 2 human VH6-1 gene segments, at least 3 human JH6 gene segments, at least 2 human VK1-39 gene segments, at least 2 human D2-2 gene segments or at least 2 human JK1 gene segments) cis at the same Ig locus.
[0490] In an alternative, the skilled person can choose to provide more than the wild type human complement of a specific gene segment type by providing several copies of one variant type of the human gene segment. Thus, there is provided
[0491] A non-human vertebrate (e.g., a mouse or rat) or a non-human vertebrate cell (e.g., an ES cell or a B-cell) having a genome comprising at least 2 non-endogenous variable region gene segments of the same variant type (e.g., at least 2 human JH6*02 gene segments) cis at the same Ig locus.
[0492] 3. A non-human vertebrate (e.g., a mouse or rat) or a non-human vertebrate cell (e.g., an ES cell or a B-cell) having a genome comprising at least 2 different human variable region gene segments of the same type (e.g., at least 2 human VH6-1 gene segments, at least 2 human JH6 gene segments, at least 2 human VK1-39 gene segments, at least 2 human D2-2 gene segments or at least 2 human JK1 gene segments) trans at the same Ig locus; and optionally a third human gene segment of the same type, wherein the third gene segment is cis with one of said 2 different gene segments.
[0493] In an alternative, the skilled person can choose to provide more than the wild type human complement of a specific gene segment type by providing several copies of one variant type of the human gene segment. Thus, there is provided A non-human vertebrate (e.g., a mouse or rat) or a non-human vertebrate cell (e.g., an ES cell or a B-cell) having a genome comprising at least 2 different human variable region gene segments of the same variant type (e.g., at least 2 human JH6*02 gene segments) trans at the same Ig locus; and optionally a third human gene segment of the same variant type, wherein the third gene segment is cis with one of said 2 different gene segments.
[0494] 4. A population of non-human vertebrates (e.g., mice or rats) comprising a repertoire of human variable region gene segments, wherein the plurality comprises at least 2 human variable region gene segments of the same type (e.g., at least 2 human VH6-1 gene segments, at least 2 human JH6 gene segments, at least 2 human VK1-39 gene segments, at least 2 human D2-2 gene segments or at least 2 human JK1 gene segments), a first of said different gene segments is provided in the genome of a first vertebrate of the population, and a second of said different gene segments being provided in the genome of a second vertebrate of the population, wherein the genome of the first vertebrate does not comprise the second gene segment.
[0495] 5. A non-human vertebrate (e.g., a mouse or rat) or a non-human vertebrate cell (e.g., an ES cell or a B-cell) having a genome comprising at least 2 different non-endogenous variable region gene segments of the same type (e.g., at least 2 human VH6-1 gene segments, at least 2 human JH6 gene segments, at least 2 human VK1-39 gene segments, at least 2 human D2-2 gene segments or at least 2 human JK1 gene segments), wherein the gene segments are derived from the genome sequence of different human individuals that are not genetically related over at least 3 generations.
[0496] 6. A method of enhancing the human immunoglobulin gene diversity of a non-human vertebrate (e.g., a mouse or rat), the method comprising providing the vertebrate with a genome comprising at least 3 human variable region gene segments of the same type (e.g., at least 3 human VH6-1 gene segments, at least 3 human JH6 gene segments, at least 3 human VK1-39 gene segments, at least 3 human D2-2 gene segments or at least 3 human JK1 gene segments), wherein at least two of the human gene segments are variants that are not identical to each other.
[0497] 7. A method of enhancing the immunoglobulin gene diversity of a non-human vertebrate (e.g., a mouse or rat), the method comprising providing the vertebrate with a genome comprising at least 2 different non-endogenous variable region gene segments of the same type (e.g., at least 2 human VH6-1 gene segments, at least 2 human JH6 gene segments, at least 2 human VK1-39 gene segments, at least 2 human D2-2 gene segments or at least 2 human JK1 gene segments) cis at the same Ig locus.
[0498] 8. A method of enhancing the immunoglobulin gene diversity of a non-human vertebrate (e.g., a mouse or rat), the method comprising providing the vertebrate with a genome comprising at least 2 different human variable region gene segments of the same type (e.g., at least 2 human VH6-1 gene segments, at least 2 human JH6 gene segments, at least 2 human VK1-39 gene segments, at least 2 human D2-2 gene segments or at least 2 human JK1 gene segments) trans at the same Ig locus; and optionally a third human gene segment of the same type, wherein the third gene segment is cis with one of said 2 different gene segments.
[0499] 9. A method of providing an enhanced human immunoglobulin variable region gene segment repertoire, the method comprising providing a population of non-human vertebrates (e.g., a mouse or rat) comprising a repertoire of human variable region gene segments, wherein the method comprises providing at least 2 different human variable region gene segments of the same type (e.g., at least 2 human VH6-1 gene segments, at least 2 human JH6 gene segments, at least 2 human VK1-39 gene segments, at least 2 human D2-2 gene segments or at least 2 human JK1 gene segments), wherein a first of said different gene segments is provided in the genome of a first vertebrate of the population, and a second of said different gene segments is provided in the genome of a second vertebrate of the population, wherein the genome of the first vertebrate does not comprise the second gene segment.
[0500] 10. A method of enhancing the human immunoglobulin gene diversity of a non-human vertebrate (e.g., a mouse or rat), the method comprising providing the vertebrate with a genome comprising at least 2 different non-endogenous variable region gene segments of the same type (e.g., at least 2 human VH6-1 gene segments, at least 2 human JH6 gene segments, at least 2 human VK1-39 gene segments, at least 2 human D2-2 gene segments or at least 2 human JK1 gene segments), wherein the gene segments are derived from the genome sequence of different human individuals that are not genetically related over at least 3 generations.
[0501] 11. The vertebrate, cell or method of any preceding paragraph, wherein at least 2 or 3 of said different gene segments are provided cis at the same Ig locus in said genome.
[0502] 12. The vertebrate, cell or method of any preceding paragraph, wherein the gene segments are derived from the genome sequence of different human individuals that are not genetically related over at least 3 generations.
[0503] 13. The vertebrate, cell or method of any preceding paragraph, wherein the gene segments are derived from the genome sequence of two or more different human individuals; optionally wherein the different human individuals are from different human populations.
[0504] 14. The vertebrate, cell or method of paragraph 13, wherein the individuals are not genetically related.
[0505] 15. A method of enhancing the human immunoglobulin gene diversity of a non-human vertebrate (e.g., a mouse or rat), the method comprising providing the vertebrate with a genome comprising at least 2 human variable region gene segments of the same type (e.g., at least 2 human VH6-1 gene segments, at least 2 human JH6 gene segments, at least 2 human VK1-39 gene segments, at least 2 human D2-2 gene segments or at least 2 human JK1 gene segments), wherein the gene segments are derived from the genome sequence of different human individuals that are not genetically related over at least 3 generations; optionally wherein at least 2 or 3 of said different gene segments are provided at the same Ig locus in said genome.
[0506] 16. The method of paragraph 15, wherein the different human individuals are from different human populations.
[0507] 17. The method of paragraph 15, wherein the individuals are not genetically related.
[0508] 18. The vertebrate, cell or method of preceding paragraph, wherein at least one of the different segments is a synthetic mutant of a human germline gene segment.
[0509] 19. The vertebrate, cell or method of any preceding paragraph, wherein each of said gene segments occurs in 10 or more different human populations.
[0510] 20. The vertebrate, cell or method of preceding paragraph, wherein each of said gene segments has a human frequency of 5% or greater (e.g., 10, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, or 95% or greater).
[0511] In this respect, the skilled person can be guided by the information provided in Table 14. Frequency can, for example, be cumulative frequency in the 1000 Genomes database.
[0512] 21. The vertebrate, cell or method of paragraph 20, wherein each of said gene segments occurs in 10 or more different human populations.
[0513] 22. The vertebrate, cell or method of any preceding paragraph, wherein each of said gene segments occurs in the 1000 Genomes database in more than 50 individuals.
[0514] 23. The vertebrate, cell or method of preceding paragraph, wherein each of said gene segments (i) has a human frequency of 5% or greater (e.g., 10, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, or 95% or greater); and (ii) occurs in 10 or more different human populations.
[0515] In this respect, the skilled person can be guided by the information provided in Table 14.
[0516] Frequency can, for example, be cumulative frequency in the 1000 Genomes database.
[0517] 24. A non-human vertebrate (e.g., a mouse or rat) or a non-human vertebrate cell (e.g., an ES cell or a B-cell) having a genome comprising first and second human Ig locus gene segments of the same type (e.g., first and second human JH6 gene segments; or first and second IgG2 gene segments; or first and second human JA7 gene segments), wherein the first gene segment is a gene segment selected from Table 14 (e.g., IGHJ6-a) and the second gene segment is the corresponding reference sequence (e.g., IGHJ6 ref; SEQ ID NO: 244).
[0518] Table 14 lists commonly-occurring natural human variants. It can be seen that these occur across many human populations and thus usefully have wide applicability for human antibody-based drugs.
[0519] For example, the gene segments are provided as targeted insertions into an endogenous non-human vertebrate Ig locus. Alternatively, random integration (e.g., using YACs) as is know in the art can be performed.
[0520] For example, the genome comprises a variable region that comprises V, D and J gene segments (for the variable region of a heavy chain locus) or V and J gene segments (for the variable region of a light chain locus) upstream of a constant region for expression of heavy or light chains respectively.
[0521] In another embodiment, the invention enables the skilled person to select two or more different naturally-occurring human gene segment variants for combination into the genome of a non-human vertebrate or cell. A reference sequence need not be included. It may be desirable to use one or more rare gene segments to increase diversity of the repertoire. Additionally or alternatively, it may be desirable to include a mixture of frequent and rare variants of the same type to provide repertoire diversity. The variants may be chosen additionally or alternatively to tailor the gene segment inclusion to one or more specific human populations as indicated by the information provided in Table 13 or Table 14.
[0522] Thus, the invention provides
[0523] A non-human vertebrate (e.g., a mouse or rat) or a non-human vertebrate cell (e.g., an ES cell or a B-cell) having a genome comprising first and second human Ig locus gene segments of the same type (e.g., first and second human JH6 gene segments; or first and second IgG2 gene segments; or first and second human JA7 gene segments), wherein the gene segments are gene segments selected from Table 13 or Table 14; and optionally wherein one or more of the gene segments appears in Table 14 (e.g., IGHJ6-a) or is a reference sequence (e.g., IGHJ6 ref; SEQ ID NO: 244).
[0524] 25. The vertebrate or cell of paragraph 24, wherein the genome comprises a third human gene segment of said type, the third gene segment being different from the first and second gene segments.
[0525] 26. The vertebrate or cell of paragraph 24 or 25, wherein the first and second gene segments are cis on the same chromosome; and optionally the third gene segment is also cis on said chromosome.
[0526] 27. The vertebrate or cell of paragraph 26, wherein the gene segments are targeted insertions into an endogenous non-human Ig locus.
[0527] For example, the gene segments are heavy chain gene segments and the non-human locus is an IgH locus. For example, the gene segments are light chain (kappa or lambda) gene segments and the non-human locus is an IgL locus.
[0528] 28. The vertebrate or cell of paragraph 24 or 25, wherein the first and second gene segments are trans on different chromosomes.
[0529] Thus, the chromosomes are the same type (e.g., both mouse chromosome 6 or rat chromosome 4).
[0530] 29. The vertebrate or cell of any one of paragraphs 24 to 28, wherein the first gene segment is a gene segment selected from any one of Tables 1 to 7 and 9 to 14 (e.g., selected from Table 13 or 14) and the second gene segment is the corresponding reference sequence.
[0531] 30. A population of non-human vertebrates (e.g., mice or rats) comprising first and second human Ig locus gene segments of the same type (e.g., first and second human JH6 gene segments; or first and second IgG2 gene segments; or first and second human JA7 gene segments), wherein the first gene segment is a gene segment selected from any one of Tables 1 to 7 and 9 to 14 (e.g., Table 13 or 14) (e.g., IGHJ6-a) and the second gene segment is the corresponding reference sequence (e.g., SEQ ID NO: 7), wherein the first gene segment is provided in the genome of a first vertebrate of the population, and the second gene segment is provided in the genome of a second vertebrate of the population.
[0532] 31. The population of paragraph 30, wherein the genome of the first vertebrate does not comprise the second gene segment.
[0533] 32. The population of paragraph 30 or 31, wherein the population comprises a third human gene segment of said type, the third gene segment being different from the first and second gene segments and optionally wherein the first and third gene segments are present in the genome of the first vertebrate.
[0534] 33. The population of paragraph 30, 31 or 32, wherein the gene segments are targeted insertions into an endogenous non-human Ig locus in the respective genome.
[0535] For example, the gene segments are heavy chain gene segments and the non-human locus is an IgH locus. For example, the gene segments are light chain (kappa or lambda) gene segments and the non-human locus is an IgL locus.
[0536] 34. The population of any one of paragraphs 30 to 33, wherein the first gene segment is a gene segment selected from any one of Tables 1 to 7 and 9 to 14 (e.g., Table 13 or 14) and the second gene segment is the corresponding reference sequence.
[0537] 35. A method of enhancing the human immunoglobulin gene diversity of a non-human vertebrate (e.g., a mouse or rat), the method comprising providing the vertebrate with a genome comprising first and second human Ig locus gene segments of the same type (e.g., first and second human JH6 gene segments; or first and second IgG2 gene segments; or first and second human JA7 gene segments), wherein the first gene segment is a gene segment selected from any one of Tables 1 to 7 and 9 to 14 (e.g., Table 13 or 14) (e.g., IGHJ6-a) and the second gene segment is the corresponding reference sequence (e.g., SEQ ID NO: 7).
[0538] 36. A method of providing an enhanced human immunogolobulin gene segment repertoire, the method comprising providing a population according to any one of paragraphs 30 to 33.
Variants Prevalent in Few Populations
[0539] In another aspect, it is of note that certain human gene segment variants may appear relatively frequently in one or a small number of populations, but is not found prevalently across many different human populations. There is thinking that specific germline gene segment repertoires have evolved in individual human ethnic populations due to iterative exposure to antigens (e.g., disease pathogen antigens) to which the population is often exposed. Repeated exposure and mutation may have lead to the evolution of gene segment variants that can provide an effective response to the antigen (pathogen) in the population, and this may explain the conservation of the gene segments in those populations (as opposed to other human ethnic populations that may not have frequently encountered the antigen). With this in mind, the inventors identified gene segment variants from their analysis that are relatively prevalent in a small number of human populations, and not across many populations. The inventors realized that inclusion of one or more of such gene segments in the configurations of the invention (e.g., in transgenic Ig loci, vertebrates and cells) would be useful for producing antibodies, Ig chains and variable domains that can address antigens (e.g., disease-causing antigens or pathogens) to which the small number of human populations may become exposed. Such products would be useful for treating and/or preventing disease or medical conditions in members of such a population. This aspect could also be useful for addressing infectious disease pathogens that may have been common in the small number of populations, but which in the future or relatively recently in evolution has become a more prevalent disease-causing pathogen in other human populations (i.e., those not listed in Table 13 against the gene segment variant(s) in question). To this end, from the 1000 Genomes database the inventors have identified the gene segment variants listed in Table 20.
[0540] Thus, according to any configuration or aspect described herein, one, more or all of the gene segments used in the present invention can be a gene segment listed in Table 20A, 20B, 20C or 20D.
Multiple JH Gene Segment Variants
[0541] A specific application of this configuration is the provision of multiple human JH gene segments as follows.
[0542] A non-human vertebrate (e.g., a mouse or rat) or a non-human vertebrate cell (e.g., an ES cell or a B-cell) having a genome comprising at least 3 human JH gene segments of the same type (JH1, JH2, JH3, JH4, JH5 or JH6), wherein at least two of the human JH gene segments are variants that are not identical to each other.
[0543] In an example, any cell of the invention is an isolated cell. An "isolated" cell is one that has been identified, separated and/or recovered from a component of its production environment (e.g., naturally or recombinantly). Preferably, the isolated cell is free of association with all other components from its production environment, e.g., so that the cell can produce an antibody to an FDA-approvable or approved standard. Contaminant components of its production environment, such as that resulting from recombinant transfected cells, are materials that would typically interfere with research, diagnostic or therapeutic uses for the resultant antibody, and may include enzymes, hormones, and other proteinaceous or non-proteinaceous solutes. In preferred embodiments, the polypeptide will be purified: (1) to greater than 95% by weight of antibody as determined by, for example, the Lowry method, and in some embodiments, to greater than 99% by weight; (2) to a degree sufficient to obtain at least 15 residues of N-terminal or internal amino acid sequence by use of a spinning cup sequenator, or (3) to homogeneity by SDS-PAGE under non-reducing or reducing conditions using Coomassie blue or, preferably, silver stain. Ordinarily, however, an isolated cell will be prepared by at least one purification step.
[0544] A non-human vertebrate (e.g., a mouse or rat) or a non-human vertebrate cell (e.g., an ES cell or a B-cell) having a genome comprising at least 2 different non-endogenous JH gene segments (e.g., human gene segments) of the same type (JH1, JH2, JH3, JH4, JH5 or JH6) cis at the same Ig (e.g., IgH, e.g., endogenous IgH, e.g., mouse or rat IgH) locus. In an example, the genome comprises a human VH, D and JH repertoire comprising said different JH gene segments. Optionally the non-endogenous JH gene segments are non-mouse or non-rat, e.g., human JH gene segments. In an example one or more or all of the non-endogenous gene segments are synthetic.
[0545] A non-human vertebrate (e.g., a mouse or rat) or a non-human vertebrate cell (e.g., an ES cell or a B-cell) having a genome comprising at least 2 different human JH gene segments of the same type (JH1, JH2, JH3, JH4, JH5 or JH6) trans at the same Ig (e.g., IgH, e.g., endogenous IgH, e.g., mouse or rat IgH) locus; and optionally a third human JH gene segments of the same type, wherein the third JH is cis with one of said 2 different JH gene segments.
[0546] A population of non-human vertebrates (e.g., mice or rats) comprising a repertoire of human JH gene segments, wherein the plurality comprises at least 2 different human JH gene segments of the same type (JH1, JH2, JH3, JH4, JH5 or JH6), a first of said different JH gene segments is provided in the genome of a first vertebrate of the population, and a second of said different JH gene segments being provided in the genome of a second vertebrate of the population, wherein the genome of the first vertebrate does not comprise the second JH gene segment.
[0547] A non-human vertebrate (e.g., a mouse or rat) or a non-human vertebrate cell (e.g., an ES cell or a B-cell) having a genome comprising at least 2 different non-endogenous (e.g., human) JH gene segments of the same type (JH1, JH2, JH3, JH4, JH5 or JH6), wherein the JH gene segments are derived from the genome sequence of different human individuals that are not genetically related over at least 3 generations (e.g., 3, 4, 5 or 6 generations). Optionally the non-endogenous JH gene segments are human JH gene segments. In an example one or more or all of the non-endogenous gene segments are synthetic.
[0548] A method of enhancing the human immunoglobulin gene diversity of a non-human vertebrate (e.g., a mouse or rat), the method comprising providing the vertebrate with a genome comprising at least 3 human JH gene segments of the same type (JH1, JH2, JH3, JH4, JH5 or JH6), wherein at least two of the human JH gene segments are variants that are not identical to each other.
[0549] A method of enhancing the immunoglobulin gene diversity of a non-human vertebrate (e.g., a mouse or rat), the method comprising providing the vertebrate with a genome comprising at least 2 different non-endogenous (e.g., human) JH gene segments of the same type (JH1, JH2, JH3, JH4, JH5 or JH6) cis at the same Ig (e.g., IgH, e.g., endogenous IgH, e.g., mouse or rat IgH) locus). Optionally the non-endogenous JH gene segments are non-mouse or non-rat, e.g., human JH gene segments. In an example one or more or all of the non-endogenous gene segments are synthetic.
[0550] A method of enhancing the immunoglobulin gene diversity of a non-human vertebrate (e.g., a mouse or rat), the method comprising providing the vertebrate with a genome comprising at least 2 different human JH gene segments of the same type (JH1, JH2, JH3, JH4, JH5 or JH6) trans at the same Ig (e.g., IgH, e.g., endogenous IgH, e.g., mouse or rat IgH) locus; and optionally a third human JH gene segments of the same type, wherein the third JH is cis with one of said 2 different JH gene segments.
[0551] A method of providing an enhanced human immunoglobulin JH gene segment repertoire, the method comprising providing a population of non-human vertebrates (e.g., a mouse or rat) comprising a repertoire of human JH gene segments, wherein the method comprises providing at least 2 different human JH gene segments of the same type (JH1, JH2, JH3, JH4, JH5 or JH6), wherein a first of said different JH gene segments is provided in the genome of a first vertebrate of the population, and a second of said different JH gene segments is provided in the genome of a second vertebrate of the population, wherein the genome of the first vertebrate does not comprise the second JH gene segment.
[0552] A method of enhancing the human immunoglobulin gene diversity of a non-human vertebrate (e.g., a mouse or rat), the method comprising providing the vertebrate with a genome comprising at least 2 different non-endogenous (e.g., human) JH gene segments of the same type (JH1, JH2, JH3, JH4, JH5 or JH6), wherein the JH gene segments are derived from the genome sequence of different human individuals that are not genetically related over at least 3 generations (e.g., 3, 4, 5, or 6 generations). Optionally the non-endogenous JH gene segments are human JH gene segments. In an example one or more or all of the non-endogenous gene segments are synthetic.
[0553] In an example of the vertebrate or cell or the method of the invention at least 2 or 3 of said different gene segments are provided cis at the same Ig locus in said genome.
[0554] In an example of the vertebrate or cell or the method of the invention the JH gene segments are derived from the genome sequence of different human individuals that are not genetically related over at least 3 generations (e.g., 3, 4, 5, or 6 generations).
[0555] In an example of the vertebrate or cell or the method of the invention the JH gene segments are derived from the genome sequence of two or more different human individuals; optionally wherein the different human individuals are from different human populations.
[0556] In an example of the vertebrate or cell or the method of the invention the individuals are not genetically related (e.g., going back 3, 4, 5, or 6 generations).
[0557] In an example of the vertebrate or cell or the method of the invention at least one of the different JH segments is a synthetic mutant of a human germline JH gene segment.
[0558] The invention also provides a method of enhancing the human immunoglobulin gene diversity of a non-human vertebrate (e.g., a mouse or rat), the method comprising providing the vertebrate with a genome comprising at least 2 human JH gene segments of the same type (JH1, JH2, JH3, JH4, JH5 or JH6), wherein the JH gene segments are derived from the genome sequence of different human individuals that are not genetically related over at least 3 generations (e.g., 3, 4, 5, or 6 generations); optionally wherein at least 2 or 3 of said different gene segments are provided at the same IgH locus in said genome.
[0559] In an example of the vertebrate or cell or the method of this embodiment of the invention the genome comprises a substantially complete functional repertoire of human JH gene segment types supplemented with one, two or more human JH gene segments, wherein said substantially complete functional repertoire and the supplementary JH gene segments are not found together in the germline genome of a human individual.
[0560] In an example of the population of the invention, the population comprises a substantially complete functional repertoire of human JH gene segment types supplemented with one, two or more human JH gene segments, wherein said substantially complete functional repertoire and the supplementary JH gene segments are not found together in the germline genome of a human individual.
[0561] A non-human vertebrate (e.g., a mouse or rat) or a non-human cell (e.g., an ES cell or a B-cell) having a genome comprising a substantially complete functional repertoire of human JH gene segment types supplemented with one, two or more human JH gene segments, wherein said substantially complete functional repertoire and the supplementary JH gene segments are not found together in the germline genome of a human individual.
[0562] A population of non-human vertebrates (e.g., mice or rats) comprising a substantially complete functional repertoire of human JH gene segment types supplemented with one, two or more human JH gene segments, wherein said substantially complete functional repertoire and the supplementary JH gene segments are not found together in the germline genome of a human individual.
[0563] In an example of the vertebrate or the population, at least one of said JH gene segments is SEQ ID NO: 1, 2, 3 or 4. For example, at least one of said JH gene segments is SEQ ID NO: 1 and at least one, two or more of said supplementary JH gene segments is a variant according to any example above. For example, at least one of said JH gene segments is SEQ ID NO: 2 and at least one, two or more of said supplementary JH gene segments is a variant according to any one of the examples above. For example, at least one of said JH gene segments is SEQ ID NO: 2 and at least one, two or more of said supplementary JH gene segments is a variant according to any one of the examples above.
[0564] In an embodiment, the non-human vertebrate or vertebrate cell of the invention comprises a genome that comprises VH, D and JH gene repertoires comprising human gene segments, the JH gene repertoire (e.g., a human JH gene segment repertoire) comprising a plurality of JH1 gene segments provided by at least 2 different JH1 gene segments in cis at the same Ig locus in said genome;
a plurality of JH2 gene segments provided by at least 2 different JH2 gene segments in cis at the same Ig locus in said genome; a plurality of JH3 gene segments provided by at least 2 different JH3 gene segments in cis at the same Ig locus in said genome; a plurality of JH4 gene segments provided by at least 2 different JH4 gene segments in cis at the same Ig locus in said genome; a plurality of JH5 gene segments provided by at least 2 different JH5 gene segments in cis at the same Ig locus in said genome; and/or a plurality of JH6 gene segments provided by at least 2 different JH6 gene segments in cis at the same Ig locus in said genome; optionally wherein the JH gene segments are derived from the genome sequence of two or more different human individuals.
[0565] Optionally said at least 2 different JH gene segments are human gene segments or synthetic gene segments derived from human gene segments.
[0566] Optionally, the Ig locus is a IgH locus, e.g., an endogenous locus, e.g., a mouse or rat IgH locus.
[0567] In an embodiment, the non-human vertebrate or vertebrate cell of the invention comprises a genome that comprises VH, D and JH gene repertoires comprising human gene segments, the JH gene repertoire (e.g., a human JH gene segment repertoire) comprising a plurality of JH1 gene segments provided by at least 3 different JH1 gene segments; a plurality of JH2 gene segments provided by at least 3 different JH2 gene segments; a plurality of JH3 gene segments provided by at least 3 different JH3 gene segments; a plurality of JH4 gene segments provided by at least 3 different JH4 gene segments; a plurality of JH5 gene segments provided by at least 3 different JH5 gene segments; and/or a plurality of JH6 gene segments provided by at least 3 different JH6 gene segments; optionally wherein the JH gene segments are derived from the genome sequence of two or three different human individuals;
optionally wherein at least 2 or 3 of said different gene segments are provided in cis at the same Ig locus in said genome.
[0568] Optionally said at least 3 different JH gene segments are human gene segments or synthetic gene segments derived from human gene segments.
[0569] Optionally, the Ig locus is a IgH locus, e.g., an endogenous locus, e.g., a mouse or rat IgH locus.
[0570] Optionally in the vertebrate or cell the different human individuals are from different human populations.
[0571] Optionally in the vertebrate or cell the individuals are not genetically related (e.g., Going back 3, 4, 5 or 6 generations).
[0572] Optionally in the vertebrate or cell at least one of the different JH segments is a synthetic mutant of a human germline JH gene segment.
[0573] In an embodiment of a non-human vertebrate or vertebrate cell (optionally an ES cell or B-cell) according to the invention, the vertebrate or cell genome comprises human VH, D and JH gene repertoires, the JH gene repertoire (e.g., a human JH gene repertoire) comprising a plurality of JH1 gene segments provided by at least 2 different human JH1 gene segments, optionally in cis at the same Ig locus in said genome;
a plurality of JH2 gene segments provided by at least 2 different human JH2 gene segments, optionally in cis at the same Ig locus in said genome; a plurality of JH3 gene segments provided by at least 2 different human JH3 gene segments, optionally in cis at the same Ig locus in said genome; a plurality of JH4 gene segments provided by at least 2 different human JH4 gene segments, optionally in cis at the same Ig locus in said genome; a plurality of JH5 gene segments provided by at least 2 different human JH5 gene segments, optionally in cis at the same Ig locus in said genome; and/or a plurality of JH6 gene segments provided by at least 2 different human JH6 gene segments, optionally in cis at the same Ig locus in said genome; wherein the JH gene segments are derived from the genome sequence of different human individuals that are not genetically related over at least 3 generations (e.g., 3, 4, 5 or 6 generations).
[0574] Optionally said at least 2 different JH gene segments are human gene segments or synthetic gene segments derived from human gene segments.
[0575] Optionally, the Ig locus is a IgH locus, e.g., an endogenous locus, e.g., a mouse or rat IgH locus. Optionally in the vertebrate or cell the human individuals are from different human populations.
JH5
[0576] An embodiment provides a vertebrate, cell or population of the invention whose genome comprises a plurality of JH5 gene segments, wherein the plurality comprises a human JH5 gene variant of SEQ ID NO: 1, wherein the variant comprises a nucleotide mutation at one or more positions corresponding to positions
106,330,024 106,330,027 106,330,032 106,330,041 106.330.44 106.330.45 106.330.62 106.330.63 106.330.65 106.330.66 106.330.67 106.330.68 and 106,330,071 on human chromosome 14.
[0577] In the vertebrate, cell or population optionally the plurality comprises a human JH5 gene variant of SEQ ID NO: 1, wherein the variant comprises a guanine at a position corresponding to position 106,330,067 on human chromosome 14; and optionally no further mutation from the sequence of SEQ ID NO: 1.
[0578] Optionally the variant comprises additionally a mutation at a position corresponding to (i) position 106,330,071 on human chromosome 14 (optionally the additional mutation being a guanine); (ii) position 106,330,066 on human chromosome 14 (optionally the additional mutation being a guanine); and/or (iii) position 106,330,068 on human chromosome 14 (optionally the additional mutation being a thymine).
[0579] Optionally the plurality comprises a human JH5 gene variant of SEQ ID NO: 1, wherein the variant comprises a guanine at a position corresponding to position 106,330,071 on human chromosome 14; and optionally no further mutation from the sequence of SEQ ID NO: 1.
[0580] Optionally the variant comprises additionally a mutation at a position corresponding to (i) position 106,330,063 on human chromosome 14 (optionally the additional mutation being an adenine); and/or (ii) position 106,330,067 on human chromosome 14 (optionally the additional mutation being a guanine).
[0581] Optionally the plurality comprises a human JH5 gene variant of SEQ ID NO: 1, wherein the variant comprises a cytosine at a position corresponding to position 106,330,045 on human chromosome 14; and optionally no further mutation from the sequence of SEQ ID NO: 1.
[0582] Optionally the plurality comprises a human JH5 gene variant of SEQ ID NO: 1, wherein the variant comprises an adenine at a position corresponding to position 106,330,044 on human chromosome 14; and optionally no further mutation from the sequence of SEQ ID NO: 1.
[0583] Optionally the variant comprises additionally a mutation at a position corresponding to (i) position 106.330.66 on human chromosome 14 (optionally the additional mutation being a guanine); and/or (ii) position 106,330,068 on human chromosome 14 (optionally the additional mutation being a thymine).
[0584] Optionally the plurality comprises a human JH5 gene variant of SEQ ID NO: 1, wherein the variant comprises a guanine at a position corresponding to position 106,330,066 on human chromosome 14; and optionally no further mutation from the sequence of SEQ ID NO: 1.
[0585] Optionally the variant comprises additionally a mutation at a position corresponding to (i) position 106.330.67 on human chromosome 14 (optionally the additional mutation being a guanine); and/or (ii) position 106,330,068 on human chromosome 14 (optionally the additional mutation being a thymine).
[0586] Optionally the plurality comprises a human JH5 gene variant of SEQ ID NO: 1, wherein the variant comprises a thymine at a position corresponding to position 106,330,068 on human chromosome 14; and optionally no further mutation from the sequence of SEQ ID NO: 1.
[0587] Optionally the variant comprises additionally a mutation at a position corresponding to (i) position 106,330,067 on human chromosome 14 (optionally the additional mutation being a guanine); and/or (ii) position 106,330,066 on human chromosome 14 (optionally the additional mutation being a guanine).
[0588] Optionally the plurality comprises a human JH5 gene variant of SEQ ID NO: 1, wherein the variant comprises a cytosine at a position corresponding to position 106,330,027 on human chromosome 14; and optionally no further mutation from the sequence of SEQ ID NO: 1.
[0589] Optionally the plurality comprises a human JH5 gene variant of SEQ ID NO: 1, wherein the variant comprises an adenine at a position corresponding to position 106,330,024 on human chromosome 14; and optionally no further mutation from the sequence of SEQ ID NO: 1.
[0590] Optionally the plurality comprises a human JH5 gene variant of SEQ ID NO: 1, wherein the variant comprises a thymine at a position corresponding to position 106,330,032 on human chromosome 14; and optionally no further mutation from the sequence of SEQ ID NO: 1.
[0591] Optionally the plurality comprises a human JH5 gene variant of SEQ ID NO: 1, wherein the variant comprises a thymine at a position corresponding to position 106,330,041 on human chromosome 14; and optionally no further mutation from the sequence of SEQ ID NO: 1.
[0592] Optionally the plurality comprises a human JH5 gene variant of SEQ ID NO: 1, wherein the variant comprises an adenine or thymine at a position corresponding to position 106,330,063 on human chromosome 14; and optionally no further mutation from the sequence of SEQ ID NO: 1.
[0593] Optionally the variant comprises additionally a mutation at a position corresponding to position 106,330,071 on human chromosome 14 (optionally the additional mutation being a guanine).
[0594] Optionally the plurality comprises a human JH5 gene variant of SEQ ID NO: 1,wherein the variant comprises a cytosine at a position corresponding to position 106,330,062 on human chromosome 14; and optionally no further mutation from the sequence of SEQ ID NO: 1.
[0595] Optionally the genome comprises SEQ ID NO:1; optionally in cis at the same Ig locus as one, two or more of the variants.
JH6
[0596] An embodiment provides a vertebrate, cell or population of the invention whose genome comprises a plurality of JH6 gene segments, wherein the plurality comprises a human JH6 gene variant of SEQ ID NO: 2, wherein the variant comprises a nucleotide mutation at one or more positions corresponding to positions
106,329,411 106.329.413 106.329.414 106,329,417 106,329,419 106,329,426 106,329,434 106,329,435, and 106,329,468 on human chromosome 14.
[0597] Optionally the genome of the vertebrate, cell or population comprises a plurality of JH6 gene segments, wherein the plurality comprises a human JH6 gene variant of SEQ ID NO: 2, wherein the variant comprises a guanine at a position corresponding to position 106,329,435 on human chromosome 14; and optionally no further mutation from the sequence of SEQ ID NO: 2.
[0598] Optionally the variant comprises additionally a mutation at a position corresponding to (i) position 106,329,468 on human chromosome 14 (optionally the additional mutation being a guanine); (ii) position 106,329,419 on human chromosome 14 (optionally the additional mutation being an adenine); (iii) position 106,329,434 on human chromosome 14 (optionally the additional mutation being a cytosine) and/or position 106,329,414 on human chromosome 14 (optionally the additional mutation being a guanine); (iv) position 106,329,426 on human chromosome 14 (optionally the additional mutation being an adenine); (v) position 106,329,413 on human chromosome 14 (optionally the additional mutation being an adenine); (vi) position 106,329,417 on human chromosome 14 (optionally the additional mutation being a thymine); (vii) position 106,329,411 on human chromosome 14 (optionally the additional mutation being a thymine); (viii) position 106,329,451 on human chromosome 14 (optionally the additional mutation being an adenine); (ix) position 106,329,452 on human chromosome 14 (optionally the additional mutation being a cytosine); and/or (x) position 106,329,453 on human chromosome 14 (optionally the additional mutation being a cytosine).
[0599] Optionally the variant comprises additionally mutations at positions corresponding to position 106.329.451 on human chromosome 14, the additional mutation being an adenine; position 106.329.452 on human chromosome 14, the additional mutation being a cytosine; and position 106.329.453 on human chromosome 14, the additional mutation being a cytosine.
[0600] The vertebrate, cell or population optionally comprises a plurality of JH6 gene segments, wherein the plurality comprises a human JH6 gene variant of SEQ ID NO: 2, wherein the variant comprises a guanine at a position corresponding to position 106,329,468 on human chromosome 14; and optionally no further mutation from the sequence of SEQ ID NO: 2.
[0601] Optionally the variant comprises additionally a mutation at a position corresponding to position 106,329,435 on human chromosome 14 (optionally the additional mutation being a guanine).
[0602] Optionally the vertebrate, cell or population comprises a plurality of JH6 gene segments, wherein the plurality comprises a human JH6 gene variant of SEQ ID NO: 2, wherein the variant comprises a thymine at a position corresponding to position 106,329,417 on human chromosome 14; and optionally no further mutation from the sequence of SEQ ID NO: 2.
[0603] Optionally the variant comprises additionally a mutation at a position corresponding to position 106,329,435 on human chromosome 14 (optionally the additional mutation being a guanine).
[0604] Optionally the vertebrate, cell or population comprises a plurality of JH6 gene segments, wherein the plurality comprises a human JH6 gene variant of SEQ ID NO: 2, wherein the variant comprises a cytosine at a position corresponding to position 106,329,434 on human chromosome 14; and optionally no further mutation from the sequence of SEQ ID NO: 2.
[0605] Optionally the variant comprises additionally a mutation at a position corresponding to (i) position 106,329,414 on human chromosome 14 (optionally the additional mutation being a guanine); and/or (ii) position 106,329,435 on human chromosome 14 (optionally the additional mutation being a guanine).
[0606] Optionally the vertebrate, cell or population comprises a plurality of JH6 gene segments, wherein the plurality comprises a human JH6 gene variant of SEQ ID NO: 2, wherein the variant comprises a thymine at a position corresponding to position 106,329,411 on human chromosome 14; and optionally no further mutation from the sequence of SEQ ID NO: 2.
[0607] Optionally the variant comprises additionally a mutation at a position corresponding to position 106,329,435 on human chromosome 14 (optionally the additional mutation being a guanine).
[0608] Optionally the vertebrate, cell or population comprises a plurality of JH6 gene segments, wherein the plurality comprises a human JH6 gene variant that is an antisense sequence of a variant described above.
[0609] Optionally the genome comprises SEQ ID NO:2; optionally cis at the same Ig locus as one, two or more of the JH6 variants.
JH2
[0610] An embodiment provides a vertebrate, cell or population of the invention whose genome comprises a plurality of JH2 gene segments, wherein the plurality comprises a human JH2 gene variant of SEQ ID NO: 3, wherein the variant comprises a nucleotide mutation at one or more positions corresponding to positions
106,331,455 106,331,453, and 106,331,409 on human chromosome 14.
[0611] Optionally the vertebrate, cell or population comprises said plurality of JH2 gene segments, wherein the plurality comprises a human JH2 gene variant of SEQ ID NO: 3, wherein the variant comprises a guanine at a position corresponding to position 106,331,455 on human chromosome 14; and optionally no further mutation from the sequence of SEQ ID NO: 3.
[0612] Optionally the variant comprises additionally a mutation at a position corresponding to (i) position 106,331,453 on human chromosome 14 (optionally the additional mutation being an adenine); and/or (ii) position 106,331,409 on human chromosome 14 (optionally the additional mutation being an adenine); (iii) position 106,329,434 on human chromosome 14 (optionally the additional mutation being an adenine).
[0613] Optionally the vertebrate, cell or population comprises a plurality of JH2 gene segments, wherein the plurality comprises a human JH2 gene variant of SEQ ID NO: 3, wherein the variant comprises an adenine at a position corresponding to position 106,331,453 on human chromosome 14; and optionally no further mutation from the sequence of SEQ ID NO: 3.
[0614] Optionally the variant comprises additionally a mutation at a position corresponding to position 106,331,409 on human chromosome 14 (optionally the additional mutation being an adenine).
[0615] Optionally the vertebrate, cell or population comprises a plurality of JH2 gene segments, wherein the plurality comprises a human JH2 gene variant of SEQ ID NO: 3, wherein the variant comprises an adenine at a position corresponding to position 106,331,409 on human chromosome 14; and optionally no further mutation from the sequence of SEQ ID NO: 3.
[0616] Optionally the vertebrate, cell or population comprises a plurality of JH2 gene segments, wherein the plurality comprises a human JH2 gene variant that is an antisense sequence of a variant described above.
[0617] Optionally the genome comprises SEQ ID NO:3; optionally cis at the same Ig locus as one, two or more of the JH2 variants.
[0618] Optionally the vertebrate, cell or population genome comprises two or more different JH gene segments selected from SEQ ID NOs: 1 to 3 and variants described above; optionally wherein said JH gene segments are cis at the same immunoglobulin Ig locus.
Multiple Human D Gene Segment Variants
[0619] A specific application of this configuration is the provision of multiple human D gene segments as follows (as set out in numbered clauses, starting at clause number 154).
[0620] 154. A non-human vertebrate (e.g., a mouse or rat) or a non-human vertebrate cell (e.g., an ES cell or a B-cell) having a genome comprising at least 3 human D gene segments of the same type (e.g., D2-2 gene segments), wherein at least two of the human D gene segments are variants that are not identical to each other (e.g., D2-2ref and D2-2a).
[0621] In an example of any aspect of the sixth configuration of the invention (V, D, J or C), one or more or all of the variants are naturally-occurring human gene segments.
[0622] In an example of any aspect of the sixth configuration of the invention (V, D, J or C), one or more of the variants may be a synthetic variant of a human gene segment.
[0623] 155. A non-human vertebrate (e.g., a mouse or rat) or a non-human vertebrate cell (e.g., an ES cell or a B-cell) having a genome comprising at least 2 different non-endogenous D gene segments of the same type (e.g., D2-2ref and D2-2a) cis at the same Ig locus.
[0624] 156. A non-human vertebrate (e.g., a mouse or rat) or a non-human vertebrate cell (e.g., an ES cell or a B-cell) having a genome comprising at least 2 different human D gene segments of the same type (e.g., D2-2ref and D2-2a) trans at the same Ig locus; and optionally a third human D gene segment (e.g., (e.g., D2-2ref, D2-2a or D2-2b) of the same type, wherein the third D is cis with one of said 2 different D gene segments.
[0625] 157. A population of non-human vertebrates (e.g., mice or rats) comprising a repertoire of human D gene segments, wherein the plurality comprises at least 2 different human D gene segments of the same type (e.g., D2-2 gene segments), a first of said different D gene segments (e.g., D2-2ref) is provided in the genome of a first vertebrate of the population, and a second of said different D gene segment (e.g., D2-2a) being provided in the genome of a second vertebrate of the population, wherein the genome of the first vertebrate does not comprise the second D gene segment.
[0626] 158. A non-human vertebrate (e.g., a mouse or rat) or a non-human vertebrate cell (e.g., an ES cell or a B-cell) having a genome comprising at least 2 different non-endogenous D gene segments of the same type (e.g., human D2-2 gene segments), wherein the D gene segments are derived from the genome sequence of different human individuals that are not genetically related over at least 3 generations.
[0627] 159. A method of enhancing the human immunoglobulin gene diversity of a non-human vertebrate (e.g., a mouse or rat), the method comprising providing the vertebrate with a genome comprising at least 3 human D gene segments of the same type (e.g., D2-2 gene segments), wherein at least two of the human D gene segments are variants that are not identical to each other (e.g., D2-2ref and D2-2a).
[0628] 160. A method of enhancing the immunoglobulin gene diversity of a non-human vertebrate (e.g., a mouse or rat), the method comprising providing the vertebrate with a genome comprising at least 2 different non-endogenous D gene segments of the same type (e.g., human D2-2 gene segments) cis at the same Ig locus.
[0629] 161. A method of enhancing the immunoglobulin gene diversity of a non-human vertebrate (e.g., a mouse or rat), the method comprising providing the vertebrate with a genome comprising at least 2 different human D gene segments of the same type (e.g., D2-2ref and D2-2a) trans at the same Ig locus; and optionally a third human D gene segment (e.g., D2-2ref, D2-2a or D2-2b) of the same type, wherein the third D is cis with one of said 2 different D gene segments.
[0630] 162. A method of providing an enhanced human immunoglobulin D gene segment repertoire, the method comprising providing a population of non-human vertebrates (e.g., a mouse or rat) comprising a repertoire of human D gene segments, wherein the method comprises providing at least 2 different human D gene segments of the same type (e.g., D2-2ref and D2-2a), wherein a first of said different D gene segments is provided in the genome of a first vertebrate of the population, and a second of said different D gene segments is provided in the genome of a second vertebrate of the population, wherein the genome of the first vertebrate does not comprise the second D gene segment.
[0631] 163. A method of enhancing the human immunoglobulin gene diversity of a non-human vertebrate (e.g., a mouse or rat), the method comprising providing the vertebrate with a genome comprising at least 2 different non-endogenous D gene segments of the same type (e.g., D2-2ref and D2-2a), wherein the D gene segments are derived from the genome sequence of different human individuals that are not genetically related over at least 3 generations.
[0632] 164. The vertebrate or cell of clause 154, 156 or 158, or the method of clause 159, 161 or 163, wherein at least 2 or 3 of said different gene segments are provided cis at the same Ig locus in said genome.
[0633] 165. The vertebrate or cell of clause 154, 155 or 156, or the method of any one of clauses 159 to 162 and 164, wherein the D gene segments are derived from the genome sequence of different human individuals that are not genetically related over at least 3 generations.
[0634] 166. The vertebrate or cell of any one of clauses 154 to 157, or the method of any one of clauses 159 to 162 and 165, wherein the D gene segments are derived from the genome sequence of two or more different human individuals; optionally wherein the different human individuals are from different human populations.
[0635] 167. The vertebrate, cell or method of clause 166, wherein the individuals are not genetically related.
[0636] 168. The vertebrate, cell or method of any one of clauses 154 to 167, wherein at least one of the different D segments is a synthetic mutant of a human germline D gene segment.
[0637] 169. A method of enhancing the human immunoglobulin gene diversity of a non-human vertebrate (e.g., a mouse or rat), the method comprising providing the vertebrate with a genome comprising at least 2 human D gene segments of the same type (e.g., D2-2ref and D2-2a), wherein the D gene segments are derived from the genome sequence of different human individuals that are not genetically related over at least 3 generations; optionally wherein at least 2 or 3 of said different gene segments are provided at the same IgH locus in said genome.
[0638] 170. The vertebrate or cell of any one of clauses 154 to 158 and 164 to 168, wherein the genome comprises a substantially complete functional repertoire of human D gene segment types supplemented with one, two or more variant human D gene segments, wherein said substantially complete functional repertoire and the supplementary D gene segments are not found together in the germline genome of a human individual.
[0639] 171. The population of clause 157, wherein the population comprises a substantially complete functional repertoire of human D gene segment types supplemented with one, two or more variant human D gene segments, wherein said substantially complete functional repertoire and the supplementary D gene segments are not found together in the germline genome of a human individual.
[0640] 172. A non-human vertebrate (e.g., a mouse or rat) or a non-human cell (e.g., an ES cell or a B-cell) having a genome comprising a substantially complete functional repertoire of human D gene segment types supplemented with one, two or more variant human D gene segments, wherein said substantially complete functional repertoire and the supplementary D gene segments are not found together in the germline genome of a human individual.
[0641] 173. A population of non-human vertebrates (e.g., mice or rats) comprising a substantially complete functional repertoire of human JH gene segment types supplemented with one, two or more variant human D gene segments, wherein said substantially complete functional repertoire and the supplementary D gene segments are not found together in the germline genome of a human individual.
[0642] 174. The vertebrate or cell of clause 172 or the population of clause 173, comprising first and second D gene segments selected from D2-2ref and D2-2a; or D2-21 ref and D2-21a; or D3-10ref and D3-10a; or D3-16ref and D3-16a; or D2-8ref and D2-8a; or D3-3ref and D3-3a; or D4-23ref and D4-23a; or D6-13ref and D6-13a; or D3-9ref and D3-9a; or D4-4ref and D4-4a; or D7-27ref and D7-27a;
[0643] Optionally wherein the first and/or second D gene segment is present in two or more copies.
[0644] For example, there are provided two or three copies of the first gene segment, optionally with one, two or three copies of the second gene segment. Copies can be arranged in cis or trans.
[0645] 175. The vertebrate, cell or population of clause 174, comprising human gene segments D2-2ref and D2-2a; and D3-3ref and D3-3a; and optionally also D2-15.
[0646] In an example, the vertebrate, cell or population comprises one or more D segments selected from human D3-3, D2-15, D3-9; D4-17; D3-10; D2-2; D5-24; D6-19; D3-22; D6-13; D5-12; D1-26; D1-20; D5-18; D3-16; D2-21; D1-14; D7-27; D1-1; D6-25; D2-14 and D4-23 (e.g., selected from D3-9*01; D4-17*01; D3-10*01; D2-2*02; D5-24*01; D6-19*01; D3-22*01; D6-13*01; D5-12*01; D1-26*01; D1-20*01; D5-18*01; D3-16*02; D2-21*02; D1-14*01; D7-27*02; D1-1*01; D6-25*01; D2-15*01; and D4-23*01), together with the reference sequence(s) of said selected segment(s). These were found in variable domains having a HCDR3 length of at least 20 amino acids (see examples herein).
[0647] 176. A non-human vertebrate or vertebrate cell according to clause 155, comprising a genome that comprises VH, D and JH gene repertoires comprising human gene segments, the D gene repertoire comprising one or more of
[0648] a plurality of D2-2 gene segments provided by at least 2 different D2-2 gene segments in cis at the same Ig locus in said genome;
[0649] a plurality of D2-21 gene segments provided by at least 2 different D2-21 gene segments in cis at the same Ig locus in said genome;
[0650] a plurality of D3-10 gene segments provided by at least 2 different D3-10 gene segments in cis at the same Ig locus in said genome;
[0651] a plurality of D3-16 gene segments provided by at least 2 different D3-16 gene segments in cis at the same Ig locus in said genome;
[0652] a plurality of D2-8 gene segments provided by at least 2 different D2-8 gene segments in cis at the same Ig locus in said genome;
[0653] a plurality of D3-3 gene segments provided by at least 2 different D3-3 gene segments in cis at the same Ig locus in said genome;
[0654] a plurality of D4-23 gene segments provided by at least 2 different D4-23 gene segments in cis at the same Ig locus in said genome;
[0655] a plurality of D6-13 gene segments provided by at least 2 different D6-13 gene segments in cis at the same Ig locus in said genome;
[0656] a plurality of D3-9 gene segments provided by at least 2 different D3-9 gene segments in cis at the same Ig locus in said genome;
[0657] a plurality of D4-4 gene segments provided by at least 2 different D4-4 gene segments in cis at the same Ig locus in said genome; and
[0658] a plurality of D7-27 gene segments provided by at least 2 different D7-27 gene segments in cis at the same Ig locus in said genome;
[0659] optionally wherein the D gene segments are derived from the genome sequence of two or more different human individuals.
[0660] 177. A non-human vertebrate or vertebrate cell according to clause 155, comprising a genome that comprises VH, D and JH gene repertoires comprising human gene segments, the D gene repertoire comprising one or more of
[0661] a plurality of D2-2 gene segments provided by at least 2 different D2-2 gene segments in trans in said genome;
[0662] a plurality of D2-21 gene segments provided by at least 2 different D2-21 gene segments in trans in said genome;
[0663] a plurality of D3-10 gene segments provided by at least 2 different D3-10 gene segments in trans in said genome;
[0664] a plurality of D3-16 gene segments provided by at least 2 different D3-16 gene segments in trans in said genome;
[0665] a plurality of D2-8 gene segments provided by at least 2 different D2-8 gene segments in trans in said genome;
[0666] a plurality of D3-3 gene segments provided by at least 2 different D3-3 gene segments in trans in said genome;
[0667] a plurality of D4-23 gene segments provided by at least 2 different D4-23 gene segments in trans in said genome;
[0668] a plurality of D6-13 gene segments provided by at least 2 different D6-13 gene segments in trans in said genome;
[0669] a plurality of D3-9 gene segments provided by at least 2 different D3-9 gene segments in trans in said genome;
[0670] a plurality of D4-4 gene segments provided by at least 2 different D4-4 gene segments in trans in said genome; and
[0671] a plurality of D7-27 gene segments provided by at least 2 different D7-27 gene segments in trans in said genome;
[0672] optionally wherein the D gene segments are derived from the genome sequence of two or more different human individuals.
[0673] 178. A non-human vertebrate or vertebrate cell (optionally an ES cell or B-cell), according to clause 154, comprising a genome that comprises VH, D and JH gene repertoires comprising human gene segments, the D gene repertoire comprising one or more of a plurality of D2-2 gene segments provided by at least 3 different D2-2 gene segments; a plurality of D2-21 gene segments provided by at least 3 different D2-21 gene segments; a plurality of D3-10 gene segments provided by at least 3 different D3-10 gene segments; a plurality of D3-16 gene segments provided by at least 3 different D3-16 gene segments; a plurality of D2-8 gene segments provided by at least 3 different D2-8 gene segments; a plurality of D3-3 gene segments provided by at least 3 different D3-3 gene segments; a plurality of D4-23 gene segments provided by at least 3 different D4-23 gene segments; a plurality of D6-13 gene segments provided by at least 3 different D6-13 gene segments; a plurality of D3-9 gene segments provided by at least 3 different D3-9 gene segments; a plurality of D4-4 gene segments provided by at least 3 different D4-4 gene segments; and a plurality of D7-27 gene segments provided by at least 3 different D7-27 gene segments; optionally wherein the D gene segments are derived from the genome sequence of two or three different human individuals; optionally wherein at least 2 or 3 of said different gene segments are provided in cis at the same Ig locus in said genome.
[0674] 179. The vertebrate or cell of clause 176, 177 or 178, wherein the different human individuals are from different human populations.
[0675] 180. The vertebrate or cell of any one of clauses 176 to 179, wherein the individuals are not genetically related.
[0676] 181. The vertebrate or cell of any one of clauses 176 to 180, wherein at least one of the different D segments is a synthetic mutant of a human germline D gene segment.
[0677] 182. A non-human vertebrate or vertebrate cell (optionally an ES cell or B-cell) according to clause 158, comprising a genome comprising human VH, D and JH gene repertoires, the D gene repertoire comprising of one or more of a plurality of D2-2 gene segments provided by at least 2 different D2-2 gene; optionally in cis in said genome;
[0678] a plurality of D2-21 gene segments provided by at least 2 different D2-21 gene; optionally in cis in said genome;
[0679] a plurality of D3-10 gene segments provided by at least 2 different D3-10 gene; optionally in cis in said genome;
[0680] a plurality of D3-16 gene segments provided by at least 2 different D3-16 gene; optionally in cis in said genome;
[0681] a plurality of D2-8 gene segments provided by at least 2 different D2-8 gene; optionally in cis in said genome;
[0682] a plurality of D3-3 gene segments provided by at least 2 different D3-3 gene; optionally in cis in said genome;
[0683] a plurality of D4-23 gene segments provided by at least 2 different D4-23 gene; optionally in cis in said genome;
[0684] a plurality of D6-13 gene segments provided by at least 2 different D6-13 gene; optionally in cis in said genome;
[0685] a plurality of D3-9 gene segments provided by at least 2 different D3-9 gene; optionally in cis in said genome;
[0686] a plurality of D4-4 gene segments provided by at least 2 different D4-4 gene; optionally in cis in said genome; and
[0687] a plurality of D7-27 gene segments provided by at least 2 different D7-27 gene; optionally in cis in said genome;
[0688] wherein the D gene segments are derived from the genome sequence of different human individuals that are not genetically related over at least 3 generations.
[0689] 183. The vertebrate or cell of clause 182, wherein the human individuals are from different human populations.
[0690] 184. The vertebrate, cell or population of any one of clauses 154 to 183, wherein one or more of the D gene segments is a variant of a human germline D gene segment, wherein the variant gene segment encodes an amino acid sequence that differs by 1, 2 or 3 amino acids from the corresponding amino acid sequence encoded by the human germline D gene segment, provided in that said amino acid sequence encoded by the variant does not include a stop codon when said corresponding amino acid sequence does not include a stop codon.
[0691] Optionally, the variant and germline D gene segments encode the respective amino acid sequences in reading frame 2 (IMGT numbering). See Briney et al 2012.
[0692] 185. The vertebrate, cell or population of clause 184, wherein said corresponding amino acid sequence encoded by the human germline D gene segment is a hydrophilic or hydrophobic sequence (according to J Mol. Biol. 1997 Jul. 25; 270(4):587-97; Corbett S J et al; Table 2).
[0693] 186. The vertebrate, cell or population of clause 184 or 185, comprising said variant and said germline human D gene segments; optionally wherein the variant and germline human D gene segments are cis on the same chromosome.
[0694] 187. The vertebrate, cell or population of any one of clauses 184 to 186, wherein germline human D gene segment is a D2, D3, D5 or D6 family gene segment; optionally a D2-2, D2-15, D3-3, D3-9, D3-10, D3-22, D5-5, D5-18, D6-6, D6-13, D6-19 gene segment.
[0695] These D segments are usable in all three reading frames.
[0696] Optionally a variant of 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 or all of these human germline D gene segments is used.
[0697] 188. The vertebrate, cell or population of any one of clauses 154 to 187, comprising a plurality of D2-2 gene segments, wherein the plurality comprises D2-2 gene segments that vary from each other at one or more nucleotide positions corresponding to positions 106,382,687 and 106,382,711
[0698] on human chromosome 14.
[0699] 189. The vertebrate, cell or population of clause 188, wherein the plurality comprises a human D2-2 gene segment ((optionally two copies and/or in homozygous state) comprising a thymine at a position corresponding to position 106,382,687 on human chromosome 14; and optionally no further mutation from the sequence of D2-2ref.
[0700] 190. The vertebrate, cell or population of clause 188 or 189, wherein the plurality comprises a human D2-2 gene segment comprising a cytosine at a position corresponding to position 106,382,687 on human chromosome 14; and optionally no further mutation from the sequence of D2-2a.
[0701] 191. The vertebrate, cell or population of any one of clauses 188 to 190, wherein the plurality comprises a human D2-2 gene segment comprising an adenine at a position corresponding to position 106,382,711 on human chromosome 14; and optionally no further mutation from the sequence of D2-2b.
[0702] 192. The vertebrate, cell or population of any one of clauses 188 to 191, wherein the plurality comprises a human D2-2 gene segment comprising an thymine at a position corresponding to position 106,382,711 on human chromosome 14; and optionally no further mutation from the sequence of D2-2ref.
[0703] 193. The vertebrate, cell or population of any one of clauses 154 to 192, comprising a plurality of D7-27 gene segments, wherein the plurality comprises D7-27 gene segments that vary from each other at a nucleotide position corresponding to position 106,331,767 on human chromosome 14.
[0704] 194. The vertebrate, cell or population of clause 193, wherein the plurality comprises a human D7-27 gene segment (optionally two copies and/or in homozygous state) comprising a cytosine at a position corresponding to position 106,331,767 on human chromosome 14; and optionally no further mutation from the sequence of D7-27ref.
[0705] 195. The vertebrate, cell or population of clause 193 or 194, wherein the plurality comprises a human D7-27 gene segment comprising a guanine at a position corresponding to position 106,331,767 on human chromosome 14; and optionally no further mutation from the sequence of D7-27a.
[0706] 196. The vertebrate, cell or population of any one of clauses 154 to 195, comprising a plurality of D4-23 gene segments, wherein the plurality comprises D4-23 gene segments that vary from each other at a nucleotide position corresponding to position 106,350,740 on human chromosome 14.
[0707] 197. The vertebrate, cell or population of clause 196, wherein the plurality comprises a human D4-23 gene segment (optionally two copies and/or in homozygous state) comprising an adenine at a position corresponding to position 106,350,740 on human chromosome 14; and optionally no further mutation from the sequence of D4-23ref.
[0708] 198. The vertebrate, cell or population of clause 196 or 197, wherein the plurality comprises a human D4-23 gene segment (optionally two copies and/or in homozygous state) comprising an guanine at a position corresponding to position 106,350,740 on human chromosome 14; and optionally no further mutation from the sequence of D4-23a.
[0709] 199. The vertebrate, cell or population of any one of clauses 154 to 197, comprising a plurality of D2-21 gene segments, wherein the plurality comprises D2-21 gene segments that vary from each other at a nucleotide position corresponding to position 106,354,418 on human chromosome 14.
[0710] 200. The vertebrate, cell or population of clause 199, wherein the plurality comprises a human D2-21 gene segment (optionally two copies and/or in homozygous state) comprising an adenine at a position corresponding to position 106,354,418 on human chromosome 14; and optionally no further mutation from the sequence of D2-21 ref.
[0711] 201. The vertebrate, cell or population of clause 199 or 200, wherein the plurality comprises a human D2-21 gene segment (optionally two copies and/or in homozygous state) comprising a guanine at a position corresponding to position 106,354,418 on human chromosome 14; and optionally no further mutation from the sequence of D2-21a.
[0712] 202. The vertebrate, cell or population of any one of clauses 154 to 201, comprising a plurality of D3-16 gene segments, wherein the plurality comprises D3-16 gene segments that vary from each other at a nucleotide position corresponding to position 106,354,418 on human chromosome 14.
[0713] 203. The vertebrate, cell or population of clause 202, wherein the plurality comprises a human D3-16 gene segment (optionally two copies and/or in homozygous state) comprising a thymine at a position corresponding to position 106,361,515 on human chromosome 14; and optionally no further mutation from the sequence of D3-16ref.
[0714] 204. The vertebrate, cell or population of clause 202 or 203, wherein the plurality comprises a human D3-16 gene segment (optionally two copies and/or in homozygous state) comprising a cytosine at a position corresponding to position 106,361,515 on human chromosome 14; and optionally no further mutation from the sequence of D3-16a.
[0715] 205. The vertebrate, cell or population of any one of clauses 154 to 204, comprising a plurality of D6-13 gene segments, wherein the plurality comprises D6-13 gene segments that vary from each other at a nucleotide position corresponding to position 106,367,013 on human chromosome 14.
[0716] 206. The vertebrate, cell or population of clause 205, wherein the plurality comprises a human D6-13 gene segment (optionally two copies and/or in homozygous state) comprising a thymine at a position corresponding to position 106,367,013 on human chromosome 14; and optionally no further mutation from the sequence of D6-13ref.
[0717] 207. The vertebrate, cell or population of clause 205 or 206, wherein the plurality comprises a human D6-13 gene segment (optionally two copies and/or in homozygous state) comprising a cytosine at a position corresponding to position 106,367,013 on human chromosome 14; and optionally no further mutation from the sequence of D6-13a.
[0718] 208. The vertebrate, cell or population of any one of clauses 154 to 207, comprising a plurality of D3-10 gene segments, wherein the plurality comprises D3-10 gene segments that vary from each other at one or more nucleotide positions corresponding to positions
[0719] 106.370.370 and
[0720] 106.370.371
[0721] on human chromosome 14.
[0722] 209. The vertebrate, cell or population of clause 208, wherein the plurality comprises a human D3-10 gene segment (optionally two copies and/or in homozygous state) comprising a thymine at a position corresponding to position 106,370,370 on human chromosome 14; and optionally no further mutation from the sequence of D3-10ref.
[0723] 210. The vertebrate, cell or population of clause 208 or 209, wherein the plurality comprises a human D3-10 gene segment (optionally two copies and/or in homozygous state) comprising a cytosine at a position corresponding to position 106,370,370 on human chromosome 14; and optionally no further mutation from the sequence of D3-10a.
[0724] 211. The vertebrate, cell or population of clause 208, 209 or 210 wherein the plurality comprises a human D3-10 gene segment (optionally two copies and/or in homozygous state) comprising an adenine at a position corresponding to position 106,370,371 on human chromosome 14; and optionally no further mutation from the sequence of D3-10ref.
[0725] 212. The vertebrate, cell or population of any one of clauses 208 to 211, wherein the plurality comprises a human D3-10 gene segment (optionally two copies and/or in homozygous state) comprising a guanine at a position corresponding to position 106,370,371 on human chromosome 14; and optionally no further mutation from the sequence of D3-10b.
[0726] 213. The vertebrate, cell or population of any one of clauses 154 to 212, comprising a plurality of D3-9 gene segments, wherein the plurality comprises D3-9 gene segments that vary from each other at a nucleotide position corresponding to position 106,370,567 on human chromosome 14.
[0727] 214. The vertebrate, cell or population of clause 213, wherein the plurality comprises a human D3-9 gene segment (optionally two copies and/or in homozygous state) comprising an adenine at a position corresponding to position 106,370,567 on human chromosome 14; and optionally no further mutation from the sequence of D3-9ref.
[0728] 215. The vertebrate, cell or population of clause 213 or 214, wherein the plurality comprises a human D3-9 gene segment (optionally two copies and/or in homozygous state) comprising a thymine at a position corresponding to position 106,370,567 on human chromosome 14; and optionally no further mutation from the sequence of D3-9a.
[0729] 216. The vertebrate, cell or population of any one of clauses 154 to 215, comprising a plurality of D2-8 gene segments, wherein the plurality comprises D2-8 gene segments that vary from each other at one or more nucleotide positions corresponding to positions
[0730] 106,373,085; 106,373,086 and 106,373,089 on human chromosome 14.
[0731] 217. The vertebrate, cell or population of clause 216, wherein the plurality comprises a human D2-8 gene segment (optionally two copies and/or in homozygous state) comprising a cytosine at a position corresponding to position 106,373,085 on human chromosome 14.
[0732] 218. The vertebrate, cell or population of clause 216 or 217, wherein the plurality comprises a human D2-8 gene segment (optionally two copies and/or in homozygous state) comprising a thymine at a position corresponding to position 106,373,085 on human chromosome 14; and optionally no further mutation from the sequence of D2-8b.
[0733] 219. The vertebrate, cell or population of clause 216, 217 or 218 wherein the plurality comprises a human D2-8 gene segment (optionally two copies and/or in homozygous state) comprising a cytosine at a position corresponding to position 106,373,086 on human chromosome 14; and
[0734] optionally no further mutation from the sequence of D2-8ref.
[0735] 220. The vertebrate, cell or population of any one of clauses 216 to 219, wherein the plurality comprises a human D2-8 gene segment comprising a thymine at a position corresponding to position 106,373,086 on human chromosome 14; and optionally no further mutation from the sequence of D2-8ref.
[0736] 221. The vertebrate, cell or population of any one of clauses 154 to 220, comprising a plurality of D4-4 gene segments, wherein the plurality comprises D4-4 gene segments that vary from each other at one or more nucleotide positions corresponding to positions
[0737] 106,379,086; and 106,379,089
[0738] on human chromosome 14.
[0739] 222. The vertebrate, cell or population of clause 221, wherein the plurality comprises a D4-4 gene segment (optionally two copies and/or in homozygous state) comprising a cytosine at a position corresponding to position 106,379,086 on human chromosome 14; and optionally no further mutation from the sequence of D4-4ref.
[0740] 223. The vertebrate, cell or population of clause 221 or 222, wherein the plurality comprises a human D4-4 gene segment (optionally two copies and/or in homozygous state) comprising a thymine at a position corresponding to position 106,379,086 on human chromosome 14; and optionally no further mutation from the sequence of D4-4a.
[0741] 224. The vertebrate, cell or population of clause 221, 222 or 223 wherein the plurality comprises a human D4-4 gene segment (optionally two copies and/or in homozygous state) comprising a cytosine at a position corresponding to position 106,379,089 on human chromosome 14; and optionally no further mutation from the sequence of D4-4ref or a cytosine at a position corresponding to position 106,379,086 on human chromosome 14.
[0742] 225. The vertebrate, cell or population of any one of clauses 221 to 224, wherein the plurality comprises a human D4-4 gene segment (optionally two copies and/or in homozygous state) comprising a thymine at a position corresponding to position 106,373,089 on human chromosome 14; and optionally no further mutation from the sequence of D4-4a.
[0743] 226. The vertebrate, cell or population of any one of clauses 154 to 225, comprising a plurality of D3-3 gene segments, wherein the plurality comprises D3-3 gene segments that vary from each other at one or more nucleotide positions corresponding to positions
[0744] 106,380,241; and 106,380,246
[0745] on human chromosome 14.
[0746] 227. The vertebrate, cell or population of clause 226, wherein the plurality comprises a D3-3 gene segment (optionally two copies and/or in homozygous state) comprising a thymine at a position corresponding to position 106,380,241 on human chromosome 14; and optionally no further mutation from the sequence of D3-3ref.
[0747] 228. The vertebrate, cell or population of clause 226 or 227, wherein the plurality comprises a human D3-3 gene segment (optionally two copies and/or in homozygous state) comprising a cytosine at a position corresponding to position 106,380,241 on human chromosome 14; and optionally no further mutation from the sequence of D3-3a.
[0748] 229. The vertebrate, cell or population of clause 226, 227 or 228 wherein the plurality comprises a human D3-3 gene segment (optionally two copies and/or in homozygous state) comprising an adenine at a position corresponding to position 106,380,246 on human chromosome 14; and optionally no further mutation from the sequence of D3-3ref.
[0749] 230. The vertebrate, cell or population of any one of clauses 226 to 229, wherein the plurality comprises a human D3-3 gene segment (optionally two copies and/or in homozygous state) comprising a thymine at a position corresponding to position 106,380,246 on human chromosome 14; and optionally no further mutation from the sequence of D3-3a.
Multiple Human JL Gene Segment Variants
[0750] A specific application of this configuration is the provision of multiple human JLgene segments (JK and/or JA) as follows (as set out in numbered paragraphs, starting at paragraph number 80).
[0751] 80. A non-human vertebrate (e.g., a mouse or rat) or a non-human vertebrate cell (e.g., an ES cell or a B-cell) having a genome comprising at least 3 human JLgene segments of the same type (e.g., JK1), wherein at least two of the human JLgene segments are variants that are not identical to each other.
[0752] 81. A non-human vertebrate (e.g., a mouse or rat) or a non-human vertebrate cell (e.g., an ES cell or a B-cell) having a genome comprising at least 2 different non-endogenous JL gene segments of the same type (e.g., JK1) cis at the same Ig locus.
[0753] 82. A non-human vertebrate (e.g., a mouse or rat) or a non-human vertebrate cell (e.g., an ES cell or a B-cell) having a genome comprising at least 2 different human JLgene segments of the same type (e.g., JK1) trans at the same Ig locus; and optionally a third human JLgene segment of the same type, wherein the third JL is cis with one of said 2 different JL gene segments.
[0754] 83. A population of non-human vertebrates (e.g., mice or rats) comprising a repertoire of human JL gene segments, wherein the plurality comprises at least 2 different human JL gene segments of the same type (e.g., JK1), a first of said different JL gene segments is provided in the genome of a first vertebrate of the population, and a second of said different JLgene segments being provided in the genome of a second vertebrate of the population, wherein the genome of the first vertebrate does not comprise the second JL gene segment.
[0755] 84. A non-human vertebrate (e.g., a mouse or rat) or a non-human vertebrate cell (e.g., an ES cell or a B-cell) having a genome comprising at least 2 different non-endogenous JL gene segments of the same type (e.g., JK1), wherein the JL gene segments are derived from the genome sequence of different human individuals that are not genetically related over at least 3 generations.
[0756] 85. A method of enhancing the human immunoglobulin gene diversity of a non-human vertebrate (e.g., a mouse or rat), the method comprising providing the vertebrate with a genome comprising at least 3 human JL gene segments of the same type (e.g., JK1), wherein at least two of the human JL gene segments are variants that are not identical to each other.
[0757] 86. A method of enhancing the immunoglobulin gene diversity of a non-human vertebrate (e.g., a mouse or rat), the method comprising providing the vertebrate with a genome comprising at least 2 different non-endogenous JLgene segments of the same type (e.g., JK1) cis at the same Ig locus.
[0758] 87. A method of enhancing the immunoglobulin gene diversity of a non-human vertebrate (e.g., a mouse or rat), the method comprising providing the vertebrate with a genome comprising at least 2 different human JLgene segments of the same type(e.g., JK1) trans at the same Ig locus; and optionally a third human JLgene segment of the same type, wherein the third JL is cis with one of said 2 different JL gene segments.
[0759] 88. A method of providing an enhanced human immunoglobulin JL gene segment repertoire, the method comprising providing a population of non-human vertebrates (e.g., a mouse or rat) comprising a repertoire of human JL gene segments, wherein the method comprises providing at least 2 different human JLgene segments of the same type (e.g., JK1), wherein a first of said different JLgene segments is provided in the genome of a first vertebrate of the population, and a second of said different JL gene segments is provided in the genome of a second vertebrate of the population, wherein the genome of the first vertebrate does not comprise the second JL gene segment.
[0760] 89. A method of enhancing the human immunoglobulin gene diversity of a non-human vertebrate (e.g., a mouse or rat), the method comprising providing the vertebrate with a genome comprising at least 2 different non-endogenous JLgene segments of the same type (e.g., JK1), wherein the JL gene segments are derived from the genome sequence of different human individuals that are not genetically related over at least 3 generations.
[0761] 90. The vertebrate or cell of paragraph 80, 82 or 84, or the method of paragraph 85, 82 or 89, wherein at least 2 or 3 of said different gene segments are provided cis at the same Ig locus in said genome.
[0762] 91. The vertebrate or cell of paragraph 80, 81 or 82, or the method of paragraph 85, 86 or 87, wherein the JL gene segments are derived from the genome sequence of different human individuals that are not genetically related over at least 3 generations.
[0763] 92. The vertebrate or cell of paragraph 80, 81 or 82, or the method of paragraph 85, 86 or 87, wherein the JL gene segments are derived from the genome sequence of two or more different human individuals; optionally wherein the different human individuals are from different human populations.
[0764] 93. The vertebrate, cell or method of paragraph 92, wherein the individuals are not genetically related.
[0765] 94. The vertebrate, cell or method of any one of paragraphs 80 to 93, wherein at least one of the different JL segments is a synthetic mutant of a human germline JL gene segment.
[0766] 95. A method of enhancing the human immunoglobulin gene diversity of a non-human vertebrate (e.g., a mouse or rat), the method comprising providing the vertebrate with a genome comprising at least 2 human JL gene segments of the same type (e.g., JK1),
[0767] wherein the JL gene segments are derived from the genome sequence of different human individuals that are not genetically related over at least 3 generations; optionally wherein at least 2 or 3 of said different gene segments are provided at the same IgL locus in said genome.
[0768] 96. The vertebrate or cell of any one of paragraphs paragraph 80 to 82 and 84, wherein the genome comprises a substantially complete functional repertoire of human JK and/or JA gene segment types supplemented with one, two or more human JK and/or JA gene segments respectively, wherein said substantially complete functional repertoire and the supplementary gene segments are not found together in the germline genome of a human individual.
[0769] 97. The population of paragraph 83, wherein the population comprises a substantially complete functional repertoire of human JL gene segment types supplemented with one, two or more human JK and/or JA gene segments respectively, wherein said substantially complete functional repertoire and the supplementary gene segments are not found together in the germline genome of a human individual.
[0770] 98. A non-human vertebrate (e.g., a mouse or rat) or a non-human cell (e.g., an ES cell or a B-cell) having a genome comprising a substantially complete functional repertoire of human JK and/or JA gene segment types supplemented with one, two or more human JK and/or JA gene segments respectively, wherein said substantially complete functional repertoire and the supplementary gene segments are not found together in the germline genome of a human individual.
[0771] 99. A population of non-human vertebrates (e.g., mice or rats) comprising a substantially complete functional repertoire of human JK and/or JA gene segment types supplemented with one, two or more human JK and/or JA gene segments respectively, wherein said substantially complete functional repertoire and the supplementary gene segments are not found together in the germline genome of a human individual.
[0772] 100. A non-human vertebrate or vertebrate cell according to paragraph 81, comprising a genome that comprises VL and JL gene repertoires comprising human gene segments, the JL gene repertoire comprising
[0773] a plurality of human JK1 gene segments provided by at least 2 different human JK1 gene segments in cis at the same Ig locus in said genome;
[0774] a plurality of human JK2 gene segments provided by at least 2 different human JK1 gene segments in cis at the same Ig locus in said genome;
[0775] a plurality of human JK3 gene segments provided by at least 2 different human JK1 gene segments in cis at the same Ig locus in said genome;
[0776] a plurality of human JK4 gene segments provided by at least 2 different human JK1 gene segments in cis at the same Ig locus in said genome;
[0777] a plurality of human JK5 gene segments provided by at least 2 different human JK1 gene segments in cis at the same Ig locus in said genome;
[0778] a plurality of human JA1 gene segments provided by at least 2 different human JA1 gene segments in cis at the same Ig locus in said genome;
[0779] a plurality of human JA2 gene segments provided by at least 2 different human JA2 gene segments in cis at the same Ig locus in said genome;
[0780] a plurality of human JA3 gene segments provided by at least 2 different human JA3 gene segments in cis at the same Ig locus in said genome;
[0781] a plurality of human JA4 gene segments provided by at least 2 different human JA4 gene segments in cis at the same Ig locus in said genome;
[0782] a plurality of human JA5 gene segments provided by at least 2 different human JA5 gene segments in cis at the same Ig locus in said genome;
[0783] a plurality of human JA6 gene segments provided by at least 2 different human JA6 gene segments in cis at the same Ig locus in said genome; or
[0784] a plurality of human JA7 gene segments provided by at least 2 different human JA7 gene segments in cis at the same Ig locus in said genome;
[0785] optionally wherein the JLgene segments are derived from the genome sequence of two or more different human individuals.
[0786] 101. A non-human vertebrate or vertebrate cell (optionally an ES cell or B-cell), according to paragraph 80, comprising a genome that comprises VL and JL gene repertoires comprising human gene segments, the JL gene repertoire comprising
[0787] a plurality of human JK1 gene segments provided by at least 3 (e.g., 3, 4, 5, 6, or 7) different human JK1 gene segments;
[0788] a plurality of human JK2 gene segments provided by at least 3 (e.g., 3, 4, 5, 6, or 7) different human JK1 gene segments;
[0789] a plurality of human JK3 gene segments provided by at least 3 (e.g., 3, 4, 5, 6, or 7) different human JK1 gene segments;
[0790] a plurality of human JK4 gene segments provided by at least 3 (e.g., 3, 4, 5, 6, or 7) different human JK1 gene segments;
[0791] a plurality of human JK5 gene segments provided by at least 3 (e.g., 3, 4, 5, 6, or 7) different human JK1 gene segments;
[0792] a plurality of human JA1 gene segments provided by at least 3 (e.g., 3, 4, 5, 6, or 7) different human JA1 gene segments;
[0793] a plurality of human JA2 gene segments provided by at least 3 (e.g., 3, 4, 5, 6, or 7) different human JA2 gene segments;
[0794] a plurality of human JA3 gene segments provided by at least 3 (e.g., 3, 4, 5, 6, or 7) different human JA3 gene segments;
[0795] a plurality of human JA4 gene segments provided by at least 3 (e.g., 3, 4, 5, 6, or 7) different human JA4 gene segments;
[0796] a plurality of human JA5 gene segments provided by at least 3 (e.g., 3, 4, 5, 6, or 7) different human JA5 gene segments;
[0797] a plurality of human JA6 gene segments provided by at least 3 (e.g., 3, 4, 5, 6, or 7) different human JA6 gene segments; or
[0798] a plurality of human JA7 gene segments provided by at least 3 (e.g., 3, 4, 5, 6, or 7) different human JA7 gene segments;
[0799] optionally wherein the JLgene segments are derived from the genome sequence of two or three
[0800] different human individuals;
[0801] optionally wherein at least 2 or 3 of said different gene segments are provided in cis at the same Ig locus in said genome.
[0802] 102. The vertebrate or cell of paragraph 104 or 105, wherein the different human individuals are from different human populations.
[0803] 103. The vertebrate or cell of any one of paragraphs 104 to 106, wherein the individuals are not genetically related.
[0804] 104. The vertebrate or cell of any one of paragraphs 104 to 107, wherein at least one of the different JL segments is a synthetic mutant of a human germline JL gene segment.
[0805] 105. A non-human vertebrate or vertebrate cell (optionally an ES cell or B-cell) according to paragraph 84, comprising a genome comprising human VL and JL gene repertoires, the JL gene repertoire comprising
[0806] a plurality of human JK1 gene segments provided by at least 2 different human JK1 gene segments, optionally in cis at the same Ig locus in said genome;
[0807] a plurality of human JK2 gene segments provided by at least 2 different human JK1 gene segments, optionally in cis at the same Ig locus in said genome;
[0808] a plurality of human JK3 gene segments provided by at least 2 different human JK1 gene segments, optionally in cis at the same Ig locus in said genome;
[0809] a plurality of human JK4 gene segments provided by at least 2 different human JK1 gene segments, optionally in cis at the same Ig locus in said genome;
[0810] a plurality of human JK5 gene segments provided by at least 2 different human JK1 gene segments, optionally in cis at the same Ig locus in said genome;
[0811] a plurality of human JA1 gene segments provided by at least 2 different human JA1 gene segments, optionally in cis at the same Ig locus in said genome;
[0812] a plurality of human JA2 gene segments provided by at least 2 different human JA2 gene segments, optionally in cis at the same Ig locus in said genome;
[0813] a plurality of human JA3 gene segments provided by at least 2 different human JA3 gene segments, optionally in cis at the same Ig locus in said genome;
[0814] a plurality of human JA4 gene segments provided by at least 2 different human JA4 gene segments, optionally in cis at the same Ig locus in said genome;
[0815] a plurality of human JA5 gene segments provided by at least 2 different human JA5 gene segments, optionally in cis at the same Ig locus in said genome;
[0816] a plurality of human JA6 gene segments provided by at least 2 different human JA6 gene segments, optionally in cis at the same Ig locus in said genome; or a plurality of human JA7 gene segments provided by at least 2 different human JA7 gene segments, optionally in cis at the same Ig locus in said genome;
[0817] wherein the JL gene segments are derived from the genome sequence of different human individuals that are not genetically related over at least 3 generations.
[0818] 106. The vertebrate or cell of paragraph 109, wherein the human individuals are from different human populations.
[0819] The skilled person will realise that standard molecular biology techniques can be used to provide vectors comprising synthetic combinations of immunoglobulin gene segments (e.g., V, D and/or J) for use in the invention, such that the vectors can be used to build a transgenic immunoglobulin locus (e.g., using homologous recombination and/or recombinase mediated cassette exchange as known in the art, e.g., see U.S. Pat. No. 7,501,552 (Medarex), U.S. Pat. No. 5,939,598 (Abgenix), U.S. Pat. No. 6,130,364 (Abgenix), WO02/066630 (Regeneron), WO2011004192 (Genome Research Limited), WO2009076464, WO2009143472 and WO2010039900 (Ablexis), the disclosures of which are explicitly incorporated herein. For example, such synthetic combinations of gene segments can be made using standard recombineerinq techniques in E. coli to construct BAC vectors harbouring the synthetic combination prior to insertion in embryonic stem cells using homologous recombination or RMCE (e.g., using cre/lox site-specific recombination). Details of recombineering can be found at www.genebridges.com and in EP1034260 and EP1204740 the disclosures of which are explicitly incorporated herein.
[0820] In one embodiment, it is useful to bias the immune response of the vertebrate (and thus resultant lead antibodies) to a predetermined gene segment, e.g., one known to be commonly used in natural human immune responses to antigens, such as antigens of infectious disease pathogens. For example, VH1-69 is commonly used to produce antibodies in humans against Influenza virus; it is possible, therefore, to include two or more polymorphic DNA versions of the VH segment VH1-69 in the locus of the invention. The examples below illustrate how such a transgenic locus can be constructed in which diversity is extended by extending the VH1-69 gene segment repertoire based on naturally-occurring VH1-69 polymorphic variants.
[0821] In one embodiment in any configuration of the invention, the genome has been modified to prevent or reduce the expression of fully-endogenous antibody. Examples of suitable techniques for doing this can be found in PCT/GB2010/051122, U.S. Pat. No. 7,501,552, U.S. Pat. No. 6,673,986, U.S. Pat. No. 6,130,364, WO2009/076464, EP1399559 and U.S. Pat. No. 6,586,251, the disclosures of which are incorporated herein by reference. In one embodiment, the non-human vertebrate VDJ region of the endogenous heavy chain immunoglobulin locus, and optionally VJ region of the endogenous light chain immunoglobulin loci (lambda and/or kappa loci), have been inactivated. For example, all or part of the non-human vertebrate VDJ region is inactivated by inversion in the endogenous heavy chain immunoglobulin locus of the mammal, optionally with the inverted region being moved upstream or downstream of the endogenous Ig locus (see, e.g., WO2011004192, the disclosure of which is incorporated herein by reference). For example, all or part of the non-human vertebrate VJ region is inactivated by inversion in the endogenous kappa chain immunoglobulin locus of the mammal, optionally with the inverted region being moved upstream or downstream of the endogenous Ig locus. For example, all or part of the non-human vertebrate VJ region is inactivated by inversion in the endogenous lambda chain immunoglobulin locus of the mammal, optionally with the inverted region being moved upstream or downstream of the endogenous Ig locus. In one embodiment the endogenous heavy chain locus is inactivated in this way as is one or both of the endogenous kappa and lambda loci.
[0822] Additionally or alternatively, the vertebrate has been generated in a genetic background which prevents the production of mature host B and T lymphocytes, optionally a RAG-1-deficient and/or RAG-2 deficient background. See U.S. Pat. No. 5,859,301 for techniques of generating RAG-1 deficient animals.
[0823] Thus, in one embodiment of any configuration or aspect of the invention herein, endogenous heavy and light chain expression has been inactivated.
[0824] In one embodiment each said locus constant region is a heavy chain endogenous non-human vertebrate (optionally host mouse or rat) constant region.
[0825] In one embodiment each said locus constant region is a light chain endogenous non-human vertebrate (optionally host mouse or rat) constant region.
[0826] The invention provides a monoclonal or polyclonal antibody composition prepared by immunisation of at least one vertebrate (e.g., mouse or rat) according to the invention, optionally wherein the antigen is an antigen of an infectious disease pathogen (e.g., a bacterial or viral pathogen antigen), optionally wherein the same antigen is used to immunize all the vertebrates; optionally wherein the antibody or antibodies are IgG-type (e.g., IgG1).
[0827] The invention also provides a monoclonal or polyclonal antibody mixture produced by the method of the invention or a derivative antibody or mixture thereof, e.g., where one or more constant region has been changed (e.g., replaced with a different constant region such as a human constant region; or mutated to enhance or ablate Fc effector function). In an aspect of the invention, the monoclonal or polyclonal antibody mixture is provided for therapy and/or prophylaxis of a disease or condition in a human, e.g., for the treatment and/or prevention of an infectious disease, wherein optionally wherein each antibody binds an antigen of an infectious disease pathogen, preferably the same antigen.
[0828] In an aspect of the invention, there is provided the use of an isolated, monoclonal or polyclonal antibody according to the invention, or a mutant or derivative antibody thereof in the manufacture of a medicament for the treatment and/or prevention of a disease or condition in a human, e.g., an infectious disease, optionally wherein the infectious disease is a disease caused by a bacterial or viral pathogen.
[0829] An example of a mutant antibody is one that bears up to 15 or 10 amino acid mutations in its variable regions relative to an isolated antibody (e.g., IgG-type, such as IgG1-type, antibody) obtainable or obtained by the method of the invention. An example of a derivative is one that has been modified to replace a constant region with a different constant region such as a human constant region; or mutated to enhance or ablate Fc effector function.
[0830] Examples of infectious diseases are diseases caused or mediated by a bacterial or viral pathogen. For example, the infectious disease is selected from the group consisting of a disease caused by a pathogen selected from the group consisting of Haemophilus influenza, E coli, Neisseria meningitidis, a herpes family virus, cytomegalovirus (CMV), HIV and influenza virus. Tailoring V(D)J Incorporation into Immunoglobin Loci for the Generation of Antibodies Against Infectious Disease
[0831] The inventors realised that it would be desirable to provide for vertebrates, cells, methods etc for the production of therapeutic and/or prophylactic antibodies based on natural human immune responses to antigens, such as antigens of infectious disease pathogens. In this respect, the literature observes frequently used immunoglobulin gene segments to raise anti-infective responses in humans (Table 9).
[0832] In the various configurations, aspects, embodiments and examples above, the invention provides the skilled addressee with the possibility of choosing immunoglobulin gene segments in a way that tailors or biases the repertoire for application to generating antibodies to treat and/or prevent infectious diseases. The inventors have categorized the following groups of gene segments for use in the invention according to the desired application of resultant antibodies.
List A:
[0833] Immunoglobulin Gene Segments for Antibodies that Bind an Antigen Expressed by a Pathogen
[0834] (a) a VL gene segment selected from the group consisting of a VAII gene family member, VAVII 4A, VAII 2.1, VAVII 4A, a VA1 gene family member, a VA3gene family member, IGLV1S2, VA3-cML70, Ialh2, Ialyl, Ia3h3, Kv325, a VKI gene family member, KI-15A (KL012), V°II family member, a V°III family member, a VKI gene family member, KI-15A (KL012), V°II A2 (optionally the A2a variant), VK A27 (Humkv325) and a gene segment at least 80% identical thereto.
[0835] (b) a VAgene segment selected from a VAII gene family member, VAVII 4A, VAII 2.1, VAVII 4A, a VA1 gene family member, a VA3gene family member, IGLV1S2, VA3-cML70, Ialh2, Ialyl, Ia3h3 and a gene segment at least 80% identical thereto.
[0836] (c) a VK gene segment selected from Kv325, a VKI gene family member, KI-15A (KL012), V°II family member, a VKIII family member, a VKI gene family member, KI-15A (KL012), V°II A2 (optionally the A2a variant), VK A27 (Humkv325) and a gene segment at least 80% identical thereto.
[0837] (d) a VH gene segment a VHIII gene family member (optionally, a VHIIIa or VHIIIb family member), a VHIV gene family member, VHIII 9.1 (VH3-15), VHIII VH26 (VH3-23), VH3-21, LSG6.1, LSG12.1, DP77 (V3-21), VH H11, VH1GRR, ha3h2, VHI-ha1c, VHIII-VH2-1, VH4.18, ha4h3, Hv1051, 71-2, Hv1f10, VH4.11, 71-4, VH251, VH1-69 and a gene segment at least 80% identical thereto.
[0838] (e) a JA gene segment selected from JA2, JA3 and a gene segment at least 80% identical thereto.
[0839] (f) a D gene segment selected from Dk1, Dxp>>1, Dn4r, D2r and a gene segment at least 80% identical thereto.
List A1:
[0840] Immunoglobulin Gene Segments for Antibodies that Bind an Antigen Expressed by a Pathogen
[0841] (a) a VAgene segment selected from a VAII gene family member, VAVII 4A, VAII 2.1, VAVII 4A and a gene segment at least 80% identical thereto.
[0842] (b) a VK gene segment selected from a VKI gene family member, KI-15A (KL012), V°II family member, a VKIII family member, a VKI gene family member, KI-15A (KL012), V°II A2 (optionally the A2a variant), VK A27 (Humkv325) and a gene segment at least 80% identical thereto.
[0843] (c) a VH gene segment a VH3 gene family member (optionally, a VHIIIa or VHIIIb family member), VHIII 9.1 (VH3-15), VHIII VH26 (VH3-23), VH3-21, LSG6.1, LSG12.1, DP77 (V3-21), VH H11 and a gene segment at least 80% identical thereto.
[0844] (d) a JA gene segment selected from JA2, JA3 and a gene segment at least 80% identical thereto.
[0845] (e) a JH gene segment selected from JH2, JH3, JH4 and a gene segment at least 80% identical thereto.
List A1.1:
[0846] Immunoglobulin Gene Segments for Antibodies that Bind an Antigen Expressed by H Influenza
[0847] (a) a VAgene segment selected from a VAII gene family member, VAVII 4A, VAII 2.1, VAVII 4A and a gene segment at least 80% identical thereto.
[0848] (b) a VK gene segment selected from a V°II family member, a VKIII family member, a VKI gene family member, KI-15A (KL012), V°II A2 (optionally the A2a variant), V°A27 (Humkv325) and a gene segment at least 80% identical thereto.
[0849] (c) a VH gene segment a VH3 gene family member (optionally, a VHIIIb family member), VHIII 9.1 (VH3-15), VHIII VH26 (VH3-23), VH3-21, LSG6.1, LSG12.1, DP77 (V3-21) and a gene segment at least 80% identical thereto.
[0850] (d) a JA gene segment selected from JA2, JA3 and a gene segment at least 80% identical thereto.
List A1.2:
[0851] Immunoglobulin Gene Segments for Antibodies that Bind an Antigen Expressed by E. coli or Neisseria meningitidis
[0852] (a) a VH gene segment a VH3 gene family member (optionally a VHIIIa or VHIIIb member), VHIII 9.1 (VH3-15), VH H11, VHIII VH26 (VH3-23) a gene segment at least 80% identical thereto, e.g., VHIII 9.1© JH3; or VH H11© JH4; or VHIII VH26© JH2.
[0853] (b) a VK gene segment selected from a VKI gene family member, KI-15A (KL012) and a gene segment at least 80% identical thereto.
[0854] (c) a VAgene segment selected from a VAII gene family member, VAII 2.1 and a gene segment at least 80% identical thereto.
[0855] (d) a JH gene segment selected from JH2, JH3, JH4 and a gene segment at least 80% identical thereto.
A2:
[0856] Immunoglobulin Gene Segments for Antibodies that Bind an Antigen Expressed by a Viral Pathogen
[0857] (a) a VH gene segment selected from a VHIII gene family member, a VHIV gene family member, VHIII-VH26 (VH3-23), VH1GRR, ha3h2, VHI-ha1c1, VHIII-VH2-1, VH4.18, ha4h3, Hv1051, 71-2, Hv1f10, VH4.11, 71-4, VH251, VH1-69 and a gene segment at least 80% identical thereto.
[0858] (b) a VA gene segment selected from a VA1 gene family member, a VA3gene family member, IGLV1S2, VA3-cML70, Ialh2, Ialyl, Ia3h3 and a gene segment at least 80% identical thereto.
[0859] (c) a Vk gene segment selected from Kv325 and a gene segment at least 80% identical thereto.
[0860] (d) a JH gene segment selected from JH3, JH5, JH6 and a gene segment at least 80% identical thereto.
[0861] (e) a D gene segment selected from Dk1, Dxp>>1, Dn4r, D2r and a gene segment at least 80% identical thereto.
[0862] (f) a JA gene segment selected from JA2, JA3 and a gene segment at least 80% identical thereto.
A2.1:
[0863] Immunoglobulin Gene Segments for Antibodies that Bind an Antigen Expressed by Herpes Virus Family (e.g., VZV or HSV)
[0864] (a) a VH gene segment selected from a VHIII gene family member, a VHIV gene family member, VHIII-VH26 (VH3-23), VH1GRR, ha3h2, VHI-ha1c1, VHIII-VH2-1, VH4.18, ha4h3, and a gene segment at least 80% identical thereto.
[0865] (b) a VA gene segment selected from a VA1 gene family member, a VA3gene family member, IGLV1S2, VA3-cML70, Ialh2, Ialyl, Ia3h3 and a gene segment at least 80% identical thereto.
[0866] (c) a JH gene segment selected from JH3, JH5, JH6 and a gene segment at least 80% identical thereto.
[0867] (d) a D gene segment selected from Dk1, Dxp>>1, Dn4r, D2r and a gene segment at least 80% identical thereto.
[0868] (e) a JA gene segment selected from JA2, JA3 and a gene segment at least 80% identical thereto.
A2.2:
[0869] Immunoglobulin Gene Segments for Antibodies that Bind an Antigen Expressed by CMV
[0870] (a) a VH gene segment selected from Hv1051 and a gene segment at least 80% identical thereto.
[0871] (b) a Vk gene segment selected from Kv325 and a gene segment at least 80% identical thereto. A2.3: Immunoglobulin Gene Segments for Antibodies that Bind an Antigen Expressed by HIV
[0872] (a) a VH gene segment selected from 71-2, Hv1f10, VH4.11, 71-4, VH251, VH1-69 and a gene segment at least 80% identical thereto.
A2.4:
[0873] Immunoglobulin Gene Segments for Antibodies that Bind an Antigen Expressed by Influenza Virus
[0874] (a) a VH gene segment selected from VH1-69 and a gene segment at least 80% identical thereto.
[0875] Thus,
[0876] Where one wishes to generate an antibody or antibody mixture to treat and/or prevent an infectious disease, one or more V, D and/or or all J gene segments used in any configuration, aspect, method, example or embodiment of the invention can be selected from List A1. Thus, for example in (a) of the first configuration of the invention, the recited heavy chain V gene segment is selected from the VH gene segments in List A, optionally with a D in that list.
[0877] Where one wishes to generate an antibody or antibody mixture to treat and/or prevent an infectious disease caused or mediated by a bacterial pathogen, one or more or all V, D and/or J gene segments used in any configuration, aspect, method, example or embodiment of the invention can be selected from List A1.
[0878] Where one wishes to generate an antibody or antibody mixture to treat and/or prevent an infectious disease caused or mediated by a viral pathogen, one or more or all V, D and/or J gene segments used in any configuration, aspect, method, example or embodiment of the invention can be selected from List A2.
[0879] Where one wishes to generate an antibody or antibody mixture to treat and/or prevent an infectious disease caused or mediated by H influenza, one or more or all V, D and/or J gene segments used in any configuration, aspect, method, example or embodiment of the invention can be selected from List A1.1.
[0880] Where one wishes to generate an antibody or antibody mixture to treat and/or prevent an infectious disease caused or mediated by E. coli or Neisseria meningitidis, one or more or all V, D and/or J gene segments used in any configuration, aspect, method, example or embodiment of the invention can be selected from List A1.2.
[0881] Where one wishes to generate an antibody or antibody mixture to treat and/or prevent an infectious disease caused or mediated by Herpes Virus Family (e.g., VZV or HSV), one or more or all V, D and/or J gene segments used in any configuration, aspect, method, example or embodiment of the invention can be selected from List A2.1.
[0882] Where one wishes to generate an antibody or antibody mixture to treat and/or prevent an infectious disease caused or mediated by CMV, one or more or all V, D and/or J gene segments used in any configuration, aspect, method, example or embodiment of the invention can be selected from List A2.2.
[0883] Where one wishes to generate an antibody or antibody mixture to treat and/or prevent an infectious disease caused or mediated by HIV, one or more or all V, D and/or J gene segments used in any configuration, aspect, method, example or embodiment of the invention can be selected from List A2.3.
[0884] Where one wishes to generate an antibody or antibody mixture to treat and/or prevent an infectious disease caused or mediated by Influenza Virus, one or more or all V, D and/or J gene segments used in any configuration, aspect, method, example or embodiment of the invention can be selected from List A2.4.
[0885] Optionally each VH segment in the locus of the invention is selected from List A1, A2, A1.1, A1.2, A2.1, A2.2, A2.3 or A2.4.
[0886] Optionally each VL segment in the locus of the invention is selected from List A1, A2, A1.1, A1.2, A2.1, A2.2, A2.3 or A2.4
[0887] Optionally each D segment in the locus of the invention is selected from List A1, A2, A1.1, A1.2, A2.1, A2.2, A2.3 or A2.4.
[0888] Optionally each JL segment in the locus of the invention is selected from List A1, A2, A1.1, A1.2, A2.1, A2.2, A2.3 or A2.4.
Antibodies for Therapy & Prophylaxis of Patients of Specific Ancestry
[0889] The inventors, having undertaken the extensive Bioinformatics analysis exercise described herein, realised that the output of that analysis has made it possible to identify specific gene segments that are useful to produce antibody- and VH domain-based drugs that are tailored specifically to a patient's ancestry (i.e., genotype). That is, antibodies can be selected on the basis that they are made in vivo in a transgenic non-human vertebrate (e.g., mouse or rat with transgenic IgH loci) and particularly derived from gene segments that are relatively prevalent in members of the patient's population, i.e., from individuals of the same human ancestry. Since variant distributions differ across different populations (see Table 13), this presumably reflects the effects of evolution, adaptation and conservation of useful variant gene types in those populations. Thus, by tailoring the antibody-based drugs according to the invention, it is possible to match the drug to the population gene biases, thus with the aim of making better drugs for that specific population of humans. Better can, for example, mean more efficacious, better neutralizing, higher target antigen affinity, less immunogenic, less patient reactions to the drug etc. This can be determined empirically, as is standard in drug research and development processes.
[0890] Thus, the invention provides the following embodiments (numbered from clause 345 onwards):--
[0891] 345. An isolated antibody for administration to a Chinese patient, the antibody comprising a human heavy chain, the heavy chain comprising a variable domain that is specific for an antigen and a constant region, wherein the constant region is a human constant region selected from a constant region (e.g., an IGHG constant region) in Table 13 found in a Chinese population and with a cumulative frequency of at least 1 or 5%; and wherein
[0892] (i) the variable domain is derived from the recombination of said human gene segments in a non-human vertebrate (e.g., in a mouse or a rat); and/or (ii) the variable domain comprises non-human vertebrate (e.g., mouse or rat) AID-pattern mutations and non-human vertebrate (e.g., mouse or rat) terminal deoxynucleotidyl transferase (TdT)-pattern mutations.
[0893] In another embodiment, the invention provides
[0894] An isolated antibody for administration to a Chinese patient, the antibody comprising a human heavy chain, the heavy chain comprising a variable domain that is specific for an antigen and a constant region, wherein the constant region is a human constant region selected from a constant region (e.g., an IGHG constant region) present in a Chinese population with a cumulative frequency of at least 5%;, and wherein
[0895] (i) the variable domain is derived from the recombination of said human gene segments in a non-human vertebrate (e.g., in a mouse or a rat); and/or (ii) the variable domain comprises non-human vertebrate (e.g., mouse or rat) AID-pattern mutations and non-human vertebrate (e.g., mouse or rat) terminal deoxynucleotidyl transferase (TdT)-pattern mutations.
[0896] In an example, the constant region is found in the 1000 Genomes database. In an example, the constant region is found in Table 13.
[0897] 346. The antibody of clause 345 wherein the constant region is a IGHG1a, IGHG2a, IGHG3a, IGHG3b or IGHG4a constant region.
[0898] 347. The antibody of clause 345 or 346, wherein the variable domain is derived from the recombination of a human VH gene segment with a human D gene segment and a human JH gene segment, the VH gene segment being selected from a VH in Table 13 found in a Chinese population and with a cumulative frequency of at least 5%.
[0899] In another embodiment, the invention provides
[0900] The antibody of clause 345 or 346, wherein the variable domain is derived from the recombination of a human VH gene segment with a human D gene segment and a human JH gene segment, the VH gene segment being selected from a VH present in a Chinese population with a cumulative frequency of at least 5%.
[0901] In an example, the gene segment is found in the 1000 Genomes database. In an example, the gene segment is found in Table 13.
[0902] 348. The antibody of clause 345, 346 or 347, wherein the variable domain is derived from the recombination of a human VH gene segment with a human D gene segment and a human JH gene segment, the D gene segment being selected from a D in Table 13 found in a Chinese population and with a cumulative frequency of at least 5%.
[0903] In another embodiment, the invention provides
[0904] The antibody of clause 345, 346 or 347, wherein the variable domain is derived from the recombination of a human VH gene segment with a human D gene segment and a human JH gene segment, the D gene segment being selected from a D present in a Chinese population with a cumulative frequency of at least 5%.
[0905] In an example, the gene segment is found in the 1000 Genomes database. In an example, the gene segment is found in Table 13.
[0906] 349. The antibody of clause 345, 346, 347 or 348 wherein the variable domain is derived from the recombination of a human VH gene segment with a human D gene segment and a human JH gene segment, the JH gene segment being selected from a JH in Table 13 found in a Chinese population and with a cumulative frequency of at least 5%.
[0907] In another embodiment, the invention provides
[0908] The antibody of clause 345, 346, 347 or 348 wherein the variable domain is derived from the recombination of a human VH gene segment with a human D gene segment and a human JH gene segment, the JH gene segment being selected from a JH present in a Chinese population with a cumulative frequency of at least 5%.
[0909] In an example, the gene segment is found in the 1000 Genomes database. In an example, the gene segment is found in Table 13.
[0910] 350. An isolated VH domain identical to a variable domain as recited in any one of clauses 347 to 349, optionally fused at its C-terminus to a polypeptide (e.g., an antibody Fc).
[0911] In an embodiment, there is provided an isolated VH domain identical to a variable domain as recited in any one of clauses 347 to 349 which is part of a conjugate, conjugated with a label (e.g., for imaging in the patient) or a toxin (e.g., a radioactive toxic payload, such as for cancer treatment in the patient) or a half-life-extending moiety (e.g., PEG of human serum albumin).
[0912] 351. A pharmaceutical composition comprising the antibody or variable domain of any one of clauses 345 to 350 together with a pharmaceutically-acceptable excipient, diluent or a medicament (e.g., a further antigen-specific variable domain, antibody chain or antibody).
[0913] 352. An isolated antibody for administration to a Chinese patient, the antibody comprising a human heavy chain, the heavy chain comprising a variable domain that is specific for an antigen and a constant region, wherein the variable domain is derived from the recombination of a human VH gene segment with a human D gene segment and a human JH gene segment, the VH gene segment being selected from a VH in Table 13 found in a Chinese population and with a cumulative frequency of at least 5%; and wherein
[0914] (i) the variable domain is derived from the recombination of said human gene segments in a non-human vertebrate (e.g., in a mouse or a rat); and/or (ii) the variable domain comprises non-human vertebrate (e.g., mouse or rat) AID-pattern mutations and non-human vertebrate (e.g., mouse or rat) terminal deoxynucleotidyl transferase (TdT)-pattern mutations.
[0915] In another embodiment, the invention provides
[0916] An isolated antibody for administration to a Chinese patient, the antibody comprising a human heavy chain, the heavy chain comprising a variable domain that is specific for an antigen and a constant region, wherein the variable domain is derived from the recombination of a human VH gene segment with a human D gene segment and a human JH gene segment, the VH gene segment being selected from a VH present in a Chinese population with a cumulative frequency of at least 5%; and wherein
[0917] (i) the variable domain is derived from the recombination of said human gene segments in a non-human vertebrate (e.g., in a mouse or a rat); and/or (ii) the variable domain comprises non-human vertebrate (e.g., mouse or rat) AID-pattern mutations and non-human vertebrate (e.g., mouse or rat) terminal deoxynucleotidyl transferase (TdT)-pattern mutations.
[0918] 353. The antibody of clause 352, wherein the variable domain is derived from the recombination of a human VH gene segment with a human D gene segment and a human JH gene segment, the D gene segment being selected from a D in Table 13 found in a Chinese population and with a cumulative frequency of at least 5%.
[0919] In another embodiment, the invention provides
[0920] The antibody of clause 352, wherein the variable domain is derived from the recombination of a human VH gene segment with a human D gene segment and a human JH gene segment, the D gene segment being selected from a D present in a Chinese population with a cumulative frequency of at least 5%.
[0921] In an example, the gene segment is found in the 1000 Genomes database. In an example, the gene segment is found in Table 13.
[0922] 354. The antibody of clause 352 or 353, wherein the variable domain is derived from the recombination of a human VH gene segment with a human D gene segment and a human JH gene segment, the JH gene segment being selected from a JH in Table 13 found in a Chinese population and with a cumulative frequency of at least 5%.
[0923] In another embodiment, the invention provides
[0924] The antibody of clause 352 or 353, wherein the variable domain is derived from the recombination of a human VH gene segment with a human D gene segment and a human JH gene segment, the JH gene segment being selected from a JH present in a Chinese population with a cumulative frequency of at least 5%.
[0925] In an example, the gene segment is found in the 1000 Genomes database. In an example, the gene segment is found in Table 13.
[0926] 355. An isolated VH domain identical to a variable domain as recited in any one of clauses 352 to 354, optionally fused at its C-terminus to a polypeptide (e.g., an antibody Fc).
[0927] In an embodiment, there is provided a VH domain identical to a variable domain as recited in any one of clauses 352 to 354 which is part of a conjugate, conjugated with a label (e.g., for imaging in the patient) or a toxin (e.g., a radioactive toxic payload, such as for cancer treatment in the patient) or a half-life-extending moiety (e.g., PEG of human serum albumin).
[0928] 356. A pharmaceutical composition comprising the antibody or variable domain of any one of clauses 352 to 355 together with a pharmaceutically-acceptable excipient, diluent or a medicament (e.g., a further antigen-specific variable domain, antibody chain or antibody).
[0929] 357. An antibody heavy chain or VH domain (e.g., provided as part of an antibody) for therapy and/or prophylaxis of a disease or medical condition in a Chinese patient, wherein the heavy chain is a heavy chain produced by the following steps (or is a copy of such a heavy chain):--
[0930] (a) Selection of an antigen-specific antibody heavy chain or VH domain from a non-human vertebrate (e.g., a mouse or a rat), wherein the heavy chain or VH domain is derived from the recombination of a human VH gene segment with a human D gene segment and a human JH gene segment, the VH gene segment being selected from a VH in Table 13 found in a Chinese population and with a cumulative frequency of at least 5%;
[0931] (b) Optional humanization of the heavy chain by combining the variable domain of the heavy chain with a human constant region; or optional humanization of the selected VH domain by combining with a human constant region.
[0932] In another embodiment, the invention provides
[0933] An antibody heavy chain or VH domain (e.g., provided as part of an antibody) for therapy and/or prophylaxis of a disease or medical condition in a Chinese patient, wherein the heavy chain is a heavy chain produced by the following steps (or is a copy of such a heavy chain):--
[0934] (a) Selection of an antigen-specific antibody heavy chain or VH domain from a non-human vertebrate (e.g., a mouse or a rat), wherein the heavy chain or VH domain is derived from the recombination of a human VH gene segment with a human D gene segment and a human JH gene segment, the VH gene segment being selected from a VH present in a Chinese population with a cumulative frequency of at least 5%;
[0935] (b) Optional humanization of the heavy chain by combining the variable domain of the heavy chain with a human constant region; or optional humanization of the selected VH domain by combining with a human constant region.
[0936] In an example, the VH gene segment is found in the 1000 Genomes database. In an example, the gene segment is found in Table 13.
[0937] 358. The antibody heavy chain or VH domain of clause 357, wherein the human constant region is as recited in clause 345 or 346.
[0938] 359. An antibody heavy chain or VH domain as recited in clause 357 or 358 for use in a medicament for therapy and/or prophylaxis of a disease or medical condition in a Chinese patient.
[0939] 360. A method of treating and/or preventing a disease or medical condition in a Chinese patient, the method comprising administering to the patient a therapeutically or prophylactically-effective amount of the antibody heavy chain or VH domain as recited in clause 357 or 358.
[0940] 361. An isolated antibody for administration to a patient of European, East Asian, West African, South Asian or Americas ancestry, the antibody comprising a human heavy chain, the heavy chain comprising a variable domain that is specific for an antigen and a constant region, wherein the constant region is a human constant region selected from a constant region (e.g., an IGHG constant region) in Table 13 found in a population of European, East Asian, West African, South Asian or Americas ancestry respectively and with a cumulative frequency of at least 1 or 5%;, and wherein
[0941] (i) the variable domain is derived from the recombination of said human gene segments in a non-human vertebrate (e.g., in a mouse or a rat); or (ii) the variable domain comprises non-human vertebrate (e.g., mouse or rat) AID-pattern mutations and non-human vertebrate (e.g., mouse or rat) terminal deoxynucleotidyl transferase (TdT)-pattern mutations.
[0942] In another embodiment, the invention provides
[0943] An isolated antibody for administration to a patient of European, East Asian, West African, South Asian or Americas ancestry, the antibody comprising a human heavy chain, the heavy chain comprising a variable domain that is specific for an antigen and a constant region, wherein the constant region is a human constant region selected from a constant region (e.g., an IGHG constant region) present in a population of European, East Asian, West African, South Asian or Americas ancestry respectively with a cumulative frequency of at least 1 or 5%;, and wherein
[0944] (i) the variable domain is derived from the recombination of said human gene segments in a non-human vertebrate (e.g., in a mouse or a rat); or (ii) the variable domain comprises non-human vertebrate (e.g., mouse or rat) AID-pattern mutations and non-human vertebrate (e.g., mouse or rat) terminal deoxynucleotidyl transferase (TdT)-pattern mutations.
[0945] In an example, the constant region is found in the 1000 Genomes database. In an example, the constant region is found in Table 13.
[0946] 362. The antibody of clause 361 wherein the constant region is a IGHG1a, IGHG2a, IGHG3a, IGHG3b or IGHG4a constant region and the patient is of European ancestry.
[0947] 363. The antibody of clause 361 or 362, wherein the variable domain is derived from the recombination of a human VH gene segment with a human D gene segment and a human JH gene segment, the VH gene segment being selected from a VH in Table 13 found in said population and with a cumulative frequency of at least 1 or 5%.
[0948] In another embodiment, the invention provides
[0949] The antibody of clause 361 or 362, wherein the variable domain is derived from the recombination of a human VH gene segment with a human D gene segment and a human JH gene segment, the VH gene segment being selected from a VH present in a Chinese population with a cumulative frequency of at least 5%.
[0950] In an example, the gene segment is found in the 1000 Genomes database. In an example, the gene segment is found in Table 13.
[0951] 364. The antibody of clause 361, 362 or 363, wherein the variable domain is derived from the recombination of a human VH gene segment with a human D gene segment and a human JH gene segment, the D gene segment being selected from a D in Table 13 found in said population and with a cumulative frequency of at least 1 or 5%.
[0952] In another embodiment, the invention provides
[0953] The antibody of clause 361, 362 or 363, wherein the variable domain is derived from the recombination of a human VH gene segment with a human D gene segment and a human JH gene segment, the D gene segment being selected from a D present in a Chinese population with a cumulative frequency of at least 5%.
[0954] In an example, the gene segment is found in the 1000 Genomes database. In an example, the gene segment is found in Table 13.
[0955] 365. The antibody of clause 361, 362, 363 or 364 wherein the variable domain is derived from the recombination of a human VH gene segment with a human D gene segment and a human JH gene segment, the JH gene segment being selected from a JH in Table 13 found in said population and with a cumulative frequency of at least 1 or 5%.
[0956] In another embodiment, the invention provides
[0957] The antibody of clause 361, 362, 363 or 364 wherein the variable domain is derived from the recombination of a human VH gene segment with a human D gene segment and a human JH gene segment, the JH gene segment being selected from a JH present in a Chinese population with a cumulative frequency of at least 5%.
[0958] In an example, the gene segment is found in the 1000 Genomes database. In an example, the gene segment is found in Table 13.
[0959] 366. An isolated VH domain identical to a variable domain as recited in any one of clauses 363 to 365, optionally fused at its C-terminus to a polypeptide (e.g., an antibody Fc).
[0960] 367. A pharmaceutical composition comprising the antibody or variable domain of any one of clauses 361 to 366 together with a pharmaceutically-acceptable excipient, diluent or a medicament (e.g., a further antigen-specific variable domain, antibody chain or antibody).
[0961] 368. An isolated antibody for administration to a patient of European, East Asian, West African or Americas ancestry, the antibody comprising a human heavy chain, the heavy chain comprising a variable domain that is specific for an antigen and a constant region, wherein the variable domain is derived from the recombination of a human VH gene segment with a human D gene segment and a human JH gene segment, the VH gene segment being selected from a VH in Table 13 found in a population of European, East Asian, West African, South Asian or Americas ancestry respectively and with a cumulative frequency of at least 1 or 5%; and wherein
[0962] (i) the variable domain is derived from the recombination of said human gene segments in a non-human vertebrate (e.g., in a mouse or a rat); or (ii) the variable domain comprises non-human vertebrate (e.g., mouse or rat) AID-pattern mutations and non-human vertebrate (e.g., mouse or rat) terminal deoxynucleotidyl transferase (TdT)-pattern mutations.
[0963] In another embodiment the invention provides:--
[0964] An isolated antibody for administration to a patient of European, East Asian, West African or Americas ancestry, the antibody comprising a human heavy chain, the heavy chain comprising a variable domain that is specific for an antigen and a constant region, wherein the variable domain is derived from the recombination of a human VH gene segment with a human D gene segment and a human JH gene segment, the VH gene segment being selected from a VH present in a population of European, East Asian, West African, South Asian or Americas ancestry respectively with a cumulative frequency of at least 1 or 5%; and wherein
[0965] (i) the variable domain is derived from the recombination of said human gene segments in a non-human vertebrate (e.g., in a mouse or a rat); or (ii) the variable domain comprises non-human vertebrate (e.g., mouse or rat) AID-pattern mutations and non-human vertebrate (e.g., mouse or rat) terminal deoxynucleotidyl transferase (TdT)-pattern mutations.
[0966] In an example, the VH gene segment is found in the 1000 Genomes database. In an example, the gene segment is found in Table 13.
[0967] 369. The antibody of clause 368, wherein the variable domain is derived from the recombination of a human VH gene segment with a human D gene segment and a human JH gene segment, the D gene segment being selected from a D in Table 13 found in said population and with a cumulative frequency of at least 1 or 5%.
[0968] In another example there is provided
[0969] The antibody of clause 368, wherein the variable domain is derived from the recombination of a human VH gene segment with a human D gene segment and a human JH gene segment, the D gene segment being selected from a D present in said population with a cumulative frequency of at least 1 or 5%.
[0970] In an example, the D gene segment is found in the 1000 Genomes database. In an example, the gene segment is found in Table 13.
[0971] 370. The antibody of clause 368 or 369, wherein the variable domain is derived from the recombination of a human VH gene segment with a human D gene segment and a human JH gene segment, the JH gene segment being selected from a JH in Table 13 found in said population and with a cumulative frequency of at least 1 or 5%.
[0972] In another example there is provided
[0973] The antibody of clause 368 or 369, wherein the variable domain is derived from the recombination of a human VH gene segment with a human D gene segment and a human JH gene segment, the JH gene segment being selected from a JH present in said population and with a cumulative frequency of at least 1 or 5%.
[0974] In an example, the JH gene segment is found in the 1000 Genomes database. In an example, the gene segment is found in Table 13.
[0975] 371. An isolated VH domain identical to a variable domain as recited in any one of clauses 368 to 370, optionally fused at its C-terminus to a polypeptide (e.g., an antibody Fc).
[0976] 372. A pharmaceutical composition comprising the antibody or variable domain of any one of clauses 368 to 371 together with a pharmaceutically-acceptable excipient, diluent or a medicament (e.g., a further antigen-specific variable domain, antibody chain or antibody).
[0977] 373. An antibody heavy chain or VH domain (e.g., provided as part of an antibody) for therapy and/or prophylaxis of a disease or medical condition in a patient of European, East Asian, West African, South Asian or Americas ancestry, wherein the heavy chain is a heavy chain produced by the following steps (or is a copy of such a heavy chain):--
[0978] (a) Selection of an antigen-specific antibody heavy chain or VH domain from a non-human vertebrate (e.g., a mouse or a rat), wherein the heavy chain or VH domain is derived from the recombination of a human VH gene segment with a human D gene segment and a human JH gene segment, the VH gene segment being selected from a VH in Table 13 found in said population and with a cumulative frequency of at least 1 or 5%;
[0979] (b) Optional humanization of the heavy chain by combining the variable domain of the heavy chain with a human constant region; or optional humanization of the selected VH domain by combining with a human constant region.
[0980] In another embodiment, there is provided:--
[0981] An antibody heavy chain or VH domain (e.g., provided as part of an antibody) for therapy and/or prophylaxis of a disease or medical condition in a patient of European, East Asian, West African, South Asian or Americas ancestry, wherein the heavy chain is a heavy chain produced by the following steps (or is a copy of such a heavy chain):--
[0982] (a) Selection of an antigen-specific antibody heavy chain or VH domain from a non-human vertebrate (e.g., a mouse or a rat), wherein the heavy chain or VH domain is derived from the recombination of a human VH gene segment with a human D gene segment and a human JH gene segment, the VH gene segment being selected from a VH present in said population with a cumulative frequency of at least 1 or 5%;
[0983] (b) Optional humanization of the heavy chain by combining the variable domain of the heavy chain with a human constant region; or optional humanization of the selected VH domain by combining with a human constant region.
[0984] In an example, the VH gene segment is found in the 1000 Genomes database. In an example, the gene segment is found in Table 13.
[0985] 374. The antibody heavy chain or VH domain of clause 373, wherein the human constant region is as recited in clause 361 or 362.
[0986] 375. An antibody heavy chain or VH domain as recited in clause 373 or 374 for use in a medicament for therapy and/or prophylaxis of a disease or medical condition in a patient of said ancestry.
[0987] 376. A method of treating and/or preventing a disease or medical condition in a patient of European, East Asian, West African, South Asian or Americas ancestry, the method comprising administering to the patient a therapeutically or prophylactically-effective amount of the antibody heavy chain or VH domain as recited in clause 373 or 374.
[0988] In embodiments herein, a Chinese patient can be a Han Chinese patient.
[0989] In embodiments herein, a patient of European ancestry can be a patient of Northern or Western European ancestry, Italian ancestry, British or Scottish ancestry, Finnish ancestry or Iberian ancestry.
[0990] In embodiments herein, a patient of East Asian ancestry can be a patient of Han Chinese ancestry, Japanese ancestry Chinese Dai ancestry, Vietnamese ancestry or Kinh ancestry.
[0991] In embodiments herein, a patient of West African ancestry can be a patient of Yoruba ancestry, Luhya ancestry, Gambian ancestry or Malawian ancestry.
[0992] In embodiments herein, a patient of Americas ancestry can be a patient of African American ancestry, African Caribbean ancestry, Mexican ancestry, Puerto Rican ancestry, Colombian ancestry or Peruvian ancestry.
[0993] In embodiments herein, a patient of South Asian ancestry can be a patient of Ahom ancestry, Kayadtha ancestry, Reddy ancestry, Maratha ancestry, or Punjabi ancestry.
[0994] In an example of any aspect, the cumulative frequency is at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90 or 95%.
[0995] It will be understood that particular embodiments described herein are shown by way of illustration and not as limitations of the invention. The principal features of this invention can be employed in various embodiments without departing from the scope of the invention. Those skilled in the art will recognize, or be able to ascertain using no more than routine study, numerous equivalents to the specific procedures described herein. Such equivalents are considered to be within the scope of this invention and are covered by the claims. All publications and patent applications mentioned in the specification are indicative of the level of skill of those skilled in the art to which this invention pertains. All publications and patent applications are herein incorporated by reference to the same extent as if each individual publication or patent application was specifically and individually indicated to be incorporated by reference. The use of the word "a" or an when used in conjunction with the term "comprising" in the claims and/or the specification may mean "one," but it is also consistent with the meaning of "one or more," "at least one," and "one or more than one." The use of the term or in the claims is used to mean "and/or" unless explicitly indicated to refer to alternatives only or the alternatives are mutually exclusive, although the disclosure supports a definition that refers to only alternatives and "and/or." Throughout this application, the term "about" is used to indicate that a value includes the inherent variation of error for the device, the method being employed to determine the value, or the variation that exists among the study subjects.
[0996] As used in this specification and claim(s), the words "comprising" (and any form of comprising, such as "comprise" and "comprises"), "having" (and any form of having, such as "have" and "has"), "including" (and any form of including, such as "includes" and "include") or "containing" (and any form of containing, such as "contains" and "contain") are inclusive or open-ended and do not exclude additional, unrecited elements or method steps
[0997] The term or combinations thereof as used herein refers to all permutations and combinations of the listed items preceding the term. For example, "A, B, C, or combinations thereof is intended to include at least one of: A, B, C, AB, AC, BC, or ABC, and if order is important in a particular context, also BA, CA, CB, CBA, BCA, ACB, BAC, or CAB. Continuing with this example, expressly included are combinations that contain repeats of one or more item or term, such as BB, AAA, MB, BBC, AAABCCCC, CBBAAA, CABABB, and so forth. The skilled artisan will understand that typically there is no limit on the number of items or terms in any combination, unless otherwise apparent from the context.
[0998] Any part of this disclosure may be read in combination with any other part of the disclosure, unless otherwise apparent from the context.
[0999] All of the compositions and/or methods disclosed and claimed herein can be made and executed without undue experimentation in light of the present disclosure. While the compositions and methods of this invention have been described in terms of preferred embodiments, it will be apparent to those of skill in the art that variations may be applied to the compositions and/or
[1000] methods and in the steps or in the sequence of steps of the method described herein without departing from the concept, spirit and scope of the invention. All such similar substitutes and modifications apparent to those skilled in the art are deemed to be within the spirit, scope and concept of the invention as defined by the appended claims.
[1001] The present invention is described in more detail in the following non limiting prophetic Examples.
EXAMPLES
Example 1
Recombineered BAC Vectors to Add Polymorphic V-Regions to the Mouse Genome
[1002] FIG. 1 through 3 depict recombineering methods (see references above) that can be used to introduce polymorphic V-gene regions into genomic DNA. In one embodiment, a genomic fragment from the human heavy chain region is inserted into a bacterial artificial chromosome (BAC) vector by standard techniques. Preferably, such a BAC, which can range in size from 20-kb to 200-kb or more, can be isolated from libraries of BACs by standard techniques including sequence searches of commercially available libraries or by hybridization to bacterial colonies containing BACs to identify those with a BAC of interest.
[1003] A BAC is chosen that has several VH gene segments; in FIG. 1, these are generically identified as VH[a] through VH[z] for example. One skilled in the art will readily identify appropriate genomic fragments, for example, an approximately 120-kb fragment from human VH5-78 through VH1-68 which includes 5 endogenous active VH gene segments and 7 VH pseudogenes. Using recombineering techniques, the endogenous VH gene segments can be replaced by polymorphic VH or VL gene segments. In this example, two steps are required. The first step replaces the V-region coding exon of an endogenous VH gene segment with a positive-negative selection operon, in this example, an operon encoding an ampicillin resistance gene (Amp) and a streptomycin-sensitizing ribosomal protein (rpsL). Certain strains of bacteria can be selected for the absence of the rpsL gene by resistance to streptomycin. Short stretches of DNA homologous to sequences flanking the endogenous VH gene exon are placed 5' and 3' of the rpsL-Amp operon. In the presence of appropriate recombination factors per standard recombineering techniques (see references above) recombination between the operon fragment and the BAC will result in replacement of the endogenous VH gene exon with the operon (FIG. 1a) which are selected by resistance to ampicillin. The second step uses the same homologous sequences in order to replace the inserted operon with a desired polymorphic VH gene segment. In this example, a human VH1-69 gene is inserted (FIGS. 1b and 1c). In particular the *02 variant of VH1-69 is used [ref IMGT and FIG. 5]. Successful integrations of the polymorphic VH gene segment are selected in bacteria that become resistant to streptomycin due to the loss of the operon, specifically the rpsL portion.
[1004] In this example, the two step process as described can be repeated for each of the endogenous VH gene segments or for as many endogenous gene segments that one wishes to replace with polymorphic V gene segments (FIG. 1d).
[1005] As is apparent, any polymorphic V gene segment can be inserted in this manner and any endogenous V gene segment can act as a target, including pseudogenes. V gene segments in each of the heavy chain and two light chain loci can be replaced using this technique with appropriate genomic fragments available as BAC inserts.
[1006] FIG. 2 depicts another method for creating a genomic fragment encoding polymorphic V gene segments. In this example, polymorphic V gene segments are inserted into a region of genomic DNA devoid of other genes, control elements or other functions. Such `desert` regions can be selected based on sequence analysis and corresponding DNA fragments cloned into BACs or identified in existing BAC libraries. Starting with such a genomic fragment, recombineering techniques can be used to insert polymorphic V gene segments at intervals of, for example, 10-kb. In this example, a 150-kb genomic fragment might accommodate insertion of up to 15 polymorphic V gene segments. Insertion of the segments is a two-step process. The first recombineering step inserts the rpsL-Amp operon at a specific site. Sequences homologous to a specific site are used to flank the operon. These are used by the recombineering system to insert the element specifically into the BAC genomic fragment and positive events are selected by resistance to ampicillin (FIG. 2a). The second step replaces the operon in the genomic fragment with a polymorphic V gene segment by a similar recombineering step using the same sequence homology (FIG. 2b). In this example, both
exons and promoter element of a polymorphic VH gene segment are inserted, resulting in replacement of the rpsL-Amp operon and therefore resistance to streptomycin (FIG. 2c).
[1007] The two step technique for inserting polymorphic V gene segments into a specific site on the genomic fragment can be repeated multiple times resulting in a BAC genomic fragment with several polymorphic gene segments, including their promoter elements. It is apparent that the examples shown in FIGS. 1 and 2 can be combined wherein the technique for insertion can be used to add extra polymorphic V gene segments to a BAC genomic fragment as depicted in FIG. 1. One might choose to add these extra segments to an IG genomic fragment since such a fragment would be more amenable to proper IG gene expression once inserted into a non-human mammal's genome. It is known that a genomic fragment can have elements such as enhancers or elements that contribute to certain chromatin conformations, both important in wild-type gene expression.
[1008] FIG. 3 depicts an additional method to create genomic fragments with polymorphic V gene segments. This method depends upon the efficiency with which short (around 50 to 150 bases, preferably 100 bases) single stranded DNA fragments recombine with a homologous sequence using recombineering (Nat Rev Genet. 2001 October; 2(10):769-79; Recombineering: a powerful new tool for mouse functional genomics; Copeland N G, Jenkins N A, Court D L). The recombinases used in recombineering preferentially bind and use such short single-stranded fragments of DNA as a substrate for initiating homologous recombination. The efficiency can be as high as 10-2, that is, a positive event can be found in approximately 100 randomly picked (not selected) clones resulting from recombineering. A positive event in this example occurring when one or more single nucleotide changes introduced into the single-stranded fragment get transferred to the BAC insert containing V gene segments and surrounding genomic DNA, said nucleotide change or changes occurring at a homologous sequence on the BAC.
[1009] Polymorphic V gene segments can differ from endogenous V gene segments by only 1 or 2, or up to 10 or 15 nucleotide changes, for example. An example of such nucleotide polymorphisms are depicted in FIG. 5. Short single stranded regions that encompass the polymorphic nucleotide changes can be chemically synthesized using standard techniques. The resulting single stranded DNA fragments are introduced into bacteria and via recombineering techniques approximately 1 in 100 BAC fragments will have incorporated the polymorphic nucleotides via homologous incorporation of the single stranded fragment (FIG. 3a). BACs with the desired nucleotide change can be identified by screening for example several hundred individual clones by polymerase chain reaction (PCR) amplification and sequencing, both by standard techniques. In the example, two nucleotide changes will convert a VH1-69*01 gene segment into a VH1-69*02 gene segment (FIG. 3b).
[1010] It is clear that this process can be repeated for multiple endogenous V gene segments contained on a single BAC genomic fragment. In addition, the techniques depicted in FIG. 2 can be used to add additional polymorphic V gene segments by insertion into regions between existing V gene segments. As would be evident to one skilled in the art, a combination of these techniques can be used to create numerous variations of both polymorphic and endogenous human V gene segments. And it would be evident that several different genomic fragments with engineered polymorphic V gene segments and endogenous human V gene segments can be combined to create even more variations.
Example 2
Adding Polymorphic V-Regions to the Genome Using SRMCE of Modified BACs
[1011] Modified BACs with polymorphic V gene segments created using the methods described in Example 1 can be used to alter the genome of non-human mammals. These alterations can result in an intact IG locus in which normal immunoglobin region recombination results in VDJ or VJ combinations which includes the human V gene segments. An example of how such an animal can be created is by altering the genome of, for example, mouse embryonic stem (ES) cells using the strategy outlined in FIG. 4.
[1012] One technique to integrate modified BACs with polymorphic V gene segments into a genome is sequential recombinase mediated cassette exchange (SRMCE). The technique is described in WO2011004192 (Genome Research Limited), which is incorporated here in its entirety by reference.
[1013] SRMCE provides for a locus modified with a `landing pad` inserted at a specific location. This insertion can either be de novo via homologous recombination or as a consequence of a previous BAC insertion. In this example, the landing pad is inserted in the mouse IGH locus between the most 3' J gene segment and the CA gene segment and a previous BAC insertion via SRMCE techniques have resulted in the addition of 5 human V gene segments and 2 V region pseudogenes. The landing pad has elements as shown in FIG. 4 that will allow the selection of correct insertion of a second targeting BAC fragment. The specificity of this insertion is provided by cre recombinase-mediated exchange between permissive lox sites. A lox site is permissive for recombination only with a compatible lox site. In this example, the loxP site will only recombine with loxP and lox2272 will only recombine with lox2272. This provides directionality to the insertion of the BAC fragment as depicted in FIGS. 4b and 4c.
[1014] ES cell clones with correct insertions are selected from a pool of clones without insertions or with non-productive insertions by resistance to puromycin. Resistance to puromycin results from the juxtaposition of an active promoter element, PGK, with the puroTK coding region. Correct insertions are verified by standard techniques including PCR of junctions, PCR of internal elements, Southern blotting, comparative genomic hybridization (CGH), sequencing and etc. In the example, correct lox2272-lox2272 and loxP-loxP recombination also results in two intact sets of piggyBac elements that did not exist prior to insertion. An intact piggyBac element is comprised of a set of inverted repeats which are depicted in the figure by "PB5'" and "PB3'". An appropriated oriented set of piggyBac elements are the substrate of piggyBac transposase which can catalyse recombination between the elements, resulting in deletion of intervening sequences as well as both elements. The DNA remaining after a piggyBac transposition is left intact and is lacking any remnant of the piggyBac element. In the example, ES cell clones with successful piggyBac transposition are selected by loss of the active puroTK element which renders the cells resistant to the drug FIAU (FIGS. 4c and 4d).
[1015] The final product of the SRMCE method in this example is a IGH locus with several polymorphic V gene segments inserted along with a set of endogenous unmodified VH gene segments between sequences of the mouse genome on the 5' side and the mouse IGH constant region gene segments on the 3' side. The polymorphic V gene segments are positioned such that they can participate in the recombination events associated with B cell maturation yielding VDJ gene segments. These gene
segments can then be transcribed and spliced to the mouse constant region. Translation of these transcripts will result in the production of an antibody heavy chain encoded by the polymorphic V gene segment, a human DH gene segment, a human JH gene segment and a mouse constant heavy chain gene segment.
[1016] As is well known to those skilled in the art, an ES cell clone can be used to create a line of genetically modified mice via injection of said cells into a mouse blastocyst embryo, transferring the injected embryo to a suitable recipient and breeding the chimeric offspring that result. The modified gene locus can be propagated through breeding and made either heterozygous or homozygous depending on the genetic cross.
[1017] It is evident from the structure of the IGH locus provided in this example and by knowledge of the mechanisms involved in B cell receptor (BCR) and antibody gene rearrangements that a large set of different combinations of polymorphic V gene segments with various DH and JH gene segments will result and these can contribute to a large repertoire of functional antibody genes in a population of B cells in genetically modified animals. In this example, several different human VH1-69 polymorphs are incorporated to provide superhuman VH diversity. This particular VH gene segment is known to be prevalent in antibodies that bind infectious disease pathogens (such as influenza virus) and therefore the antibody repertoire of a mouse with the genetic modification of this example would be expected to produce antibodies with a bias in favour of those that bind infectious disease pathogens. The repertoire, in other words, would have a larger subset of antibodies with superior affinities for pathogen antigens. Examples of such pathogens include influenza virus, hepatitis C virus (HCV) and human immunodeficiency virus-1 (HIV-1) (see also table above).
Example 3
Alignment of 13 VH1-69 Alleles
[1018] Building a more diverse antibody repertoire by incorporating additional V gene segment polymorphs requires availability of polymorphic variants of V gene segments. One source of such variants include sequence databases. In this example, 13 distinct variants of the VH1-69 gene segment are provided.
[1019] These variant sequences and comparisons are drawn from the "IMmunoGeneTics" IMGT Information System (www.imgt.com) database. FIG. 5 is a diagram of the alignment of variants *02 through *13 with the *01 variant. The VH1-69*01 nucleotide and amino acid sequence is provided at the top of the figure. Where the remaining variants are identical to the *01 variant sequence a dash is inserted below the sequence. Nucleotide differences are noted alongside the appropriate variant and if the sequence change results in a protein coding change, the amino acid change is indicated above the triplet.
[1020] FIG. 5 depicts between 1 and 4 amino acid changes for each variant in comparison to the *01 variant. All of the amino acid changes occur in the part of the heavy chain protein encoding the complementarity determining regions (CDRs). These regions are responsible for antigen specificity and the affinity of the antibody for the antigen. It is evident that providing additional polymorphic CDRs in a repertoire of antibodies will increase the likelihood of there being an antibody with superior binding characteristics for various antigens. In several reports, it has been observed that the VH1-69-encoded variable region of the heavy chain is often found in antibodies that bind influenza virus, HCV and HIV-1 antigens (see table above). Therefore incorporating the polymorphic V gene segments of this example into a transgenic animal model using the methods of Examples 1 and 2 would likely result in an antibody repertoire in said transgenic animal with more antibodies that bind to antigens associated with these and other pathogens. And as is known in the art, a larger repertoire increases the probability of finding monoclonal antibodies using, for example, hybridoma technology, that bind with high affinity and specificity to a desired antigen.
[1021] This disclosure therefore describes in these examples a transgenic mouse model which can be immunized with pathogen or other antigens. Plasma B cells from such an immunized mouse can be used to make a hybridoma library that can be screened for production of antibodies that bind the pathogen antigens. This library will be superior to libraries from traditional transgenic mice for finding such antibodies given the addition of polymorphic VH1-69 gene segments to the IGH locus in said transgenic mouse.
[1022] These examples are not limiting to the human polymorphic V gene segments that can be chosen or to the methods used to introduce them into an animal model. The method can be used to construct a transgenic locus with immunoglobulin D and/or J segments. The V, D, J segments can be from a plurality of human sources (optionally more than one human ethnic population).
Example 4
Human IgH JH Gene Variants Selected from the 1000 Genomes Database
[1023] Data is presented for human JH2, 5 and 6 variants. In Tables 10A, 11A and 12A samples from humans from various populations are listed where the sequence analysis of the inventors has revealed the presence of polymorphisms in one or both IgH JH alleles. The population codes are explained in Table 8 above. The polymorphisms are nucleotide variants from JH2, 5 and 6 reference sequences (SEQ ID NOs: 1, 2 and 3 respectively; see below). All references are sequences taken from the Ensembl database (www.ensembl.org). The JH5 reference is human IgH J5-001 disclosed in that database. The JH6 reference is human IgH J6-001 disclosed in that database. The JH2 reference is human IgH J2-001 disclosed in that database.
[1024] The reference nucleotide and encoded amino acid sequences are shown on the next page. Alignments with encoded amino acid sequences are also provided, including the corresponding position numbers on human chromosome 14.
[1025] Variant Frequencies are shown in Tables 10A, 11A and 12A and these relate to the frequency of the variants in the 1000 Genomes Database (release current at October 2011).
[1026] Tables 10B, 11B and 12B show the non-synonymous nucleotide polymorphisms in the human JH variants, as sorted by the present inventors from the 1000 Genomes database. Position numbers corresponding to nucleotide positions on human chromosome 14 are shown for variant positions (chromosome 14 being the chromosome bearing the IgH locus in humans). Thus, for example, the first entry in Table 11B is "14:106330027:A/C" which refers to a position in a variant JH5 sequence wherein the position corresponds to position 106,330,027 on human chromosome 14, such position being A (adenine) in the reference sequence. The "C" indicates that the present inventors observed a mutation to cytosine at this position in the variants found in the 1000 Genomes database. This change leads to a change at the amino acid level of the encoded sequence (i.e., a "non-synonymous" change), in this case a change from a serine (found in the reference) to an alanine in the variant.
Example 5
Human Antibody Gene Segment Variant Identification & Population Analysis
[1027] The genomic coding region coordinates for each target gene for variant analysis were identified from the Ensembl WWW site (www.ensembl.org) using coordinates from the GRCh.p8 Human Genome assembly (www.ncbi.nlm.nih.gov/projects/genome/assembly/grc). Using the collected gene location coordinates, variant data was extracted from the public ftp site of the 1000 Genomes Project using the Perl `Variant Pattern Finder` (VPF--www.1000genomes.org/variation-pattern-finder-api-documentation).
[1028] Data extracted by VPF was post processed using software to extract all non-synonymous (NSS) variants with their associated genotype calls. Genotypes calls were assembled to form unique haplotypes, representing groups of NSS variants associated with 1000 Genome population groups and frequency of occurrence within those populations.
[1029] The output of the analysis results in tables such as in Table 13. The main body of the table describes each haplotype in turn giving a unique ID for that gene (in the range a-z,aa-zz), the population frequencies and occurrence in individuals and unique population groups; one or more subsequent columns describe the DNA base calls at each location that form the haplotype giving both the base from the reference sequence or the variant base call.
[1030] Table 13 was constructed in this manner. The table can be read as follows:
[1031] The first four columns (left to right) consist of (1) the haplotype ID letter ('ref indicates reference--the DNA base call at each genomic location from the GRCh37 Human Reference Assembly) (2) the observed cumulative frequency of the haplotype among the different populations (3) the number of individuals in which a specific haplotype was observed (4) the number of unique population groups that the identified individuals belong to (the actual population group identifiers are displayed as a string of ID's in the most right hand column for each haplotype. For example haplotype `a` has a population ID string of `3,4,9,13`).
[1032] The populations are numbered as follows (population labels being according to 1000 Genomes Project nomenclature)
1=ASW;
2=CEU;
3=CHB;
4=CHS;
5=CLM;
6=FIN;
7=GBR;
8=IBS;
9=JPT;
10=LWK;
11=MXL;
12=PUR;
13=TSI;
14=YRI.
[1033] Subsequent columns detail a single point variant and have the following format (top to bottom) (1) the human genomic location of the variant (format [chromosome number]: [location] e.g. `14:106204113`); (2) The identifier for the point variant as defined in DbSNP (www.ncbi.nlm.nih.gov/projects/SNP/); (3) One or additional rows show the amino acid change as result of the variant for a specific transcript (denoted by the Ensembl transcript ID in the most right-hand column for each row), the format is the amino acid in the reference sequence followed by `->` and the amino acid caused by the substitution of the variant in the reference sequence (e.g. `Gly->Arg` means a that the translated reference sequence would result in a glycine at that location, whereas the substitution of the identified variant would result in translated protein containing arginine) using the IUPAC three letter amino acid codes (http://pac.iupac.org/publications/pac/pdf/1972/pdf/3104×0639.pdf). Subsequent rows (one per haplotype) show the DNA base at each location, bases matching the reference sequence are shown in black on white back ground, bases varying from the reference are shown as white text on a black background.
[1034] The most right-hand column contains the Ensembl transcript ID's (e.g. `ENST00000390542`) for each of the gene transcript and relates to the amino acid changes to the left of this column.
[1035] Because the transcripts are differing lengths each variant position may or may not have an associated amino acid change at the that position.
Example 6
Transgenic Mice. B-cells. Hebridomas. Antibodies & Heavy Chains Based on Human JH6*02
[1036] A functional human gene segment repertoire (from VH2-26 to JH6, see the IMGT database for the structure of the human IgH locus;
http://www.imgt.org/IMGTrepertoire/index.php?section=LocusGenes&repertoir- e=locus&species=human&group=IGK) was sectored by the inventors to produce two different transgenic heavy chain alleles (denoted S2F and S3F) and corresponding mice. The transgenic alleles were expressed in the mice and the heavy chain repertoires were assessed at the RNA transcript level. Deep sequence analysis was carried out using Bioinformatics methods to assess V, D and JH gene usage, including in variable domain sequences having a HCDR3 length of at least 20 amino acids. Endogenous, mouse variable region gene segments were inactivated by inversion (as per the method described in WO2011004192, this disclosure being incorporated herein by reference).
Sequencing of Human Donor DNA Samples: Identification of Conserved JH6*02 Variant
[1037] DNA samples from 9 anonymised consenting human donors were obtained by taking cheek swabs.
[1038] The samples were processed and the DNA Samples were extracted follow the protocol of QIAamp DNA Mini Kit (Cat. No. 51304, Qiagen).
[1039] PCR reactions were set up to amplify the JH6 region and PCR products were sequenced (PCR Oligos sequence: Fwd. 5'-AGGCCAGCAGAGGGTTCCATG-3' (SEQ ID NO: 444), Rev. 5'-GGCTCCCAGATCCTCAACCCAC-3' (SEC) ID NO: 445)).
[1040] Sequence analysis was carried out by comparing to the JH6 reference sequence from IMGT annotated database (http://www.imgt.org/) and this identified that all 9 donor genomes contained the human JH6*02 variant, with this variant being in the homozygous state in 7 out of the 9 donors. The inventors also consulted the genomic sequences publicly available for Jim Watson and Craig Venter at Ensembl human genome database [http://www.ensembl.org/]. These too contained the human JH6*02 variant. This confirmed to the inventors that human JH6*02 is a common, conserved variant in humans, and thus a good candidate for construction of a transgenic IgH locus as per the invention
Identification of Suitable Human DNA Sequence BACs
[1041] A series of human bacterial artificial chromosome (BAC) clones were identified from Ensemble (http://www.ensembl.org/index.html) or UCSC (http://genome.ucsc.edu/) human database searches based on gene name (IGH) or location (chromosome 14: 106026574-107346185). Seven human RP11 BAC clones (see an extract of the UCSC database in FIG. 10, identified BACs being circled) were selected, RP11-1065N8 BAC carrying human JH6*02. In total, the following BACs were identified as sources of human IgH locus DNA: RP11-1065N8, RP11-659B19, RP11-14117, RP-112H5, RP11-101G24, RP11-12F16 and RP11-47P23.
[1042] With a similar approach, different BAC clones (e.g., different RP11 clone IDs or different sources from RP11) or genetically engineered BACs can be selected for insertion into the mouse IGH locus to provide different sets of human repertoires in the transgenic mouse.
Construction of Transgenic IgH Loci
[1043] Insertion of human heavy gene segments from a 1st IGH BAC (RP11-1065N8) into the IGH locus of mouse AB2.1 ES cells (Baylor College of Medicine) was performed to create a heavy chain allele denoted the S1 allele. The inserted human sequence corresponds to the sequence of human chromosome 14 from position 106494908 to position 106328951 and comprises functional heavy gene segments VH2-5, VH7-4-1, VH4-4, VH1-3, VH1-2, VH6-1, D1-1, D2-2, D3-9, D3-10, D4-11, D5-12, D6-13, D1-14, D2-15, D3-16, D4-17, D5-18, D6-19, D1-20, D2-21, D3-22, D4-23, D5-24, D6-25, D1-26, D7-27, JH1, JH2, JH3, JH4, JH5 and JH6 (in 5' to 3' order), wherein the JH6 was chosen to be the human JH6*02 variant. The insertion was made between positions 114666435 and 114666436 on mouse chromosome 12, which is upstream of the mouse C region. The mouse VH, D and J H gene segments were retained in the locus, immediately upstream of (5' of) the inserted human heavy chain DNA.
[1044] A second allele, S2 was constructed in which more human functional VH gene segments were inserted upstream (5') of the 5'-most VH inserted in the 51 allele by the sequential insertion of human DNA from a second BAC (BAC2). The inserted human sequence from BAC2 corresponds to the sequence of human chromosome 14 from position 106601551 to position 106494909 and comprises functional heavy chain gene segments VH3-13, VH3-11, VH3-9, VH1-8, VH3-7. The mouse VH, D and JH gene segments were retained in the locus, immediately upstream of (5' of) the inserted human heavy chain DNA. In a subsequent step, these were inverted to inactivate them, thereby producing S2F mice in which only the human heavy chain variable region gene segments are active.
[1045] A third allele, S3 was constructed in which more human functional VH gene segments were inserted upstream (5') of the 5'-most VH inserted in the S2 allele by the sequential insertion of human DNA from a third BAC (BAC3). The inserted sequence corresponds to the sequence of human chromosome 14 from position 106759988 to position 106609301, and comprises functional heavy chain gene segments, VH2-26, VH1-24, VH3-23, VH3-21, VH3-20, VH1-18, and VH3-15. The mouse VH, D and JH gene segments were retained in the locus, immediately upstream of (5' of) the inserted human heavy chain DNA. In a subsequent step, these were inverted to inactivate them, thereby producing S3F mice in which only the human heavy chain variable region gene segments are active.
[1046] Mice bearing either the S2F or S3F insertion into an endogenous heavy chain locus were generated from the ES cells using standard procedures. The other endogenous heavy chain locus was inactivated in the mice by insertion of an inactivating sequence comprising neoR into the mouse JH-C intron (to produce the "HA" allele).
Immunisation Procedure
[1047] Transgenic mice of the S2F or S3F genotype were primed with 20-40 ug recombinant proteins obtained commercially or produced in house with Antigen 1 (OVA (Sigma A7641); Antigen 2 (a human infectious disease pathogen antigen) and Antigen 3 (a human antigen) via the ip route in complete Freunds adjuvant (Sigma F 5881) and 10 ug/animal CpG (CpG oligo; Invivogen, San Diego, Calif., USA) and then boosted twice in about two weekly intervals with about half the amount of antigen in incomplete Freunds adjuvant (Sigma F 5506) and 10 ug/animal CpG. Final boosts were administered two weeks later iv without any adjuvant and contained 5-10 ug protein in PBS.
Hybridoma Fusion Procedure
[1048] Spleens were taken 3 days after the final boost and spleenocytes were treated with CpG (25 m final concentration) for and left until the following day. Cells were then fused with SPO/2 Ag14 myeloma cells (HPA Cultures Cat No 85072401) using a BTX ECM2001 electrofusion instrument. Fused cells were left to recover for 20 minutes then seeded in a T75 flask until next morning. Then the cells were spun down and plated out by dilution series on 96-well culture plates and left for about 10 days before screening. Media was changed 1-3 times during this period.
Screening
[1049] Culture supernatants of the hybridoma wells above were screened using homogenious time resolved fluorescence assay (htrf) using Europium cryptate labelled anti-mouse IgG (Cisbio anti-mouse Ig Europium Cryptate) and a biotin tagged target antigen with a commercially available streptavidin conjucated donor (Cisbio; streptaviding conjugated D2) or by IgG-specific 384 well ELISA. Positive wells identified by htrf were scaled to 24-well plates or immediately counterscreened using an IgG-specific detection ELISA method. Positives identified by primary ELISA screen were immediately expanded to 24-well plates. Once cultures were expanded to 24-well stage and reached conflueny, supernatants were re-tested using htrf or IgG-specific ELISA to confirm binding to target antigen. Supernatant of such confirmed cultures were then also analysed by surface plasmon resonance using a BioRad ProteOn XPR36 instrument. For this, antibody expressed in the hybridoma cultures was captured on a biosensor GLM chip (BioRad 176-512) which had an anti-mouse IgG (GE Healthcare BR-1008-38)) covalently coupled the biosensor chip surface. The antigen was then used as the analyte and passed over the captured hybridoma antibody surface. For Antigen 2 and Antigen 3, concentrations of 256 nM, 64 nM, 16 nM, 4 nM and 1 nM were typically used, for Antigen 1, concentrations of 1028 nM, 256 nM, 64 nM, 16 nM and 4 nM were typically used, binding curves were double referenced using a 0 nM injection (i.e. buffer alone). Kinetics and overall affinities were determined using the 1:1 model inherent to the BioRad ProteOn XPR36 analysis software.
[1050] Any clones with confirmed binding activity were used for preparing total RNA and followed by PCR to recover the heavy chain variable region sequences. Standard 5'-RACE was carried out to analyse RNA transcripts from the transgenic heavy chain loci in the S2F and S3F mice. Additionally, deep sequence analysis of almost 2000 sequences produced by the mice was carried out.
Bioinformatics Analysis
[1051] Sequences for analysis were obtained from two different methods:
[1052] The first is from RNA extracted from the spleen: first cDNA strand was synthesized using an oligo based on the Cmu region of the mouse IGH locus as a PCR template. PCR was performed using this oligo with an oligo dT-anchor primer. Then PCR product was cloned into pDrive vector (Qiagen) and then sequenced.
[1053] The second is from hybridomas generated through electro-fusion: total RNA was extracted from hybridoma lines of interest using standard Trizol methods and frozen at -80 oC for long term storage. cDNA was generated from 100 ng total RNA using standard Superscript III reverse transcriptase and a gene-specific reverse primer binding to all mouse IgG isotypes for heavy chain and a mouse kappa constant region primer for the light chain amplification. 2-3 ul of cDNA were then used as template in a PCR reaction using Pfu DNA polymerase and a panel of degenerate forward primers annealing to the leader sequence of the human immunoglobulin variable domain as well as one mouse pan-IgG reverse primer. PCR products were run out of a 1% agarose gel and bands of approximately 350-450 basepairs extracted and purified. DNA was then sequenced.
[1054] The sequences from the first method can either be from IgM from Naive mice or IgG from immunised mice. The samples from the second method are all from IgG from immunised mice, and specific to the immunizing antigen. Almost 2000 sequences were analysed.
[1055] The sequences were obtained as a pair of forward and reverse reads. These were first trimmed to remove low-quality base calls from the ends of the reads (trimmed from both ends until a 19 nucleotide window had an average quality score of 25 or more). The reads were combined together by taking the reverse complement of the reverse read, and aligning it against the forward read. The alignment scoring was 5 for a match, -4 for a mismatch, a gap open penalty of 10 and a gap extension penalty of 1. A consensus sequence was then produced by stepping through the alignment and comparing bases. When there was a disagreement the base with the highest quality value from sequencing was used.
[1056] The BLAST© (Basic Local Alignment Search Tool) (Camacho C., Coulouris G., Avagyan V., Ma N., Papadopoulos J., Beeler K., & Madden T. L. (2008) "BLAST©: architecture and applications."BMC Bioinformatics 10:421 http://www.ncbi.nlm.nih.gov/pubmed/20003500) program `blastn` was then used to find the germline J and V segments used in each sequence. A wordsize of 30 was used for V matching, and 15 for J matching. The database searched against was constructed from the NGS sequencing of the BACs which were used to generate the Kymouse.
[1057] If a sequence matched both a V and a J segment, the sequence between the two was then compared to a database of germline D segments in the mouse using `blastn` with a wordsize of 4 and the options `blastn-short` and `ungapped`. This was used to assign a D segment, if possible. The CDR3 was identified by searching for the conserved "TATTACTGT" sequence in the V segment, and the "CTGGGG" in the J segment. If these motifs were not found, then up to 4 mismatches were allowed. The IMGT definition of CDR3 was used, so the CDR3 length is calculated from after the "TGT" in the V to before the "TGG" in the J. Sequences with an out of frame junction (those which do not have a CDR3 nucleotide length divisible by 3) or which contained a stop codon ("TAA", "TAG" or "TGA") were excluded.
[1058] The identity of the matching V, J and D segments as well as the CDR3 length from this assignment were then saved as a table for downstream analysis. The ratio of IGHJ6*02 used increased from the naive to immunised mice, as well as being enriched in the sub-population of sequences with a long HCDR3 (defined as consisting of 20 or more amino acids):
TABLE-US-00003 All HCDR3 > 20 Total Total JH6 * 02% Count JH6 * 02% Count % HCDR3 > 20 Naive 22.31% 1340 91.11% 45 3.36% Immunised 37.50% 256 66.67% 9 3.52% Hybridoma 36.13% 119 63.64% 11 9.24%
[1059] This shows that the JH6*02 gene segment is selected for by immunisation, as the proportion of JH6*02 usage increases after immunisation. JH6*02 is also used in the majority of antibodies with a long HCDR3 length, which is desirable for targets which are specifically bound by long HCDR3 length antibodies.
TABLE-US-00004 SEQ ID NO: 1 (JH5 Reference) T T G A C C A A G C T G G G G A C C C C G G T C C C T T G G G A C C A G T G G C A G A G G A G T C JH5 Alignment: (top line = SEQ ID NO: 1, Middle line = SEQ ID NO: 5, Bottom line = SEQ ID NO: 6) LgH J5001 106, 330, 072 106, 330, 071 106, 330, 068, 106, 106, 106, 106, 106, 106, 106, 106, 330, 041 106, 330, 082 106, 330, 027, 106, 330, 024 330, 067 330, 065 330, 065 330, 063 330, 062 330, 045, 330, 044 T T G A C C A A G C T G G G G A C C C C G C C T T G G G A C C A G T G G C A G A G G- A A C T G G T T C G A C C C C T G G C T G G -- C A C C G T C C T C A G W P W G Q G T L V T V S S SEQ ID NO: 2(JH6Reference) A T G A T G A T G A T G A T G A T G T A C C T G C A G A C C C C G T T T C C C T G G T G C C A G T G G C A G A G G A G T JH6 Alignment: (top line = SEQ ID NO: 2, Middle line = SEQ ID NO: 7, Bottom line = SEQ ID NO: 8) LgH J6001 106, 329, 468 106, 329, 453 106, 329, 452, 106, 329, 451 106, 329, 435 T T G A C C A A G C T G G G G A C C C C G C C T T G G G A C C A G T G G C A G A G G- A A 106, 329, 426 106, 106, 106, 106, 106, 106, 329, 413 329, 411 329, 408 329, 419 329 417 329, 414 C C C T G G T G C C A G T G G C A G A G G A G T C G G G A C C A C G G T C A C C G T C T C C T C A G G T T V T V S S SEQ ID NO: 3 JH 2 Reference) A T G A C C A T G A A G C T A G A G A C C C C G G C A C C G T G G G A C C A G T G A C A G A G G A G T C JH2 Alignment: (top line = SEQ ID NO: 3, Middle line = SEQ ID NO: 9, Bottom line = SEQ ID NO: 10) 106, 106, 106, IgH J5-001 331, 460 331, 455 331, 453 106, 331, 453 A T G A C C A T G A A G C T A G A G A C C C C G G C A C C G T G G G A C C A G T G A C A G A G G A G T C T A C T G G -- A C T T C G A -- C T C T G G G G C C G T G G C A C C C T G G T C A C T G T C T C C T C A G Y W Y F D L W G R G T L V T V S S
TABLES
[1060] In the tables, the notation is illustrated by the following example
TABLE-US-00005 IGLV1 40 G1 40 * 02 X53936 :g9 > Ic1) > g, L4 > VI
[1061] Polymorphic variant IGV lambda VI-40*02 has Genbank Accession No. X53936 and when compared to the *01 variant, the VI-40*02 variant has mutations at positions 9, 10 and 4. For example, at position 9, a "C" appears instead of a "G" that is present in the *01 variant. The "|" is simply a notation separator, and does not indicate any mutation. For example the "g282|" notation indicates no change (i.e., position 282 is a g). "del#" means that the residue at that position is absent.
TABLE-US-00006 Lengthy table referenced here US20140041067A1-20140206-T00001 Please refer to the end of the specification for access instructions.
TABLE-US-00007 Lengthy table referenced here US20140041067A1-20140206-T00002 Please refer to the end of the specification for access instructions.
TABLE-US-00008 Lengthy table referenced here US20140041067A1-20140206-T00003 Please refer to the end of the specification for access instructions.
TABLE-US-00009 Lengthy table referenced here US20140041067A1-20140206-T00004 Please refer to the end of the specification for access instructions.
TABLE-US-00010 Lengthy table referenced here US20140041067A1-20140206-T00005 Please refer to the end of the specification for access instructions.
TABLE-US-00011 Lengthy table referenced here US20140041067A1-20140206-T00006 Please refer to the end of the specification for access instructions.
TABLE-US-00012 Lengthy table referenced here US20140041067A1-20140206-T00007 Please refer to the end of the specification for access instructions.
TABLE-US-00013 Lengthy table referenced here US20140041067A1-20140206-T00008 Please refer to the end of the specification for access instructions.
TABLE-US-00014 Lengthy table referenced here US20140041067A1-20140206-T00009 Please refer to the end of the specification for access instructions.
TABLE-US-00015 Lengthy table referenced here US20140041067A1-20140206-T00010 Please refer to the end of the specification for access instructions.
TABLE-US-00016 Lengthy table referenced here US20140041067A1-20140206-T00011 Please refer to the end of the specification for access instructions.
TABLE-US-00017 Lengthy table referenced here US20140041067A1-20140206-T00012 Please refer to the end of the specification for access instructions.
TABLE-US-00018 Lengthy table referenced here US20140041067A1-20140206-T00013 Please refer to the end of the specification for access instructions.
TABLE-US-00019 Lengthy table referenced here US20140041067A1-20140206-T00014 Please refer to the end of the specification for access instructions.
TABLE-US-00020 Lengthy table referenced here US20140041067A1-20140206-T00015 Please refer to the end of the specification for access instructions.
TABLE-US-00021 Lengthy table referenced here US20140041067A1-20140206-T00016 Please refer to the end of the specification for access instructions.
TABLE-US-00022 Lengthy table referenced here US20140041067A1-20140206-T00017 Please refer to the end of the specification for access instructions.
TABLE-US-00023 Lengthy table referenced here US20140041067A1-20140206-T00018 Please refer to the end of the specification for access instructions.
TABLE-US-00024 Lengthy table referenced here US20140041067A1-20140206-T00019 Please refer to the end of the specification for access instructions.
TABLE-US-00025 Lengthy table referenced here US20140041067A1-20140206-T00020 Please refer to the end of the specification for access instructions.
TABLE-US-00026 Lengthy table referenced here US20140041067A1-20140206-T00021 Please refer to the end of the specification for access instructions.
TABLE-US-00027 Lengthy table referenced here US20140041067A1-20140206-T00022 Please refer to the end of the specification for access instructions.
TABLE-US-00028 Lengthy table referenced here US20140041067A1-20140206-T00023 Please refer to the end of the specification for access instructions.
TABLE-US-00029 Lengthy table referenced here US20140041067A1-20140206-T00024 Please refer to the end of the specification for access instructions.
TABLE-US-00030 Lengthy table referenced here US20140041067A1-20140206-T00025 Please refer to the end of the specification for access instructions.
REFERENCES
[1062] 1. Nat. Biotechnol. 2005 September; 23(9):1117-25; Human antibodies from transgenic animals; Lonberg N.
[1063] 2. J Clin Invest. 1992 March; 89(3):729-38; Immunoglobulin light chain variable region gene sequences for human antibodies to Haemophilus influenzae type b capsular polysaccharide are dominated by a limited number of V kappa and V lambda segments and VJ combinations; Adderson E E, Shackelford P G, Insel R A, Quinn A, Wilson P M, Carroll W L.
[1064] 3. J. Immunol. 1993 Oct. 15; 151(8):4352-61; Clonal characterization of the human IgG antibody repertoire to Haemophilus influenzae type b polysaccharide. V. In vivo expression of individual antibody clones is dependent on Ig CH haplotypes and the categories of antigen; Chung G H, Scott M G, Kim K H, Kearney J, Siber G R, Ambrosino D M, Nahm M H.
[1065] 4. J. Immunol. 1998 Dec. 1; 161(11):6068-73; Decreased frequency of rearrangement due to the synergistic effect of nucleotide changes in the heptamer and nonamer of the recombination signal sequence of the V kappa gene A2b, which is associated with increased susceptibility of Navajos to Haemophilus influenzae type b disease; Nadel B, Tang A, Lugo G, Love V, Escuro G, Feeney A J.
[1066] 5. J Clin Invest. 1996 May 15; 97(10):2277-82; A defective Vkappa A2 allele in Navajos which may play a role in increased susceptibility to Haemophilus influenzae type b disease; Feeney A J, Atkinson M J, Cowan M J, Escuro G, Lugo G.
[1067] 6. Infect Immun. 1994 September; 62(9):3873-80; Variable region sequences of a protective human monoclonal antibody specific for the Haemophilus influenzae type b capsular polysaccharide; Lucas A H, Larrick J W, Reason D C.
[1068] 7. J Clin Invest. 1993 June; 91(6):2734-43; Restricted immunoglobulin VH usage and VDJ combinations in the human response to Haemophilus influenzae type b capsular polysaccharide. Nucleotide sequences of monospecific anti-Haemophilus antibodies and polyspecific antibodies cross-reacting with self antigens; Adderson E E, Shackelford P G, Quinn A, Wilson P M, Cunningham M W, Insel R A, Carroll W L.
[1069] 8. J Clin Invest. 1993 March; 91(3):788-96; Variable region expression in the antibody responses of infants vaccinated with Haemophilus influenzae type b polysaccharide-protein conjugates. Description of a new lambda light chain-associated idiotype and the relation between idiotype expression, avidity, and vaccine formulation. The Collaborative Vaccine Study Group; Granoff D M, Shackelford P G, Holmes S J, Lucas A H.
[1070] 9. Infect Immun. 1994 May; 62(5):1776-86; Variable region sequences and idiotypic expression of a protective human immunoglobulin M antibody to capsular polysaccharides of Neisseria meningitidis group B and Escherichia coli K1; Azmi F H, Lucas A H, Raff H V, Granoff D M.
[1071] 10. J Clin Invest. 1992 December; 90(6):2197-208; Sequence analyses of three immunoglobulin G anti-virus antibodies reveal their utilization of autoantibody-related immunoglobulin Vh genes, but not V lambda genes; Huang D F, Olee T, Masuho Y, Matsumoto Y, Carson D A, Chen P P.
[1072] 11. Science. 2011 Aug. 12; 333(6044):834-5, Biochemistry. Catching a moving target, Wang T T, Palese P
[1073] 12. Science. 2009 Apr. 10; 324(5924):246-51. Epub 2009 Feb. 26; Antibody recognition of a highly conserved influenza virus epitope; Ekiert D C, Bhabha G, Elsliger M A, Friesen R H, Jongeneelen M, Throsby M, Goudsmit J, Wilson I A.
[1074] 13. PLoS One. 2008; 3(12):e3942. Epub 2008 Dec. 16; Heterosubtypic neutralizing monoclonal antibodies cross-protective against H5N1 and H1N1 recovered from human IgM© memory B cells; Throsby M, van den Brink E, Jongeneelen M, Poon L L, Alard P, Cornelissen L, Bakker A, Cox F, van Deventer E, Guan Y, Cinatl J, ter Meulen J, Lasters I, Carsetti R, Peiris M, de Kruif J, Goudsmit J.
[1075] 14. Nat Struct Mol Biol. 2009 March; 16(3):265-73. Epub 2009 Feb. 22,Structural and functional bases for broad-spectrum neutralization of avian and human influenza A viruses, Sui J, Hwang W C, Perez S, Wei G, Aird D, Chen L M, Santelli E, Stec B, Cadwell G, Ali M, Wan H, Murakami A, Yammanuru A, Han T, Cox N J, Bankston L A, Donis R O, Liddington R C, Marasco W A.
[1076] 15. Science. 2011 Aug. 12; 333(6044):843-50. Epub 2011 Jul. 7, A highly conserved neutralizing epitope on group 2 influenza A viruses, Ekiert D C, Friesen R H, Bhabha G, Kwaks T, Jongeneelen M, Yu W, Ophorst C, Cox F, Korse H J, Brandenburg B, Vogels R, Brakenhoff J P, Kompier R, Koldijk M H, Cornelissen L A, Poon L L, Peiris M, Koudstaal W, Wilson I A, Goudsmit J.
TABLE-US-LTS-00001
[1076] LENGTHY TABLES The patent application contains a lengthy table section. A copy of the table is available in electronic form from the USPTO web site (http://seqdata.uspto.gov/?pageRequest=docDetail&DocID=US20140041067A1). An electronic copy of the table will also be available from the USPTO upon request and payment of the fee set forth in 37 CFR 1.19(b)(3).
Sequence CWU
1
1
445149DNAhomo sapiens; 1ttgaccaagc tggggacccc ggtcccttgg gaccagtggc
agaggagtc 49261DNAhomo sapiens; 2atgatgatga tgatgatgta
cctgcagacc ccgtttccct ggtgccagtg gcagaggagt 60c
61352DNAhomo sapiens;
3atgaccatga agctagagac cccggcaccg tgggaccagt gacagaggag tc
52461DNAhomo sapiens; 4atgatgatga tgatgccata cctgcagacc ccggttccct
ggtgccagtg gcagaggagt 60c
61549DNAhomo sapiens; 5aactggttcg acccctgggg
ccagggaacc ctggtcaccg tctcctcag 49616PRThomo sapiens;
6Asn Trp Phe Asp Pro Trp Gly Gln Gly Thr Leu Val Thr Val Ser Ser 1
5 10 15 761DNAhomo
sapiens; 7tactactact actactacat ggacgtctgg ggcaaaggga ccacggtcac
cgtctcctca 60g
61820PRThomo sapiens; 8Tyr Tyr Tyr Tyr Tyr Tyr Met Asp Val
Trp Gly Lys Gly Thr Thr Val 1 5 10
15 Thr Val Ser Ser 20 952DNAhomo sapiens;
9tactggtact tcgatctctg gggccgtggc accctggtca ctgtctcctc ag
521017PRThomo sapiens; 10Tyr Trp Tyr Phe Asp Leu Trp Gly Arg Gly Thr Leu
Val Thr Val Ser 1 5 10
15 Ser 11296DNAhomo sapiens; 11caggtgcagc tggtgcagtc tggggctgag
gtgaagaagc ctggggcctc agtgaaggtc 60tcctgcaagg cttctggata caccttcacc
ggctactata tgcactgggt gcgacaggcc 120cctggacaag ggcttgagtg gatgggacgg
atcaacccta acagtggtgg cacaaactat 180gcacagaagt ttcagggcag ggtcaccagt
accagggaca cgtccatcag cacagcctac 240atggagctga gcaggctgag atctgacgac
acggtcgtgt attactgtgc gagaga 29612296DNAhomo sapiens; 12caggtccagc
ttgtgcagtc tggggctgag gtgaagaagc ctggggcctc agtgaaggtt 60tcctgcaagg
cttctggata caccttcact agctatgcta tgcattgggt gcgccaggcc 120cccggacaaa
ggcttgagtg gatgggatgg atcaacgctg gcaatggtaa cacaaaatat 180tcacagaagt
tccagggcag agtcaccatt accagggaca catccgcgag cacagcctac 240atggagctga
gcagcctgag atctgaagac acggctgtgt attactgtgc gagaga 29613296DNAhomo
sapiens; 13caggtgcagc tggtgcagtc tggggctgag gtgaagaagc ctggggcctc
agtgaaggtc 60tcctgcaagg cttctggata caccttcacc agttatgata tcaactgggt
gcgacaggcc 120actggacaag ggcttgagtg gatgggatgg atgaacccta acagtggtaa
cacaggctat 180gcacagaagt tccagggcag agtcaccatg accaggaaca cctccataag
cacagcctac 240atggagctga gcagcctgag atctgaggac acggccgtgt attactgtgc
gagagg 29614296DNAhomo sapiens; 14caggtccagc tggtacagtc tggggctgag
gtgaagaagc ctggggcctc agtgaaggtc 60tcctgcaagg tttccggata caccctcact
gaattatcca tgcactgggt gcgacaggct 120cctggaaaag ggcttgagtg gatgggaggt
tttgatcctg aagatggtga aacaatctac 180gcacagaagt tccagggcag agtcaccatg
accgaggaca catctacaga cacagcctac 240atggagctga gcagcctgag atctgaggac
acggccgtgt attactgtgc aacaga 29615296DNAhomo
sapiens;misc_feature(295)..(295)n is a, c, g, or t 15cagatgcagc
tggtgcagtc tggggctgag gtgaagaaga ctgggtcctc agtgaaggtt 60tcctgcaagg
cttccggata caccttcacc taccgctacc tgcactgggt gcgacaggcc 120cccggacaag
cgcttgagtg gatgggatgg atcacacctt tcaatggtaa caccaactac 180gcacagaaat
tccaggacag agtcaccatt actagggaca ggtctatgag cacagcctac 240atggagctga
gcagcctgag atctgaggac acagccatgt attactgtgc aagana 29616296DNAhomo
sapiens; 16caggtgcagc tggtgcagtc tggggctgag gtgaagaagc ctggggcctc
agtgaaggtt 60tcctgcaagg catctggata caccttcacc agctactata tgcactgggt
gcgacaggcc 120cctggacaag ggcttgagtg gatgggaata atcaacccta gtggtggtag
cacaagctac 180gcacagaagt tccagggcag agtcaccatg accagggaca cgtccacgag
cacagtctac 240atggagctga gcagcctgag atctgaggac acggccgtgt attactgtgc
gagaga 29617296DNAhomo sapiens; 17caaatgcagc tggtgcagtc tgggcctgag
gtgaagaagc ctgggacctc agtgaaggtc 60tcctgcaagg cttctggatt cacctttact
agctctgctg tgcagtgggt gcgacaggct 120cgtggacaac gccttgagtg gataggatgg
atcgtcgttg gcagtggtaa cacaaactac 180gcacagaagt tccaggaaag agtcaccatt
accagggaca tgtccacaag cacagcctac 240atggagctga gcagcctgag atccgaggac
acggccgtgt attactgtgc ggcaga 29618296DNAhomo sapiens; 18caggtgcagc
tggtgcagtc tggggctgag gtgaagaagc ctgggtcctc ggtgaaggtc 60tcctgcaagg
cttctggagg caccttcagc agctatgcta tcagctgggt gcgacaggcc 120cctggacaag
ggcttgagtg gatgggaggg atcatcccta tctttggtac agcaaactac 180gcacagaagt
tccagggcag agtcacgatt accgcggacg aatccacgag cacagcctac 240atggagctga
gcagcctgag atctgaggac acggccgtgt attactgtgc gagaga 29619294DNAhomo
sapiens; 19caggtccagc tggtgcagtc ttgggctgag gtgaggaagt ctggggcctc
agtgaaagtc 60tcctgtagtt tttctgggtt taccatcacc agctacggta tacattgggt
gcaacagtcc 120cctggacaag ggcttgagtg gatgggatgg atcaaccctg gcaatggtag
cccaagctat 180gccaagaagt ttcagggcag attcaccatg accagggaca tgtccacaac
cacagcctac 240acagacctga gcagcctgac atctgaggac atggctgtgt attactatgc
aaga 29420294DNAhomo sapiens; 20gaggtccagc tggtacagtc tggggctgag
gtgaagaagc ctggggctac agtgaaaatc 60tcctgcaagg tttctggata caccttcacc
gactactaca tgcactgggt gcaacaggcc 120cctggaaaag ggcttgagtg gatgggactt
gttgatcctg aagatggtga aacaatatac 180gcagagaagt tccagggcag agtcaccata
accgcggaca cgtctacaga cacagcctac 240atggagctga gcagcctgag atctgaggac
acggccgtgt attactgtgc aaca 29421302DNAhomo sapiens; 21cagatcacct
tgaaggagtc tggtcctacg ctggtgaaac ccacacagac cctcacgctg 60acctgcacct
tctctgggtt ctcactcagc actagtggag tgggtgtggg ctggatccgt 120cagcccccag
gaaaggccct ggagtggctt gcactcattt attggaatga tgataagcgc 180tacagcccat
ctctgaagag caggctcacc atcaccaagg acacctccaa aaaccaggtg 240gtccttacaa
tgaccaacat ggaccctgtg gacacagcca catattactg tgcacacaga 300cc
30222301DNAhomo
sapiens; 22caggtcacct tgaaggagtc tggtcctgtg ctggtgaaac ccacagagac
cctcacgctg 60acctgcaccg tctctgggtt ctcactcagc aatgctagaa tgggtgtgag
ctggatccgt 120cagcccccag ggaaggccct ggagtggctt gcacacattt tttcgaatga
cgaaaaatcc 180tacagcacat ctctgaagag caggctcacc atctccaagg acacctccaa
aagccaggtg 240gtccttacca tgaccaacat ggaccctgtg gacacagcca catattactg
tgcacggata 300c
30123301DNAhomo sapiens; 23caggtcacct tgagggagtc tggtcctgcg
ctggtgaaac ccacacagac cctcacactg 60acctgcacct tctctgggtt ctcactcagc
actagtggaa tgtgtgtgag ctggatccgt 120cagcccccag ggaaggccct ggagtggctt
gcactcattg attgggatga tgataaatac 180tacagcacat ctctgaagac caggctcacc
atctccaagg acacctccaa aaaccaggtg 240gtccttacaa tgaccaacat ggaccctgtg
gacacagcca cgtattactg tgcacggata 300c
30124296DNAhomo sapiens; 24gaggtgcagc
tggtggagtc tgggggaggc ttggtccagc ctggggggtc cctgagactc 60tcctgtgcag
cctctggatt cacctttagt agctattgga tgagctgggt ccgccaggct 120ccagggaagg
ggctggagtg ggtggccaac ataaagcaag atggaagtga gaaatactat 180gtggactctg
tgaagggccg attcaccatc tccagagaca acgccaagaa ctcactgtat 240ctgcaaatga
acagcctgag agccgaggac acggctgtgt attactgtgc gagaga 29625298DNAhomo
sapiens; 25gaagtgcagc tggtggagtc tgggggaggc ttggtacagc ctggcaggtc
cctgagactc 60tcctgtgcag cctctggatt cacctttgat gattatgcca tgcactgggt
ccggcaagct 120ccagggaagg gcctggagtg ggtctcaggt attagttgga atagtggtag
cataggctat 180gcggactctg tgaagggccg attcaccatc tccagagaca acgccaagaa
ctccctgtat 240ctgcaaatga acagtctgag agctgaggac acggccttgt attactgtgc
aaaagata 29826296DNAhomo sapiens; 26caggtgcagc tggtggagtc tgggggaggc
ttggtcaagc ctggagggtc cctgagactc 60tcctgtgcag cctctggatt caccttcagt
gactactaca tgagctggat ccgccaggct 120ccagggaagg ggctggagtg ggtttcatac
attagtagta gtggtagtac catatactac 180gcagactctg tgaagggccg attcaccatc
tccagggaca acgccaagaa ctcactgtat 240ctgcaaatga acagcctgag agccgaggac
acggccgtgt attactgtgc gagaga 29627293DNAhomo sapiens; 27gaggtgcagc
tggtggagtc tgggggaggc ttggtacagc ctggggggtc cctgagactc 60tcctgtgcag
cctctggatt caccttcagt agctacgaca tgcactgggt ccgccaagct 120acaggaaaag
gtctggagtg ggtctcagct attggtactg ctggtgacac atactatcca 180ggctccgtga
agggccgatt caccatctcc agagaaaatg ccaagaactc cttgtatctt 240caaatgaaca
gcctgagagc cggggacacg gctgtgtatt actgtgcaag aga 29328302DNAhomo
sapiens; 28gaggtgcagc tggtggagtc tgggggaggc ttggtaaagc ctggggggtc
ccttagactc 60tcctgtgcag cctctggatt cactttcagt aacgcctgga tgagctgggt
ccgccaggct 120ccagggaagg ggctggagtg ggttggccgt attaaaagca aaactgatgg
tgggacaaca 180gactacgctg cacccgtgaa aggcagattc accatctcaa gagatgattc
aaaaaacacg 240ctgtatctgc aaatgaacag cctgaaaacc gaggacacag ccgtgtatta
ctgtaccaca 300ga
30229296DNAhomo sapiens; 29gaggtacaac tggtggagtc tgggggaggc
ttggtacagc ctggggggtc cctgagactc 60tcctgtgcag cctctggatt caccttcagt
aacagtgaca tgaactgggc ccgcaaggct 120ccaggaaagg ggctggagtg ggtatcgggt
gttagttgga atggcagtag gacgcactat 180gtggactccg tgaagcgccg attcatcatc
tccagagaca attccaggaa ctccctgtat 240ctgcaaaaga acagacggag agccgaggac
atggctgtgt attactgtgt gagaaa 29630296DNAhomo sapiens; 30acagtgcagc
tggtggagtc tgggggaggc ttggtagagc ctggggggtc cctgagactc 60tcctgtgcag
cctctggatt caccttcagt aacagtgaca tgaactgggt ccgccaggct 120ccaggaaagg
ggctggagtg ggtatcgggt gttagttgga atggcagtag gacgcactat 180gcagactctg
tgaagggccg attcatcatc tccagagaca attccaggaa cttcctgtat 240cagcaaatga
acagcctgag gcccgaggac atggctgtgt attactgtgt gagaaa 29631296DNAhomo
sapiens; 31gaggtgcagc tggtggagtc tgggggaggt gtggtacggc ctggggggtc
cctgagactc 60tcctgtgcag cctctggatt cacctttgat gattatggca tgagctgggt
ccgccaagct 120ccagggaagg ggctggagtg ggtctctggt attaattgga atggtggtag
cacaggttat 180gcagactctg tgaagggccg attcaccatc tccagagaca acgccaagaa
ctccctgtat 240ctgcaaatga acagtctgag agccgaggac acggccttgt atcactgtgc
gagaga 29632296DNAhomo sapiens; 32gaggtgcagc tggtggagtc tgggggaggc
ctggtcaagc ctggggggtc cctgagactc 60tcctgtgcag cctctggatt caccttcagt
agctatagca tgaactgggt ccgccaggct 120ccagggaagg ggctggagtg ggtctcatcc
attagtagta gtagtagtta catatactac 180gcagactcag tgaagggccg attcaccatc
tccagagaca acgccaagaa ctcactgtat 240ctgcaaatga acagcctgag agccgaggac
acggctgtgt attactgtgc gagaga 29633296DNAhomo sapiens; 33gaggtgcagc
tgttggagtc tgggggaggc ttggtacagc ctggggggtc cctgagactc 60tcctgtgcag
cctctggatt cacctttagc agctatgcca tgagctgggt ccgccaggct 120ccagggaagg
ggctggagtg ggtctcagct attagtggta gtggtggtag cacatactac 180gcagactccg
tgaagggccg gttcaccatc tccagagaca attccaagaa cacgctgtat 240ctgcaaatga
acagcctgag agccgaggac acggccgtat attactgtgc gaaaga 29634296DNAhomo
sapiens; 34caggtgcagc tggtggagtc tgggggaggc gtggtccagc ctgggaggtc
cctgagactc 60tcctgtgcag cctctggatt caccttcagt agctatgcta tgcactgggt
ccgccaggct 120ccaggcaagg ggctagagtg ggtggcagtt atatcatatg atggaagtaa
taaatactac 180gcagactccg tgaagggccg attcaccatc tccagagaca attccaagaa
cacgctgtat 240ctgcaaatga acagcctgag agctgaggac acggctgtgt attactgtgc
gagaga 29635294DNAhomo sapiens; 35caggtgcagc tggtggagtc tgggggaggc
gtggtccagc ctgggaggtc cctgagactc 60tcctgtgcag cctctggatt caccttcagt
agctatgcta tgcactgggt ccgccaggct 120ccaggcaagg ggctggagtg ggtggcagtt
atatcatatg atggaagcaa taaatactac 180gcagactccg tgaagggccg attcaccatc
tccagagaca attccaagaa cacgctgtat 240ctgcaaatga acagcctgag agctgaggac
acggctgtgt attactgtgc gaga 29436296DNAhomo sapiens; 36caggtgcagc
tggtggagtc tgggggaggc gtggtccagc ctgggaggtc cctgagactc 60tcctgtgcag
cgtctggatt caccttcagt agctatggca tgcactgggt ccgccaggct 120ccaggcaagg
ggctggagtg ggtggcagtt atatggtatg atggaagtaa taaatactat 180gcagactccg
tgaagggccg attcaccatc tccagagaca attccaagaa cacgctgtat 240ctgcaaatga
acagcctgag agccgaggac acggctgtgt attactgtgc gagaga 29637296DNAhomo
sapiens; 37gaggtgcagc tggtggagtc tgggggaggc ttggtacagc ctgggggatc
cctgagactc 60tcctgtgcag cctctggatt caccttcagt aacagtgaca tgaactgggt
ccatcaggct 120ccaggaaagg ggctggagtg ggtatcgggt gttagttgga atggcagtag
gacgcactat 180gcagactctg tgaagggccg attcatcatc tccagagaca attccaggaa
caccctgtat 240ctgcaaacga atagcctgag ggccgaggac acggctgtgt attactgtgt
gagaaa 29638292DNAhomo sapiens; 38gaggtgcagc tggtggagtc tgggggaggc
ttggtacagc ctagggggtc cctgagactc 60tcctgtgcag cctctggatt caccgtcagt
agcaatgaga tgagctggat ccgccaggct 120ccagggaagg ggctggagtg ggtctcatcc
attagtggtg gtagcacata ctacgcagac 180tccaggaagg gcagattcac catctccaga
gacaattcca agaacacgct gtatcttcaa 240atgaacaacc tgagagctga gggcacggcc
gcgtattact gtgccagata ta 29239298DNAhomo sapiens; 39gaagtgcagc
tggtggagtc tgggggagtc gtggtacagc ctggggggtc cctgagactc 60tcctgtgcag
cctctggatt cacctttgat gattatacca tgcactgggt ccgtcaagct 120ccggggaagg
gtctggagtg ggtctctctt attagttggg atggtggtag cacatactat 180gcagactctg
tgaagggccg attcaccatc tccagagaca acagcaaaaa ctccctgtat 240ctgcaaatga
acagtctgag aactgaggac accgccttgt attactgtgc aaaagata 29840291DNAhomo
sapiens; 40gaggatcagc tggtggagtc tgggggaggc ttggtacagc ctggggggtc
cctgcgaccc 60tcctgtgcag cctctggatt cgccttcagt agctatgctc tgcactgggt
tcgccgggct 120ccagggaagg gtctggagtg ggtatcagct attggtactg gtggtgatac
atactatgca 180gactccgtga tgggccgatt caccatctcc agagacaacg ccaagaagtc
cttgtatctt 240catatgaaca gcctgatagc tgaggacatg gctgtgtatt attgtgcaag a
29141296DNAhomo sapiens; 41gaggtgcagc tggtggagtc tgggggaggc
ttggtacagc ctggggggtc cctgagactc 60tcctgtgcag cctctggatt caccttcagt
agctatagca tgaactgggt ccgccaggct 120ccagggaagg ggctggagtg ggtttcatac
attagtagta gtagtagtac catatactac 180gcagactctg tgaagggccg attcaccatc
tccagagaca atgccaagaa ctcactgtat 240ctgcaaatga acagcctgag agccgaggac
acggctgtgt attactgtgc gagaga 29642302DNAhomo sapiens; 42gaggtgcagc
tggtggagtc tgggggaggc ttggtacagc cagggcggtc cctgagactc 60tcctgtacag
cttctggatt cacctttggt gattatgcta tgagctggtt ccgccaggct 120ccagggaagg
ggctggagtg ggtaggtttc attagaagca aagcttatgg tgggacaaca 180gaatacaccg
cgtctgtgaa aggcagattc accatctcaa gagatggttc caaaagcatc 240gcctatctgc
aaatgaacag cctgaaaacc gaggacacag ccgtgtatta ctgtactaga 300ga
30243293DNAhomo
sapiens; 43gaggtgcagc tggtggagtc tggaggaggc ttgatccagc ctggggggtc
cctgagactc 60tcctgtgcag cctctgggtt caccgtcagt agcaactaca tgagctgggt
ccgccaggct 120ccagggaagg ggctggagtg ggtctcagtt atttatagcg gtggtagcac
atactacgca 180gactccgtga agggccgatt caccatctcc agagacaatt ccaagaacac
gctgtatctt 240caaatgaaca gcctgagagc cgaggacacg gccgtgtatt actgtgcgag
aga 29344296DNAhomo sapiens; 44gaggtgcagc tggtggagtc tgggggaggc
ttggtccagc ctggggggtc cctgagactc 60tcctgtgcag cctctggatt caccttcagt
agctatgcta tgcactgggt ccgccaggct 120ccagggaagg gactggaata tgtttcagct
attagtagta atgggggtag cacatattat 180gcaaactctg tgaagggcag attcaccatc
tccagagaca attccaagaa cacgctgtat 240cttcaaatgg gcagcctgag agctgaggac
atggctgtgt attactgtgc gagaga 29645293DNAhomo sapiens; 45gaggtgcagc
tggtggagtc tgggggaggc ttggtccagc ctggggggtc cctgagactc 60tcctgtgcag
cctctggatt caccgtcagt agcaactaca tgagctgggt ccgccaggct 120ccagggaagg
ggctggagtg ggtctcagtt atttatagcg gtggtagcac atactacgca 180gactccgtga
agggcagatt caccatctcc agagacaatt ccaagaacac gctgtatctt 240caaatgaaca
gcctgagagc cgaggacacg gctgtgtatt actgtgcgag aga 29346302DNAhomo
sapiens; 46gaggtgcagc tggtggagtc tgggggaggc ttggtccagc ctggagggtc
cctgagactc 60tcctgtgcag cctctggatt caccttcagt gaccactaca tggactgggt
ccgccaggct 120ccagggaagg ggctggagtg ggttggccgt actagaaaca aagctaacag
ttacaccaca 180gaatacgccg cgtctgtgaa aggcagattc accatctcaa gagatgattc
aaagaactca 240ctgtatctgc aaatgaacag cctgaaaacc gaggacacgg ccgtgtatta
ctgtgctaga 300ga
30247302DNAhomo sapiens; 47gaggtgcagc tggtggagtc tgggggaggc
ttggtccagc ctggggggtc cctgaaactc 60tcctgtgcag cctctgggtt caccttcagt
ggctctgcta tgcactgggt ccgccaggct 120tccgggaaag ggctggagtg ggttggccgt
attagaagca aagctaacag ttacgcgaca 180gcatatgctg cgtcggtgaa aggcaggttc
accatctcca gagatgattc aaagaacacg 240gcgtatctgc aaatgaacag cctgaaaacc
gaggacacgg ccgtgtatta ctgtactaga 300ca
30248296DNAhomo sapiens; 48gaggtgcagc
tggtggagtc cgggggaggc ttagttcagc ctggggggtc cctgagactc 60tcctgtgcag
cctctggatt caccttcagt agctactgga tgcactgggt ccgccaagct 120ccagggaagg
ggctggtgtg ggtctcacgt attaatagtg atgggagtag cacaagctac 180gcggactccg
tgaagggccg attcaccatc tccagagaca acgccaagaa cacgctgtat 240ctgcaaatga
acagtctgag agccgaggac acggctgtgt attactgtgc aagaga 29649288DNAhomo
sapiens; 49gaggtgcagc tggtggagtc tcggggagtc ttggtacagc ctggggggtc
cctgagactc 60tcctgtgcag cctctggatt caccgtcagt agcaatgaga tgagctgggt
ccgccaggct 120ccagggaagg gtctggagtg ggtctcatcc attagtggtg gtagcacata
ctacgcagac 180tccaggaagg gcagattcac catctccaga gacaattcca agaacacgct
gcatcttcaa 240atgaacagcc tgagagctga ggacacggct gtgtattact gtaagaaa
28850293DNAhomo sapiens; 50gaggtgcagc tggtggagtc tgggggaggc
ttggtaaagc ctggggggtc cctgagactc 60tcctgtgcag cctctggatt caccttcagt
gactactaca tgaactgggt ccgccaggct 120ccagggaagg ggctggagtg ggtctcatcc
attagtagta gtagtaccat atactacgca 180gactctgtga agggccgatt caccatctcc
agagacaacg ccaagaactc actgtatctg 240caaatgaaca gcctgagagc cgaggacacg
gctgtgtatt actgtgcgag aga 29351296DNAhomo sapiens; 51caggtgcagc
tgcaggagtc gggcccagga ctggtgaagc ctccggggac cctgtccctc 60acctgcgctg
tctctggtgg ctccatcagc agtagtaact ggtggagttg ggtccgccag 120cccccaggga
aggggctgga gtggattggg gaaatctatc atagtgggag caccaactac 180aacccgtccc
tcaagagtcg agtcaccata tcagtagaca agtccaagaa ccagttctcc 240ctgaagctga
gctctgtgac cgccgcggac acggccgtgt attgctgtgc gagaga 29652296DNAhomo
sapiens; 52caggtgcagc tgcaggagtc gggcccagga ctggtgaagc cttcggacac
cctgtccctc 60acctgcgctg tctctggtta ctccatcagc agtagtaact ggtggggctg
gatccggcag 120cccccaggga agggactgga gtggattggg tacatctatt atagtgggag
cacctactac 180aacccgtccc tcaagagtcg agtcaccatg tcagtagaca cgtccaagaa
ccagttctcc 240ctgaagctga gctctgtgac cgccgtggac acggccgtgt attactgtgc
gagaaa 29653299DNAhomo sapiens; 53cagctgcagc tgcaggagtc cggctcagga
ctggtgaagc cttcacagac cctgtccctc 60acctgcgctg tctctggtgg ctccatcagc
agtggtggtt actcctggag ctggatccgg 120cagccaccag ggaagggcct ggagtggatt
gggtacatct atcatagtgg gagcacctac 180tacaacccgt ccctcaagag tcgagtcacc
atatcagtag acaggtccaa gaaccagttc 240tccctgaagc tgagctctgt gaccgccgcg
gacacggccg tgtattactg tgccagaga 29954299DNAhomo sapiens; 54caggtgcagc
tgcaggagtc gggcccagga ctggtgaagc cttcacagac cctgtccctc 60acctgcactg
tctctggtgg ctccatcagc agtggtgatt actactggag ttggatccgc 120cagcccccag
ggaagggcct ggagtggatt gggtacatct attacagtgg gagcacctac 180tacaacccgt
ccctcaagag tcgagttacc atatcagtag acacgtccaa gaaccagttc 240tccctgaagc
tgagctctgt gactgccgca gacacggccg tgtattactg tgccagaga 29955299DNAhomo
sapiens; 55caggtgcagc tgcaggagtc gggcccagga ctggtgaagc cttcacagac
cctgtccctc 60acctgcactg tctctggtgg ctccatcagc agtggtggtt actactggag
ctggatccgc 120cagcacccag ggaagggcct ggagtggatt gggtacatct attacagtgg
gagcacctac 180tacaacccgt ccctcaagag tctagttacc atatcagtag acacgtctaa
gaaccagttc 240tccctgaagc tgagctctgt gactgccgcg gacacggccg tgtattactg
tgcgagaga 29956293DNAhomo sapiens; 56caggtgcagc tacagcagtg gggcgcagga
ctgttgaagc cttcggagac cctgtccctc 60acctgcgctg tctatggtgg gtccttcagt
ggttactact ggagctggat ccgccagccc 120ccagggaagg ggctggagtg gattggggaa
atcaatcata gtggaagcac caactacaac 180ccgtccctca agagtcgagt caccatatca
gtagacacgt ccaagaacca gttctccctg 240aagctgagct ctgtgaccgc cgcggacacg
gctgtgtatt actgtgcgag agg 29357299DNAhomo sapiens; 57cagctgcagc
tgcaggagtc gggcccagga ctggtgaagc cttcggagac cctgtccctc 60acctgcactg
tctctggtgg ctccatcagc agtagtagtt actactgggg ctggatccgc 120cagcccccag
ggaaggggct ggagtggatt gggagtatct attatagtgg gagcacctac 180tacaacccgt
ccctcaagag tcgagtcacc atatccgtag acacgtccaa gaaccagttc 240tccctgaagc
tgagctctgt gaccgccgca gacacggctg tgtattactg tgcgagaca 29958296DNAhomo
sapiens; 58caggtgcagc tgcaggagtc gggcccagga ctggtgaagc cttcggagac
cctgtccctc 60atctgcgctg tctctggtga ctccatcagc agtggtaact ggtgaatctg
ggtccgccag 120cccccaggga aggggctgga gtggattggg gaaatccatc atagtgggag
cacctactac 180aacccgtccc tcaagagtcg aatcaccatg tccgtagaca cgtccaagaa
ccagttctac 240ctgaagctga gctctgtgac cgccgcggac acggccgtgt attactgtgc
gagata 29659293DNAhomo sapiens; 59caggtgcagc tgcaggagtc gggcccagga
ctggtgaagc cttcggagac cctgtccctc 60acctgcactg tctctggtgg ctccatcagt
agttactact ggagctggat ccggcagccc 120ccagggaagg gactggagtg gattgggtat
atctattaca gtgggagcac caactacaac 180ccctccctca agagtcgagt caccatatca
gtagacacgt ccaagaacca gttctccctg 240aagctgagct ctgtgaccgc tgcggacacg
gccgtgtatt actgtgcgag aga 29360299DNAhomo sapiens; 60caggtgcagc
tgcaggagtc gggcccagga ctggtgaagc cttcggagac cctgtccctc 60acctgcactg
tctctggtgg ctccgtcagc agtggtagtt actactggag ctggatccgg 120cagcccccag
ggaagggact ggagtggatt gggtatatct attacagtgg gagcaccaac 180tacaacccct
ccctcaagag tcgagtcacc atatcagtag acacgtccaa gaaccagttc 240tccctgaagc
tgagctctgt gaccgctgcg gacacggccg tgtattactg tgcgagaga 29961294DNAhomo
sapiens; 61caggtgcagc tgcaggagtc gggcccagga ctggtgaagc cttcggagac
cctgtccctc 60acctgcgctg tctctggtta ctccatcagc agtggttact actggggctg
gatccggcag 120cccccaggga aggggctgga gtggattggg agtatctatc atagtgggag
cacctactac 180aacccgtccc tcaagagtcg agtcaccata tcagtagaca cgtccaagaa
ccagttctcc 240ctgaagctga gctctgtgac cgccgcagac acggccgtgt attactgtgc
gaga 29462296DNAhomo sapiens; 62gaggtgcagc tggtgcagtc tggagcagag
gtgaaaaagc ccggggagtc tctgaagatc 60tcctgtaagg gttctggata cagctttacc
agctactgga tcggctgggt gcgccagatg 120cccgggaaag gcctggagtg gatggggatc
atctatcctg gtgactctga taccagatac 180agcccgtcct tccaaggcca ggtcaccatc
tcagccgaca agtccatcag caccgcctac 240ctgcagtgga gcagcctgaa ggcctcggac
accgccatgt attactgtgc gagaca 29663294DNAhomo sapiens; 63gaagtgcagc
tggtgcagtc tggagcagag gtgaaaaagc ccggggagtc tctgaggatc 60tcctgtaagg
gttctggata cagctttacc agctactgga tcagctgggt gcgccagatg 120cccgggaaag
gcctggagtg gatggggagg attgatccta gtgactctta taccaactac 180agcccgtcct
tccaaggcca cgtcaccatc tcagctgaca agtccatcag cactgcctac 240ctgcagtgga
gcagcctgaa ggcctcggac accgccatgt attactgtgc gaga 29464305DNAhomo
sapiens; 64caggtacagc tgcagcagtc aggtccagga ctggtgaagc cctcgcagac
cctctcactc 60acctgtgcca tctccgggga cagtgtctct agcaacagtg ctgcttggaa
ctggatcagg 120cagtccccat cgagaggcct tgagtggctg ggaaggacat actacaggtc
caagtggtat 180aatgattatg cagtatctgt gaaaagtcga ataaccatca acccagacac
atccaagaac 240cagttctccc tgcagctgaa ctctgtgact cccgaggaca cggctgtgta
ttactgtgca 300agaga
30565294DNAhomo sapiens; 65caggtgcagc tggtgcaatc tgggtctgag
ttgaagaagc ctggggcctc agtgaaggtt 60tcctgcaagg cttctggata caccttcact
agctatgcta tgaattgggt gcgacaggcc 120cctggacaag ggcttgagtg gatgggatgg
atcaacacca acactgggaa cccaacgtat 180gcccagggct tcacaggacg gtttgtcttc
tccttggaca cctctgtcag cacggcatat 240ctgcagatct gcagcctaaa ggctgaggac
actgccgtgt attactgtgc gaga 29466296DNAhomo sapiens; 66caggtgcagc
tggtgcagtc tggccatgag gtgaagcagc ctggggcctc agtgaaggtc 60tcctgcaagg
cttctggtta cagtttcacc acctatggta tgaattgggt gccacaggcc 120cctggacaag
ggcttgagtg gatgggatgg ttcaacacct acactgggaa cccaacatat 180gcccagggct
tcacaggacg gtttgtcttc tccatggaca cctctgccag cacagcatac 240ctgcagatca
gcagcctaaa ggctgaggac atggccatgt attactgtgc gagata 2966717DNAhomo
sapiens; 67ggtacaactg gaacgac
176817DNAhomo sapiens; 68ggtataactg gaactac
176917DNAhomo sapiens; 69ggtataaccg gaaccac
177017DNAhomo sapiens;
70ggtataactg gaacgac
177120DNAhomo sapiens; 71ggtatagtgg gagctactac
207231DNAhomo sapiens; 72aggatattgt agtagtacca
gctgctatgc c 317331DNAhomo sapiens;
73aggatattgt actaatggtg tatgctatac c
317431DNAhomo sapiens; 74aggatattgt agtggtggta gctgctactc c
317528DNAhomo sapiens; 75agcatattgt ggtggtgatt
gctattcc 287631DNAhomo sapiens;
76gtattacgat ttttggagtg gttattatac c
317731DNAhomo sapiens; 77gtattacgat attttgactg gttattataa c
317831DNAhomo sapiens; 78gtattactat ggttcgggga
gttattataa c 317937DNAhomo sapiens;
79gtattatgat tacgtttggg ggagttatgc ttatacc
378031DNAhomo sapiens; 80gtattactat gatagtagtg gttattacta c
318116DNAhomo sapiens; 81tgactacagt aactac
168216DNAhomo sapiens;
82tgactacagt aactac
168316DNAhomo sapiens; 83tgactacggt gactac
168419DNAhomo sapiens; 84tgactacggt ggtaactcc
198520DNAhomo sapiens;
85gtggatacag ctatggttac
208623DNAhomo sapiens; 86gtggatatag tggctacgat tac
238720DNAhomo sapiens; 87gtggatacag ctatggttac
208820DNAhomo sapiens;
88gtagagatgg ctacaattac
208918DNAhomo sapiens; 89gagtatagca gctcgtcc
189021DNAhomo sapiens; 90gggtatagca gcagctggta c
219121DNAhomo sapiens;
91gggtatagca gtggctggta c
219218DNAhomo sapiens; 92gggtatagca gcggctac
189311DNAhomo sapiens; 93ctaactgggg a
119452DNAhomo sapiens;
94gctgaatact tccagcactg gggccagggc accctggtca ccgtctcctc ag
529553DNAhomo sapiens; 95ctactggtac ttcgatctct ggggccgtgg caccctggtc
actgtctcct cag 539650DNAhomo sapiens; 96tgatgctttt gatgtctggg
gccaagggac aatggtcacc gtctcttcag 509748DNAhomo sapiens;
97actactttga ctactggggc caaggaaccc tggtcaccgt ctcctcag
489851DNAhomo sapiens; 98acaactggtt cgactcctgg ggccaaggaa ccctggtcac
cgtctcctca g 519963DNAhomo sapiens; 99attactacta ctactacggt
atggacgtct gggggcaagg gaccacggtc accgtctcct 60cag
63100287DNAhomo sapiens;
100gacatccaga tgacccagtc tccttccacc ctgtctgcat ctgtaggaga cagagtcacc
60atcacttgcc gggccagtca gagtattagt agctggttgg cctggtatca gcagaaacca
120gggaaagccc ctaagctcct gatctatgat gcctccagtt tggaaagtgg ggtcccatca
180aggttcagcg gcagtggatc tgggacagaa ttcactctca ccatcagcag cctgcagcct
240gatgattttg caacttatta ctgccaacag tataatagtt attctcc
287101287DNAhomo sapiens; 101gccatccaga tgacccagtc tccatcctcc ctgtctgcat
ctgtaggaga cagagtcacc 60atcacttgcc gggcaagtca gggcattaga aatgatttag
gctggtatca gcagaaacca 120gggaaagccc ctaagctcct gatctatgct gcatccagtt
tacaaagtgg ggtcccatca 180aggttcagcg gcagtggatc tggcacagat ttcactctca
ccatcagcag cctgcagcct 240gaagattttg caacttatta ctgtctacaa gattacaatt
accctcc 287102287DNAhomo sapiens; 102gccatccgga
tgacccagtc tccatcctca ttctctgcat ctacaggaga cagagtcacc 60atcacttgtc
gggcgagtca gggtattagc agttatttag cctggtatca gcaaaaacca 120gggaaagccc
ctaagctcct gatctatgct gcatccactt tgcaaagtgg ggtcccatca 180aggttcagcg
gcagtggatc tgggacagat ttcactctca ccatcagctg cctgcagtct 240gaagattttg
caacttatta ctgtcaacag tattatagtt accctcc
287103287DNAhomo sapiens; 103gtcatctgga tgacccagtc tccatcctta ctctctgcat
ctacaggaga cagagtcacc 60atcagttgtc ggatgagtca gggcattagc agttatttag
cctggtatca gcaaaaacca 120gggaaagccc ctgagctcct gatctatgct gcatccactt
tgcaaagtgg ggtcccatca 180aggttcagtg gcagtggatc tgggacagat ttcactctca
ccatcagttg cctgcagtct 240gaagattttg caacttatta ctgtcaacag tattatagtt
tccctcc 287104287DNAhomo sapiens; 104gacatccagt
tgacccagtc tccatccttc ctgtctgcat ctgtaggaga cagagtcacc 60atcacttgcc
gggccagtca gggcattagc agttatttag cctggtatca gcaaaaacca 120gggaaagccc
ctaagctcct gatctatgct gcatccactt tgcaaagtgg ggtcccatca 180aggttcagcg
gcagtggatc tgggacagaa ttcactctca caatcagcag cctgcagcct 240gaagattttg
caacttatta ctgtcaacag cttaatagtt accctcc
287105287DNAhomo sapiens; 105gacatccaga tgacccagtc tccatcttcc gtgtctgcat
ctgtaggaga cagagtcacc 60atcacttgtc gggcgagtca gggtattagc agctggttag
cctggtatca gcagaaacca 120gggaaagccc ctaagctcct gatctatgct gcatccagtt
tgcaaagtgg ggtcccatca 180aggttcagcg gcagtggatc tgggacagat ttcactctca
ccatcagcag cctgcagcct 240gaagattttg caacttacta ttgtcaacag gctaacagtt
tccctcc 287106287DNAhomo sapiens; 106gccatccagt
tgacccagtc tccatcctcc ctgtctgcat ctgtaggaga cagagtcacc 60atcacttgcc
gggcaagtca gggcattagc agtgctttag cctgatatca gcagaaacca 120gggaaagctc
ctaagctcct gatctatgat gcctccagtt tggaaagtgg ggtcccatca 180aggttcagcg
gcagtggatc tgggacagat ttcactctca ccatcagcag cctgcagcct 240gaagattttg
caacttatta ctgtcaacag tttaataatt accctca
287107287DNAhomo sapiens; 107gacatccaga tgacccagtc tccatcctca ctgtctgcat
ctgtaggaga cagagtcacc 60atcacttgtc gggcgagtca gggcattagc aattatttag
cctggtttca gcagaaacca 120gggaaagccc ctaagtccct gatctatgct gcatccagtt
tgcaaagtgg ggtcccatca 180aggttcagcg gcagtggatc tgggacagat ttcactctca
ccatcagcag cctgcagcct 240gaagattttg caacttatta ctgccaacag tataatagtt
accctcc 287108287DNAhomo sapiens; 108gacatccaga
tgacccagtc tccatcctca ctgtctgcat ctgtaggaga cagagtcacc 60atcacttgtc
gggcgagtca gggtattagc agctggttag cctggtatca gcagaaacca 120gagaaagccc
ctaagtccct gatctatgct gcatccagtt tgcaaagtgg ggtcccatca 180aggttcagcg
gcagtggatc tgggacagat ttcactctca ccatcagcag cctgcagcct 240gaagattttg
caacttatta ctgccaacag tataatagtt accctcc
287109287DNAhomo sapiens; 109gacatccaga tgacccagtc tccatcctcc ctgtctgcat
ctgtaggaga cagagtcacc 60atcacttgcc gggcaagtca gggcattaga aatgatttag
gctggtatca gcagaaacca 120gggaaagccc ctaagcgcct gatctatgct gcatccagtt
tgcaaagtgg ggtcccatca 180aggttcagcg gcagtggatc tgggacagaa ttcactctca
caatcagcag cctgcagcct 240gaagattttg caacttatta ctgtctacag cataatagtt
accctcc 287110287DNAhomo sapiens; 110aacatccaga
tgacccagtc tccatctgcc atgtctgcat ctgtaggaga cagagtcacc 60atcacttgtc
gggcgaggca gggcattagc aattatttag cctggtttca gcagaaacca 120gggaaagtcc
ctaagcacct gatctatgct gcatccagtt tgcaaagtgg ggtcccatca 180aggttcagcg
gcagtggatc tgggacagaa ttcactctca caatcagcag cctgcagcct 240gaagattttg
caacttatta ctgtctacag cataatagtt accctcc
287111287DNAhomo sapiens; 111gacatccaga tgacccagtc tccatcctcc ctgtctgcat
ctgtaggaga cagagtcacc 60atcacttgcc gggcgagtca gggcattagc aattatttag
cctggtatca gcagaaacca 120gggaaagttc ctaagctcct gatctatgct gcatccactt
tgcaatcagg ggtcccatct 180cggttcagtg gcagtggatc tgggacagat ttcactctca
ccatcagcag cctgcagcct 240gaagatgttg caacttatta ctgtcaaaag tataacagtg
cccctcc 287112287DNAhomo sapiens; 112gacatccaga
tgacccagtc tccatcctcc ctgtctgcat ctgtaggaga cagagtcacc 60atcacttgcc
aggcgagtca ggacattagc aactatttaa attggtatca gcagaaacca 120gggaaagccc
ctaagctcct gatctacgat gcatccaatt tggaaacagg ggtcccatca 180aggttcagtg
gaagtggatc tgggacagat tttactttca ccatcagcag cctgcagcct 240gaagatattg
caacatatta ctgtcaacag tatgataatc tccctcc
287113287DNAhomo sapiens; 113gacatccaga tgacccagtc tccatcctcc ctgtctgcat
ctgtaggaga cagagtcacc 60atcacttgcc aggcgagtca ggacattagc aactatttaa
attggtatca gcagaaacca 120gggaaagccc ctaagctcct gatctacgat gcatccaatt
tggaaacagg ggtcccatca 180aggttcagtg gaagtggatc tgggacagat tttactttca
ccatcagcag cctgcagcct 240gaagatattg caacatatta ctgtcaacag tatgataatc
tccctcc 287114287DNAhomo sapiens; 114gacatccagt
tgacccagtc tccatcctcc ctgtctgcat ctgtaggaga cagagtcacc 60atcacttgcc
gggtgagtca gggcattagc agttatttaa attggtatcg gcagaaacca 120gggaaagttc
ctaagctcct gatctatagt gcatccaatt tgcaatctgg agtcccatct 180cggttcagtg
gcagtggatc tgggacagat ttcactctca ctatcagcag cctgcagcct 240gaagatgttg
caacttatta cggtcaacgg acttacaatg cccctcc
287115287DNAhomo sapiens; 115gacatccagt tgacccagtc tccatcctcc ctgtctgcat
ctgtaggaga cagagtcacc 60atcacttgcc gggtgagtca gggcattagc agttatttaa
attggtatcg gcagaaacca 120gggaaagttc ctaagctcct gatctatagt gcatccaatt
tgcaatctgg agtcccatct 180cggttcagtg gcagtggatc tgggacagat ttcactctca
ctatcagcag cctgcagcct 240gaagatgttg caacttatta cggtcaacgg acttacaatg
cccctcc 287116287DNAhomo sapiens; 116gacatccaga
tgacccagtc tccatcctcc ctgtctgcat ctgtaggaga cagagtcacc 60atcacttgcc
gggcaagtca gagcattagc agctatttaa attggtatca gcagaaacca 120gggaaagccc
ctaagctcct gatctatgct gcatccagtt tgcaaagtgg ggtcccatca 180aggttcagtg
gcagtggatc tgggacagat ttcactctca ccatcagcag tctgcaacct 240gaagattttg
caacttacta ctgtcaacag agttacagta cccctcc
287117287DNAhomo sapiens; 117gacatccaga tgacccagtc tccatcctcc ctgtctgcat
ctgtaggaga cagagtcacc 60atcacttgcc gggcaagtca gagcattagc agctatttaa
attggtatca gcagaaacca 120gggaaagccc ctaagctcct gatctatgct gcatccagtt
tgcaaagtgg ggtcccatca 180aggttcagtg gcagtggatc tgggacagat ttcactctca
ccatcagcag tctgcaacct 240gaagattttg caacttacta ctgtcaacag agttacagta
cccctcc 287118287DNAhomo sapiens; 118gacatccaga
tgatccagtc tccatctttc ctgtctgcat ctgtaggaga cagagtcagt 60atcatttgct
gggcaagtga gggcattagc agtaatttag cctggtatct gcagaaacca 120gggaaatccc
ctaagctctt cctctatgat gcaaaagatt tgcaccctgg ggtctcatcg 180aggttcagtg
gcaggggatc tgggacggat ttcactctca ccatcatcag cctgaagcct 240gaagattttg
cagcttatta ctgtaaacag gacttcagtt accctcc
287119287DNAhomo sapiens; 119gccatccgga tgacccagtc tccattctcc ctgtctgcat
ctgtaggaga cagagtcacc 60atcacttgct gggccagtca gggcattagc agttatttag
cctggtatca gcaaaaacca 120gcaaaagccc ctaagctctt catctattat gcatccagtt
tgcaaagtgg ggtcccatca 180aggttcagcg gcagtggatc tgggacggat tacactctca
ccatcagcag cctgcagcct 240gaagattttg caacttatta ctgtcaacag tattatagta
cccctcc 287120302DNAhomo sapiens; 120gatattgtga
tgacccagac tccactctcc tcacctgtca cccttggaca gccggcctcc 60atctcctgca
ggtctagtca aagcctcgta cacagtgatg gaaacaccta cttgagttgg 120cttcagcaga
ggccaggcca gcctccaaga ctcctaattt ataagatttc taaccggttc 180tctggggtcc
cagacagatt cagtggcagt ggggcaggga cagatttcac actgaaaatc 240agcagggtgg
aagctgagga tgtcggggtt tattactgca tgcaagctac acaatttcct 300ca
302121302DNAhomo
sapiens; 121gatattgtga tgacccagac tccactctcc tcgcctgtca cccttggaca
gccggcctcc 60atctccttca ggtctagtca aagcctcgta cacagtgatg gaaacaccta
cttgagttgg 120cttcagcaga ggccaggcca gcctccaaga ctcctaattt ataaggtttc
taaccggttc 180tctggggtcc cagacagatt cagtggcagt ggggcaggga cagatttcac
actgaaaatc 240agcagggtgg aagctgagga tgtcggggtt tattactgca cgcaagctac
acaatttcct 300ca
302122302DNAhomo sapiens; 122gatattgtga tgactcagtc tccactctcc
ctgcccgtca cccctggaga gccggcctcc 60atctcctgca ggtctagtca gagcctcctg
catagtaatg gatacaacta tttggattgg 120tacctgcaga agccagggca gtctccacag
ctcctgatct atttgggttc taatcgggcc 180tccggggtcc ctgacaggtt cagtggcagt
ggatcaggca cagattttac actgaaaatc 240agcagagtgg aggctgagga tgttggggtt
tattactgca tgcaagctct acaaactcct 300cc
302123302DNAhomo sapiens; 123gatattgtga
tgactcagtc tccactctcc ctgcccgtca cccctggaga gccggcctcc 60atctcctgca
ggtctagtca gagcctcctg catagtaatg gatacaacta tttggattgg 120tacctgcaga
agccagggca gtctccacag ctcctgatct atttgggttc taatcgggcc 180tccggggtcc
ctgacaggtt cagtggcagt ggatcaggca cagattttac actgaaaatc 240agcagagtgg
aggctgagga tgttggggtt tattactgca tgcaagctct acaaactcct 300cc
302124302DNAhomo
sapiens; 124gatattgtga tgacccagac tccactctct ctgtccgtca cccctggaca
gccggcctcc 60atctcctgca agtctagtca gagcctcctg catagtgatg gaaagaccta
tttgtattgg 120tacctgcaga agccaggcca gtctccacag ctcctgatct atgaagtttc
cagccggttc 180tctggagtgc cagataggtt cagtggcagc gggtcaggga cagatttcac
actgaaaatc 240agccgggtgg aggctgagga tgttggggtt tattactgaa tgcaaggtat
acaccttcct 300cc
302125302DNAhomo sapiens; 125gatattgtga tgacccagac tccactctct
ctgtccgtca cccctggaca gccggcctcc 60atctcctgca agtctagtca gagcctcctg
catagtgatg gaaagaccta tttgtattgg 120tacctgcaga agccaggcca gcctccacag
ctcctgatct atgaagtttc caaccggttc 180tctggagtgc cagataggtt cagtggcagc
gggtcaggga cagatttcac actgaaaatc 240agccgggtgg aggctgagga tgttggggtt
tattactgca tgcaaagtat acagcttcct 300cc
302126302DNAhomo sapiens; 126gatgttgtga
tgactcagtc tccactctcc ctgcccgtca cccttggaca gccggcctcc 60atctcctgca
ggtctagtca aagcctcgta tacagtgatg gaaacaccta cttgaattgg 120tttcagcaga
ggccaggcca atctccaagg cgcctaattt ataaggtttc taaccgggac 180tctggggtcc
cagacagatt cagcggcagt gggtcaggca ctgatttcac actgaaaatc 240agcagggtgg
aggctgagga tgttggggtt tattactgca tgcaaggtac acactggcct 300cc
302127302DNAhomo
sapiens; 127gatgttgtga tgactcagtc tccactctcc ctgcccgtca cccttggaca
gccggcctcc 60atctcctgca ggtctagtca aagcctcgta tacagtgatg gaaacaccta
cttgaattgg 120tttcagcaga ggccaggcca atctccaagg cgcctaattt ataaggtttc
taactgggac 180tctggggtcc cagacagatt cagcggcagt gggtcaggca ctgatttcac
actgaaaatc 240agcagggtgg aggctgagga tgttggggtt tattactgca tgcaaggtac
acactggcct 300cc
302128305DNAhomo sapiens; 128gatattgtga tgacccagac tccactctcc
ctgcccgtca cccctggaga gccggcctcc 60atctcctgca ggtctagtca gagcctcttg
gatagtgatg atggaaacac ctatttggac 120tggtacctgc agaagccagg gcagtctcca
cagctcctga tctatacgct ttcctatcgg 180gcctctggag tcccagacag gttcagtggc
agtgggtcag gcactgattt cacactgaaa 240atcagcaggg tggaggctga ggatgttgga
gtttattact gcatgcaacg tatagagttt 300ccttc
305129305DNAhomo sapiens; 129gatattgtga
tgacccagac tccactctcc ctgcccgtca cccctggaga gccggcctcc 60atctcctgca
ggtctagtca gagcctcttg gatagtgatg atggaaacac ctatttggac 120tggtacctgc
agaagccagg gcagtctcca cagctcctga tctatacgct ttcctatcgg 180gcctctggag
tcccagacag gttcagtggc agtgggtcag gcactgattt cacactgaaa 240atcagcaggg
tggaggctga ggatgttgga gtttattact gcatgcaacg tatagagttt 300ccttc
305130290DNAhomo
sapiens; 130gaaattgtaa tgacacagtc tccacccacc ctgtctttgt ctccagggga
aagagtcacc 60ctctcctgca gggccagtca gagtgttagc agcagctact taacctggta
tcagcagaaa 120cctggccagg cgcccaggct cctcatctat ggtgcatcca ccagggccac
tagcatccca 180gccaggttca gtggcagtgg gtctgggaca gacttcactc tcaccatcag
cagcctgcag 240cctgaagatt ttgcagttta ttactgtcag caggatcata acttacctcc
290131290DNAhomo sapiens; 131gaaattgtaa tgacacagtc tccacccacc
ctgtctttgt ctccagggga aagagtcacc 60ctctcctgca gggccagtca gagtgttagc
agcagctact taacctggta tcagcagaaa 120cctggccagg cgcccaggct cctcatctat
ggtgcatcca ccagggccac tagcatccca 180gccaggttca gtggcagtgg gtctgggaga
gacttcactc tcaccatcag cagcctgcag 240cctgaagatt ttgcagttta ttactgtcag
caggatcata acttacctcc 290132290DNAhomo sapiens;
132gaaattgtaa tgacacagtc tccagccacc ctgtctttgt ctccagggga aagagccacc
60ctctcctgca gggccagtca gagtgttagc agcagctact tatcctggta ccagcagaaa
120cctgggcagg ctcccaggct cctcatctat ggtgcatcca ccagggccac tggcatccca
180gccaggttca gtggcagtgg gtctgggaca gacttcactc tcaccatcag cagcctgcag
240cctgaagatt ttgcagttta ttactgtcag caggattata acttacctcc
290133287DNAhomo sapiens; 133gaaattgtgt tgacacagtc tccagccacc ctgtctttgt
ctccagggga aagagccacc 60ctctcctgca gggccagtca gagtgttagc agctacttag
cctggtacca acagaaacct 120ggccaggctc ccaggctcct catctatgat gcatccaaca
gggccactgg catcccagcc 180aggttcagtg gcagtgggtc tgggacagac ttcactctca
ccatcagcag cctagagcct 240gaagattttg cagtttatta ctgtcagcag cgtagcaact
ggcctcc 287134287DNAhomo sapiens; 134gaaattgtgt
tgacacagtc tccagccacc ctgtctttgt ctccagggga aagagccacc 60ctctcctgca
gggccagtca gggtgttagc agctacttag cctggtacca gcagaaacct 120ggccaggctc
ccaggctcct catctatgat gcatccaaca gggccactgg catcccagcc 180aggttcagtg
gcagtgggcc tgggacagac ttcactctca ccatcagcag cctagagcct 240gaagattttg
cagtttatta ctgtcagcag cgtagcaact ggcatcc
287135287DNAhomo sapiens; 135gaaatagtga tgacgcagtc tccagccacc ctgtctgtgt
ctccagggga aagagccacc 60ctctcctgca gggccagtca gagtgttagc agcaacttag
cctggtacca gcagaaacct 120ggccaggctc ccaggctcct catctatggt gcatccacca
gggccactgg tatcccagcc 180aggttcagtg gcagtgggtc tgggacagag ttcactctca
ccatcagcag cctgcagtct 240gaagattttg cagtttatta ctgtcagcag tataataact
ggcctcc 287136287DNAhomo sapiens; 136gaaatagtga
tgacgcagtc tccagccacc ctgtctgtgt ctccagggga aagagccacc 60ctctcctgca
gggccagtca gagtgttagc agcaacttag cctggtacca gcagaaacct 120ggccaggctc
ccaggctcct catctatggt gcatccacca gggccactgg catcccagcc 180aggttcagtg
gcagtgggtc tgggacagag ttcactctca ccatcagcag cctgcagtct 240gaagattttg
cagtttatta ctgtcagcag tataataact ggcctcc
287137290DNAhomo sapiens; 137gaaattgtgt tgacgcagtc tccaggcacc ctgtctttgt
ctccagggga aagagccacc 60ctctcctgca gggccagtca gagtgttagc agcagctact
tagcctggta ccagcagaaa 120cctggccagg ctcccaggct cctcatctat ggtgcatcca
gcagggccac tggcatccca 180gacaggttca gtggcagtgg gtctgggaca gacttcactc
tcaccatcag cagactggag 240cctgaagatt ttgcagtgta ttactgtcag cagtatggta
gctcacctcc 290138290DNAhomo sapiens; 138gaaattgtgt
tgacgcagtc tccagccacc ctgtctttgt ctccagggga aagagccacc 60ctctcctgcg
gggccagtca gagtgttagc agcagctact tagcctggta ccagcagaaa 120cctggcctgg
cgcccaggct cctcatctat gatgcatcca gcagggccac tggcatccca 180gacaggttca
gtggcagtgg gtctgggaca gacttcactc tcaccatcag cagactggag 240cctgaagatt
ttgcagtgta ttactgtcag cagtatggta gctcacctcc
290139305DNAhomo sapiens; 139gacatcgtga tgacccagtc tccagactcc ctggctgtgt
ctctgggcga gagggccacc 60atcaactgca agtccagcca gagtgtttta tacagctcca
acaataagaa ctacttagct 120tggtaccagc agaaaccagg acagcctcct aagctgctca
tttactgggc atctacccgg 180gaatccgggg tccctgaccg attcagtggc agcgggtctg
ggacagattt cactctcacc 240atcagcagcc tgcaggctga agatgtggca gtttattact
gtcagcaata ttatagtact 300cctcc
305140287DNAhomo sapiens; 140gaaacgacac tcacgcagtc
tccagcattc atgtcagcga ctccaggaga caaagtcaac 60atctcctgca aagccagcca
agacattgat gatgatatga actggtacca acagaaacca 120ggagaagctg ctattttcat
tattcaagaa gctactactc tcgttcctgg aatcccacct 180cgattcagtg gcagcgggta
tggaacagat tttaccctca caattaataa catagaatct 240gaggatgctg catattactt
ctgtctacaa catgataatt tccctct 287141287DNAhomo sapiens;
141gaaattgtgc tgactcagtc tccagacttt cagtctgtga ctccaaagga gaaagtcacc
60atcacctgcc gggccagtca gagcattggt agtagcttac actggtacca gcagaaacca
120gatcagtctc caaagctcct catcaagtat gcttcccagt ccttctcagg ggtcccctcg
180aggttcagtg gcagtggatc tgggacagat ttcaccctca ccatcaatag cctggaagct
240gaagatgctg caacgtatta ctgtcatcag agtagtagtt tacctca
287142287DNAhomo sapiens; 142gaaattgtgc tgactcagtc tccagacttt cagtctgtga
ctccaaagga gaaagtcacc 60atcacctgcc gggccagtca gagcattggt agtagcttac
actggtacca gcagaaacca 120gatcagtctc caaagctcct catcaagtat gcttcccagt
ccttctcagg ggtcccctcg 180aggttcagtg gcagtggatc tgggacagat ttcaccctca
ccatcaatag cctggaagct 240gaagatgctg caacgtatta ctgtcatcag agtagtagtt
tacctca 287143287DNAhomo sapiens; 143gatgttgtga
tgacacagtc tccagctttc ctctctgtga ctccagggga gaaagtcacc 60atcacctgcc
aggccagtga aggcattggc aactacttat actggtacca gcagaaacca 120gatcaagccc
caaagctcct catcaagtat gcttcccagt ccatctcagg ggtcccctcg 180aggttcagtg
gcagtggatc tgggacagat ttcaccttta ccatcagtag cctggaagct 240gaagatgctg
caacatatta ctgtcagcag ggcaataagc accctca
287144296DNAhomo sapiens; 144cagtctgtgc tgactcagcc accctcggtg tctgaagccc
ccaggcagag ggtcaccatc 60tcctgttctg gaagcagctc caacatcgga aataatgctg
taaactggta ccagcagctc 120ccaggaaagg ctcccaaact cctcatctat tatgatgatc
tgctgccctc aggggtctct 180gaccgattct ctggctccaa gtctggcacc tcagcctccc
tggccatcag tgggctccag 240tctgaggatg aggctgatta ttactgtgca gcatgggatg
acagcctgaa tggtcc 296145299DNAhomo sapiens; 145cagtctgtgc
tgacgcagcc gccctcagtg tctggggccc cagggcagag ggtcaccatc 60tcctgcactg
ggagcagctc caacatcggg gcaggttatg atgtacactg gtaccagcag 120cttccaggaa
cagcccccaa actcctcatc tatggtaaca gcaatcggcc ctcaggggtc 180cctgaccgat
tctctggctc caagtctggc acctcagcct ccctggccat cactgggctc 240caggctgagg
atgaggctga ttattactgc cagtcctatg acagcagcct gagtggttc
299146296DNAhomo sapiens; 146cagtctgtgt tgacgcagcc gccttcagtg tctgcggccc
caggacagaa ggtcaccatc 60tcctgctctg gaagcagctc cgacatgggg aattatgcgg
tatcctggta ccagcagctc 120ccaggaacag cccccaaact cctcatctat gaaaataata
agcgaccctc agggattcct 180gaccgattct ctggctccaa gtctggcacc tcagccaccc
tgggcatcac tggcctctgg 240cctgaggacg aggccgatta ttactgctta gcatgggata
ccagcccgag agcttg 296147296DNAhomo sapiens; 147cagtctgtgc
tgactcagcc accctcagcg tctgggaccc ccgggcagag ggtcaccatc 60tcttgttctg
gaagcagctc caacatcgga agtaatactg taaactggta ccagcagctc 120ccaggaacgg
cccccaaact cctcatctat agtaataatc agcggccctc aggggtccct 180gaccgattct
ctggctccaa gtctggcacc tcagcctccc tggccatcag tgggctccag 240tctgaggatg
aggctgatta ttactgtgca gcatgggatg acagcctgaa tggtcc
296148296DNAhomo sapiens; 148cagtctgtgc tgactcagcc accctcagcg tctgggaccc
ccgggcagag ggtcaccatc 60tcttgttctg gaagcagctc caacatcgga agtaattatg
tatactggta ccagcagctc 120ccaggaacgg cccccaaact cctcatctat aggaataatc
agcggccctc aggggtccct 180gaccgattct ctggctccaa gtctggcacc tcagcctccc
tggccatcag tgggctccgg 240tccgaggatg aggctgatta ttactgtgca gcatgggatg
acagcctgag tggtcc 296149299DNAhomo sapiens; 149cagtctgtgc
tgacgcagcc gccctcagtg tctggggccc cagggcagag ggtcaccatc 60tcctgcactg
ggagcagctc caacattggg gcgggttatg ttgtacattg gtaccagcag 120cttccaggaa
cagcccccaa actcctcatc tatggtaaca gcaatcggcc ctcaggggtc 180cctgaccaat
tctctggctc caagtctggc acctcagcct ccctggccat cactggactc 240cagtctgagg
atgaggctga ttattactgc aaagcatggg ataacagcct gaatgctca
299150296DNAhomo sapiens; 150cagtctgtgt tgacgcagcc gccctcagtg tctgcggccc
caggacagaa ggtcaccatc 60tcctgctctg gaagcagctc caacattggg aataattatg
tatcctggta ccagcagctc 120ccaggaacag cccccaaact cctcatttat gacaataata
agcgaccctc agggattcct 180gaccgattct ctggctccaa gtctggcacg tcagccaccc
tgggcatcac cggactccag 240actggggacg aggccgatta ttactgcgga acatgggata
gcagcctgag tgctgg 296151297DNAhomo sapiens; 151cagtctgccc
tgactcagcc tccctccgcg tccgggtctc ctggacagtc agtcaccatc 60tcctgcactg
gaaccagcag tgacgttggt ggttataact atgtctcctg gtaccaacag 120cacccaggca
aagcccccaa actcatgatt tatgaggtca gtaagcggcc ctcaggggtc 180cctgatcgct
tctctggctc caagtctggc aacacggcct ccctgaccgt ctctgggctc 240caggctgagg
atgaggctga ttattactgc agctcatatg caggcagcaa caatttc
297152297DNAhomo sapiens; 152cagtctgccc tgactcagcc tcgctcagtg tccgggtctc
ctggacagtc agtcaccatc 60tcctgcactg gaaccagcag tgatgttggt ggttataact
atgtctcctg gtaccaacag 120cacccaggca aagcccccaa actcatgatt tatgatgtca
gtaagcggcc ctcaggggtc 180cctgatcgct tctctggctc caagtctggc aacacggcct
ccctgaccat ctctgggctc 240caggctgagg atgaggctga ttattactgc tgctcatatg
caggcagcta cactttc 297153297DNAhomo sapiens; 153cagtctgccc
tgactcagcc tgcctccgtg tctgggtctc ctggacagtc gatcaccatc 60tcctgcactg
gaaccagcag tgacgttggt ggttataact atgtctcctg gtaccaacag 120cacccaggca
aagcccccaa actcatgatt tatgaggtca gtaatcggcc ctcaggggtt 180tctaatcgct
tctctggctc caagtctggc aacacggcct ccctgaccat ctctgggctc 240caggctgagg
acgaggctga ttattactgc agctcatata caagcagcag cactctc
297154297DNAhomo sapiens; 154cagtctgccc tgactcagcc tccctccgtg tccgggtctc
ctggacagtc agtcaccatc 60tcctgcactg gaaccagcag tgacgttggt agttataacc
gtgtctcctg gtaccagcag 120cccccaggca cagcccccaa actcatgatt tatgaggtca
gtaatcggcc ctcaggggtc 180cctgatcgct tctctgggtc caagtctggc aacacggcct
ccctgaccat ctctgggctc 240caggctgagg acgaggctga ttattactgc agcttatata
caagcagcag cactttc 297155298DNAhomo sapiens; 155cagtctgccc
tgactcagcc tgcctccgtg tctgggtctc ctggacagtc gatcaccatc 60tcctgcactg
gaaccagcag tgatgttggg agttataacc ttgtctcctg gtaccaacag 120cacccaggca
aagcccccaa actcatgatt tatgagggca gtaagcggcc ctcaggggtt 180tctaatcgct
tctctggctc caagtctggc aacacggcct ccctgacaat ctctgggctc 240caggctgagg
acgaggctga ttattactgc tgctcatatg caggtagtag cactttac
298156297DNAhomo sapiens; 156caatctgccc tgactcagcc tccttttgtg tccggggctc
ctggacagtc ggtcaccatc 60tcctgcactg gaaccagcag tgacgttggg gattatgatc
atgtcttctg gtaccaaaag 120cgtctcagca ctacctccag actcctgatt tacaatgtca
atactcggcc ttcagggatc 180tctgacctct tctcaggctc caagtctggc aacatggctt
ccctgaccat ctctgggctc 240aagtccgagg ttgaggctaa ttatcactgc agcttatatt
caagtagtta cactttc 297157285DNAhomo sapiens; 157tcctatgagc
tgactcagcc accctcagtg tccgtgtccc caggacagac agccagcatc 60acctgctctg
gagataaatt gggggataaa tatgcttgct ggtatcagca gaagccaggc 120cagtcccctg
tgctggtcat ctatcaagat agcaagcggc cctcagggat ccctgagcga 180ttctctggct
ccaactctgg gaacacagcc actctgacca tcagcgggac ccaggctatg 240gatgaggctg
actattactg tcaggcgtgg gacagcagca ctgca
285158290DNAhomo sapiens; 158tcctatgagc tgacacagcc accctcggtg tcagtgtccc
caggacaaac ggccaggatc 60acctgctctg gagatgcatt gccaaaaaaa tatgcttatt
ggtaccagca gaagtcaggc 120caggcccctg tgctggtcat ctatgaggac agcaaacgac
cctccgggat ccctgagaga 180ttctctggct ccagctcagg gacaatggcc accttgacta
tcagtggggc ccaggtggag 240gatgaagctg actactactg ttactcaaca gacagcagtg
gtaatcatag 290159290DNAhomo sapiens; 159tcctatgagc
tgactcagcc acactcagtg tcagtggcca cagcacagat ggccaggatc 60acctgtgggg
gaaacaacat tggaagtaaa gctgtgcact ggtaccagca aaagccaggc 120caggaccctg
tgctggtcat ctatagcgat agcaaccggc cctcagggat ccctgagcga 180ttctctggct
ccaacccagg gaacaccacc accctaacca tcagcaggat cgaggctggg 240gatgaggctg
actattactg tcaggtgtgg gacagtagta gtgatcatcc
290160290DNAhomo sapiens; 160tcctatgagc tgacacagcc accctcggtg tcagtgtccc
taggacagat ggccaggatc 60acctgctctg gagaagcatt gccaaaaaaa tatgcttatt
ggtaccagca gaagccaggc 120cagttccctg tgctggtgat atataaagac agcgagaggc
cctcagggat ccctgagcga 180ttctctggct ccagctcagg gacaatagtc acattgacca
tcagtggagt ccaggcagaa 240gacgaggctg actattactg tctatcagca gacagcagtg
gtacttatcc 290161290DNAhomo sapiens; 161tcttctgagc
tgactcagga ccctgctgtg tctgtggcct tgggacagac agtcaggatc 60acatgccaag
gagacagcct cagaagctat tatgcaagct ggtaccagca gaagccagga 120caggcccctg
tacttgtcat ctatggtaaa aacaaccggc cctcagggat cccagaccga 180ttctctggct
ccagctcagg aaacacagct tccttgacca tcactggggc tcaggcggaa 240gatgaggctg
actattactg taactcccgg gacagcagtg gtaaccatct
290162290DNAhomo sapiens; 162tcctatgtgc tgactcagcc accctcagtg tcagtggccc
caggaaagac ggccaggatt 60acctgtgggg gaaacaacat tggaagtaaa agtgtgcact
ggtaccagca gaagccaggc 120caggcccctg tgctggtcat ctattatgat agcgaccggc
cctcagggat ccctgagcga 180ttctctggct ccaactctgg gaacacggcc accctgacca
tcagcagggt cgaagccggg 240gatgaggccg actattactg tcaggtgtgg gacagtagta
gtgatcatcc 290163284DNAhomo sapiens; 163tcctatgagc
tgacacagct accctcggtg tcagtgtccc caggacagac agccaggatc 60acctgctctg
gagatgtact gggggaaaat tatgctgact ggtaccagca gaagccaggc 120caggcccctg
agttggtgat atacgaagat agtgagcggt accctggaat ccctgaacga 180ttctctgggt
ccacctcagg gaacacgacc accctgacca tcagcagggt cctgaccgaa 240gacgaggctg
actattactg tttgtctggg gatgaggaca atcc
284164290DNAhomo sapiens; 164tcctatgagc tgatgcagcc accctcggtg tcagtgtccc
caggacagac ggccaggatc 60acctgctctg gagatgcatt gccaaagcaa tatgcttatt
ggtaccagca gaagccaggc 120caggcccctg tgctggtgat atataaagac agtgagaggc
cctcagggat ccctgagcga 180ttctctggct ccagctcagg gacaacagtc acgttgacca
tcagtggagt ccaggcagaa 240gatgaggctg actattactg tcaatcagca gacagcagtg
gtacttatcc 290165284DNAhomo sapiens; 165tcctatgagc
tgacacagcc atcctcagtg tcagtgtctc cgggacagac agccaggatc 60acctgctcag
gagatgtact ggcaaaaaaa tatgctcggt ggttccagca gaagccaggc 120caggcccctg
tgctggtgat ttataaagac agtgagcggc cctcagggat ccctgagcga 180ttctccggct
ccagctcagg gaccacagtc accttgacca tcagcggggc ccaggttgag 240gatgaggctg
actattactg ttactctgcg gctgacaaca atct
284166284DNAhomo sapiens; 166tcctctgggc caactcaggt gcctgcagtg tctgtggcct
tgggacaaat ggccaggatc 60acctgccagg gagacagcat ggaaggctct tatgaacact
ggtaccagca gaagccaggc 120caggcccccg tgctggtcat ctatgatagc agtgaccggc
cctcaaggat ccctgagcga 180ttctctggct ccaaatcagg caacacaacc accctgacca
tcactggggc ccaggctgag 240gatgaggctg attattacta tcagttgata gacaaccatg
ctac 284167314DNAhomo sapiens; 167ctgcctgtgc
tgactcagcc cccgtctgca tctgccttgc tgggagcctc gatcaagctc 60acctgcaccc
taagcagtga gcacagcacc tacaccatcg aatggtatca acagagacca 120gggaggtccc
cccagtatat aatgaaggtt aagagtgatg gcagccacag caagggggac 180gggatccccg
atcgcttcat gggctccagt tctggggctg accgctacct caccttctcc 240aacctccagt
ctgacgatga ggctgagtat cactgtggag agagccacac gattgatggc 300caagtcggtt
gagc
314168297DNAhomo sapiens; 168cagcctgtgc tgactcaatc atcctctgcc tctgcttccc
tgggatcctc ggtcaagctc 60acctgcactc tgagcagtgg gcacagtagc tacatcatcg
catggcatca gcagcagcca 120gggaaggccc ctcggtactt gatgaagctt gaaggtagtg
gaagctacaa caaggggagc 180ggagttcctg atcgcttctc aggctccagc tctggggctg
accgctacct caccatctcc 240aacctccagt tagaggatga ggctgattat tactgtgaga
cctgggacag taacact 297169299DNAhomo sapiens; 169cagcttgtgc
tgactcaatc gccctctgcc tctgcctccc tgggagcctc ggtcaagctc 60acctgcactc
tgagcagtgg gcacagcagc tacgccatcg catggcatca gcagcagcca 120gagaagggcc
ctcggtactt gatgaagctt aacagtgatg gcagccacag caagggggac 180gggatccctg
atcgcttctc aggctccagc tctggggctg agcgctacct caccatctcc 240agcctccagt
ctgaggatga ggctgactat tactgtcaga cctggggcac tggcattca
299170312DNAhomo sapiens; 170cagcctgtgc tgactcagcc accttcctcc tccgcatctc
ctggagaatc cgccagactc 60acctgcacct tgcccagtga catcaatgtt ggtagctaca
acatatactg gtaccagcag 120aagccaggga gccctcccag gtatctcctg tactactact
cagactcaga taagggccag 180ggctctggag tccccagccg cttctctgga tccaaagatg
cttcagccaa tacagggatt 240ttactcatct ccgggctcca gtctgaggat gaggctgact
attactgtat gatttggcca 300agcaatgctt ct
312171312DNAhomo sapiens; 171caggctgtgc tgactcagcc
ggcttccctc tctgcatctc ctggagcatc agccagtctc 60acctgcacct tgcgcagtgg
catcaatgtt ggtacctaca ggatatactg gtaccagcag 120aagccaggga gtcctcccca
gtatctcctg aggtacaaat cagactcaga taagcagcag 180ggctctggag tccccagccg
cttctctgga tccaaagatg cttcggccaa tgcagggatt 240ttactcatct ctgggctcca
gtctgaggat gaggctgact attactgtat gatttggcac 300agcagcgctt ct
312172312DNAhomo sapiens;
172cagcctgtgc tgactcagcc aacttccctc tcagcatctc ctggagcatc agccagactc
60acctgcacct tgcgcagtgg catcaatctt ggtagctaca ggatattctg gtaccagcag
120aagccagaga gccctccccg gtatctcctg agctactact cagactcaag taagcatcag
180ggctctggag tccccagccg cttctctgga tccaaagatg cttcgagcaa tgcagggatt
240ttagtcatct ctgggctcca gtctgaggat gaggctgact attactgtat gatttggcac
300agcagtgctt ct
312173317DNAhomo sapiens; 173cagcctgtgc tgactcagcc atcttcccat tctgcatctt
ctggagcatc agtcagactc 60acctgcatgc tgagcagtgg cttcagtgtt ggggacttct
ggataaggtg gtaccaacaa 120aagccaggga accctccccg gtatctcctg tactaccact
cagactccaa taagggccaa 180ggctctggag ttcccagccg cttctctgga tccaacgatg
catcagccaa tgcagggatt 240ctgcgtatct ctgggctcca gcctgaggat gaggctgact
attactgtgg tacatggcac 300agcaactcta agactca
317174296DNAhomo sapiens; 174aattttatgc tgactcagcc
ccactctgtg tcggagtctc cggggaagac ggtaaccatc 60tcctgcaccc gcagcagtgg
cagcattgcc agcaactatg tgcagtggta ccagcagcgc 120ccgggcagtt cccccaccac
tgtgatctat gaggataacc aaagaccctc tggggtccct 180gatcggttct ctggctccat
cgacagctcc tccaactctg cctccctcac catctctgga 240ctgaagactg aggacgaggc
tgactactac tgtcagtctt atgatagcag caatca 296175294DNAhomo sapiens;
175cagactgtgg tgactcagga gccctcactg actgtgtccc caggagggac agtcactctc
60acctgtgctt ccagcactgg agcagtcacc agtggttact atccaaactg gttccagcag
120aaacctggac aagcacccag ggcactgatt tatagtacaa gcaacaaaca ctcctggacc
180cctgcccggt tctcaggctc cctccttggg ggcaaagctg ccctgacact gtcaggtgtg
240cagcctgagg acgaggctga gtattactgc ctgctctact atggtggtgc tcag
294176294DNAhomo sapiens; 176caggctgtgg tgactcagga gccctcactg actgtgtccc
caggagggac agtcactctc 60acctgtggct ccagcactgg agctgtcacc agtggtcatt
atccctactg gttccagcag 120aagcctggcc aagcccccag gacactgatt tatgatacaa
gcaacaaaca ctcctggaca 180cctgcccggt tctcaggctc cctccttggg ggcaaagctg
ccctgaccct ttcgggtgcg 240cagcctgagg atgaggctga gtattactgc ttgctctcct
atagtggtgc tcgg 294177296DNAhomo sapiens; 177cagactgtgg
tgacccagga gccatcgttc tcagtgtccc ctggagggac agtcacactc 60acttgtggct
tgagctctgg ctcagtctct actagttact accccagctg gtaccagcag 120accccaggcc
aggctccacg cacgctcatc tacagcacaa acactcgctc ttctggggtc 180cctgatcgct
tctctggctc catccttggg aacaaagctg ccctcaccat cacgggggcc 240caggcagatg
atgaatctga ttattactgt gtgctgtata tgggtagtgg catttc
296178317DNAhomo sapiens; 178cagcctgtgc tgactcagcc accttctgca tcagcctccc
tgggagcctc ggtcacactc 60acctgcaccc tgagcagcgg ctacagtaat tataaagtgg
actggtacca gcagagacca 120gggaagggcc cccggtttgt gatgcgagtg ggcactggtg
ggattgtggg atccaagggg 180gatggcatcc ctgatcgctt ctcagtcttg ggctcaggcc
tgaatcggta cctgaccatc 240aagaacatcc aggaagagga tgagagtgac taccactgtg
gggcagacca tggcagtggg 300agcaacttcg tgtaacc
317179296DNAhomo sapiens; 179caggcagggc tgactcagcc
accctcggtg tccaagggct tgagacagac cgccacactc 60acctgcactg ggaacagcaa
caatgttggc aaccaaggag cagcttggct gcagcagcac 120cagggccacc ctcccaaact
cctatcctac aggaataaca accggccctc agggatctca 180gagagattat ctgcatccag
gtcaggaaac acagcctccc tgaccattac tggactccag 240cctgaggacg aggctgacta
ttactgctca gcatgggaca gcagcctcag tgctca 296180312DNAhomo sapiens;
180cggcccgtgc tgactcagcc gccctctctg tctgcatccc cgggagcaac agccagactc
60ccctgcaccc tgagcagtga cctcagtgtt ggtggtaaaa acatgttctg gtaccagcag
120aagccaggga gctctcccag gttattcctg tatcactact cagactcaga caagcagctg
180ggacctgggg tccccagtcg agtctctggc tccaaggaga cctcaagtaa cacagcgttt
240ttgctcatct ctgggctcca gcctgaggac gaggccgatt attactgcca ggtgtacgaa
300agtagtgcta at
31218138DNAhomo sapiens; 181gtggacgttc ggccaaggga ccaaggtgga aatcaaac
3818239DNAhomo sapiens; 182tgtacacttt tggccagggg
accaagctgg agatcaaac 3918338DNAhomo sapiens;
183attcactttc ggccctggga ccaaagtgga tatcaaac
3818438DNAhomo sapiens; 184gctcactttc ggcggaggga ccaaggtgga gatcaaac
3818538DNAhomo sapiens; 185gatcaccttc ggccaaggga
cacgactgga gattaaac 3818638DNAhomo sapiens;
186ttatgtcttc ggaactggga ccaaggtcac cgtcctag
3818738DNAhomo sapiens; 187tgtggtattc ggcggaggga ccaagctgac cgtcctag
3818838DNAhomo sapiens; 188tgtggtattc ggcggaggga
ccaagctgac cgtcctag 3818938DNAhomo sapiens;
189ttttgtattt ggtggaggaa cccagctgat cattttag
3819038DNAhomo sapiens; 190ctgggtgttt ggtgagggga ccgagctgac cgtcctag
3819138DNAhomo sapiens; 191taatgtgttc ggcagtggca
ccaaggtgac cgtcctcg 3819238DNAhomo sapiens;
192tgctgtgttc ggaggaggca cccagctgac cgtcctcg
38193302DNAhomo sapiens; 193gagattgtga tgacccagac tccactctcc ttgtctatca
cccctggaga gcaggcctcc 60atctcctgca ggtctagtca gagcctcctg catagtgatg
gatacaccta tttgtattgg 120tttctgcaga aagccaggcc agtctccaca ctcctgatct
atgaagtttc caaccggttc 180tctggagtgc cagataggtt cagtggcagc gggtcaggga
cagatttcac actgaaaatc 240agccgggtgg aggctgagga ttttggagtt tattactgca
tgcaagatgc acaagatcct 300cc
302194296DNAhomo sapiens; 194cagtctgtgc tgactcagcc
accctcggtg tctgaagccc ccaggcagag ggtcaccatc 60tcctgttctg gaagcagctc
caacatcgga aataatgctg taaactggta ccagcagctc 120ccaggaaagg ctcccaaact
cctcatctat tatgatgatc tgctgccctc aggggtctct 180gaccgattct ctggctccaa
gtctggcacc tcagcctccc tggccatcag tgggctccag 240tctgaggatg aggctgatta
ttactgtgca gcatgggatg acagcctgaa tggtcc 296195296DNAhomo sapiens;
195cagtctgtgc tgactcagcc accctcagcg tctgggaccc ccgggcagag ggtcaccatc
60tcttgttctg gaagcagctc caacatcgga agtaattatg tatactggta ccagcagctc
120ccaggaacgg cccccaaact cctcatctat aggaataatc agcggccctc aggggtccct
180gaccgattct ctggctccaa gtctggcacc tcagcctccc tggccatcag tgggctccgg
240tccgaggatg aggctgatta ttactgtgca gcatgggatg acagcctgag tggtcc
296196299DNAhomo sapiens; 196cagtctgtgc tgacgcagcc gccctcagtg tctggggccc
cagggcagag ggtcaccatc 60tcctgcactg ggagcagctc caacattggg gcgggttatg
ttgtacattg gtaccagcag 120cttccaggaa cagcccccaa actcctcatc tatggtaaca
gcaatcggcc ctcaggggtc 180cctgaccaat tctctggctc caagtctggc acctcagcct
ccctggccat cactggactc 240cagtctgagg atgaggctga ttattactgc aaagcatggg
ataacagcct gaatgctca 299197296DNAhomo sapiens; 197cagtctgtgt
tgacgcagcc gccctcagtg tctgcggccc caggacagaa ggtcaccatc 60tcctgctctg
gaagcagctc caacattggg aataattatg tatcctggta ccagcagctc 120ccaggaacag
cccccaaact cctcatttat gacaataata agcgaccctc agggattcct 180gaccgattct
ctggctccaa gtctggcacg tcagccaccc tgggcatcac cggactccag 240actggggacg
aggccgatta ttactgcgga acatgggata gcagcctgag tgctgg
296198296DNAhomo sapiens; 198caggcagggc tgactcagcc accctcggtg tccaagggct
tgagacagac cgccacactc 60acctgcactg ggaacagcaa caatgttggc aaccaaggag
cagcttggct gcagcagcac 120cagggccacc ctcccaaact cctatcctac aggaataaca
accggccctc agggatctca 180gagagattat ctgcatccag gtcaggaaac acagcctccc
tgaccattac tggactccag 240cctgaggacg aggctgacta ttactgctca gcatgggaca
gcagcctcag tgctca 296199312DNAhomo sapiens; 199cggcccgtgc
tgactcagcc gccctctctg tctgcatccc cgggagcaac agccagactc 60ccctgcaccc
tgagcagtga cctcagtgtt ggtggtaaaa acatgttctg gtaccagcag 120aagccaggga
gctctcccag gttattcctg tatcactact cagactcaga caagcagctg 180ggacctgggg
tccccagtcg agtctctggc tccaaggaga cctcaagtaa cacagcgttt 240ttgctcatct
ctgggctcca gcctgaggac gaggccgatt attactgcca ggtgtacgaa 300agtagtgcta
at
312200297DNAhomo sapiens; 200cagtctgccc tgactcagcc tgcctccgtg tctgggtctc
ctggacagtc gatcaccatc 60tcctgcactg gaaccagcag tgacgttggt ggttataact
atgtctcctg gtaccaacag 120cacccaggca aagcccccaa actcatgatt tatgaggtca
gtaatcggcc ctcaggggtt 180tctaatcgct tctctggctc caagtctggc aacacggcct
ccctgaccat ctctgggctc 240caggctgagg acgaggctga ttattactgc agctcatata
caagcagcag cactctc 297201297DNAhomo sapiens; 201cagtctgccc
tgactcagcc tccctccgtg tccgggtctc ctggacagtc agtcaccatc 60tcctgcactg
gaaccagcag tgacgttggt agttataacc gtgtctcctg gtaccagcag 120cccccaggca
cagcccccaa actcatgatt tatgaggtca gtaatcggcc ctcaggggtc 180cctgatcgct
tctctgggtc caagtctggc aacacggcct ccctgaccat ctctgggctc 240caggctgagg
acgaggctga ttattactgc agcttatata caagcagcag cactttc
297202298DNAhomo sapiens; 202cagtctgccc tgactcagcc tgcctccgtg tctgggtctc
ctggacagtc gatcaccatc 60tcctgcactg gaaccagcag tgatgttggg agttataacc
ttgtctcctg gtaccaacag 120cacccaggca aagcccccaa actcatgatt tatgagggca
gtaagcggcc ctcaggggtt 180tctaatcgct tctctggctc caagtctggc aacacggcct
ccctgacaat ctctgggctc 240caggctgagg acgaggctga ttattactgc tgctcatatg
caggtagtag cactttac 298203297DNAhomo sapiens; 203cagtctgccc
tgactcagcc tccctccgcg tccgggtctc ctggacagtc agtcaccatc 60tcctgcactg
gaaccagcag tgacgttggt ggttataact atgtctcctg gtaccaacag 120cacccaggca
aagcccccaa actcatgatt tatgaggtca gtaagcggcc ctcaggggtc 180cctgatcgct
tctctggctc caagtctggc aacacggcct ccctgaccgt ctctgggctc 240caggctgagg
atgaggctga ttattactgc agctcatatg caggcagcaa caatttc
297204285DNAhomo sapiens; 204tcctatgagc tgactcagcc accctcagtg tccgtgtccc
caggacagac agccagcatc 60acctgctctg gagataaatt gggggataaa tatgcttgct
ggtatcagca gaagccaggc 120cagtcccctg tgctggtcat ctatcaagat agcaagcggc
cctcagggat ccctgagcga 180ttctctggct ccaactctgg gaacacagcc actctgacca
tcagcgggac ccaggctatg 240gatgaggctg actattactg tcaggcgtgg gacagcagca
ctgca 285205290DNAhomo sapiens; 205tcctatgagc
tgactcagcc acactcagtg tcagtggcca cagcacagat ggccaggatc 60acctgtgggg
gaaacaacat tggaagtaaa gctgtgcact ggtaccagca aaagccaggc 120caggaccctg
tgctggtcat ctatagcgat agcaaccggc cctcagggat ccctgagcga 180ttctctggct
ccaacccagg gaacaccacc accctaacca tcagcaggat cgaggctggg 240gatgaggctg
actattactg tcaggtgtgg gacagtagta gtgatcatcc
290206290DNAhomo sapiens; 206tcttctgagc tgactcagga ccctgctgtg tctgtggcct
tgggacagac agtcaggatc 60acatgccaag gagacagcct cagaagctat tatgcaagct
ggtaccagca gaagccagga 120caggcccctg tacttgtcat ctatggtaaa aacaaccggc
cctcagggat cccagaccga 180ttctctggct ccagctcagg aaacacagct tccttgacca
tcactggggc tcaggcggaa 240gatgaggctg actattactg taactcccgg gacagcagtg
gtaaccatct 290207290DNAhomo sapiens; 207tcctatgtgc
tgactcagcc accctcagtg tcagtggccc caggaaagac ggccaggatt 60acctgtgggg
gaaacaacat tggaagtaaa agtgtgcact ggtaccagca gaagccaggc 120caggcccctg
tgctggtcat ctattatgat agcgaccggc cctcagggat ccctgagcga 180ttctctggct
ccaactctgg gaacacggcc accctgacca tcagcagggt cgaagccggg 240gatgaggccg
actattactg tcaggtgtgg gacagtagta gtgatcatcc
290208284DNAhomo sapiens; 208tcctatgagc tgacacagct accctcggtg tcagtgtccc
caggacagac agccaggatc 60acctgctctg gagatgtact gggggaaaat tatgctgact
ggtaccagca gaagccaggc 120caggcccctg agttggtgat atacgaagat agtgagcggt
accctggaat ccctgaacga 180ttctctgggt ccacctcagg gaacacgacc accctgacca
tcagcagggt cctgaccgaa 240gacgaggctg actattactg tttgtctggg gatgaggaca
atcc 284209290DNAhomo sapiens; 209tcctatgagc
tgatgcagcc accctcggtg tcagtgtccc caggacagac ggccaggatc 60acctgctctg
gagatgcatt gccaaagcaa tatgcttatt ggtaccagca gaagccaggc 120caggcccctg
tgctggtgat atataaagac agtgagaggc cctcagggat ccctgagcga 180ttctctggct
ccagctcagg gacaacagtc acgttgacca tcagtggagt ccaggcagaa 240gatgaggctg
actattactg tcaatcagca gacagcagtg gtacttatcc
290210297DNAhomo sapiens; 210cagcctgtgc tgactcaatc atcctctgcc tctgcttccc
tgggatcctc ggtcaagctc 60acctgcactc tgagcagtgg gcacagtagc tacatcatcg
catggcatca gcagcagcca 120gggaaggccc ctcggtactt gatgaagctt gaaggtagtg
gaagctacaa caaggggagc 180ggagttcctg atcgcttctc aggctccagc tctggggctg
accgctacct caccatctcc 240aacctccagt tagaggatga ggctgattat tactgtgaga
cctgggacag taacact 297211312DNAhomo sapiens; 211cagcctgtgc
tgactcagcc accttcctcc tccgcatctc ctggagaatc cgccagactc 60acctgcacct
tgcccagtga catcaatgtt ggtagctaca acatatactg gtaccagcag 120aagccaggga
gccctcccag gtatctcctg tactactact cagactcaga taagggccag 180ggctctggag
tccccagccg cttctctgga tccaaagatg cttcagccaa tacagggatt 240ttactcatct
ccgggctcca gtctgaggat gaggctgact attactgtat gatttggcca 300agcaatgctt
ct
312212312DNAhomo sapiens; 212caggctgtgc tgactcagcc ggcttccctc tctgcatctc
ctggagcatc agccagtctc 60acctgcacct tgcgcagtgg catcaatgtt ggtacctaca
ggatatactg gtaccagcag 120aagccaggga gtcctcccca gtatctcctg aggtacaaat
cagactcaga taagcagcag 180ggctctggag tccccagccg cttctctgga tccaaagatg
cttcggccaa tgcagggatt 240ttactcatct ctgggctcca gtctgaggat gaggctgact
attactgtat gatttggcac 300agcagcgctt ct
312213312DNAhomo sapiens; 213cagcctgtgc tgactcagcc
aacttccctc tcagcatctc ctggagcatc agccagactc 60acctgcacct tgcgcagtgg
catcaatctt ggtagctaca ggatattctg gtaccagcag 120aagccagaga gccctccccg
gtatctcctg agctactact cagactcaag taagcatcag 180ggctctggag tccccagccg
cttctctgga tccaaagatg cttcgagcaa tgcagggatt 240ttagtcatct ctgggctcca
gtctgaggat gaggctgact attactgtat gatttggcac 300agcagtgctt ct
312214296DNAhomo sapiens;
214aattttatgc tgactcagcc ccactctgtg tcggagtctc cggggaagac ggtaaccatc
60tcctgcaccc gcagcagtgg cagcattgcc agcaactatg tgcagtggta ccagcagcgc
120ccgggcagtt cccccaccac tgtgatctat gaggataacc aaagaccctc tggggtccct
180gatcggttct ctggctccat cgacagctcc tccaactctg cctccctcac catctctgga
240ctgaagactg aggacgaggc tgactactac tgtcagtctt atgatagcag caatca
296215294DNAhomo sapiens; 215caggctgtgg tgactcagga gccctcactg actgtgtccc
caggagggac agtcactctc 60acctgtggct ccagcactgg agctgtcacc agtggtcatt
atccctactg gttccagcag 120aagcctggcc aagcccccag gacactgatt tatgatacaa
gcaacaaaca ctcctggaca 180cctgcccggt tctcaggctc cctccttggg ggcaaagctg
ccctgaccct ttcgggtgcg 240cagcctgagg atgaggctga gtattactgc ttgctctcct
atagtggtgc tcgg 294216296DNAhomo sapiens; 216cagactgtgg
tgacccagga gccatcgttc tcagtgtccc ctggagggac agtcacactc 60acttgtggct
tgagctctgg ctcagtctct actagttact accccagctg gtaccagcag 120accccaggcc
aggctccacg cacgctcatc tacagcacaa acactcgctc ttctggggtc 180cctgatcgct
tctctggctc catccttggg aacaaagctg ccctcaccat cacgggggcc 240caggcagatg
atgaatctga ttattactgt gtgctgtata tgggtagtgg catttc
2962171134DNAhomo sapiens;misc_feature(1)..(1)n is a, c, g, or t
217ncttccacca agggcccatc ggtcttcccc ctggcgccct gctccaggag cacctctggg
60ggcacagcgg ccctgggctg cctggtcaag gactacttcc cagaaccggt gacggtgtcg
120tggaactcag gcgccctgac cagcggcgtg cacaccttcc cggctgtcct acagtcctca
180ggactctact ccctcagcag cgtggtgacc gtgccctcca gcagcttggg cacccagacc
240tacacctgca acgtgaatca caagcccagc aacaccaagg tggacaagag agttgagctc
300aaaaccccac ttggtgacac aactcacaca tgcccacggt gcccagagcc caaatcttgt
360gacacacctc ccccgtgccc acggtgccca gagcccaaat cttgtgacac acctccccca
420tgcccacggt gcccagagcc caaatcttgt gacacacctc ccccgtgccc aaggtgccca
480gcacctgaac tcctgggagg accgtcagtc ttcctcttcc ccccaaaacc caaggatacc
540cttatgattt cccggacccc tgaggtcacg tgcgtggtgg tggacgtgag ccacgaagac
600cccgaggtcc agttcaagtg gtacgtggac ggcgtggagg tgcataatgc caagacaaag
660ccgcgggagg agcagtacaa cagcacgttc cgtgtggtca gcgtcctcac cgtcctgcac
720caggactggc tgaacggcaa ggagtacaag tgcaaggtct ccaacaaagc cctcccagcc
780cccatcgaga aaaccatctc caaaaccaaa ggacagcccc gagaaccaca ggtgtacacc
840ctgcccccat cccgggagga gatgaccaag aaccaggtca gcctgacctg cctggtcaaa
900ggcttctacc ccagcgacat cgccgtggag tgggagagca gcgggcagcc ggagaacaac
960tacaacacca cgcctcccat gctggactcc gacggctcct tcttcctcta cagcaagctc
1020accgtggaca agagcaggtg gcagcagggg aacatcttct catgctccgt gatgcatgag
1080gctctgcaca accgcttcac gcagaagagc ctctccctgt ctccgggtaa atga
11342181023DNAhomo sapiens; 218gcatccccga ccagccccaa ggtcttcccg
ctgagcctcg acagcacccc ccaagatggg 60aacgtggtcg tcgcatgcct ggtccagggc
ttcttccccc aggagccact cagtgtgacc 120tggagcgaaa gcggacagaa cgtgaccgcc
agaaacttcc cacctagcca ggatgcctcc 180ggggacctgt acaccacgag cagccagctg
accctgccgg ccacacagtg cccagacggc 240aagtccgtga catgccacgt gaagcactac
acgaattcca gccaggatgt gactgtgccc 300tgccgagttc ccccacctcc cccatgctgc
cacccccgac tgtcgctgca ccgaccggcc 360ctcgaggacc tgctcttagg ttcagaagcg
aacctcacgt gcacactgac cggcctgaga 420gatgcctctg gtgccacctt cacctggacg
ccctcaagtg ggaagagcgc tgttcaagga 480ccacctgagc gtgacctctg tggctgctac
agcgtgtcca gtgtcctgcc tggctgtgcc 540cagccatgga accatgggga gaccttcacc
tgcactgctg cccaccccga gttgaagacc 600ccactaaccg ccaacatcac aaaatccgga
aacacattcc ggcccgaggt ccacctgctg 660ccgccgccgt cggaggagct ggccctgaac
gagctggtga cgctgacgtg cctggcacgt 720ggcttcagcc ccaaggatgt gctggttcgc
tggctgcagg ggtcacagga gctgccccgc 780gagaagtacc tgacttgggc atcccggcag
gagcccagcc agggcaccac cacctacgct 840gtaaccagca tactgcgcgt ggcagctgag
gactggaaga agggggagac cttctcctgc 900atggtgggcc acgaggccct gccgctggcc
ttcacacaga agaccatcga ccgcatggcg 960ggtaaaccca cccacatcaa tgtgtctgtt
gtcatggcgg aggcggatgg cacctgctac 1020tga
1023219981DNAhomo sapiens; 219gcctccacca
agggcccatc ggtcttcccc ctggcgccct gctccaggag cacctccgag 60agcacagcgg
ccctgggctg cctggtcaag gactacttcc ccgaaccggt gacggtgtcg 120tggaactcag
gcgctctgac cagcggcgtg cacaccttcc cggctgtcct acagtcctca 180ggactctact
ccctcagcag cgtggtgacc gtgccctcca gcaacttcgg cacccagacc 240tacacctgca
acgtagatca caagcccagc aacaccaagg tggacaagac agttgagcgc 300aaatgttgtg
tcgagtgccc accgtgccca gcaccacctg tggcaggacc gtcagtcttc 360ctcttccccc
caaaacccaa ggacaccctc atgatctccc ggacccctga ggtcacgtgc 420gtggtggtgg
acgtgagcca cgaagacccc gaggtccagt tcaactggta cgtggacggc 480gtggaggtgc
ataatgccaa gacaaagcca cgggaggagc agttcaacag cacgttccgt 540gtggtcagcg
tcctcaccgt cgtgcaccag gactggctga acggcaagga gtacaagtgc 600aaggtctcca
acaaaggcct cccagccccc atcgagaaaa ccatctccaa aaccaaaggg 660cagccccgag
aaccacaggt gtacaccctg cccccatccc gggaggagat gaccaagaac 720caggtcagcc
tgacctgcct ggtcaaaggc ttctacccca gcgacatctc cgtggagtgg 780gagagcaatg
ggcagccgga gaacaactac aagaccacac ctcccatgct ggactccgac 840ggctccttct
tcctctacag caagctcacc gtggacaaga gcaggtggca gcaggggaac 900gtcttctcat
gctccgtgat gcatgaggct ctgcacaacc actacacaca gaagagcctc 960tccctgtctc
cgggtaaatg a
9812201287DNAhomo sapiens; 220gcctccacac agagcccatc cgtcttcccc ttgacccgct
gctgcaaaaa cattccctcc 60aatgccacct ccgtgactct gggctgcctg gccacgggct
acttcccgga gccggtgatg 120gtgacctggg acacaggctc cctcaacggg acaactatga
ccttaccagc caccaccctc 180acgctctctg gtcactatgc caccatcagc ttgctgaccg
tctcgggtgc gtgggccaag 240cagatgttca cctgccgtgt ggcacacact ccatcgtcca
cagactgggt cgacaacaaa 300accttcagcg tctgctccag ggacttcacc ccgcccaccg
tgaagatctt acagtcgtcc 360tgcgacggcg gcgggcactt ccccccgacc atccagctcc
tgtgcctcgt ctctgggtac 420accccaggga ctatcaacat cacctggctg gaggacgggc
aggtcatgga cgtggacttg 480tccaccgcct ctaccacgca ggagggtgag ctggcctcca
cacaaagcga gctcaccctc 540agccagaagc actggctgtc agaccgcacc tacacctgcc
aggtcaccta tcaaggtcac 600acctttgagg acagcaccaa gaagtgtgca gattccaacc
cgagaggggt gagcgcctac 660ctaagccggc ccagcccgtt cgacctgttc atccgcaagt
cgcccacgat cacctgtctg 720gtggtggacc tggcacccag caaggggacc gtgaacctga
cctggtcccg ggccagtggg 780aagcctgtga accactccac cagaaaggag gagaagcagc
gcaatggcac gttaaccgtc 840acgtccaccc tgccggtggg cacccgagac tggatcgagg
gggagaccta ccagtgcagg 900gtgacccacc cccacctgcc cagggccctc atgcggtcca
cgaccaagac cagcggcccg 960cgtgctgccc cggaagtcta tgcgtttgcg acgccggagt
ggccggggag ccgggacaag 1020cgcaccctcg cctgcctgat ccagaacttc atgcctgagg
acatctcggt gcagtggctg 1080cacaacgagg tgcagctccc ggacgcccgg cacagcacga
cgcagccccg caagaccaag 1140ggctccggct tcttcgtctt cagccgcctg gaggtgacca
gggccgaatg ggagcagaaa 1200gatgagttca tctgccgtgc agtccatgag gcagcaagcc
cctcacagac cgtccagcga 1260gcggtgtctg taaatcccgg taaatga
128722130DNAhomo sapiens; 221tccggcttct tcgtcttcag
ccgcctggag 302221062DNAhomo sapiens;
222gcatccccga ccagccccaa ggtcttcccg ctgagcctct gcagcaccca gccagatggg
60aacgtggtca tcgcctgcct ggtccagggc ttcttccccc aggagccact cagtgtgacc
120tggagcgaaa gcggacaggg cgtgaccgcc agaaacttcc cacccagcca ggatgcctcc
180ggggacctgt acaccacgag cagccagctg accctgccgg ccacacagtg cctagccggc
240aagtccgtga catgccacgt gaagcactac acgaatccca gccaggatgt gactgtgccc
300tgcccagttc cctcaactcc acctacccca tctccctcaa ctccacctac cccatctccc
360tcatgctgcc acccccgact gtcactgcac cgaccggccc tcgaggacct gctcttaggt
420tcagaagcga acctcacgtg cacactgacc ggcctgagag atgcctcagg tgtcaccttc
480acctggacgc cctcaagtgg gaagagcgct gttcaaggac cacctgagcg tgacctctgt
540ggctgctaca gcgtgtccag tgtcctgccg ggctgtgccg agccatggaa ccatgggaag
600accttcactt gcactgctgc ctaccccgag tccaagaccc cgctaaccgc caccctctca
660aaatccggaa acacattccg gcccgaggtc cacctgctgc cgccgccgtc ggaggagctg
720gccctgaacg agctggtgac gctgacgtgc ctggcacgcg gcttcagccc caaggatgtg
780ctggttcgct ggctgcaggg gtcacaggag ctgccccgcg agaagtacct gacttgggca
840tcccggcagg agcccagcca gggcaccacc accttcgctg tgaccagcat actgcgcgtg
900gcagccgagg actggaagaa gggggacacc ttctcctgca tggtgggcca cgaggccctg
960ccgctggcct tcacacagaa gaccatcgac cgcttggcgg gtaaacccac ccatgtcaat
1020gtgtctgttg tcatggcgga ggtggacggc acctgctact ga
1062223888DNAhomo sapiens; 223gcctccacca agggcccatc ggtcttcccc ctggcaccct
cctccaagag cacctctggg 60ggcacagcag ccctgggctg cctggtcaag gactacttcc
ccgaaccggt gacggtgtcg 120tggaactcag gcgccctgac cagcggcgtg cacaccttcc
cggctgtcct acagtcctca 180ggactctact ccctcagcag cgtggtgacc gtgccctcca
gcagcttggg cacccagacc 240tacatctgca acgtgaatca caagcccagc aacaccaagg
tggacaagaa agttgagccc 300aaatcttgtg acaaaactca cacatgccca ccgtgcccag
cacctgaact cctgggggga 360ccgtcagtct tcctcttccc cccaaaaccc aaggacaccc
tcatgatctc ccggacccct 420gaggtcacat gcgtggtggt ggacgtgagc cacgaagacc
ctgaggtcaa gttcaactgg 480tacgtggacg gcgtggagta caagtgcaag gtctccaaca
aagccctccc agcccccatc 540gagaaaacca tctccaaagc caaagggcag ccccgagaac
cacaggtgta caccctgccc 600ccatcccggg atgagctgac caagaaccag gtcagcctga
cctgcctggt caaaggcttc 660tatcccagcg acatcgccgt ggagtgggag agcaatgggc
agccggagaa caactacaag 720accacgcctc ccgtgctgga ctccgacggc tccttcttcc
tctacagcaa gctcaccgtg 780gacaagagca ggtggcagca ggggaacgtc ttctcatgct
ccgtgatgca tgaggctctg 840cacaaccact acacacagaa gagcctctcc ctgtctccgg
gtaaatga 888224993DNAhomo sapiens; 224gcctccacca
agggcccatc ggtcttcccc ctggcaccct cctccaagag cacctctggg 60ggcacagcag
ccctgggctg cctggtcaag gactacttcc ccgaaccggt gacggtgtcg 120tggaactcag
gcgccctgac cagcggcgtg cacaccttcc cggctgtcct acagtcctca 180ggactctact
ccctcagcag cgtggtgacc gtgccctcca gcagcttggg cacccagacc 240tacatctgca
acgtgaatca caagcccagc aacaccaagg tggacaagaa agttgagccc 300aaatcttgtg
acaaaactca cacatgccca ccgtgcccag cacctgaact cctgggggga 360ccgtcagtct
tcctcttccc cccaaaaccc aaggacaccc tcatgatctc ccggacccct 420gaggtcacat
gcgtggtggt ggacgtgagc cacgaagacc ctgaggtcaa gttcaactgg 480tacgtggacg
gcgtggaggt gcataatgcc aagacaaagc cgcgggagga gcagtacaac 540agcacgtacc
gtgtggtcag cgtcctcacc gtcctgcacc aggactggct gaatggcaag 600gagtacaagt
gcaaggtctc caacaaagcc ctcccagccc ccatcgagaa aaccatctcc 660aaagccaaag
ggcagccccg agaaccacag gtgtacaccc tgcccccatc ccgggatgag 720ctgaccaaga
accaggtcag cctgacctgc ctggtcaaag gcttctatcc cagcgacatc 780gccgtggagt
gggagagcaa tgggcagccg gagaacaact acaagaccac gcctcccgtg 840ctggactccg
acggctcctt cttcctctac agcaagctca ccgtggacaa gagcaggtgg 900cagcagggga
acgtcttctc atgctccgtg atgcatgagg ctctgcacaa ccactacaca 960cagaagagcc
tctccctgtc tccgggtaaa tga
9932251362DNAhomo sapiens; 225gggagtgcat ccgccccaac ccttttcccc ctcgtctcct
gtgagaattc cccgtcggat 60acgagcagcg tggccgttgg ctgcctcgca caggacttcc
ttcccgactc catcactttc 120tcctggaaat acaagaacaa ctctgacatc agcagcaccc
ggggcttccc atcagtcctg 180agagggggca agtacgcagc cacctcacag gtgctgctgc
cttccaagga cgtcatgcag 240ggcacagacg aacacgtggt gtgcaaagtc cagcacccca
acggcaacaa agaaaagaac 300gtgcctcttc cagtgattgc cgagctgcct cccaaagtga
gcgtcttcgt cccaccccgc 360gacggcttct tcggcaaccc ccgcaagtcc aagctcatct
gccaggccac gggtttcagt 420ccccggcaga ttcaggtgtc ctggctgcgc gaggggaagc
aggtggggtc tggcgtcacc 480acggaccagg tgcaggctga ggccaaagag tctgggccca
cgacctacaa ggtgaccagc 540acactgacca tcaaagagag cgactggctc agccagagca
tgttcacctg ccgcgtggat 600cacaggggcc tgaccttcca gcagaatgcg tcctccatgt
gtggccccga tcaagacaca 660gccatccggg tcttcgccat ccccccatcc tttgccagca
tcttcctcac caagtccacc 720aagttgacct gcctggtcac agacctgacc acctatgaca
gcgtgaccat ctcctggacc 780cgccagaatg gcgaagctgt gaaaacccac accaacatct
ccgagagcca ccccaatgcc 840actttcagcg ccgtgggtga ggccagcatc tgcgaggatg
actggaattc cggggagagg 900ttcacgtgca ccgtgaccca cacagacctg ccctcgccac
tgaagcagac catctcccgg 960cccaaggggg tggccctgca caggcccgat gtctacttgc
tgccaccagc ccgggagcag 1020ctgaacctgc gggagtcggc caccatcacg tgcctggtga
cgggcttctc tcccgcggac 1080gtcttcgtgc agtggatgca gagggggcag cccttgtccc
cggagaagta tgtgaccagc 1140gccccaatgc ctgagcccca ggccccaggc cggtacttcg
cccacagcat cctgaccgtg 1200tccgaagagg aatggaacac gggggagacc tacacctgcg
tggtggccca tgaggccctg 1260cccaacaggg tcaccgagag gaccgtggac aagtccaccg
gtaaacccac cctgtacaac 1320gtgtccctgg tcatgtccga cacagctggc acctgctact
ga 13622261200DNAhomo sapiens; 226gcctccacca
agggcccatc ggtcttcccc ctggcaccct cctccaagag cacctctggg 60ggcacagcag
ccctgggctg cctggtcaag gactacttcc ccgaaccggt gacggtgtcg 120tggaactcag
gcgccctgac cagcggcgtg cacaccttcc cggctgtcct acagtcctca 180ggactctact
ccctcagcag cgtggtgacc gtgccctcca gcagcttggg cacccagacc 240tacatctgca
acgtgaatca caagcccagc aacaccaagg tggacaagaa agttgagccc 300aaatcttgtg
acaaaactca cacatgccca ccgtgcccag cacctgaact cctgggggga 360ccgtcagtct
tcctcttccc cccaaaaccc aaggacaccc tcatgatctc ccggacccct 420gaggtcacat
gcgtggtggt ggacgtgagc cacgaagacc ctgaggtcaa gttcaactgg 480tacgtggacg
gcgtggaggt gcataatgcc aagacaaagc cgcgggagga gcagtacaac 540agcacgtacc
gtgtggtcag cgtcctcacc gtcctgcacc aggactggct gaatggcaag 600gagtacaagt
gcaaggtctc caacaaagcc ctcccagccc ccatcgagaa aaccatctcc 660aaagccaaag
ggcagccccg agaaccacag gtgtacaccc tgcccccatc ccgggatgag 720ctgaccaaga
accaggtcag cctgacctgc ctggtcaaag gcttctatcc cagcgacatc 780gccgtggagt
gggagagcaa tgggcagccg gagaacaact acaagaccac gcctcccgtg 840ctggactccg
acggctcctt cttcctctac agcaagctca ccgtggacaa gagcaggtgg 900cagcagggga
acgtcttctc atgctccgtg atgcatgagg ctctgcacaa ccactacaca 960cagaagagcc
tctccctgtc tccggagctg caactggagg agagctgtgc ggaggcgcag 1020gacggggagc
tggacgggct gtggacgacc atcaccatct tcatcacact cttcctgtta 1080agcgtgtgct
acagtgccac cgtcaccttc ttcaaggtga agtggatctt ctcctcggtg 1140gtggacctga
agcagaccat catccccgac tacaggaaca tgatcggaca gggggcctag
12002271293DNAhomo sapiens;misc_feature(1)..(1)n is a, c, g, or t
227ncacccacca aggctccgga tgtgttcccc atcatatcag ggtgcagaca cccaaaggat
60aacagccctg tggtcctggc atgcttgata actgggtacc acccaacgtc cgtgactgtc
120acctggtaca tggggacaca gagccagccc cagagaacct tccctgagat acaaagacgg
180gacagctact acatgacaag cagccagctc tccacccccc tccagcagtg gcgccaaggc
240gagtacaaat gcgtggtcca gcacaccgcc agcaagagta agaaggagat cttccgctgg
300ccagagtctc caaaggcaca ggcctcctca gtgcccactg cacaacccca agcagagggc
360agcctcgcca aggcaaccac agccccagcc accacccgta acacaggaag aggaggagaa
420gagaagaaga aggagaagga gaaagaggaa caagaagaga gagagacaaa gacaccagag
480tgtccgagcc acacccagcc tcttggcgtc tacctgctaa cccctgcagt gcaggacctg
540tggctccggg acaaagccac cttcacctgc ttcgtggtgg gcagtgacct gaaggatgct
600cacctgacct gggaggtggc cgggaaggtc cccacagggg gcgtggagga agggctgctg
660gagcggcaca gcaacggctc ccagagccag cacagccgtc tgaccctgcc caggtccttg
720tggaacgcgg ggacctccgt cacctgcaca ctgaaccatc ccagcctccc accccagagg
780ttgatggcgc tgagagaacc cgctgcgcag gcacccgtca agctttccct gaacctgctg
840gcctcgtctg accctcccga ggcggcctcg tggctcctgt gtgaggtgtc tggcttctcg
900ccccccaaca tcctcctgat gtggctggag gaccagcgtg aggtgaacac ttctgggttt
960gcccccgcac gcccccctcc acagcccggg agcaccacgt tctgggcctg gagtgtgctg
1020cgtgtcccag ccccgcccag ccctcagcca gccacctaca cgtgtgtggt cagccacgag
1080gactcccgga ctctgctcaa cgccagccgg agcctagaag tcagctacct ggccatgacc
1140cccctgatcc ctcagagcaa ggatgagaac agcgatgact acacgacctt tgatgatgtg
1200ggcagcctgt ggaccgccct gtccacgttt gtggccctct tcatcctcac cctcctctac
1260agcggcattg tcactttcat caaggtgaag tag
1293228984DNAhomo sapiens; 228gcttccacca agggcccatc ggtcttcccc ctggcgccct
gctccaggag cacctccgag 60agcacagccg ccctgggctg cctggtcaag gactacttcc
ccgaaccggt gacggtgtcg 120tggaactcag gcgccctgac cagcggcgtg cacaccttcc
cggctgtcct acagtcctca 180ggactctact ccctcagcag cgtggtgacc gtgccctcca
gcagcttggg cacgaagacc 240tacacctgca acgtagatca caagcccagc aacaccaagg
tggacaagag agttgagtcc 300aaatatggtc ccccatgccc atcatgccca gcacctgagt
tcctgggggg accatcagtc 360ttcctgttcc ccccaaaacc caaggacact ctcatgatct
cccggacccc tgaggtcacg 420tgcgtggtgg tggacgtgag ccaggaagac cccgaggtcc
agttcaactg gtacgtggat 480ggcgtggagg tgcataatgc caagacaaag ccgcgggagg
agcagttcaa cagcacgtac 540cgtgtggtca gcgtcctcac cgtcctgcac caggactggc
tgaacggcaa ggagtacaag 600tgcaaggtct ccaacaaagg cctcccgtcc tccatcgaga
aaaccatctc caaagccaaa 660gggcagcccc gagagccaca ggtgtacacc ctgcccccat
cccaggagga gatgaccaag 720aaccaggtca gcctgacctg cctggtcaaa ggcttctacc
ccagcgacat cgccgtggag 780tgggagagca atgggcagcc ggagaacaac tacaagacca
cgcctcccgt gctggactcc 840gacggctcct tcttcctcta cagcaggctc accgtggaca
agagcaggtg gcaggagggg 900aatgtcttct catgctccgt gatgcatgag gctctgcaca
accactacac acagaagagc 960ctctccctgt ctctgggtaa atga
984229228DNAhomo sapiens; 229tccggcttct tcgtcttcag
ccgcctggag gtgaccaggg ccgaatggga gcagaaagat 60gagttcatct gccgtgcagt
ccatgaggca gcaagcccct cacagaccgt ccagcgagcg 120gtgtctgtaa atcccgagct
ggacgtgtgc gtggaggagg ccgagggcga ggcgccgtgg 180acgtggaccg gcctctgcat
cttcgccgca ctcttcctgc tcagcgtg 228230324DNAhomo
sapiens;misc_feature(1)..(1)n is a, c, g, or t 230ngaactgtgg ctgcaccatc
tgtcttcatc ttcccgccat ctgatgagca gttgaaatct 60ggaactgcct ctgttgtgtg
cctgctgaat aacttctatc ccagagaggc caaagtacag 120tggaaggtgg ataacgccct
ccaatcgggt aactcccagg agagtgtcac agagcaggac 180agcaaggaca gcacctacag
cctcagcagc accctgacgc tgagcaaagc agactacgag 240aaacacaaag tctacgcctg
cgaagtcacc catcagggcc tgagctcgcc cgtcacaaag 300agcttcaaca ggggagagtg
ttag 324231321DNAhomo
sapiens;misc_feature(1)..(1)n is a, c, g, or t 231ngtcagccca aggctgcccc
ctcggtcact ctgttcccgc cctcctctga ggagcttcaa 60gccaacaagg ccacactggt
gtgtctcata agtgacttct acccgggagc cgtgacagtg 120gcctggaagg cagatagcag
ccccgtcaag gcgggagtgg agaccaccac accctccaaa 180caaagcaaca acaagtacgc
ggccagcagc tatctgagcc tgacgcctga gcagtggaag 240tcccacagaa gctacagctg
ccaggtcacg catgaaggga gcaccgtgga gaagacagtg 300gcccctacag aatgttcata g
321232321DNAhomo
sapiens;misc_feature(1)..(1)n is a, c, g, or t 232ngtcagccca aggccaaccc
cactgtcact ctgttcccgc cctcctctga ggagctccaa 60gccaacaagg ccacactagt
gtgtctgatc agtgacttct acccgggagc tgtgacagtg 120gcctggaagg cagatggcag
ccccgtcaag gcgggagtgg agaccaccaa accctccaaa 180cagagcaaca acaagtacgc
ggccagcagc tacctgagcc tgacgcccga gcagtggaag 240tcccacagaa gctacagctg
ccaggtcacg catgaaggga gcaccgtgga gaagacagtg 300gcccctacag aatgttcata g
321233321DNAhomo
sapiens;misc_feature(1)..(1)n is a, c, g, or t 233ngtcagccca aggctgcccc
ctcggtcact ctgttcccgc cctcctctga ggagcttcaa 60gccaacaagg ccacactggt
gtgtctcata agtgacttct acccgggagc cgtgacagtg 120gcctggaagg cagatagcag
ccccgtcaag gcgggagtgg agaccaccac accctccaaa 180caaagcaaca acaagtacgc
ggccagcagc tacctgagcc tgacgcctga gcagtggaag 240tcccacagaa gctacagctg
ccaggtcacg catgaaggga gcaccgtgga gaagacagtg 300gcccctacag aatgttcata g
321234321DNAhomo
sapiens;misc_feature(1)..(1)n is a, c, g, or t 234ngtcagccca aggctgcccc
ctcggtcact ctgttcccac cctcctctga ggagcttcaa 60gccaacaagg ccacactggt
gtgtctcgta agtgacttct acccgggagc cgtgacagtg 120gcctggaagg cagatggcag
ccccgtcaag gtgggagtgg agaccaccaa accctccaaa 180caaagcaaca acaagtatgc
ggccagcagc tacctgagcc tgacgcccga gcagtggaag 240tcccacagaa gctacagctg
ccgggtcacg catgaaggga gcaccgtgga gaagacagtg 300gcccctgcag aatgctctta g
32123561DNAhomo sapiens;
235tactactact actacggtat ggacgtctgg ggccaaggga ccacggtcac cgtctcctca
60g
6123620PRThomo sapiens; 236Tyr Tyr Tyr Tyr Tyr Gly Met Asp Val Trp Gly
Gln Gly Thr Thr Val 1 5 10
15 Thr Val Ser Ser 20 237101DNAhomo sapiens;
237ggtttttgtg gggtgaggat ggacattctg ccattgtgat tactactact actacggtat
60ggacgtctgg ggccaaggga ccacggtcac cgtctcctca g
10123838DNAhomo sapiens; 238ggtttttgtg gggtgaggat ggacattctg ccattgtg
3823952DNAhomo sapiens; 239gctgaatact tccagcactg
gggccagggc accctggtca ccgtctcctc ag 5224052DNAhomo sapiens;
240tactggtact tcgatctctg gggccgtggc accctggtca ctgtctcctc ag
5224149DNAhomo sapiens; 241gatgcttttg atatctgggg ccaagggaca atggtcaccg
tctcttcag 4924246DNAhomo sapiens; 242tactttgact
actggggcca gggaaccctg gtcaccgtct cctcag 4624349DNAhomo
sapiens; 243aactggttcg acccctgggg ccagggaacc ctggtcaccg tctcctcag
4924461DNAhomo sapiens; 244tactactact actactacat ggacgtctgg
ggcaaaggga ccacggtcac cgtctcctca 60g
61245494DNAhomo sapiens; 245gcatcaccca
aaaaccacac ccctccttgg gagaatcccc tagatcacag ctcctcacca 60tggactggac
ctggagcatc cttttcttgg tggcagcagc aacaggtaac ggactcccca 120gtcccagggc
tgagagagaa accaggccag tcatgtgaga cttcacccac tcctgtgtcc 180tctccacagg
tgcccactcc caggttcagc tggtgcagtc tggagctgag gtgaagaagc 240ctggggcctc
agtgaaggtc tcctgcaagg cttctggtta cacctttacc agctatggta 300tcagctgggt
gcgacaggcc cctggacaag ggcttgagtg gatgggatgg atcagcgctt 360acaatggtaa
cacaaactat gcacagaagc tccagggcag agtcaccatg accacagaca 420catccacgag
cacagcctac atggagctga ggagcctgag atctgacgac acggccgtgt 480attactgtgc
gaga
494246500DNAhomo sapiens; 246gagagcatca cccagcaacc acatctgtcc tctagagaat
cccctgagag ctccgttcct 60caccatggac tggacctgga ggatcctctt cttggtggca
gcagccacag gtaagaggct 120ccctagtccc agtgatgaga aagagattga gtccagtcca
gggagatctc atccacttct 180gtgttctctc cacaggagcc cactcccagg tgcagctggt
gcagtctggg gctgaggtga 240agaagcctgg ggcctcagtg aaggtctcct gcaaggcttc
tggatacacc ttcaccggct 300actatatgca ctgggtgcga caggcccctg gacaagggct
tgagtggatg ggatggatca 360accctaacag tggtggcaca aactatgcac agaagtttca
gggcagggtc accatgacca 420gggacacgtc catcagcaca gcctacatgg agctgagcag
gctgagatct gacgacacgg 480ccgtgtatta ctgtgcgaga
500247496DNAhomo sapiens; 247accatcacac aacagccaca
tccctcccct acagaagccc ccagagcgca gcacctcacc 60atggactgca cctggaggat
cctcttcttg gtggcagcag ctacaggcaa gagaatcctg 120agttccaggg ctgatgaggg
gactgggtcc agttaagtgg tgtctcatcc actcctctgt 180cctctccaca ggcacccacg
cccaggtcca gctggtacag tctggggctg aggtgaagaa 240gcctggggcc tcagtgaagg
tctcctgcaa ggtttccgga tacaccctca ctgaattatc 300catgcactgg gtgcgacagg
ctcctggaaa agggcttgag tggatgggag gttttgatcc 360tgaagatggt gaaacaatct
acgcacagaa gttccagggc agagtcacca tgaccgagga 420cacatctaca gacacagcct
acatggagct gagcagcctg agatctgagg acacggccgt 480gtattactgt gcaaca
496248478DNAhomo sapiens;
248ccacatccct cctcagaagc ccccagagca caactcctca ccatggactg gacctggagg
60atcctctttt tggtggcagc agccacaggt aaggggctgc caaatcccag tgaggaggaa
120gggatcgaag ccagtcaagg gggcttccat ccactcctgt gtcttctcta caggtgtcca
180ctcccaggtt cagctggtgc agtctggggc tgaggtgaag aagcctgggg cctcagtgaa
240ggtttcctgc aaggcttctg gatacacctt cactagctat gctatgcatt gggtgcgcca
300ggcccccgga caaaggcttg agtggatggg atggagcaac gctggcaatg gtaacacaaa
360atattcacag gagttccagg gcagagtcac cattaccagg gacacatccg cgagcacagc
420ctacatggag ctgagcagcc tgagatctga ggacatggct gtgtattact gtgcgaga
478249494DNAhomo sapiens; 249atcacccaac aaccacatcc ctcctctaga gaatcccctg
aaagcacagc tcctcaccat 60ggactggacc tggagaatcc tcttcttggt ggcagcagcc
acaggtaagg ggctcccaag 120tcccagtgat gaggagggga ttgagtccag tcaaggtggc
ttttatccac tcctgtgtcc 180cctccacaga tgcctactcc cagatgcagc tggtgcagtc
tggggctgag gtgaagaaga 240ctgggtcctc agtgaaggtt tcctgcaagg cttccggata
caccttcacc taccgctacc 300tgcactgggt gcgacaggcc cccggacaag cgcttgagtg
gatgggatgg atcacacctt 360tcaatggtaa caccaactac gcacagaaat tccaggacag
agtcaccatt accagggaca 420ggtctatgag cacagcctac atggagctga gcagcctgag
atctgaggac acagccatgt 480attactgtgc aaga
494250740DNAhomo sapiens; 250atctgtgggg acttgttctt
cagtgaaagg atcctgtccg caaacagaaa tggagcagga 60catgcatttc ttcaagcagg
attagggctt ggaccatcag catcccactc ctgtgtggca 120gatgggacat ctatcttctt
tctcaacctc gatcaggctt tgaggtatga aataatctgt 180ctcatgaata tgcaaataac
cttagatcta ctgaggtaaa tatggataca tctgggccct 240gaaagcatca tccaacaacc
acatcccttc tctacagaag cctctgagag gaaagttctt 300caccatggac tggacctgga
gggtcttctg cttgctggct gtagctccag gtaaagggcc 360aactggttcc agggctgagg
aagggatttt ttccagttta gaggactgtc attctctact 420gtgtcctctc cgcaggtgct
cactcccagg tgcagctggt gcagtctggg gctgaggtga 480agaagcctgg ggcctcagtg
aaggtttcct gcaaggcatc tggatacacc ttcaccagct 540actatatgca ctgggtgcga
caggcccctg gacaagggct tgagtggatg ggaataatca 600accctagtgg tggtagcaca
agctacgcac agaagttcca gggcagagtc accatgacca 660gggacacgtc cacgagcaca
gtctacatgg agctgagcag cctgagatct gaggacacgg 720ccgtgtatta ctgtgcgaga
740251497DNAhomo sapiens;
251agcatcatcc agaaaccaca tccctccgct agagaagccc ctgacggcac agttcctcac
60tatggactgg atttggaggg tcctcttctt ggtgggagca gcgacaggca aggagatgcc
120aagtcccagt gatgaggagg ggattgagtc cagtcaaggt ggctttcatc cactcctgtg
180ttctctccac aggtgcccac tcccaaatgc agctggtgca gtctgggcct gaggtgaaga
240agcctgggac ctcagtgaag gtctcctgca aggcttctgg attcaccttt actagctctg
300ctatgcagtg ggtgcgacag gctcgtggac aacgccttga gtggatagga tggatcgtcg
360ttggcagtgg taacacaaac tacgcacaga agttccagga aagagtcacc attaccaggg
420acatgtccac aagcacagcc tacatggagc tgagcagcct gagatccgag gacacggccg
480tgtattactg tgcggca
497252498DNAhomo sapiens; 252agcatcacat aacaaccaca ttcctcctct gaagaagccc
ctgggagcac agctcatcac 60catggactgg acctggaggt tcctctttgt ggtggcagca
gctacaggta aggggcttcc 120tagtcctaag gctgaggaag ggatcctggt ttagttaaag
aggattttat tcacccctgt 180gtcctctcca caggtgtcca gtcccaggtg cagctggtgc
agtctggggc tgaggtgaag 240aagcctgggt cctcggtgaa ggtctcctgc aaggcttctg
gaggcacctt cagcagctat 300gctatcagct gggtgcgaca ggcccctgga caagggcttg
agtggatggg agggatcatc 360cctatctttg gtacagcaaa ctacgcacag aagttccagg
gcagagtcac gattaccgcg 420gacaaatcca cgagcacagc ctacatggag ctgagcagcc
tgagatctga ggacacggcc 480gtgtattact gtgcgaga
498253499DNAhomo sapiens; 253gagcatcact caacaaccac
atctgtcctc tagagaaaac cctgtgagca cagctcctca 60ccatggactg gacctggagg
atcctcttct tggtggcagc agctacaagt aaggggcttc 120ctagtctcaa agctgaggaa
cggatcctgg ttcagtcaaa gaggatttta ttctctcctg 180tgttctctcc acaggtgccc
actcccaggt gcagctggtg cagtctgggg ctgaggtgaa 240gaagcctggg gcctcagtga
aggtctcctg caaggcttct ggatacacct tcaccagtta 300tgatatcaac tgggtgcgac
aggccactgg acaagggctt gagtggatgg gatggatgaa 360ccctaacagt ggtaacacag
gctatgcaca gaagttccag ggcagagtca ccatgaccag 420gaacacctcc ataagcacag
cctacatgga gctgagcagc ctgagatctg aggacacggc 480cgtgtattac tgtgcgaga
499254467DNAhomo sapiens;
254gctcagtgac tcctgtgccc caccatggac acactttgct acacactcct gctgctgacc
60accccttcct gtgagtgctg tggtcaggga cttcctcaga agtgaaacat cagttgtctc
120ctttgtgggc ttcatcttct tatgtcttct ccacaggggt cttgtcccag gtcaccttga
180aggagtctgg tcctgtgctg gtgaaaccca cagagaccct cacgctgacc tgcaccgtct
240ctgggttctc actcagcaat gctagaatgg gtgtgagctg gatccgtcag cccccaggga
300aggccctgga gtggcttgca cacatttttt cgaatgacga aaaatcctac agcacatctc
360tgaagagcag gctcaccatc tccaaggaca cctccaaaag ccaggtggtc cttaccatga
420ccaacatgga ccctgtggac acagccacat attactgtgc acggata
467255463DNAhomo sapiens; 255agtgactcct gtgccccacc atggacacac tttgctccac
gctcctgctg ctgaccatcc 60cttcatgtga gtgctgtggt cagggactcc ttcacgggtg
aaacatcagt tttcttgttt 120gtgggcttca tcttcttatg ctttctccac aggggtcttg
tcccagatca ccttgaagga 180gtctggtcct acgctggtga aacccacaca gaccctcacg
ctgacctgca ccttctctgg 240gttctcactc agcactagtg gagtgggtgt gggctggatc
cgtcagcccc caggaaaggc 300cctggagtgg cttgcactca tttattggaa tgatgataag
cgctacagcc catctctgaa 360gagcaggctc accatcacca aggacacctc caaaaaccag
gtggtcctta caatgaccaa 420catggaccct gtggacacag ccacatatta ctgtgcacac
aga 463256519DNAhomo sapiens; 256atctccacca
gctccaccct cccctgggtt caaaagacga ggacagggcc tcgctcagtg 60aatcctgctc
tccaccatgg acatactttg ttccacgctc ctgctactga ctgtcccgtc 120ctgtgagtgc
tgtggtcagg tagtacttca gaagcaaaaa atctattctc tcctttgtgg 180gcttcatctt
cttatgtctt ctccacaggg gtcttatccc aggtcacctt gagggagtct 240ggtcctgcgc
tggtgaaacc cacacagacc ctcacactga cctgcacctt ctctgggttc 300tcactcagca
ctagtggaat gtgtgtgagc tggatccgtc agcccccagg gaaggccctg 360gagtggcttg
cactcattga ttgggatgat gataaatact acagcacatc tctgaagacc 420aggctcacca
tctccaagga cacctccaaa aaccaggtgg tccttacaat gaccaacatg 480gaccctgtgg
acacagccac gtattattgt gcacggata
519257568DNAhomo sapiens; 257ctccctctgc tgataaaaac cagccgagcc cagaccctgc
agctctggga gaagagcccc 60agccccagaa ttcccaggag tttccattcg gtgatcagca
ctgaacacag aggactcacc 120atggagtttg ggctgagctg ggttttcctt gttgctatta
taaaaggtga tttatggaga 180actagagaca ttgagtggac gtgagtgaga taagcagtga
atatatgtgg cagtttctga 240ctaggttgtc tctgtgtttg caggtgtcca gtgtcaggtg
cagctggtgg agtctggggg 300aggcttggtc aagcctggag ggtccctgag actctcctgt
gcagcctctg gattcacctt 360cagtgactac tacatgagct ggatccgcca ggctccaggg
aaggggctgg agtgggtttc 420atacattagt agtagtggta gtaccatata ctacgcagac
tctgtgaagg gccgattcac 480catctccagg gacaacgcca agaactcact gtatctgcaa
atgaacagcc tgagagccga 540ggacacggcc gtgtattact gtgcgaga
568258531DNAhomo sapiens; 258agctctggga gtggagcccc
agccttggga ttcccaagtg tttgtattca gtgatcagga 60ctgaacacac aggactcacc
atggagttgg ggctgagctg ggttttcctt gttgctatat 120tagaaggtga ttcatggaga
actagagata ttgagtgtga atgggcatga atgagagaaa 180cagtgggtat gtgtggcaat
ttctgacttt tgtgtctctg tgtttgcagg tgtccagtgt 240gaggtgcagc tggtggagtc
tgggggaggc ttggtacagc ctggggggtc cctgagactc 300tcctgtgcag cctctggatt
caccttcagt agctacgaca tgcactgggt ccgccaagct 360acaggaaaag gtctggagtg
ggtctcagct attggtactg ctggtgacac atactatcca 420ggctccgtga agggccgatt
caccatctcc agagaaaatg ccaagaactc cttgtatctt 480caaatgaaca gcctgagagc
cggggacacg gctgtgtatt actgtgcaag a 531259540DNAhomo sapiens;
259agctctggga gaggagcccc agccttggga ttcccaagtg ttttcattca gtgatcagga
60ctgaacacag aggactcacc atggagtttg ggctgagctg gattttcctt gctgctattt
120taaaaggtga tttatggaga actagagaga ttaagtgtga gtggacgtga gtgagagaaa
180cagtggatat gtgtggcagt ttctgatctt agtgtctctg tgtttgcagg tgtccagtgt
240gaggtgcagc tggtggagtc tgggggaggc ttggtaaagc ctggggggtc ccttagactc
300tcctgtgcag cctctggatt cactttcagt aacgcctgga tgagctgggt ccgccaggct
360ccagggaagg ggctggagtg ggttggccgt attaaaagca aaactgatgg tgggacaaca
420gactacgctg cacccgtgaa aggcagattc accatctcaa gagatgattc aaaaaacacg
480ctgtatctgc aaatgaacag cctgaaaacc gaggacacag ccgtgtatta ctgtaccaca
540260526DNAhomo sapiens; 260agccctggga gagaagcccc agccctggga ttctcaggtg
tttctattgg gtcaacagca 60ataaacaaat taccatggaa tttgggctga gctgggtttt
tcttgctggt attttaaaag 120gtgattcatg gagaactaag gatattgagt gagtggacat
gagtgagaga aacagtggat 180atgtgtggca gtttctgacc agggtgtctc tgtgtttgca
ggtgtccagt gtgaggtgca 240gctggtggag tctgggggag gcttggtaca gcctgggggg
tccctgagac tctcctgtgc 300agcctctgga ttcaccttca gtaacagtga catgaactgg
gcccgcaagg ctccaggaaa 360ggggctggag tgggtatcgg gtgttagttg gaatggcagt
aggacgcact atgtggactc 420cgtgaagcgc cgattcatca tctccagaga caattccagg
aactccctgt atctgcaaaa 480gaacagacgg agagccgagg acatggctgt gtattactgt
gtgaga 526261515DNAhomo sapiens; 261ccagccctga
gattcccacg tgtttccatt cagtgatcag cactgaacac agaggactcg 60ccatggagtt
tgggctgagc tgggttttcc ttgttgctat tttaaaaggt gattcatgga 120tcaatagaga
tgttgagtgt gagtgaacac gagtgagaga aacagtggat ttgtgtggca 180gtttctgacc
aggtgtctct gtgtttgcag gtgtccagtg tgaggtgcag ctggtggagt 240ctgggggagg
tgtggtacgg cctggggggt ccctgagact ctcctgtgca gcctctggat 300tcacctttga
tgattatggc atgagctggg tccgccaagc tccagggaag gggctggagt 360gggtctctgg
tattaattgg aatggtggta gcacaggtta tgcagactct gtgaagggcc 420gattcaccat
ctccagagac aacgccaaga actccctgta tctgcaaatg aacagtctga 480gagccgagga
cacggccttg tatcactgtg cgaga
515262531DNAhomo sapiens; 262agctctgaga gaggagcctt agccctggat tccaaggcct
atccacttgg tgatcagcac 60tgagcaccga ggattcacca tggaactggg gctccgctgg
gttttccttg ttgctatttt 120agaaggtgaa tcatggaaaa gtagagagat ttagtgtgtg
tggatatgag tgagagaaac 180ggtggatgtg tgtgacagtt tctgaccaat gtctctctgt
ttgcaggtgt ccagtgtgag 240gtgcagctgg tggagtctgg gggaggcctg gtcaagcctg
gggggtccct gagactctcc 300tgtgcagcct ctggattcac cttcagtagc tatagcatga
actgggtccg ccaggctcca 360gggaaggggc tggagtgggt ctcatccatt agtagtagta
gtagttacat atactacgca 420gactcagtga agggccgatt caccatctcc agagacaacg
ccaagaactc actgtatctg 480caaatgaaca gcctgagagc cgaggacacg gctgtgtatt
actgtgcgag a 531263533DNAhomo sapiens; 263agctctgaga
gaggagccca gccctgggat tttcaggtgt tttcatttgg tgatcaggac 60tgaacagaga
gaactcacca tggagtttgg gctgagctgg ctttttcttg tggctatttt 120aaaaggtaat
tcatggagaa atagaaaaat tgagtgtgaa tggataagag tgagagaaac 180agtggatacg
tgtggcagtt tctgaccagg gtttcttttt gtttgcaggt gtccagtgtg 240aggtgcagct
gttggagtct gggggaggct tggtacagcc tggggggtcc ctgagactct 300cctgtgcagc
ctctggattc acctttagca gctatgccat gagctgggtc cgccaggctc 360cagggaaggg
gctggagtgg gtctcagcta ttagtggtag tggtggtagc acatactacg 420cagactccgt
gaagggccgg ttcaccatct ccagagacaa ttccaagaac acgctgtatc 480tgcaaatgaa
cagcctgaga gccgaggaca cggccgtata ttactgtgcg aaa
533264532DNAhomo sapiens; 264cagctctggg agaggagccc agcactagaa gtcggcggtg
tttccattcg gtgatcagca 60ctgaacacag aggactcacc atggagtttg ggctgagctg
ggttttcctc gttgctcttt 120taagaggtga ttcatggaga aatagagaga ctgagtgtga
gtgaacatga gtgagaaaaa 180ctggatttgt gtggcatttt ctgataacgg tgtccttctg
tttgcaggtg tccagtgtca 240ggtgcagctg gtggagtctg ggggaggcgt ggtccagcct
gggaggtccc tgagactctc 300ctgtgcagcc tctggattca ccttcagtag ctatggcatg
cactgggtcc gccaggctcc 360aggcaagggg ctggagtggg tggcagttat atcatatgat
ggaagtaata aatactatgc 420agactccgtg aagggccgat tcaccatctc cagagacaat
tccaagaaca cgctgtatct 480gcaaatgaac agcctgagag ctgaggacac ggctgtgtat
tactgtgcga ga 532265532DNAhomo sapiens; 265cagctctggg
agaggagccc agcactagaa gtcggcggtg tttccattcg gtgatcagca 60ctgaacacag
aggactcacc atggagtttg ggctgagctg ggttttcctc gttgctcttt 120taagaggtga
ttcatggaga aatagagaga ctgagtgtga gtgaacatga gtgagaaaaa 180ctggatttgt
gtggcatttt ctgataacgg tgtccttctg tttgcaggtg tccagtgtca 240ggtgcagctg
gtggagtctg ggggaggcgt ggtccagcct gggaggtccc tgagactctc 300ctgtgcagcg
tctggattca ccttcagtag ctatggcatg cactgggtcc gccaggctcc 360aggcaagggg
ctggagtggg tggcagttat atggtatgat ggaagtaata aatactatgc 420agactccgtg
aagggccgat tcaccatctc cagagacaat tccaagaaca cgctgtatct 480gcaaatgaac
agcctgagag ccgaggacac ggctgtgtat tactgtgcga ga
532266467DNAhomo sapiens; 266aacaaacaaa ttaccatgga atttgggctg agctgggttt
ttcttgctgc tattttaaaa 60ggtgattcat gaagaactaa ggatattgag tgagtggaca
tgagtgagag aaacagtgga 120tttgtgtggc agtttctgac cagggtgtct ctgtgtttgc
aggtgtccag tgtgaggtgc 180agctggtgga gtctggggga ggcttggtac agcctggggg
atccctgaga ctctcctgtg 240cagcctctgg attcaccttc agtaacagtg acatgaactg
ggtccatcag gctccaggaa 300aggggctgga gtgggtatcg ggtgttagtt ggaatggcag
taggacgcac tatgcagact 360ctgtgaaggg ccgattcatc atctccagag acaattccag
gaacaccctg tatctgcaaa 420cgaatagcct gagggccgag gacacggctg tgtattactg
tgtgaga 467267529DNAhomo sapiens; 267ctctgggagt
ggagccccag ccttgggatt cccaggtgtt tcccttcagt gatcaggact 60gaacacacac
aactcatcat gcagtttgtg ctgagctggg ttttccttgt tggtatttta 120aaaggtgatt
catggagaac tacagatgtt gagtgtgagt ggacatgagt gagcaaaaca 180gtgggtttgt
gtggcagttt ctgaccttgg tgtctctgtg tttgcaggtg tccagtgtga 240ggtgcagctg
gtggagtctg ggggaggctt ggtacagcct agggggtccc tgagactctc 300ctgtgcagcc
tctggattca ccgtcagtag caatgagatg agctggatcc gccaggctcc 360agggaagggg
ctggagtggg tctcatccat tagtggtggt agcacatact acgcagactc 420caggaagggc
agattcacca tctccagaga caattccaag aacacgctgt atcttcaaat 480gaacaacctg
agagctgagg gcacggccgt gtattactgt gccagatat
529268537DNAhomo sapiens; 268agctctggga gaggagcccc agccctgaga ttcccaggtg
tttccattcg gtgatcagca 60ctgaacacag agaacgcacc atggagtttg gactgagctg
ggttttcctt gttgctattt 120taaaaggtga ttcatggata aatagagatg ttgagtgtga
gtgaacatga gtgagagaaa 180cagtggatat gtgtggcagt gtctgaccag ggtgtctctg
tgtttgcagg tgtccagtgt 240gaagtgcagc tggtggagtc tgggggagtc gtggtacagc
ctggggggtc cctgagactc 300tcctgtgcag cctctggatt cacctttgat gattatacca
tgcactgggt ccgtcaagct 360ccggggaagg gtctggagtg ggtctctctt attagttggg
atggtggtag cacatactat 420gcagactctg tgaagggccg attcaccatc tccagagaca
acagcaaaaa ctccctgtat 480ctgcaaatga acagtctgag aactgaggac accgccttgt
attactgtgc aaaagat 537269533DNAhomo sapiens; 269agctctcaga
gaggtgcctt agccctggat tccaaggcat ttccacttgg tgatcagcac 60tgaacacaga
ggactcacca tggagttggg gctgtgctgg gttttccttg ttgctatttt 120agaaggtgat
tcatggaaaa ctagagagat ttagtgtgtg tggatatgag tgagagaaac 180agtggatatg
tgtggcagtt tctgaccttg gtgtctcttt gtttgcaggt gtccagtgtg 240aggtgcagct
ggtggagtct gggggaggct tggtacagcc tggggggtcc ctgagactct 300cctgtgcagc
ctctggattc accttcagta gctatagcat gaactgggtc cgccaggctc 360cagggaaggg
gctggagtgg gtttcataca ttagtagtag tagtagtacc atatactacg 420cagactctgt
gaagggccga ttcaccatct ccagagacaa tgccaagaac tcactgtatc 480tgcaaatgaa
cagcctgaga gacgaggaca cggctgtgta ttactgtgcg aga
533270540DNAhomo sapiens; 270agctctggga gaggagcccc agccgtgaga ttcccaggag
tttccacttg gtgatcagca 60ctgaacacag accaccaacc atggagtttg ggcttagctg
ggttttcctt gttgctattt 120taaaaggtaa ttcatggtgt actagagata ctgagtgtga
ggggacatga gtggtagaaa 180cagtggatat gtgtggcagt ttctgacctt ggtgtttctg
tgtttgcagg tgtccaatgt 240gaggtgcagc tggtggagtc tgggggaggc ttggtacagc
cagggcggtc cctgagactc 300tcctgtacag cttctggatt cacctttggt gattatgcta
tgagctggtt ccgccaggct 360ccagggaagg ggctggagtg ggtaggtttc attagaagca
aagcttatgg tgggacaaca 420gaatacgccg cgtctgtgaa aggcagattc accatctcaa
gagatgattc caaaagcatc 480gcctatctgc aaatgaacag cctgaaaacc gaggacacag
ccgtgtatta ctgtactaga 540271670DNAhomo sapiens; 271aataccaatc
tcccccagga cacttcatct gcacggagcc cggcctctcc tcagatgtcc 60caccccagag
cttgctatat agtcggggac atccaaatag ggccctccct ctgctgatga 120aaaccagccc
agctgaccct gcagctctgg gagaggagcc cagcactggg attccgaggt 180gtttccattc
ggtgatcagc actgaacaca gaggactcac catggagttt tggctgagct 240gggttttcct
tgttgctatt ttaaaaggtg attcatggag aactagagat attgagtgtg 300agtgaacacg
agtgagagaa acagtggata tgtgtggcag tttctaacca atgtctctgt 360gtttgcaggt
gtccagtgtg aggtgcagct ggtggagtct ggaggaggct tgatccagcc 420tggggggtcc
ctgagactct cctgtgcagc ctctgggttc accgtcagta gcaactacat 480gagctgggtc
cgccaggctc cagggaaggg gctggagtgg gtctcagtta tttatagcgg 540tggtagcaca
tactacgcag actccgtgaa gggccgattc accatctcca gagacaattc 600caagaacacg
ctgtatcttc aaatgaacag cctgagagcc gaggacacgg ccgtgtatta 660ctgtgcgaga
670272534DNAhomo
sapiens; 272agctctggga gaggagcccc cgccctggga ttcccaggtg ttttcatttg
gtgatcagca 60ctgaacacag aagagtcatg atggagtttg ggctgagctg ggttttcctt
gttgctattt 120ttaaaggtga ttcatgagga aatagagata ttgagtgtga gtggacatga
gtgagagaaa 180cagtggattt gtgtggcagt ttctgacctt ggtgtctctg tgtttgcagg
tgtccagtgt 240gaggtgcagc tggtggagtc tggggaaggc ttggtccagc ctggggggtc
cctgagactc 300tcctgtgcag cctctggatt caccttcagt agctatgcta tgcactgggt
ccgccaggct 360ccagggaagg gactggaata tgtttcagct attagtagta atgggggtag
cacatattat 420gcagactctg tgaagggcag attcaccatc tccagagaca attccaagaa
cacgctgtat 480cttcaaatgg gcagcctgag agctgaggac atggctgtgt attactgtgc
gaga 534273528DNAhomo sapiens; 273agctctggga gaggagccca
gcactgggat tccgaggtgt ttccattcag tgatctgcac 60tgaacacaga ggactcgcca
tggagtttgg gctgagctgg gttttccttg ttgctatttt 120aaaaggtgat tcatggagaa
ctagagatat tgagtgtgag tgaacacgag tgagagaaac 180agtggatatg tgtggcagtt
tctaaccaat gtctctgtgt ttgcaggtgt ccagtgtgag 240gtgcagctgg tggagtctgg
aggaggcttg atccagcctg gggggtccct gagactctcc 300tgtgcagcct ctgggttcac
cgtcagtagc aactacatga gctgggtccg ccaggctcca 360gggaaggggc tggagtgggt
ctcagttatt tatagctgtg gtagcacata ctacgcagac 420tccgtgaagg gccgattcac
catctccaga gacaattcca agaacacgct gtatcttcaa 480atgaacagcc tgagagctga
ggacacggct gtgtattact gtgcgaga 528274533DNAhomo sapiens;
274aggtctcaga gaggagcctt agccctggac tccaaggcct ttccacttgg tgatcagcac
60tgagcacaga ggactcacca tggaattggg gctgagctgg gttttccttg ttgctatttt
120agaaggtgat tcatggaaaa ctaggaagat tgagtgtgtg tggatatgag tgtgagaaac
180agtggatttg tgtggcagtt tctgaccttg gtgtctcttt gtttgcaggt gtccagtgtg
240aggtgcagct ggtggagtct gggggaggct tggtccagcc tggggggtcc ctgagactct
300cctgtgcagc ctctggattc acctttagta gctattggat gagctgggtc cgccaggctc
360cagggaaggg gctggagtgg gtggccaaca taaagcaaga tggaagtgag aaatactatg
420tggactctgt gaagggccga ttcaccatct ccagagacaa cgccaagaac tcactgtatc
480tgcaaatgaa cagcctgaga gccgaggaca cggctgtgta ttactgtgcg aga
533275540DNAhomo sapiens; 275agctctgaga gcggagcccc agccccagaa ttcccaggtg
ttttcatttg gtgatcagca 60ctgaacacag aggactcacc atggagtttg ggctgagctg
ggttttcctt gttgttattt 120tacaaggtga tttatggaga actagagatg ttaagtgtga
gtggacgtga gtgagagaaa 180cagtggattt gtgtgacagt ttctgaccag ggtgtctctg
tgtttgcagg tgtccagtgt 240gaggtgcagc tggtggagtc tgggggaggc ttggtccagc
ctggagggtc cctgagactc 300tcctgtgcag cctctggatt caccttcagt gaccactaca
tggactgggt ccgccaggct 360ccagggaagg ggctggagtg ggttggccgt actagaaaca
aagctaacag ttacaccaca 420gaatacgccg cgtctgtgaa aggcagattc accatctcaa
gagatgattc aaagaactca 480ctgtatctgc aaatgaacag cctgaaaacc gaggacacgg
ccgtgtatta ctgtgctaga 540276540DNAhomo sapiens; 276agctctggga
gaggagctcc agccttggga ttcccagctg tctccactcg gtgatcggca 60ctgaatacag
gagactcacc atggagtttg ggctgagctg ggttttcctt gttgctattt 120taaaaggtga
ttcatgggga actagagata ctgagtgtga gtggacatga gtgagagaaa 180cagtggacgt
gtgtggcact ttctgaccag ggtgtctctg tgtttgcagg tgtccagtgt 240gaggtgcagc
tggtggagtc cgggggaggc ttggtccagc ctggggggtc cctgaaactc 300tcctgtgcag
cctctgggtt caccttcagt ggctctgcta tgcactgggt ccgccaggct 360tccgggaaag
ggctggagtg ggttggccgt attagaagca aagctaacag ttacgcgaca 420gcatatgctg
cgtcggtgaa aggcaggttc accatctcca gagatgattc aaagaacacg 480gcgtatctgc
aaatgaacag cctgaaaacc gaggacacgg ccgtgtatta ctgtactaga
540277690DNAhomo sapiens; 277aatttctcaa atcccattgt tgtcacccat cttcctcagg
acactttcat ctgccctggg 60tcctgctctt tcttcaggtg tctcacccca gagcttgata
tatagtagga gacatgcaaa 120tagggccctc actctgctga agaaaaccag ccctgcagct
ctgggagagg agccccagcc 180ctgggattcc cagctgtttc tgcttgctga tcaggactgc
acacagagaa ctcaccatgg 240agtttgggct gagctgggtt ttccttgttg ctattttaaa
aggtgattca tggagaactg 300gagatatgga gtgtgaatgg acatgagtga gataagcagt
ggatgtgtgt ggcagtttct 360gaccagggtg tctctgtgtt tgcaggtgtc cagtgtgagg
tgcagctggt ggagtccggg 420ggaggcttag ttcagcctgg ggggtccctg agactctcct
gtgcagcctc tggattcacc 480ttcagtagct actggatgca ctgggtccgc caagctccag
ggaaggggct ggtgtgggtc 540tcacgtatta atagtgatgg gagtagcaca agctacgcgg
actccgtgaa gggccgattc 600accatctcca gagacaacgc caagaacacg ctgtatctgc
aaatgaacag tctgagagcc 660gaggacacgg ctgtgtatta ctgtgcaaga
690278525DNAhomo sapiens; 278agctctggga gaggagcccc
agccctgaga ttcccaggtg tttccattca gtgatcagca 60ctgaacacag aggactcacc
atggagttgg gactgagctg gattttcctt ttggctattt 120taaaaggtga ttcatggaga
aatagagaga ttgagtgtga gtggacatga gtggatttgt 180gtggcagttt ctgaccttgg
tgtctctgtg tttgcaggtg tccagtgtga agtgcagctg 240gtggagtctg ggggaggctt
ggtacagcct ggcaggtccc tgagactctc ctgtgcagcc 300tctggattca cctttgatga
ttatgccatg cactgggtcc ggcaagctcc agggaagggc 360ctggagtggg tctcaggtat
tagttggaat agtggtagca taggctatgc ggactctgtg 420aagggccgat tcaccatctc
cagagacaac gccaagaact ccctgtatct gcaaatgaac 480agtctgagag ctgaggacac
ggccttgtat tactgtgcaa aagat 525279505DNAhomo sapiens;
279atttccttaa attcagggtc ctgctcacat gggaaatact ttctgagagt cctggacctc
60ctgtgcaaga acatgaaaca cctgtggttc ttcctcctgc tggtggcagc tcccagatgt
120gagtgtctca aggctgcaga catggagata tgggaggtgc ctctgagccc agggctcact
180gtgggtctct ctgttcacag tggtcctgtc ccaggtgcag ctgcaggagt cgggcccagg
240actggtgaag ccttcggaca ccctgtccct cacctgcgct gtctctggtt actccatcag
300cagtagtaac tggtggggct ggatccggca gcccccaggg aagggactgg agtggattgg
360gtacatctat tatagtggga gcacctacta caacccgtcc ctcaagagtc gagtcaccat
420gtcagtagac acgtccaaga accagttctc cctgaagctg agctctgtga ccgccgtgga
480cacggccgtg tattactgtg cgaga
505280508DNAhomo sapiens; 280atttccttaa attcagggtc ctgctcacat gggaaatact
ttctgagagt cctggacctc 60ctgtgcaaga acatgaaaca cctgtggttc ttcctcctgc
tggtggcagc tcccagatgt 120gagtgtctca aggctgcaga catggagata tgggaggtgc
ctctgatccc agggctcact 180gtgtgtctct ctgttcacag gggtcctgcc ccaggtgcag
ctgcaggagt cgggcccagg 240actggtgaag ccttcacaga ccctgtccct cacctgtact
gtctctggtg gctccatcag 300cagtggtggt tactactgga gctggatccg ccagcaccca
gggaagggcc tggagtggat 360tgggtacatc tattacagtg ggagcaccta ctacaacccg
tccctcaaga gtcgagttac 420catatcagta gacacgtcta agaaccagtt ctccctgaag
ctgagctctg tgactgccgc 480ggacacggcc gtgtattact gtgcgaga
508281483DNAhomo sapiens; 281cagctcacat gggaagtgct
ttctgagagt catggacctc ctgcacaaga acatgaaaca 60cctgtggttc ttcctcctcc
tggtggcagc tcccagatgt gagtgtctca ggaatgcgga 120tatgaagata tgagatgctg
cctctgatcc cagggctcac tgtgggtttc tctgttcaca 180ggggtcctgt cccaggtgca
gctacagcag tggggcgcag gactgttgaa gccttcggag 240accctgtccc tcacctgcgc
tgtctatggt gggtccttca gtggttacta ctggagctgg 300atccgccagc ccccagggaa
ggggctggag tggattgggg aaatcaatca tagtggaagc 360accaactaca acccgtccct
caagagtcga gtcaccatat cagtagacac gtccaagaac 420cagttctccc tgaagctgag
ctctgtgacc gccgcggaca cggctgtgta ttactgtgcg 480aga
483282508DNAhomo sapiens;
282atttccttaa attcaggtcc aactcataag ggaaatgctt tctgagagtc atggatctca
60tgtgcaagaa aatgaagcac ctgtggttct tcctcctgct ggtggcggct cccagatgtg
120agtgtttcta ggatgcagac atggagatat gggaggctgc ctctgatccc agggctcact
180gtgggttttt ctgttcacag gggtcctgtc ccagctgcag ctgcaggagt cgggcccagg
240actggtgaag ccttcggaga ccctgtccct cacctgcact gtctctggtg gctccatcag
300cagtagtagt tactactggg gctggatccg ccagccccca gggaaggggc tggagtggat
360tgggagtatc tattatagtg ggagcaccta ctacaacccg tccctcaaga gtcgagtcac
420catatccgta gacacgtcca agaaccagtt ctccctgaag ctgagctctg tgaccgccgc
480agacacggct gtgtattact gtgcgaga
508283494DNAhomo sapiens; 283aaattcaggg tccagctcac atgggaaata ctttctgaga
ctcatggacc tcctgcacaa 60gaacatgaaa cacctgtggt tcttcctcct gctggtggca
gctcccagat gtgagtgtct 120caaggctgca gacatgggga tatgggaggt gcctctgatc
ccagggctca ctgtgggtct 180ctctgttcac aggggtcctg tcccaggtgc agctgcagga
gtcgggccca ggactggtga 240agccttcgga gaccctgtcc ctcacctgca ctgtctctgg
tggctccatc agtagttact 300actggagctg gatccggcag cccgccggga agggactgga
gtggattggg cgtatctata 360ccagtgggag caccaactac aacccctccc tcaagagtcg
agtcaccatg tcagtagaca 420cgtccaagaa ccagttctcc ctgaagctga gctctgtgac
cgccgcggac acggccgtgt 480attactgtgc gaga
4942842025DNAhomo sapiens; 284ttttcacctc
tccatacaaa ggcaccaccc acatgcaaat cctcacttaa gcacccacag 60gaaaccacca
cacatttcct taaattcagg ttccagctca catgggaaat actttctgag 120agtcctggac
ctcctgtgca agaacatgaa acatctgtgg ttcttccttc tcctggtggc 180agctcccaga
tgtgagtatc tcagggatcc agacatgggg atatgggagg tgcctctgat 240cccagggctc
actgtgggtc tctctgttca caggggtcct gtcccaggtg cagctgcagg 300agtcgggccc
aggactggtg aagccttcgg agaccctgtc cctcacctgc actgtctctg 360gtggctccat
cagtagttac tactggagct ggatccggca gcccccaggg aagggactgg 420agtggattgg
gtatatctat tacagtggga gcaccaacta caacccctcc ctcaagagtc 480gagtcaccat
atcagtagac acgtccaaga accagttctc cctgaagctg agctctgtga 540ccgctgcgga
cacggccgtg tattactgtg cgagagacac agtgagggga ggtgagtgtg 600agcccagaca
aaaacctccg tgcagggagg cggaggggac cggcgcaggt gctgctcagc 660gccagcaggg
ggcgcgcggg gcccacagag caggaggccc ggtcaggagc aggtgcaggg 720agggcggggc
ttcctcatct gctcagtggt ctccctcctc gccagcacct cagctgtccc 780caggggtcct
ctttctttat tatctgtggt tctgcttcct cacattcttg tgccaagaaa 840gaaatgagga
agacaaattt tcgtctgtag ttgaagtttc accaattact aggaactttc 900ctagaagttc
ctgcatggcc cattatagct tacagattaa atatatatca agcttctcat 960ctcttgattt
gtgtcatcaa ctgaattgtg ccctctttga aattcatatg cagaaacctt 1020aaattcaatt
gatgtatatt ggaattttaa tgaaataatt aaggttaaat gtggtcataa 1080gtgtaagact
ctaattcaac agacgtgtcg tctttataag aagaggaaga gacaccagag 1140acctctcact
tttcacgtgc aggcagagaa gaggccatgt ggagacgtaa tgcactagaa 1200ggtggcccag
tgcaagccag gaagaagcct caccaagaac caaccctgcc agaacattga 1260tcttcaacat
tcagactgca gaattttaag aaaatcaata tttgttgttt aagccaccca 1320ctcctgttgt
cttcttatga agatccagac agactaatac cacataactc tgttagcgct 1380gtcccctgga
tgcagaatca gcccgctggg gctgggcaca tctctcagat ttccacataa 1440agtaggcaaa
aaatagtagt tctgatataa aaatttgtca tgtccctgtt ggccaatttc 1500tgggcaaggt
cttttaaaga agccctgggg gctttgtcac aaaagttgcc ttttatcatt 1560tattaggaca
taactgatga acaatgagta ccagttggat ggagactgac cactgaccat 1620cttctgctgt
ctcctaagta tgccacagaa aaccacacca acattactct atgtcttcaa 1680ctttctaaat
ttgcactgat tggtatttaa ggcaggccca gcgttgaata actcctttag 1740tttttgcttc
tctgggaaag gtcttatcta tcctggcctt ggtcttcaag tttcagcaat 1800tctgggaagc
caaggacgcc tctatctcct cctccatgct ctgcaactca cctgagaaca 1860gctttctcat
tggaatgtct tctgtttaag gaataagagt ccctgtttca ggcttgggtg 1920cctgagtaca
cctactggat ccagcccagg attggagaaa ctttccagaa cacatcacct 1980gagaaatgac
cagtcacact gttacacttt cacaatttcc gcttc
2025285537DNAhomo sapiens; 285acttaagcac ccacaggaaa ccaccacaca tttccttaaa
ttcaggttcc agctcacatg 60ggaaatactt tctgagagtc ctggacctcc tgtgcaagaa
catgaaacac ctgtggttct 120tcctcctcct ggtggcagct cccagatgtg agtgtctcag
ggatccagac atgggggtat 180gggaggtgcc tctgatccca gggctcactg tgggtctctc
tgttcacagg ggtcctgtcc 240caggtgcagc tgcaggagtc gggcccagga ctggtgaagc
cttcggagac cctgtccctc 300acctgcactg tctctggtgg ctccgtcagc agtggtggtt
actactggag ctggatccgg 360cagcccccag ggaagggact ggagtggatt gggtatatct
attacagtgg gagcaccaac 420tacaacccct ccctcaagag tcgagtcacc atatcagtag
acacgtccaa gaaccagttc 480tccctgaagc tgagctctgt gaccgctgcg gacacggccg
tgtattactg tgcgaga 537286493DNAhomo sapiens; 286tgagtctccc
tcactgccca gctgggatct cagggcttca ttttctgtcc tccaccatca 60tggggtcaac
cgccatcctc gccctcctcc tggctgttct ccaaggtcag tcctgccgag 120ggcttgaggt
cacagaggag aacgggtgga aaggagcccc tgattcaaat tttgtgtctc 180ccccacagga
gtctgtgccg aggtgcagct ggtgcagtct ggagcagagg tgaaaaagcc 240cggggagtct
ctgaagatct cctgtaaggg ttctggatac agctttacca gctactggat 300cggctgggtg
cgccagatgc ccgggaaagg cctggagtgg atggggatca tctatcctgg 360tgactctgat
accagataca gcccgtcctt ccaaggccag gtcaccatct cagccgacaa 420gtccatcagc
accgcctacc tgcagtggag cagcctgaag gcctcggaca ccgccatgta 480ttactgtgcg
aga
493287498DNAhomo sapiens; 287gcagagcctg ctgaattctg gctgaccagg gcagtcacca
gagctccaga caatgtctgt 60ctccttcctc atcttcctgc ccgtgctggg cctcccatgg
ggtcagtgtc agggagatgc 120cgtattcaca gcagcattca cagactgagg ggtgtttcac
tttgctgttt ccttttgtct 180ccaggtgtcc tgtcacaggt acagctgcag cagtcaggtc
caggactggt gaagccctcg 240cagaccctct cactcacctg tgccatctcc ggggacagtg
tctctagcaa cagtgctgct 300tggaactgga tcaggcagtc cccatcgaga ggccttgagt
ggctgggaag gacatactac 360aggtccaagt ggtataatga ttatgcagta tctgtgaaaa
gtcgaataac catcaaccca 420gacacatcca agaaccagtt ctccctgcag ctgaactctg
tgactcccga ggacacggct 480gtgtattact gtgcaaga
498288489DNAhomo sapiens; 288acccaacaac aacatccctc
cttgggagaa tcccctagag cacagctcct caccatggac 60tggacctgga gcatcctctt
cttggtggca gcagcaacag gtaaggggct ccccagtctc 120ggggttgagg cagaaaccag
gccactcaag tgaggcttta cccacccctg tgtcctctcc 180acaggtacct actcccaggt
gcagctggtg cagtctggcc atgaggtgaa gcagcctggg 240gcctcagtga aggtctcctg
caaggcttct ggttacagtt tcaccaccta tggtatgaat 300tgggtgccac aggcccctgg
acaagggctt gagtggatgg gatggttcaa cacctacact 360gggaacccaa catatgccca
gggcttcaca ggacggtttg tcttctccat ggacacctct 420gccagcacag catacctgca
gatcagcagc ctaaaggctg aggacatggc catgtattac 480tgtgcgaga
48928938DNAhomo sapiens;
289gtacactttt ggccagggga ccaagctgga gatcaaac
3829038DNAhomo sapiens; 290attcactttc ggccctggga ccaaagtgga tatcaaac
3829137DNAhomo sapiens; 291ctcactttcg gcggagggac
caaggtggag atcaaac 3729238DNAhomo sapiens;
292gatcaccttc ggccaaggga cacgactgga gattaaac
38293503DNAhomo sapiens; 293aggaatcaga cccagtcagg acacagcatg gacatgagag
tcctcgctca gctcctgggg 60ctcctgctgc tctgtttccc aggtaaggat ggagaacact
agcagtttac tcagcccagg 120gtgctcagta ctgctttact attcagggaa attctcttac
aacatgatta attgtgtgga 180catttgtttt tatgtttcca atctcaggtg ccagatgtga
catccagatg acccagtctc 240catcctcact gtctgcatct gtaggagaca gagtcaccat
cacttgtcgg gcgagtcagg 300gcattagcaa ttatttagcc tggtttcagc agaaaccagg
gaaagcccct aagtccctga 360tctatgctgc atccagtttg caaagtgggg tcccatcaaa
gttcagcggc agtggatctg 420ggacagattt cactctcacc atcagcagcc tgcagcctga
agattttgca acttattact 480gccaacagta taatagttac cct
503294503DNAhomo sapiens; 294aggaatcagt cccactcagg
acacagcatg gacatgaggg tccccgctca gctcctgggg 60ctcctgctgc tctggttccc
aggtaaggat ggagaacact agcagtttac tcagcccaga 120gtgctcagta ctgctttact
gttcagggaa attctcttac aacatgatta attgtgtgga 180catttgtttt tatgtttcca
atctcaggtg ccaggtgtga catccagatg acccagtctc 240catcctccct gtctgcatct
gtaggagaca gagtcaccat cacttgccgg gcaagtcagg 300gcattagaaa tgatttaggc
tggtatcagc agaaaccagg gaaagcccct aagcgcctga 360tctatgctgc atccagtttg
caaagtgggg tcccatcaag gttcagcggc agtggatctg 420ggacagaatt cactctcaca
atcagcagcc tgcagcctga agattttgca acttattact 480gtctacagca taatagttac
cct 503295657DNAhomo sapiens;
295gggacacctg gggacactga gctggtgctg agttactgag atgagccagc tctgcagctg
60tgcccagcct gccccatccc ctgctcattt gcatgttccc agagcacaac ctcctgccct
120gaagccttat taataggctg gtcacacttt gtgcaggagt cagacccagt caggacacag
180catggacatg agggtccccg ctcagctcct ggggctcctg ctgctctggc tcccaggtaa
240ggaaggagaa cactaggaat ttactcagcc cagtgtgctc agtactgcct ggttattcag
300ggaagtcttc ctataatatg atcaatagta tgaatatttg tgtttctatt tccaatctca
360ggtgccaaat gtgacatcca gatgacccag tctccttcca ccctgtctgc atctgtagga
420gacagagtca ccatcacttg ccgggccagt cagagtatta gtagctggtt ggcctggtat
480cagcagaaac cagggaaagc ccctaagctc ctgatctata aggcgtctag tttagaaagt
540ggggtcccat caaggttcag cggcagtgga tctgggacag aattcactct caccatcagc
600agcctgcagc ctgatgattt tgcaacttat tactgccaac agtataatag ttattct
657296506DNAhomo sapiens; 296gcaggagtca gacccactca ggacacagca tggacatgag
ggtccccgct cagctcctgg 60ggctcctgct gctctggctc ccaggtaagg atggagaaca
ctggcagttt actcagccca 120gggtgctcag cacagcctgg ctattcaggg aaattctctt
actacatgat taattgtgtg 180gaccatttgt ttttgtgttt ccaatctcag gtgccagatg
tgccatccag atgacccagt 240ctccatcctc cctgtctgca tctgtaggag acagagtcac
catcacttgc cgggcaagtc 300agggcattag aaatgattta ggctggtatc agcagaaacc
agggaaagcc cctaagctcc 360tgatctatgc tgcatccagt ttacaaagtg gggtcccatc
aaggttcagc ggcagtggat 420ctggcacaga tttcactctc accatcagca gcctgcagcc
tgaagatttt gcaacttatt 480actgtctaca agattacaat taccct
506297523DNAhomo sapiens; 297aggctggaca cacttcatgc
aggagtcaga ccctgtcagg acacagcata gacatgaggg 60tccccgctca gctcctgggg
ctcctgctgc tctggctccc aggtaaggaa ggagaacact 120aggaatttac tcagcccagt
gtgcttggta cagcctggcc cttcagggaa gttctcttac 180aacatgatta attgtatgga
catttgtttt tatgtttcca atctcaggtg ccagatgtgc 240catccggatg acccagtctc
catcctcatt ctctgcatct acaggagaca gagtcaccat 300cacttgtcgg gcgagtcagg
gtattagcag ttatttagcc tggtatcagc aaaaaccagg 360gaaagcccct aagctcctga
tctatgctgc atccactttg caaagtgggg tcccatcaag 420gttcagcggc agtggatctg
ggacagattt cactctcacc atcagctgcc tgcagtctga 480agattttgca acttattact
gtcaacagta ttatagttac cct 523298534DNAhomo sapiens;
298agacttctta ataggctggt cacacctgtg caggagtcag tcccagtcag gacacagcat
60ggacatgagg gtccccgctc agctcctggg gctcctgctg ctctggctcc caggtaagga
120aggagaacac taggaattta ctcagcccag tgtgttccgt acagcctggc tcttgaggga
180agttctctta caacatgatt aattctatgg acatttgtgt ttatatttcc aatctcaggt
240gccagatgtg acatccagtt gacccagtct ccatccttcc tgtctgcatc tgtaggagac
300agagtcacca tcacttgccg ggccagtcag ggcattagca gttatttagc ctggtatcag
360caaaaaccag ggaaagcccc taagctcctg atctatgctg catccacttt gcaaagtggg
420gtcccatcaa ggttcagcgg cagtggatct gggacagaat tcactctcac aatcagcagc
480ctgcagcctg aagattttgc aacttattac tgtcaacagc ttaatagtta ccct
534299656DNAhomo sapiens; 299gggacacctg gggacactga gctggtgctg agttactgag
atgagccagc cctgcagctg 60cgcccagcct gccccatccc ctgctcattt gcatgttccc
agagcacagt ctcctgacct 120gaagacttat taacaggctg atcacaccct gtgcaggagt
cagacccagt caggacacag 180catggacatg agggtccccg ctcagctcct ggggctcctg
ctgctctggt tcccaggtaa 240gaaaggagaa cactaggatt atactcggtc agtgtgctga
gtactgcttt actattcagg 300gaacttctct tacagcatga ttaattgtgt ggacatttgt
ttttatgttt ccaatctcag 360gttccagatg cgacatccag atgacccagt ctccatcttc
tgtgtctgca tctgtaggag 420acagagtcac catcacttgt cgggcgagtc agggtattag
cagctggtta gcctggtatc 480agcagaaacc agggaaagcc cctaagctcc tgatctatgc
tgcatccagt ttgcaaagtg 540gggtcccatc aaggttcagc ggcagtggat ctgggacaga
tttcactctc actatcagca 600gcctgcagcc tgaagatttt gcaacttact attgtcaaca
ggctaacagt ttccct 656300503DNAhomo sapiens; 300aggaatcaga
cccagtcagg acacagcatg gacatgaggg tcctcgctca gctcctgggg 60ctcctgctgc
tctgtttccc aggtaaggat ggagaacact agcagtttac tcagcccagg 120gtgctcagta
ctgctttact attcagggaa attctcttac aacatgatta attgtgtgga 180catttgtttt
tatgtttcca atctcaggtg ccagatgtga catccagatg acccagtctc 240catcctcact
gtctgcatct gtaggagaca gagtcaccat cacttgtcgg gcgagtcagg 300gtattagcag
ctggttagcc tggtatcagc agaaaccaga gaaagcccct aagtccctga 360tctatgctgc
atccagtttg caaagtgggg tcccatcaag gttcagcggc agtggatctg 420ggacagattt
cactctcacc atcagcagcc tgcagcctga agattttgca acttattact 480gccaacagta
taatagttac cct
503301657DNAhomo sapiens; 301gggacacctg gggacactga gctgctgctg agttactgag
atgagccagc cctgcagctg 60cgcccagcct gccccatccc ctgctcattt gcatgttccc
agagcatagc ctcctgccct 120gaagccttat taataggctg gacacacttc atggaggaat
cagtcccact caggacacag 180catggacatg agggtccctg ctcagctcct ggggctcctg
ctgctctggt tcccaggtaa 240ggatggagaa cactaacagt ttactcagcc cagagtgctc
agtactgctt tactgttcag 300ggaaattctc ttacaacatg attaattgtg tggacatttg
tttttatgtt tccaatctca 360ggtgccagat gtaacatcca gatgacccag tctccatctg
ccatgtctgc atctgtagga 420gacagagtca ccatcacttg tcgggcgagg cagggcatta
gcaattattt agcctggttt 480cagcagaaac cagggaaagt ccctaagcac ctgatctatg
ctgcatccag tttgcaaagt 540ggggtcccat caaggttcag cggcagtgga tctgggacag
aattcactct cacaatcagc 600agcctgcagc ctgaagattt tgcaacttat tactgtctac
agcataatag ttaccct 657302487DNAhomo sapiens; 302agtcccagtc
aggacacagc atggacatga gggtccccgc tcagctcctg gggctcctgc 60tgctctggct
cccaggtaag gaaggagaac actaggaatt ttcttagccc actgtgctct 120ggcacttctg
ggaagttctc ttataccatg attcatggtg tggatatttg tttttatgtt 180tccaatctca
ggtgtcagat ttgacatcca gatgatccag tctccatctt tcctgtctgc 240atctgtagga
gacagagtca gtatcatttg ctgggcaagt gagggcatta gcagtaattt 300agcctggtat
ctgcagaaac cagggaaatc ccctaagctc ttcctctatg atgcaaaaga 360tttgcaccct
ggggtctcat cgaggttcag tggcagggga tctgggacgg atttcactct 420caccatcatc
agcctgaagc ctgaagattt tgcagcttat tactgtaaac aggacttcag 480ttaccct
487303657DNAhomo
sapiens; 303gggacacctg gggacactga gctggtgctg agttactgag atgaaccagc
cctgcagctg 60tgcccagcct gccttgcccc ctgctaattt gcatgttccc agagcacatc
ctcctaccct 120gaagacttat taatgcgctg gtcacacttc atgcaggagt cagacccagt
caggacacag 180catggacatg agggtgcccg ctcagcgcct ggggctcctg ctgctctggt
tcccaggtaa 240ggaaggagaa ccctagcagt ttactcagcc cagtgtgttc cgtacagcct
ggctcttgag 300ggaagttctc ttacaacatg attaattgta tggacatttg tgtttatatt
tccaatctca 360ggtgccagat gtgccatccg gatgacccag tctccattct ccctgtctgc
atctgtagga 420gacagagtca ccatcacttg ctgggccagt cagggcatta gcagttattt
agcctggtat 480cagcaaaaac cagcaaaagc ccctaagctc ttcatctatt atgcatccag
tttgcaaagt 540ggggtcccat caaggttcag cggcagtgga tctgggacgg attacactct
caccatcagc 600agcctgcagc ctgaagattt tgcaacttat tactgtcaac agtattatag
tacccct 657304656DNAhomo sapiens; 304gggacacctg gggacactga
gctggtgctg agttactgag atgagccagc tctgcagctg 60tgcccagtca gccccatccc
ctgctcattt gcatgttccc agagcacaac ctcctgcact 120gaagccttat taataggctg
gccacacttc atgcaggagt cagacccagt caggacacag 180catggacatg agggtccccg
ctcagctcct ggggctcctg ctgctctggc tcccaggtaa 240ggaaggagaa cactatgaat
ttactcagcc aatgtgctca gtacagcctg gcccttcagg 300gaaattctct tactacatga
ttaattgtat ggatatttgt ttttatgttt ccaatctcag 360gtgccagatg tgtcatctgg
atgacccagt ctccatcctt actctctgca tctacaggag 420acagagtcac catcagttgt
cggatgagtc agggcattag cagttattta gcctggtatc 480agcaaaaacc agggaaagcc
cctgagctcc tgatctatgc tgcatccact ttgcaaagtg 540gggtcccatc aaggttcagt
ggcagtggat ctgggacaga tttcactctc accatcagtt 600gcctgcagtc tgaagatttt
gcaacttatt actgtcaaca gtattatagt ttccct 656305833DNAhomo sapiens;
305aattaggact cctcaggtca ccttctcaca atgaggctcc ttgctcagct tctggggctg
60ctaatgctct gggtccctgg tgaggacaga agagagatga gggaggagaa tggggtggga
120gggtgaactc tgggggcccc attgcctccc atgtgtgttc tgtcctcatg ttagatgtgt
180acgtcttgta ctccaggatg gggcttgtaa cttttatatc tgcgtgagta aggcatgtga
240ggtttagatc tgtaagaatg aggaagattc cagaaggaac aaagaccagt gctccggtga
300agactctaac agagaaagag ggaatggtag aggaaacttc tagcactcaa agcactctgc
360tgtgctttga aaatatgttt ttattttgaa attatatatt actagggtct gaatcaaatt
420ataaaaattg atttagcctg aaataaataa cagaagaaaa attattttaa aattgtgctt
480aaagtttcta cataaccttg cacttctctc tcattatttc aggatccagt ggggatattg
540tgatgaccca gactccactc tcctcacctg tcacccttgg acagccggcc tccatctcct
600gcaggtctag tcaaagcctc gtacacagtg atggaaacac ctacttgagt tggcttcagc
660agaggccagg ccagcctcca agactcctaa tttataagat ttctaaccgg ttctctgggg
720tcccagacag attcagtggc agtggggcag ggacagattt cacactgaaa atcagcaggg
780tggaagctga ggatgtcggg gtttattact gcatgcaagc tacacaattt cct
833306816DNAhomo sapiens; 306gatcaggact cctcagttca ccttctcaca atgaggctcc
ctgctcagct cctggggctg 60ctaatgctct gggtcccagg taagggtaga agggagatga
gggaggagaa tggcatggaa 120cggtgagttc tggggcccca ctgcctctaa caacagtgat
ctctgggggt ctcactacac 180tcctatgtgt gttcctttcc tgtattggac atgcacatgt
tgtcctccag agtggggcat 240gtgatgatca gatctgtgag agtgaggaag attcaagcag
aaacaaggat ctgtgctctg 300gggaagactg acacagaaag gggatggtgt ggggtcttct
ggagacccct ttgagccttg 360gatcccttga gttccatttt gaaactgtgt atttttgaaa
tatgaacaaa tacatatata 420gcctgaaata aacaacaaat caaaatttat gaaaattaca
cataaacttt atacataacc 480ttgctcttct ttctatttat ttcaggatcc agtggggatg
ttgtgatgac tcagtctcca 540ctctccctgc ccgtcaccct tggacagccg gcctccatct
cctgcaggtc tagtcaaagc 600ctcgtataca gtgatggaaa cacctacttg aattggtttc
agcagaggcc aggccaatct 660ccaaggcgcc taatttataa ggtttctaac cgggactctg
gggtcccaga cagattcagc 720ggcagtgggt caggcactga tttcacactg aaaatcagca
gggtggaggc tgaggatgtt 780ggggtttatt actgcatgca aggtacacac tggcct
816307833DNAhomo sapiens; 307aattaggact cctcaggtca
ccttctcaca atgaggctcc ttgctcagct tctggggctg 60ctaatgctct gggtccctgg
tgaggacaga agagagatga gggaggagaa tggggtggga 120gggtgaactc tgggggcccc
attgcctccc atgtgtgttc tgtcctcatg ttagatgtgt 180acgtcttgta ctccaggatg
gggcttgtaa cttttatatc tgcgtgagta aggcatgtga 240ggtttagatc tgtaagaatg
aggaagattc cagaaggaac aaagaccagt gctccggtga 300agactctaac agagaaagag
ggaatggtag aggaaacttc tagcactcaa agcactctgc 360tgtgctttga aaatatgttt
ttattttgaa attatatatt actagggtct gaatcaaatt 420ataaaaattg atttagcctg
aaataaataa cagaagaaaa attattttaa aattgtgctt 480aaagtttcta cataaccttg
cacttctctc tcattatttc aggatccagt ggggatattg 540tgatgaccca gactccactc
tcctcgcctg tcacccttgg acagccggcc tccatctcct 600tcaggtctag tcaaagcctc
gtacacagtg atggaaacac ctacttgagt tggcttcagc 660agaggccagg ccagcctcca
agactcctaa tttataaggt ttctaaccgg ttctctgggg 720tcccagacag attcagtggc
agtggggcag ggacagattt cacactgaaa atcagcaggg 780tggaagctga ggatgtcggg
gtttattact gcacgcaagc tacacaattt cct 833308781DNAhomo sapiens;
308gatcaggact cctcagttca ccttctcact atgaggctcc ctgctcagct cttggggctg
60ctaatgctct gggtccctgg taaggacaga aggagatgag ggaggagaat ggggtgggaa
120ggtaagcctg gggaccccac tgccttccat gtgtgttctg ccctgcccat gtgttagatg
180tacaggtctt gttctccagg atggggaatg tgaggtttaa atctgtgaga gtgaggacga
240ttcaaaaaga agcaaggacc tgtgtgctct ggtgaatatc gtcacacaga gaaagggagg
300tggtgtaggt gacttctaga atcccctttg cagcttgcaa atttggaata tgtttagtgt
360ataaatacaa acaacaaaaa attatatagc ctgaaataaa aaatgaaaat ttatgataaa
420tgacacatga tatttgtaca tatccttcca cttctttcta tctattttag gatccagtgc
480agagattgtg atgacccaga ctccactctc cttgtctatc acccctggag agcaggcctc
540catgtcctgc aggtctagtc agagcctcct gcatagtgat ggatacacct atttgtattg
600gtttctgcag aaagccaggc cagtctccac gctcctgatc tatgaagttt ccaaccggtt
660ctctggagtg ccagataggt tcagtggcag cgggtcaggg acagatttca cactgaaaat
720cagccgggtg gaggctgagg attttggagt ttattactgc atgcaagatg cacaagatcc
780t
781309758DNAhomo sapiens; 309gatcaggact tctcagttca tcttctcacc atgaggctcc
ctgctcagct cctggggctg 60ctaatgctct ggatacctgg taaggatgga aggagatgag
ggaggaggag ggggtgggaa 120gctgagctct ggcggcccca ctgattcccg tgtttattct
aaccatgtgt taaaggaata 180tggcctatgc tccagggaga ggaattcata ttttgccctg
atgatgattt gaaaactcct 240aaaagcagtg ctctgaataa tatcttgaga aatgaaagaa
ctcttgtgcc tatttaataa 300agggttcatt taaagagttt gtttttatga tatgaataca
aatttgtaaa aataaaagat 360tagccataaa tcaataccat aaggcaaatc tcaaaagttg
ttcattatgc tttcacataa 420ccttgcactt ctctctcata atttcaggat ccagtgcaga
tattgtgatg acccagactc 480cactctctct gtccgtcacc cctggacagc cggcctccat
ctcctgcaag tctagtcaga 540gcctcctgca tagtgatgga aagacctatt tgtattggta
cctgcagaag ccaggccagc 600ctccacagct cctgatctat gaagtttcca accggttctc
tggagtgcca gataggttca 660gtggcagcgg gtcagggaca gatttcacac tgaaaatcag
ccgggtggag gctgaggatg 720ttggggttta ttactgcatg caaagtatac agcttcct
758310821DNAhomo sapiens; 310tgactgatca ggactcctca
gttcaccttc tcacaatgag gctccctgct cagctcctgg 60ggctgctaat gctctgggtc
ccaggtaagg gtagaaggga gatgagggag gagaatggca 120tggaacggtg agttctgggg
ccccactgcc tctaacaaca gtgatctctg ggggtctcac 180tacactccta tgtgtgttcc
tttcctgtat tggacatgca catgttgtcc tccagaatgg 240ggcatgtgat gatcagatct
gtgagagtca ggaagattca agaagaaaca aggatctgtg 300ctctggggaa gactgacaca
gaaaggggat ggtgtggggt cttctggaga cccctttgag 360ccttggatcc cttgagttcc
attttgaaac tgtatatttt tgaaatatga acaaatacat 420atatagcctg agataaacaa
caaatcaaaa tttatgaaaa ttacacataa actttataca 480taaccttgct cttctttcta
tttatttcag gatccagtgg ggatgttgtg atgactcagt 540ctccactctc cctgcccgtc
acccttggac agccggcctc catctcctgc aggtctagtc 600aaagcctcgt atacagtgat
ggaaacacct acttgaattg gtttcagcag aggccaggcc 660aatctccaag gcgcctaatt
tataaggttt ctaactggga ctctggggtc ccagacagat 720tcagcggcag tgggtcaggc
actgatttca cactgaaaat cagcagggtg gaggctgagg 780atgttggggt ttattactgc
atgcaaggta cacactggcc t 821311561DNAhomo sapiens;
311gtcagagccc tggggaggaa ctgctcagtt aggacccaga gggaaccatg gaagccccag
60ctcagcttct cttcctcctg ctactctggc tcccaggtga ggggaacatg aggtggtttt
120gcacattagt gaaaactctt gccacctctg ctcagcaaga aatataatta aaattcaaag
180tatatcaaca attttggctc tactcaaaga cagttggttt gatcttgatt acatgagtgc
240atttctgttt tatttccaat ttcagatacc accggagaaa ttgtgttgac acagtctcca
300gccaccctgt ctttgtctcc aggggaaaga gccaccctct cctgcagggc cagtcagagt
360gttagcagct acttagcctg gtaccaacag aaacctggcc aggctcccag gctcctcatc
420tatgatgcat ccaacagggc cactggcatc ccagccaggt tcagtggcag tgggtctggg
480acagacttca ctctcaccat cagcagccta gagcctgaag attttgcagt ttattactgt
540cagcagcgta gcaactggcc t
561312587DNAhomo sapiens; 312cctgggtcag agctctggag aagagctgct cagttaggac
ccagagggaa ccatggaaac 60cccagcgcag cttctcttcc tcctgctact ctggctccca
ggtgagggga acatgggatg 120gttttgcatg tcagtgaaaa ccctctcaag tcctgttacc
tggcaactct gctcagtcaa 180tacaataatt aaagctcaat ataaagcaat aattctggct
cttctgggaa gacaatgggt 240ttgatttaga ttacatgggt gacttttctg ttttatttcc
aatctcagat accaccggag 300aaattgtgtt gacgcagtct ccaggcaccc tgtctttgtc
tccaggggaa agagccaccc 360tctcctgcag ggccagtcag agtgttagca gcagctactt
agcctggtac cagcagaaac 420ctggccaggc tcccaggctc ctcatctatg gtgcatccag
cagggccact ggcatcccag 480acaggttcag tggcagtggg tctgggacag acttcactct
caccatcagc agactggagc 540ctgaagattt tgcagtgtat tactgtcagc agtatggtag
ctcacct 587313614DNAhomo sapiens; 313gcatgtccct
cccagctgcc ctaccttcca gagcccatat caatgcctgg gtcagagccc 60tgggaaggaa
ctgctcagtt aggacccaga cggaaccatg gaagccccag ctcagcttct 120cttcctcctg
ctactctggc tcccaggtga ggggaacatg aggtggtttt gcacatcagt 180gaaaactcct
gccacctctg ctcagcaaga aatataatta aaattcaatg tagatcaaca 240attttggctc
tactcaaaga cagctggttt gatctagatt acatgagtgc atttctgttt 300tatttccaat
cttggatacc accagagaaa ttgtaatgac acagtctcca cccaccctgt 360ctttgtctcc
aggggaaaga gtcaccctct cctgcagggc cagtcagagt gttagcagca 420gctacttaac
ctggtatcag cagaaacctg gccaggcgcc caggctcctc atctatggtg 480catccaccag
ggccactagc atcccagcca ggttcagtgg cagtgggtct gggacagact 540tcactctcac
catcagcagc ctgcagcctg aagattttgc agtttattac tgtcagcagg 600attataactt
acct
614314611DNAhomo sapiens; 314gcatgtccct cccagctgcc ctaccttcca gagcccatat
caatgcctgg gtcagagctc 60tggggaggaa ctgctcagtt aggacccaga cggaaccatg
gaagccccag cgcagcttct 120cttcctcctg ctactctggc tcccaggtga ggggaatatg
aggtgtcttt gcacatcagt 180gaaaactcct gccacctctg ctcagcaaga aatataatta
aaattcaaaa tagatcaaca 240attttggctc tactcaaaga cagtgggttt gattttgatt
acatgagtgc atttctgttt 300tatttccaat ttcagatacc accggagaaa ttgtgttgac
acagtctcca gccaccctgt 360ctttgtctcc aggggaaaga gccaccctct cctgcagggc
cagtcagggt gttagcagct 420acttagcctg gtaccagcag aaacctggcc aggctcccag
gctcctcatc tatgatgcat 480ccaacagggc cactggcatc ccagccaggt tcagtggcag
tgggcctggg acagacttca 540ctctcaccat cagcagccta gagcctgaag attttgcagt
ttattactgt cagcagcgta 600gcaactggca t
611315563DNAhomo sapiens; 315gggtcagagc tctggggagg
aactgctcag ttaggaccca gacggaacca tggaagcccc 60agcgcagctt ctcttcctcc
tgctactctg gctcccaggt gaggggaata tgaggtggtt 120ttgcacatca gtgaaaactc
ctgccacctc tgctcagcaa gaaatataat taaaattcaa 180tgtagatcaa caattttggc
tctacttaaa gacagtgggt ttgattttga ttacatgagt 240gcatttctgt tttatttcca
atttcagata ccactggaga aatagtgatg acgcagtctc 300cagccaccct gtctgtgtct
ccaggggaaa gagccaccct ctcctgcagg gccagtcaga 360gtgttagcag caacttagcc
tggtaccagc agaaacctgg ccaggctccc aggctcctca 420tctatggtgc atccaccagg
gccactggca tcccagccag gttcagtggc agtgggtctg 480ggacagagtt cactctcacc
atcagcagcc tgcagtctga agattttgca gtttattact 540gtcagcagta taataactgg
cct 563316632DNAhomo sapiens;
316gcatgtccct cccagccgcc ctgcagtcca gagcccatat caatgcctgg gtcagagctc
60tggggaggaa ctgctcagtt aggacccaga gggaaccatg gaaaccccag cgcagcttct
120cttcctcctg ctactctggc tcccaggtga ggggaacatg ggatggtttt gcatgtcagt
180gaaaaccctc tcaagtcctg ttacctggca actctgctga atcaatacaa taattaaagc
240tcaatataaa gcaataattc tggctcttct gggaagacag tgggtttgat ttagattaca
300tgggtgactt ttctatttta tttccaatct cagataccac cggagaaatt gtgttgacgc
360agtctccagc caccctgtct ttgtctccag gggaaagagc caccctctcc tgcggggcca
420gtcagagtgt tagcagcagc tacttagcct ggtaccagca gaaacctggc ctggcgccca
480ggctcctcat ctatgatgca tccagcaggg ccactggcat cccagacagg ttcagtggca
540gtgggtctgg gacagacttc actctcacca tcagcagact ggagcctgaa gattttgcag
600tgtattactg tcagcagtat ggtagctcac ct
632317757DNAhomo sapiens; 317tttggctctt gatttacatt gggtactttc acaacccact
gctcatgaaa tttgcttttg 60tactcactgg ttgtttttgc ataggcccct ccaggccacg
accagctgtt tggattttat 120aaacgggccg tttgcattgt gaactgagct acaacaggca
ggcaggggca gcaagatggt 180gttgcagacc caggtcttca tttctctgtt gctctggatc
tctggtgagg aattaaaaag 240tgccacagtc ttttcagagt aatatctgtg tagaaataaa
aaaaattaag atatagttgg 300aaataatgac tatttccaat atggatccaa ttatctgctg
acttataata ctactagaaa 360gcaaatttaa atgacatatt tcaattatat ctgagacagc
gtgtataagt ttatgtataa 420tcattgtcca ttactgacta caggtgccta cggggacatc
gtgatgaccc agtctccaga 480ctccctggct gtgtctctgg gcgagagggc caccatcaac
tgcaagtcca gccagagtgt 540tttatacagc tccaacaata agaactactt agcttggtac
cagcagaaac caggacagcc 600tcctaagctg ctcatttact gggcatctac ccgggaatcc
ggggtccctg accgattcag 660tggcagcggg tctgggacag atttcactct caccatcagc
agcctgcagg ctgaagatgt 720ggcagtttat tactgtcagc aatattatag tactcct
757318553DNAhomo sapiens; 318ataaaatctg tgctgtcaaa
ctgattagga actgactacc acctgcaggt cagggccaag 60gttatggggt cccaggttca
cctcctcagc ttcctcctcc tttggatctc tggtaagaga 120aacacttcct ctcctctgtg
ccaccaagtc ccctgcatat ccacaaaaat aatatatttt 180cataaggaat tgattttcct
cattctctgc aaatatgatg catttgattt atgtttttta 240ctttgctcca taatcagata
ccagggcaga aacgacactc acgcagtctc cagcattcat 300gtcagcgact ccaggagaca
aagtcaacat ctcctgcaaa gccagccaag acattgatga 360tgatatgaac tggtaccaac
agaaaccagg agaagctgct attttcatta ttcaagaagc 420tactactctc gttcctggaa
tcccacctcg attcagtggc agcgggtatg gaacagattt 480taccctcaca attaataaca
tagaatctga ggatgctgca tattacttct gtctacaaca 540tgataatttc cct
553319616DNAhomo sapiens;
319atcttaaaag aggttctttc tctgggatgt ggcatgagca aaactgacaa gtcaaggcag
60gaagatgttg ccatcacaac tcattgggtt tctgctgctc tgggttccag gtgagaatat
120ttccacaaac ctaggcggag atattctttc aatctgtaat ttctttcatt ggggactctg
180caataggtga tttttggctt gattttaaaa tcctaatttt aaaaatgtaa tgcatattct
240ttcttcatgt ctagcaagat taaaggtgat tttcatacac agatatttat gttgtactga
300tgtttgctgt atattttcag cctccagggg tgaaattgtg ctgactcagt ctccagactt
360tcagtctgtg actccaaagg agaaagtcac catcacctgc cgggccagtc agagcattgg
420tagtagctta cactggtacc agcagaaacc agatcagtct ccaaagctcc tcatcaagta
480tgcttcccag tccttctcag gggtcccctc gaggttcagt ggcagtggat ctgggacaga
540tttcaccctc accatcaata gcctggaagc tgaagatgct gcaacgtatt actgtcatca
600gagtagtagt ttacct
616320619DNAhomo sapiens; 320ggtatcttaa aagaggttct ttctctggga tgtggcatga
gcaaaactga caagtcaagg 60caggaagatg tcgccatcac aactcattgg gtttctgctg
ctctgggttc caggtgagaa 120tatttccaca aacctaggcg gagatattct ttcaatctgt
aatttctttc attggggact 180ctgcaatagg tgatttttgg cttgatttta aaatcctaat
tttaaaaatg taatgcatat 240tctttcttca tgtctagcaa gattaaaggt gattttcata
cacagatatt tatgttgtac 300tgatgtttgc tgtatatttt cagcctccag gggtgaaatt
gtgctgactc agtctccaga 360ctttcagtct gtgactccaa aggagaaagt caccatcacc
tgccgggcca gtcagagcat 420tggtagtagc ttacactggt accagcagaa accagatcag
tctccaaagc tcctcatcaa 480gtatgcttcc cagtccatct caggggtccc ctcgaggttc
agtggcagtg gatctgggac 540agatttcacc ctcaccatca atagcctgga agctgaagat
gctgcagcgt attactgtca 600tcagagtagt agtttacct
619321577DNAhomo sapiens; 321agcaaaactg aagtcaaaac
actgagatgg tgtccccgtt gcaattcctg cggcttctgc 60tcctctgggt tccaggtgag
aatatttaga aaaagctaaa actaattctt tgaaccatta 120attttcttaa ttaggaacct
ggcaccatat ggaacttggc ttgtttttaa atgtgatttt 180tttttaagta atgcgtattc
tttcatcttg tgctactaga ttagtggtga tttcattaag 240cagatgctta tattgtgcta
atgtttgctg tatggtttca gcctccaggg gtgatgttgt 300gatgacacag tctccagctt
tcctctctgt gactccaggg gagaaagtca ccatcacctg 360ccaggccagt gaaggcattg
gcaactactt atactggtac cagcagaaac cagatcaagc 420cccaaagctc ctcatcaagt
atgcttccca gtccatctca ggggtcccct cgaggttcag 480tggcagtgga tctgggacag
atttcacctt taccatcagt agcctggaag ctgaagatgc 540tgcaacatat tactgtcagc
agggcaataa gcaccct 577322127DNAhomo sapiens;
322cccagcaggc tcctgctcca gcccagcccc cagagagcag accccaggtg ctggccccgg
60gggttttggt ctgagcctca gtcactgtgt tatgtcttcg gaactgggac caaggtcacc
120gtcctag
127323175DNAhomo sapiens; 323gtgtgggggc catgtggact ccctcatgag cagatgccac
cagggccact ggccccagct 60tcctccttca cagctgcagt gggggctggg gctggggcat
cccagggagg gtttttgtat 120gagcctgtgt cacagtgtgt ggtattcggc ggagggacca
agctgaccgt cctag 175324176DNAhomo sapiens; 324gtgtgggggc
catgtggact ccctcatgag cagatgccac caggaccact ggccccagct 60tcctccttca
cagctgcagt gggggctggg gctaggggca tcccagggag ggtttttgta 120tgagcctgtg
tcacagtgtt gggtgttcgg cggagggacc aagctgaccg tcctag 17632572DNAhomo
sapiens; 325cagagagggt ttttgtatga gcctgtgtca cagcactggg tgtttggtga
ggggacggag 60ctgaccgtcc ta
7232670DNAhomo sapiens; 326ggagggtttg tgtgcagggt tatatcacag
tgtaatgtgt tcggcagtgg caccaaggtg 60accgtcctcg
7032746DNAhomo sapiens; 327tcactgtgtg
ctgtgttcgg aggaggcacc cagctgaccg ccctcg
46328477DNAhomo sapiens; 328gggaatctgc accatgccct gggctctgct cctcctgacc
ctcctcactc actctgcagg 60tgagagtgga ccttacccag ggatctgcac ccacctctgc
tccagcttct ccactccctg 120gctcagtgga ctctgatcct gctctcacat tcctttctgt
cccctctaca gtgtcagtgg 180tccaggcagg gctgactcag ccaccctcgg tgtccaaggg
cttgagacag accgccacac 240tcacctgcac tgggaacagc aacattgttg gcaaccaagg
agcagcttgg ctgcagcagc 300accagggcca ccctcccaaa ctcctatcct acaggaataa
caaccggccc tcagggatct 360cagagagatt ctctgcatcc aggtcaggaa acacagcctc
cctgaccatt actggactcc 420agcctgagga cgaggctgac tattactgct cagcattgga
cagcagcctc agtgctc 477329544DNAhomo sapiens; 329gctgtgtcca
ctatggccct gactcctctc ctcctcctgc tcctctctca ctgcacaggt 60agggacaggg
ctcagagccc agggtggtcc ccagcctgat ctgtccctca tggctcagat 120ccctcagcag
ctgcgccctg accctgctcc tcactgtgct gtgtctgtgt ctgcaggttc 180cctctcccgg
cccgtgctga ctcagccgcc ctctctgtct gcatccccgg gagcaacagc 240cagactcccc
tgcaccctga gcagtgacct cagtgttggt ggtaaaaaca tgttctggta 300ccagcagaag
ccagggagct ctcccaggtt attcctgtat cactactcag actcagacaa 360gcagctggga
cctggggtcc ccagtcgagt ctctggctcc aaggagacct caagtaacac 420agcgtttttg
ctcatctctg ggctccagcc tgaggacgag gccgattatt actgccaggt 480gtacgaaagt
agtgctaatc acagtgagac agatgaggaa gtcggacaaa aaccaaggtt 540ttaa
544330507DNAhomo
sapiens; 330gctgcgggta gagaagacag gactcaggac aatctccagc atggcctggt
cccctctctt 60cctcaccctc atcactcact gtgcaggtga caggatgggg accaagagag
aggccctggg 120aagcccatgc gaccctgctt tctcctcttg tctccttttg tctcttgtca
atcaccatgt 180ctgtgtctct ctcacttcca gggtcctggg cccagtctgt gctgactcag
ccaccctcgg 240tgtctgaagc ccccaggcag agggtcacca tctcctgttc tggaagcagc
tccaacatcg 300gaaataatgc tgtaaactgg taccagcagc tcccaggaaa ggctcccaaa
ctcctcatct 360attatgatga tctgctgccc tcaggggtct ctgaccgatt ctctggctcc
aagtctggca 420cctcagcctc cctggccatc agtgggctcc agtctgagga tgaggctgat
tattactgtg 480cagcatggga tgacagcctg aatggtc
507331517DNAhomo sapiens; 331gctctgcttc agctgtgggc acaagaggca
gcactcagga caatctccag catggcctgg 60tctcctctcc tcctcactct cctcgctcac
tgcacaggtg actggataca ggtccagggg 120aggggccctg ggaagcctat ggattcttgc
tttctcctgt tgtctctaga agccgaataa 180tgatgcctgt gtctctccca cttccagggt
cctgggccca gtctgtgctg acgcagccgc 240cctcagtgtc tggggcccca gggcagaggg
tcaccatctc ctgcactggg agcagctcca 300acatcggggc aggttatgat gtacactggt
accagcagct tccaggaaca gcccccaaac 360tcctcatcta tggtaacagc aatcggccct
caggggtccc tgaccgattc tctggctcca 420agtctggcac ctcagcctcc ctggccatca
ctgggctcca ggctgaggat gaggctgatt 480attactgcca gtcctatgac agcagcctga
gtggttc 517332581DNAhomo sapiens;
332ctgatttgca tggatggact ctccccctct cagagtatga agagagggag agatctgggg
60gaagctcagc ttcagctgtg ggtagagaag acaggactca ggacaatctc cagcatggcc
120agcttccctc tcctcctcac cctcctcact cactgtgcag gtgacaggat ggggaccaag
180aaaggggccc tgggaagccc atggggccct gctttctcct cttgtctcct tttgtctctt
240gtcaatcacc atgtctgtgt ctctctcact tccagggtcc tgggcccagt ctgtgctgac
300tcagccaccc tcagcgtctg ggacccccgg gcagagggtc accatctctt gttctggaag
360cagctccaac atcggaagta atactgtaaa ctggtaccag cagctcccag gaacggcccc
420caaactcctc atctatagta ataatcagcg gccctcaggg gtccctgacc gattctctgg
480ctccaagtct ggcacctcag cctccctggc catcagtggg ctccagtctg aggatgaggc
540tgattattac tgtgcagcat gggatgacag cctgaatggt c
581333522DNAhomo sapiens; 333ggggaagctc agcttcagct gtggtagaga agacaggatt
caggacaatc tccagcatgg 60ccggcttccc tctcctcctc accctcctca ctcactgtgc
aggtgacagg atggggacca 120agagaggggc cctgggaagc ccatggggcc ctgctttctc
ctcttgtctc ctttcgtctc 180ttgtcaatca ccatgtctgt gtctctctca cttccagggt
cctgggccca gtctgtgctg 240actcagccac cctcagcgtc tgggaccccc gggcagaggg
tcaccatctc ttgttctgga 300agcagctcca acatcggaag taattatgta tactggtacc
agcagctccc aggaacggcc 360cccaaactcc tcatctatag taataatcag cggccctcag
gggtccctga ccgattctct 420ggctccaagt ctggcacctc agcctccctg gccatcagtg
ggctccggtc cgaggatgag 480gctgattatt actgtgcagc atgggatgac agcctgagtg
gt 522334515DNAhomo sapiens; 334gctctgcttc
agctgtgggc acaggaggca gcactcagga caatctccag catggcctgg 60tcttctctcc
tcctcactct cctcgctcac tgcacaggtg actggatgca gatcgagggg 120agggtccctg
ggaagcctat ggattcttgc tttctcctct tgtctctaga agcagaatca 180tgatgcctgt
gtctctccca cttccagggt cctgggccca gtctgtgctg acgcagccgc 240cctcagtgtc
tggggcccca gggcagaggg tcaccatctc ctgcactggg agcagctcca 300acattggggc
gggttatgtt gtacattggt accagcagct tccaggaaca gcccccaaac 360tcctcatcta
tggtaacagc aatcggccct caggggtccc tgaccaattc tctggctcca 420agtctggcac
ctcagcctcc ctggccatca ctggactcca gtctgaggat gaggctgatt 480attactgcaa
agcatgggat aacagcctga atgct
515335509DNAhomo sapiens; 335tgagcgcaga aggcaggact cgggacaatc ttcatcatga
cctgctcccc tctcctcctc 60acccttctca ttcactgcac aggtgcccag acacagggtc
aggggagggg tccaggaagc 120ccatgaggcc ctgctttctc cttctctctc tagaccaaga
atcaccgtgt ctgtgtctct 180cctgcttcca gggtcctggg cccagtctgt gttgacgcag
ccgccctcag tgtctgcggc 240cccaggacag aaggtcacca tctcctgctc tggaagcagc
tccaacattg ggaataatta 300tgtatcctgg taccagcagc tcccaggaac agcccccaaa
ctcctcattt atgacaataa 360taagcgaccc tcagggattc ctgaccgatt ctctggctcc
aagtctggca cgtcagccac 420cctgggcatc accggactcc agactgggga cgaggccgat
tattactgcg gaacatggga 480tagcagcctg agtgctggca cagtgctcc
509336517DNAhomo sapiens; 336tgctggggtc tcaggaggca
gcactctcgg gacgtctcca ccatggcctg ggctctgctc 60ctcctcagcc tcctcactca
gggcacaggt gacacctcca gggaaagggt cacaggggtc 120tctgggctga tccttggtct
cctgctcctc aggctcacct gggcccagca ctgactcact 180agagtgtgtt tctccctctt
tccaggatcc tgggctcagt ctgccctgac tcagcctcgc 240tcagtgtccg ggtctcctgg
acagtcagtc accatctcct gcactggaac cagcagtgat 300gttggtggtt ataactatgt
ctcctggtac caacagcacc caggcaaagc ccccaaactc 360atgatttatg atgtcagtaa
gcggccctca ggggtccctg atcgcttctc tggctccaag 420tctggcaaca cggcctccct
gaccatctct gggctccagg ctgaggatga ggctgattat 480tactgctgct catatgcagg
cagctacact ttccaca 517337519DNAhomo sapiens;
337gctggggtct caggaggcag cgctctcagg acatctccac catggcctgg gctctgctgc
60tcctcaccct cctcactcag ggcacaggtg acgcctccag ggaaggggct tcagggacct
120ctgggctgat ccttggtctc ctgctcctca ggctcaccgg ggcccagcac tgactcactg
180gcatgtgttt ctccctcttt ccagggtcct gggcccagtc tgccctgact cagcctgcct
240ccgtgtctgg gtctcctgga cagtcgatca ccatctcctg cactggaacc agcagtgacg
300ttggtggtta taactatgtc tcctggtacc aacagcaccc aggcaaagcc cccaaactca
360tgatttatga ggtcagtaat cggccctcag gggtttctaa tcgcttctct ggctccaagt
420ctggcaacac ggcctccctg accatctctg ggctccaggc tgaggacgag gctgattatt
480actgcagctc atatacaagc agcagcactc tccacagtg
519338490DNAhomo sapiens; 338gaatatctcc accatggcct gggctctgct cctcctcacc
ctcctcactc agggcacagg 60tgaggcctcc agggaagggg cttcggggac ctctgggctg
atccttaact cctgctcctc 120aggctcacct gggcccagca ctgacttact aaaatgtgtt
tcttcctttt tccaggatcc 180tgggctcagt ctgccctgac tcagcctccc tccgtgtccg
ggtctcctgg acagtcagtc 240accatctcct gcactggaac cagcagtgac gttggtagtt
ataaccgtgt ctcctggtac 300cagcagcccc caggcacagc ccccaaactc atgatttatg
aggtcagtaa tcggccctca 360ggggtccctg atcgcttctc tgggtccaag tctggcaaca
cggcctccct gaccatctct 420gggctccagg ctgaggacga ggctgattat tactgcagct
tatatacaag cagcagcact 480ttccacagag
490339619DNAhomo sapiens; 339tctctgagcc caggcccacg
tgagggtggg gtgaggagag gagcccagga tgctgatttt 60catggaggcc ccgccctcct
ctgaggcaaa ggggataaga cagggctggg gcagggccag 120tgctggggtc acaagaggca
gcgctctcgg gacgtctcca ccatggcctg ggctctgctg 180ctcctcactc tcctcactca
ggacacaggt gacgcctcca gggaaggggt cttggggacc 240tctgggctga tccttggtct
cctgctcctc aggctcaccg gggcccagca ctgactcact 300ggcatgtgtt tctccctctt
tccagggtcc tgggcccagt ctgccctgac tcagcctgcc 360tccgtgtctg ggtctcctgg
acagtcgatc accatctcct gcactggaac cagcagtgat 420gttgggagtt ataaccttgt
ctcctggtac caacagcacc caggcaaagc ccccaaactc 480atgatttatg agggcagtaa
gcggccctca ggggtttcta atcgcttctc tggctccaag 540tctggcaaca cggcctccct
gacaatctct gggctccagg ctgaggacga ggctgattat 600tactgctgct catatgcag
619340520DNAhomo sapiens;
340ggctagaggc aggcccggtg ctggggtctc aaggcagcgc tctcgggaca tctccaccat
60ggcctgggct ctgctcctcc tcaccctcct cactcagggc acaggtgaca cctccaggga
120aatggccttg gggacctctg agctaatgct tggtcttctg ctcctgctcc tcagggtcac
180tggacccagt actgacccag tagagtgtgt ttctccctct ttccagggtc ctgggcccaa
240tctgccctga ctcagcctcc ttttgtgtcc ggggctcctg gacagtcggt caccatctcc
300tgcactggaa ccagcagtga cgttggggat tatgatcatg tcttctggta ccaaaagcgt
360ctcagcacta cctccagact cctgatttac aatgtcaata ctcggccttc agggatctct
420gacctcttct caggctccaa gtctggcaac atggcttccc tgaccatctc tgggctcaag
480tccgaggttg aggctaatta tcactgcagc ttatattcaa
520341635DNAhomo sapiens; 341tctctaagcc caggcccaag tgagggtggg gtgagaagag
gagctcagga tgcagatttg 60catggaggtc ccgcccttct ctgaggcaga gggataagac
agggctgggg gcaggcccag 120tgctggggtc tcaggaggca gcgctctcag gacgtcacca
ccatggcctg ggctctgctc 180ctcctcaccc tcctcactca gggcacaggt gatgcctcca
gggaaggggc cacagggacc 240tctgggctga tccttggtct cctgctcctc aggctcacct
gggcccagca ctgactcact 300agactgtgtt tctccctttc cagggtcctg ggcccagtct
gccctgactc agcctccctc 360cgcgtccggg tctcctggac agtcagtcac catctcctgc
actggaacca gcagtgacgt 420tggtggttat aactatgtct cctggtacca acagcaccca
ggcaaagccc ccaaactcat 480gatttatgag gtcagtaagc ggccctcagg ggtccctgat
cgcttctctg gctccaagtc 540tggcaacacg gcctccctga ccgtctctgg gctccaggct
gaggatgagg ctgattatta 600ctgcagctca tatgcaggca gcaacaattt ccaca
635342691DNAhomo sapiens; 342aagaacctgc ccagcctggg
cctcaggaag cagcatcgga ggtgcctcag ccatggcatg 60gatccctctc ttcctcggcg
tccttgctta ctgcacaggt gctgccccta gggtcctagc 120cactggtcca gtcccagggc
tctgggtcca gcctggccct gactctgagc tcagcagggc 180ccccgcctgt ggtgggcagg
atgctcatga ccctgctgca ggtggatggg ctcggcgggg 240ctgaaatccc cccacacagt
gctcatgtgc tcacactgcc ttagggctct ttcatccctg 300gatctgtgtc caggccaggc
acgtgggaag atttacttgg agttcagctc ctcagtttca 360agccttttct ctcccgtttt
ctctcctgta ggatccgtgg cctcctatga gctgactcag 420ccaccctcag tgtccgtgtc
cccaggacag acagccagca tcacctgctc tggagataaa 480ttgggggata aatatgcttg
ctggtatcag cagaagccag gccagtcccc tgtgctggtc 540atctatcaag atagcaagcg
gccctcaggg atccctgagc gattctctgg ctccaactct 600gggaacacag ccactctgac
catcagcggg acccaggcta tggatgaggc tgactattac 660tgtcaggcgt gggacagcag
cactgcacac a 691343539DNAhomo sapiens;
343gtgggctcag gaggcagagc tctgggaatc tcaccatggc ctggacccct ctcctgctcc
60ccctcctcac tttctgcaca ggtgcttctc ccaggccctg ccccaggctc agtgcccata
120gaccccaagt tggccctgcc ctgaaccctg tgcaaagccc agacacagtc ttagggtagg
180acccctggga atgggctctt gatcttcaag ccccctctcc tgttttcctt gcagtctctg
240aggcctccta tgagctgaca cagccaccct cggtgtcagt gtccccagga caaacggcca
300ggatcacctg ctctggagat gcattgccaa aaaaatatgc ttattggtac cagcagaagt
360caggccaggc ccctgtgctg gtcatctatg aggacagcaa acgaccctcc gggatccctg
420agagattctc tggctccagc tcagggacaa tggccacctt gactatcagt ggggcccagg
480tggaggatga agctgactac tactgttact caacagacag cagtggtaat catagcaca
539344763DNAhomo sapiens; 344gcctcagcca tggcctggac ccctctcctc ctcagcctcc
tcgctcactg cacaggtgct 60ctgcccaggg tatcaccaac ctgcccatcc ccagggctct
gggtccagtg tggccatgac 120tatgagctca ggagggccct gcctgtggtg ggcaggatgc
tcatgaccct gctgcagggt 180gagggactgg cggagctgaa gtcccctcaa actctgctca
gaggcttgtg agagcctgag 240gggctgcacc tgccaggaga gagtactggg ttttcagttc
aaaggctcca tgcagaggga 300aagtccatgg gccactgggg ctagggctga ttgcagggga
taccctgagg gttcacagac 360tctctgaagc ttttccagga cagcagggca ggggatttca
tacggatctt ttacctaaaa 420gccatcctct cctttttttt tttttttaat ctttgcaggc
tctgcgacct cctatgagct 480gactcagcca cactcagtgt cagtggccac agcacagatg
gccaggatca cctgtggggg 540aaacaacatt ggaagtaaag ctgtgcactg gtaccagcaa
aagccaggcc aggaccctgt 600gctggtcatc tatagcgata gcaaccggcc ctcagggatc
cctgagcgat tctctggctc 660caacccaggg aacaccgcca ccctaaccat cagcaggatc
gaggctgggg atgaggctga 720ctattactgt caggtgtggg acagtagtag tgatcatccc
acg 763345529DNAhomo sapiens; 345tctgtgggtc
caggaggcac agctctggga atctcaccat ggcctggatc cctctcctgc 60tccccctcct
cactctctgc acaggtgctg accccaggcc cttccccagg ctcagtcccc 120acagattcca
agttgagcct gacctgaatc ctgagcaaag cccagacaca gcctctgggt 180gggactcctg
gaaatgggtc ctttgtcttc aagccccctc tcttgttctt ccttgcaggc 240tctgaggcct
cctatgagct gacacagcca ccctcggtgt cagtgtccct aggacagatg 300gccaggatca
cctgctctgg agaagcattg ccaaaaaaat atgcttattg gtaccagcag 360aagccaggcc
agttccctgt gctggtgata tataaagaca gcgagaggcc ctcagggatc 420cctgagcgat
tctctggctc cagctcaggg acaatagtca cattgaccat cagtggagtc 480caggcagaag
acgaggctga ctattactgt ctatcagcag acagcagtg
529346523DNAhomo sapiens; 346agctgtgggc tcagaagcag agttctgggg tgtctccacc
atggcctgga cccctctctg 60gctcactctc ctcactcttt gcataggtgc tgcctcccag
ggctcaaccc catattatca 120tgctagctgt gccaacctgg ccctgagctt cggctcaaca
cagggagtag tgtagggtgt 180gggactctag gcgtgaaacc cttatcctca cctcttctgt
cctcttttgc aggttctgtg 240gtttcttctg agctgactca ggaccctgct gtgtctgtgg
ccttgggaca gacagtcagg 300atcacatgcc aaggagacag cctcagaagc tattatgcaa
gctggtacca gcagaagcca 360ggacaggccc ctgtacttgt catctatggt aaaaacaacc
ggccctcagg gatcccagac 420cgattctctg gctccagctc aggaaacaca gcttccttga
ccatcactgg ggctcaggcg 480gaagatgagg ctgactatta ctgtaactcc cgggacagca
gtg 5233471515DNAhomo sapiens; 347gcacagagga
gctgtgccct ggaatggggc ctgtacctgt ccaaggcttg tgccgtcccc 60tgtgggagat
gagaagcgtc cctgcattgg gctcttgggg acccgtcttg gacatgagtg 120agaatgaaga
gggtccctgc attgggctct ggcatgtgac tttaaatgga tttaggcctg 180taccagacat
ctcatgtctg acataaaata tttacaatca ggacattact agagaagcag 240aaaaaagcta
accacctccc tcctgagcca ggatggaatg aaggagggga ctgtggaccc 300cagataattc
ccctgtcacc actgtgactc taacaacctc ttaaatcacg gccaacatct 360atcccatagg
aaggtcttta tatcccctag aaaatacaga ggaagtcagc tctgagcttt 420tccacgacca
acccagccaa ggagcaaggc tgggcacaac ctgggtaaag atgtgagccc 480agaccatggg
accagtgggt gaaggaaaat cgcatgggct gagggggtgg gtaagcaggg 540gccagccctc
ctctctctgt ttcctttggg gctgagtcct tctctggaaa ccacagatct 600cctccagcag
cagcctctga ctctgctgat ttgcatcatg ggccgctctc tccagcaagg 660ggataagaga
ggcctgggag gaacctgctc agtctgggcc taaggaagca gcactggtgg 720tgcctcagcc
atggcctgga ccgttctcct cctcggcctc ctctctcact gcacaggtga 780tccccccagg
gtctcaccaa cctgcccagc ccaagggttc tgggtccagc gtgtccttga 840ttctgagctc
aggagggccc ttcctgtggt gggcaggatg ctcatgaccc tgctgcaggg 900tgggaggctg
gtggggctga actcccccca aactgtgctc aaaggcttgt gagagcctga 960gggactgcac
ctgccaggag agagtagtga gttttcagtt caaagtctcc atacaacagg 1020aaagtcatgg
gccactgggg ctggggctga ttgcagggga taccctgagg gttcacagac 1080tctctggagc
ttgtctggga cagcagggca agggatttca taagaagcat ctttcacctg 1140caagccaacc
tctctcttat ttatttattt atttatttat ttatttattt atttattttt 1200atctttgcag
gctctgtgac ctcctatgtg ctgactcagc caccctcggt gtcagtggcc 1260ccaggacaga
cggccaggat tacctgtggg ggaaacaaca ttggaagtaa aagtgtgcac 1320tggtaccagc
agaagccagg ccaggcccct gtgctggtcg tctatgatga tagcgaccgg 1380ccctcaggga
tccctgagcg attctctggc tccaactctg ggaacacggc caccctgacc 1440atcagcaggg
tcgaagccgg ggatgaggcc gactattact gtcaggtgtg ggatagtagt 1500agtgatcatc
ccacg
1515348558DNAhomo sapiens; 348aagagaggcc tgggaagccc agctgtgctg tgggctcagg
aggcagagct gtgggtgtct 60caccatggca tgggccacac tcctgctccc actcctcaac
ctctacacag gtgctgcccc 120cagaccctgc cccaggctca gccctcctaa gcccctggtc
ttaccctgaa ccctgagctc 180agcccaggca tagcctcagg gcgatactac tggaatgggt
ttgttatctt caagccccct 240ctcttgtcct ctcttgcagg ctctgttgcc tcctatgagc
tgacacagct accctcggtg 300tcagtgtccc caggacagac agccaggatc acctgctctg
gagatgtact gggggaaaat 360tatgctgact ggtaccagca gaagccaggc caggcccctg
agttggtgat atacgaagat 420agtgagcggt accctggaat ccctgaacga ttctctgggt
ccacctcagg gaacacgacc 480accctgacca tcagcagggt cctgaccgaa gacgaggctg
actattactg tttgtctggg 540gatgaggaca atccctca
558349546DNAhomo sapiens; 349gctgtgctgt gggtccagga
ggcagaactc tgggtgtctc accatggcct ggatccctct 60acttctcccc ctcttcactc
tctgcacagg tgctgtcccc aggccctgct ccaggccctg 120ctccagtctt attccccaca
gatcccaagt tgagcctgcc ctgaatcccg agcaaagccc 180agacgcagcc tctgggtgcg
actcctggga atgggtcctt tgtcttcaag ccccctctct 240tgttcttcct tgcaggctct
gaggcctcct atgagctgac acagccaccc tcggtgtcag 300tgtccccagg acagacggcc
aggatcacct gctctggaga tgcattgcca aagcaatatg 360cttattggta ccagcagaag
ccaggccagg cccctgtgct ggtgatatat aaagacagtg 420agaggccctc agggatccct
gagcgattct ctggctccag ctcagggaca acagtcacgt 480tgaccatcag tggagtccag
gcagaagatg aggctgacta ttactgtcaa tcagcagaca 540gcagtg
546350519DNAhomo sapiens;
350gctgtaggct caggaggcag agctctgaat gtctcaccat ggcctggatc cctctcctgc
60tccccctcct cattctctgc acaggtgctg cccctaggct cagtctccac agaccccaag
120ttgagcctga cctgaatcct gagcaaagcc ctgccactgc ctctgggggg gattcctggc
180aatgcgtcct ttgtcctcaa gccccctctc ctgtcttttc ttgcagtctc tgtggcctcc
240tatgagctga cacagccatc ctcagtgtca gtgtctccgg gacagacagc caggatcacc
300tgctcaggag atgtactggc aaaaaaatat gctcggtggt tccagcagaa gccaggccag
360gcccctgtgc tggtgattta taaagacagt gagcggccct cagggatccc tgagcgattc
420tccggctcca gctcagggac cacagtcacc ttgaccatca gcggggccca ggttgaggat
480gaggctgact attactgtta ctctgcggct gacaacaat
519351504DNAhomo sapiens; 351gctgtggact cagaggcaga gctctggggc atttccatta
tggcctggac ccctcccctg 60ctcgtcctca ctctctgcac aggtgctgcc tcccagggct
cagcccccag tgggatcaag 120atcagcctgg ccctgacctt caactcaaca tagggagtga
tgcagggtgt ggggttctgg 180gaatgaggcc ctcatcctca gactcacctc tcctgtcctc
tcttgtgggc tccgttattt 240cctctgggcc aactcaggtg cctgcagtgt ctgtggcctt
gggacaaatg gccaggatca 300cctgccaggg agacagcatg gaaggctctt atgaacactg
gtaccagcag aagccaggcc 360aggcccccgt gctggtcatc tatgatagca gtgaccggcc
ctcaaggatc cctgagcgat 420tctctggctc caaatcaggc aacacaacca ccctgaccat
cactggggcc caggctgagg 480atgaggctga ttattactat cagt
504352747DNAhomo sapiens; 352cttgactctg ctgatttgca
tcacaggctg ctctcttcag caaggggata agagagggct 60ggaaggaacc tgcccagcct
gggcctcagg aagcagcatc gggggtgccg cagccatggc 120ctggaccgct ctccttctga
gcctccttgc tcactttaca ggtgctgccc ccagtgtccc 180agccacctac ccagctccaa
ggctctgggt ccagcctggc ctgacagtga tctcagcagg 240gccctgcctg tggtgtgcag
gatgctcatg atcctgctgc agggggaggg gctgctggag 300gtgaaatccc cccacactgt
tcttctgtgc tcatggtccc ctgaggacac ttctattcct 360gaaactcagg ccaggcaggt
gggaaggcat tgttgggttg agcctctcag tttcaagtct 420attctattct ctcccctttt
cttgcaggtt ctgtggcctc ctatgagctg actcagccac 480tctcagtgtc agtggccctg
ggacagacgg ccaggattac ctgtggggga aacaacattg 540gaagtaaaaa tgtgcactgg
taccagcaga agccaggcca ggcccctgtg ctggtcatct 600atagggatag caaccggccc
tctgggatcc ctgagcgatt ctctggctcc aactcgggga 660acacggccac cctgaccatc
agcagagccc aagccgggga tgaggctgac tattactgtc 720aggtgtggga cagcagcact
gcacaca 747353529DNAhomo sapiens;
353ctcgaataga gctcttggaa gtccctccaa ccatggcctg ggtctccttc tacctactgc
60ccttcatttt ctccacaggt cagaacatcc cagggaattc agggaaatgt tttcactgct
120attttcccat gagcaccagt cctcaggggc attctttcca gttcttctgt gcattcagca
180tcattcatga cattctgttt acaggtctct gtgctctgcc tgtgctgact cagcccccgt
240ctgcatctgc cttgctggga gcctcgatca agctcacctg caccctaagc agtgagcaca
300gcacctacac catcgaatgg tatcaacaga gaccagggag gtccccccag tatataatga
360aggttaagag tgatggcagc cacagcaagg gggacgggat ccccgatcgc ttcatgggct
420ccagttctgg ggctgaccgc tacctcacct tctccaacct ccagtctgac gatgaggctg
480agtatcactg tggagagagc cacacgattg atggccaagt cggttgagc
529354483DNAhomo sapiens; 354atggcctgga ccccactcct cctcctcttc cctctcctcc
tccactgcac aggtcaggag 60gaccctcagc atcctcatgc cccagctcac tgacaccatc
tcccaaactc ataccagaaa 120tgttgtttgc tcttgtcctt ccttcaggcc ataatgagcg
tctctgtttt cagggtctct 180ctcccagcct gtgctgactc aatcatcctc tgcctctgct
tccctgggat cctcggtcaa 240gctcacctgc actctgagca gtgggcacag tagctacatc
atcgcatggc atcagcagca 300gccagggaag gcccctcggt acttgatgaa gcttgaaggt
agtggaagct acaacaaggg 360gagcggagtt cctgatcgct tctcaggctc cagctctggg
gctgaccgct acctcaccat 420ctccaacctc cagtttgagg atgaggctga ttattactgt
gagacctggg acagtaacac 480tca
483355539DNAhomo sapiens; 355agggtgggta agaaatacct
gcaactgtca gcctcagcag agctctgggg agtctgcacc 60atggcttgga ccccactcct
cttcctcacc ctcctcctcc actgcacagg tcaggatggc 120cctcagcacc ctgacctcca
gctcactgat accacctccc aaacttatgc caggaatgtc 180cttccctctt ttcttgactc
cagccggtaa tgggtgtctg tgttttcagg gtctctctcc 240cagcttgtgc tgactcaatc
gccctctgcc tctgcctccc tgggagcctc ggtcaagctc 300acctgcactc tgagcagtgg
gcacagcagc tacgccatcg catggcatca gcagcagcca 360gagaagggcc ctcggtactt
gatgaagctt aacagtgatg gcagccacag caagggggac 420gggatccctg atcgcttctc
aggctccagc tctggggctg agcgctacct caccatctcc 480agcctccagt ctgaggatga
ggctgactat tactgtcaga cctggggcac tggcattca 539356496DNAhomo sapiens;
356ccaccatggc ctggactcct cttcttctct tgctcctctc tcactgcaca ggtagggaca
60ggcctcagag atcagggcca gccacccaac ctgattctgg ctcttctggt aaagatccct
120gaaaaacctc accctgaacc ctgcccatca accatgagtg tctgtgtttg caggttccct
180ctcccagcct gtgctgactc agccaccttc ctcctccgca tctcctggag aatccgccag
240actcacctgc accttgccca gtgacatcaa tgttggtagc tacaacatat actggtacca
300gcagaagcca gggagccctc ccaggtatct cctgtactac tactcagact cagataaggg
360ccagggctct ggagtcccca gccgcttctc tggatccaaa gatgcttcag ccaatacagg
420gattttactc atctccgggc tccagtctga ggatgaggct gactattact gtatgatttg
480gccaagcaat gcttct
496357520DNAhomo sapiens; 357actgcggggg taagaggttg tgtccaccat ggcctggact
cctctcctcc tcctgttcct 60ctctcactgc acaggtagga atagacttca gagaccaggg
tcagccaccc agcctgattc 120tgactcttct ggcaaagatc cctgaaaaac tttaccctgg
tttctgcctt agcacccatt 180aatgtctgtg tttccaggtt ccctctcgca ggctgtgctg
actcagccgt cttccctctc 240tgcatctcct ggagcatcag ccagtctcac ctgcaccttg
cgcagtggca tcaatgttgg 300tacctacagg atatactggt accagcagaa gccagggagt
cctccccagt atctcctgag 360gtacaaatca gactcagata agcagcaggg ctctggagtc
cccagccgct tctctggatc 420caaagatgct tcggccaatg cagggatttt actcatctct
gggctccagt ctgaggatga 480ggctgactat tactgtatga tttggcacag cagcgcttct
520358493DNAhomo sapiens; 358atggcctgga atcctctcct
cctcctgttc ctctctcact gcacaggtag gaaaaggcct 60cagagaccag ggtcagccac
acagcctgat tctgactctt gtgtcaaaga tcactaaaaa 120aaatattacc ttggtttctg
tcttaaagcc tatatatgcc tgtgttccag gttccctctc 180gcagcctgtg ctgactcagc
caacttccct ctcagcatct cctggagcat cagccagact 240cacctgcacc ttgcgcagtg
gcatcaatct tggtagctac aggatattct ggtaccagca 300gaagccagag agccctcccc
ggtatctcct gagctactac tcagactcaa gtaagcatca 360gggctctgga gtccccagcc
gcttctctgg atccaaagat gcttcgagca atgcagggat 420tttagtcatc tctgggctcc
agtctgagga tgaggctgac tattactgta tgatttggca 480cagcagtgct tct
493359500DNAhomo sapiens;
359ccaccatggc ctggactctt ctccttctcg tgctcctctc tcactgcaca ggtagggaaa
60gtccttataa actgagtctc agtgtccaac ctacaccatc ccctgtggct cagacctaca
120agaagcttta ccctgggaac tgccttatca cccatgatgt ctgtgttttc aggttccctc
180tcccagcctg tgctgactca gccatcttcc cattctgcat cttctggagc atcagtcaga
240ctcacctgca tgctgagcag tggcttcagt gttggggact tctggataag gtggtaccaa
300caaaagccag ggaaccctcc ccggtatctc ctgtactacc actcagactc caataagggc
360caaggctctg gagttcccag ccgcttctct ggatccaacg atgcatcagc caatgcaggg
420attctgcgta tctctgggct ccagcctgag gatgaggctg actattactg tggtacatgg
480cacagcaact ctaagactca
500360574DNAhomo sapiens; 360tctgaggata cgcgtgacag ataagaaggg ctggtgggat
cagtcctggt ggtagctcag 60gaagcagagc ctggagcatc tccactatgg cctgggctcc
actacttctc accctcctcg 120ctcactgcac aggtggctgc ctgcaaggaa ttcagggagc
gttcctggat gtcacctggg 180ctgatgatct gttcctcctg cctgggaacc agtcttcatc
tctcccgact gatctctgtg 240ttgctctctt cttgcaggtt cttgggccaa ttttatgctg
actcagcccc actctgtgtc 300ggagtctccg gggaagacgg taaccatctc ctgcacccgc
agcagtggca gcattgccag 360caactatgtg cagtggtacc agcagcgccc gggcagttcc
cccaccactg tgatctatga 420ggataaccaa agaccctctg gggtccctga tcggttctct
ggctccatcg acagctcctc 480caactctgcc tccctcacca tctctggact gaagactgag
gacgaggctg actactactg 540tcagtcttat gatagcagca atcacacagt gctc
574361472DNAhomo sapiens; 361tctggcgcca ggggtccctt
ccaatatcag caccatggcc tggactcctc tctttctgtt 60cctcctcact tgctgcccag
gttaagagag atttcaaata ccagcctttg gagggatcct 120tctgtctgcc cttctaattt
ctaacatgtg tctgtttttt gtttcagggt ccaattctca 180gactgtggtg actcaggagc
cctcactgac tgtgtcccca ggagggacag tcactctcac 240ctgtgcttcc agcactggag
cagtcaccag tggttactat ccaaactggt tccagcagaa 300acctggacaa gcacccaggg
cactgattta tagtacaagc aacaaacact cctggacccc 360tgcccggttc tcaggctccc
tccttggggg caaagctgcc ctgacactgt caggtgtgca 420gcctgaggac gaggctgagt
attactgcct gctctactat ggtggtgctc ag 472362473DNAhomo sapiens;
362tctggcacca ggggtccctt ccaatatcag caccatggcc tggactcctc tctttctgtt
60cctcctcact tgctgcccag gttaagagag atttcaaata ccagcctttg gagggatccc
120tttttctccc tttctaattc ctaatatatg tctgtttttt ttgtttcagg gtccaattcc
180caggctgtgg tgactcagga gccctcactg actgtgtccc caggagggac agtcactctc
240acctgtggct ccagcactgg agctgtcacc agtggtcatt atccctactg gttccagcag
300aagcctggcc aagcccccag gacactgatt tatgatacaa gcaacaaaca ctcctggaca
360cctgcccggt tctcaggctc cctccttggg ggcaaagctg ccctgaccct tttgggtgcg
420cagcctgagg atgaggctga gtattactgc ttgctctcct atagtggtgc tcg
473363513DNAhomo sapiens; 363gaggaaaaca aaccccagct gggaagcctg agaacactta
gccttcatga gtgtccccac 60catggcctgg atgatgcttc tcctcggact ccttgcttat
ggatcaggtc aggggaaggg 120actctatccc tgggggacca cagaaaacag ggtccaggtt
actctcatcc tcatgatcat 180aactgtgtct ctcctgttcg ttttaggagt ggattctcag
actgtggtga cccaggagcc 240atcgttctca gtgtcccctg gagggacagt cacactcact
tgtggcttga gctctggctc 300agtctctact agttactacc ccagctggta ccagcagacc
ccaggccagg ctccacgcac 360gctcatctac agcacaaaca ctcgctcttc tggggtccct
gatcgcttct ctggctccat 420ccttgggaac aaagctgccc tcaccatcac gggggcccag
gcagatgatg aatctgatta 480ttactgtgtg ctgtatatgg gtagtggcat ttc
513364546DNAhomo sapiens; 364gagagactga agaacccagc
attgcagcag ctccaccatg gcctgggctc ctctgctcct 60caccctcctc agtctcctca
caggtcaggg tgggcagtgg gctgggcccc caaagggacc 120cccacctccc agcctccatc
tccccatccc tgctcttcct cctccaacag ctcatcagcc 180acccaccaac aggagccctc
atgggtgtct gtgtttccag ggtccctctc ccagcctgtg 240ctgactcagc caccttctgc
atcagcctcc ctgggagcct cggtcacact cacctgcacc 300ctgagcagcg gctacagtaa
ttataaagtg gactggtacc agcagagacc agggaagggc 360ccccggtttg tgatgcgagt
gggcactggt gggattgtgg gatccaaggg ggatggcatc 420cctgatcgct tctcagtctt
gggctcaggc ctgaatcggt acctgaccat caagaacatc 480caggaagaag atgagagtga
ctaccactgt ggggcagacc atggcagtgg gagcaacttc 540gtgtaa
5463656729DNAhomo sapiens;
365gcctccacca agggcccatc ggtcttcccc ctggcaccct cctccaagag cacctctggg
60ggcacagcag ccctgggctg cctggtcaag gactacttcc ccgaaccggt gacggtgtcg
120tggaactcag gcgccctgac cagcggcgtg cacaccttcc cggctgtcct acagtcctca
180ggactctact ccctcagcag cgtggtgacc gtgccctcca gcagcttggg cacccagacc
240tacatctgca acgtgaatca caagcccagc aacaccaagg tggacaagaa agttggtgag
300aggccagcac agggagggag ggtgtctgct ggaagccagg ctcagcgctc ctgcctggac
360gcatcccggc tatgcagccc cagtccaggg cagcaaggca ggccccgtct gcctcttcac
420ccggaggcct ctgcccgccc cactcatgct cagggagagg gtcttctggc tttttcccca
480ggctctgggc aggcacaggc taggtgcccc taacccaggc cctgcacaca aaggggcagg
540tgctgggctc agacctgcca agagccatat ccgggaggac cctgcccctg acctaagccc
600accccaaagg ccaaactctc cactccctca gctcggacac cttctctcct cccagattcc
660agtaactccc aatcttctct ctgcagagcc caaatcttgt gacaaaactc acacatgccc
720accgtgccca ggtaagccag cccaggcctc gccctccagc tcaaggcggg acaggtgccc
780tagagtagcc tgcatccagg gacaggcccc agccgggtgc tgacacgtcc acctccatct
840cttcctcagc acctgaactc ctggggggac cgtcagtctt cctcttcccc ccaaaaccca
900aggacaccct catgatctcc cggacccctg aggtcacatg cgtggtggtg gacgtgagcc
960acgaagaccc tgaggtcaag ttcaactggt acgtggacgg cgtggaggtg cataatgcca
1020agacaaagcc gcgggaggag cagtacaaca gcacgtaccg tgtggtcagc gtcctcaccg
1080tcctgcacca ggactggctg aatggcaagg agtacaagtg caaggtctcc aacaaagccc
1140tcccagcccc catcgagaaa accatctcca aagccaaagg tgggacccgt ggggtgcgag
1200ggccacatgg acagaggccg gctcggccca ccctctgccc tgagagtgac cgctgtacca
1260acctctgtcc ctacagggca gccccgagaa ccacaggtgt acaccctgcc cccatcccgg
1320gatgagctga ccaagaacca ggtcagcctg acctgcctgg tcaaaggctt ctatcccagc
1380gacatcgccg tggagtggga gagcaatggg cagccggaga acaactacaa gaccacgcct
1440cccgtgctgg actccgacgg ctccttcttc ctctacagca agctcaccgt ggacaagagc
1500aggtggcagc aggggaacgt cttctcatgc tccgtgatgc atgaggctct gcacaaccac
1560tacacacaga agagcctctc cctgtctccg ggtaaatgag tgccacggcc ggcaagcccc
1620cgctccccag gctctcgggg tcgcgcgagg atgcttggca cgtaccccgt gtacatactt
1680cccaggcacc cagcatggaa ataaagcacc cagcgcttcc ctgggcccct gcgagactgt
1740gatggttctt tccacgggtc aggccgagtc tgaggcctga gtggcatgag ggaggcagag
1800tgggtcccac tgtccccaca ctggcccagg ctgtgcaggt gtgcctgggc cgcctagggt
1860ggggctcagc caggggctgc cctcggcagg gtgggggatt tgccagcgtg gccctccctc
1920cagcagcagc tgccctgggc tgggccacga gaagccctag gagcccctgg ggacagacac
1980acagcccctg cctctgtagg agactgtcct gttctgtgag cgccctgtcc tccgacccgc
2040atgcccactc gggggcatgc ctagtccatg tgcgtaggga caggccctcc ctcacccatc
2100tacccccacg gcactaaccc ctggcagccc tgcccagcct cgcacccgca tggggacaca
2160accgactccg gggacatgca ctctcgggcc ctgtggagag actggtccag atgcccacac
2220acacactcag cccagacccg ttcaacaaac cccgcactga ggttggccgg ccacacggcc
2280accacacaca cacgtgcacg cctcacacac ggagcctcac ccgggcgaac cgcacagcac
2340ccagaccaga gcaaggtcct cgcacacgtg aacactcctc ggacacaggc ccccacgagc
2400cccacgcggc acctcaaggc ccacgagccg ctcggcagct tctccacatg ctgacctgct
2460cagacaaacc cagccctcct ctcacaaggt gcccctgcag ccgccacaca cacacagggg
2520atcacacacc acgtcacgtc cctggccctg gcccacttcc cagtgccgcc cttccctgca
2580gctggggtca catgaggtgt gggcttcacc atcctcctgc cctctgggcc tcagggaggg
2640acacgggaga cggggagcgg gtcctgctga gggccaggtc gctatctagg gccgggtgtc
2700tggctgagcc ccggggccaa agctggtgcc cagggcgggc agctgtgggg agctgacctc
2760aggacattgt tggcccatcc cggccgggcc ctacatcctg ggtcctgcca cagagggaat
2820cacccccaga ggcccaagcc cagggggaca cagcactgac cacccccttc ctgtccagag
2880ctgcaactgg aggagagctg tgcggaggcg caggacgggg agctggacgg gctgtggacg
2940accatcacca tcttcatcac actcttcctg ttaagcgtgt gctacagtgc caccgtcacc
3000ttcttcaagg tcggccgcac gttgtcccca gctgtccttg acattgtccc ccatgctgtc
3060acaaactgtc tctgacactg tcccacaggc tgtccccacc tgtccctgac gctgtccccc
3120atgctctcac aaactgtccc tgacattgtc cccaatgctg cccccacctg tccaacagtg
3180tcccccaggc tctccccaca tgtccccgac actgtccccc atgctgtccc catctgtccc
3240caacactgtc ccccaccctg tccccctttg tccccaacac tgtcccccac agtttccacc
3300tgtccctgac actgtccccc atgctttccc cacctgtccc tgacaccatc ccccactctg
3360tcccctatag ttcctggccc tgtcccccac gctgtcccct acagtacctg gcactgtccc
3420ccatgctgtc ccctcctgta tgaaaccctg tcccacatgc tgtccccacc tgtccgtgac
3480aatatccccc acactgtccc cacctgtccc cgacactctc ctccacgttg ttcttaccta
3540aacccgacac tttcctccat gctgtcccca cccatctccg acactgtacc ccacgttgtc
3600cccacctgtc ctcaacactg tcccccatgc tgtccccacc tgtccccaac actctcctcc
3660atgctgtccc cacctgtccc tgatattgtc ccccatgcag tctccacctg tccccaatgc
3720tgtcccccag gctgtaccta ccagtacaac actgtccccc atgctgtccc cacctgtccc
3780tgacactgtc ccccacgctg tcccctcctg tccccgacac tgtcccccac actgtcccca
3840cctgtcccca acactatcct ccatgctgtc ccctcctgtc cccacctgtc ccctacactg
3900tcccccatgc tgtccccacc agtccccaaa actttcctcc acactgtccc cacctgtccc
3960caacactgtc ccccacgcta tcccccctgt ccccgacaat gtccccactg tttcctcctg
4020ttccctccta tccctgacac tgtccgccat gctgtcccca cctgtccctg acactgtctc
4080ccactctgtc ccctataatc cctgacactg tcccccacgc cgtcccctcc cgtatgcacc
4140actgtccccc aagctgtccc cacctgtcct caacacagtc ccccatgctg tccccacctg
4200tccccaacac tctcctccat gtccccacct gtccctgata ttgtccccca tgcagtcccc
4260acctgtcccc gatgctgtcc cccgggctgt acctaccagt ccaacactgt cccccacact
4320ctccccacct gtccctgata ctgtccccca tgctgtcccc acctgtcccg gacactgttc
4380tccacgctct cccctcctgt ccctgacact gtcccccaca ctgtccccac ctgtccccaa
4440cactatcctc catcctgtcc caacctgtct cctacactgt cccccatgct gtccccacca
4500gtccccaaca ctgtcctcca tgctgtcccc catgtcccca acactgtccc ccatgctatc
4560tcccctgtcc ctgacaatgt ccccactgtt tcctgtcccc tcctatccct gacactgtcc
4620cccatgctgt ccccacctgt cccccacatg gtctccaccg gtccctgaca ctgtctccca
4680ctctgtcccc tataatccct gacactgtcc cccacaccgt cccctcctgt atgcaccact
4740gtcccccatg ctgtccccac ctgtccctga tgctgtcctc cacacagtcc ccacctctcc
4800ctgacactgt ccccatctct ccccaacact ctcctccatg ctgtccttaa ctgtccccaa
4860cactcttcca cactctgtct ccacctgtcc ctgacactgt cccccacact gtcctcacct
4920gtgtctgaca ctgtccccca cgctgtcccc acctgtccct gacgctgtct tctgtgctgt
4980ccacatgctg ttggtgccct ggctctgctc tctatcacca agcctcagag caggcagtgg
5040tgaggccatg gcacctgggt ggcatgaggg gccggatggg cctcaggggc agggctgtgg
5100cctgcgtgga ctgacgggtg ggtgggcctt gggggcagag aggtggcctc agtgccctga
5160ggggtgggtg gggctcgggg gcagggctgt ggcctcgctc acccctgtgc tgtgccttgc
5220ctacaggtga agtggatctt ctcctcggtg gtggacctga agcagaccat catccccgac
5280tacaggaaca tgatcggaca gggggcctag ggccaccctc tgcggggtgt ccagggccgc
5340ccagacccca cacaccagcc atgggccatg ctcagccacc acccaggcca cacctgcccc
5400cgacctcacc gccctcaacc ccatgactct ctggcctcgc agttgccctc tgaccctgac
5460acacctgaca cgcccccctt ccagaccctg tgcatagcag gtctacccca gacctccgct
5520gcttggtgca tgcagggcac tgggggccag gtgtcccctc agcaggacgt ccttgccctc
5580cggaccacaa ggtgctcaca caaaaggagg cagtgaccgg tatcccaggc ccccacccag
5640gcaggacctc gccctggagc caaccccgtc cacgccagcc tcctgaacac aggcgtggtt
5700tccagatggt gagtgggagc gtcagccgcc aaggtaggga agccacagca ccatcaggcc
5760ctgttgggga ggcttccgag agctgcgaag gctcactcag acggccttcc tcccagcccg
5820cagccagcca gcctccattc cgggcactcc cgtgaactcc tgacatgagg aatgaggttg
5880ttctgatttc aagcaaagaa cgctgctctc tggctcctgg gaacagtctc agtgccagca
5940ccaccccttg gctgcctgcc cacactgctg gattctcggg tggaactgga cccgcaggga
6000cagccagccc cagagtccgc actggggaga gaaggggcca ggcccaggac actgccacct
6060cccacccact ccagtccacc gagatcactc agagaagagc ctgggccatg tggccgctgc
6120aggagcccca cagtgcaagg gtgaggatag cccaaggaag ggctgggcat ctgcccagac
6180aggcctccca gagaaggctg gtgaccaggt cccaggcggg caagactcag ccttggtggg
6240gcctgaggac agaggaggcc caggagcatc ggggagagag gtggagggac accgggagag
6300ccaggagcgt ggacacagcc agaactcatc acagaggctg gcgtccagcc ccgggtcacg
6360tgcagcagga acaagcagcc actctggggg caccaggtgg agaggcaaga cgacaaagag
6420ggtgcccgtg ttcttgcgaa agcagggctg ctggccacga gtgctggaca gaggccccca
6480cgctctgctg cccccatcac gccgttccgt gactgtcacg cagaatctgc agacaggaag
6540ggagactcga gcgggagtgc ggccagcgcc tgcctcggcc gtcagggagg actcctgggc
6600tcactcgaag gaggtgccac catttcagct ttggtagctt ttcttcttct tttaaatttt
6660ctaaagctca ttaattgtct ttgatgtttc ttttgtgatg acaataaaat atccttttta
6720agtcttgta
6729366888DNAhomo sapiens; 366gcctccacca agggcccatc ggtcttcccc ctggcaccct
cctccaagag cacctctggg 60ggcacagcag ccctgggctg cctggtcaag gactacttcc
ccgaaccggt gacggtgtcg 120tggaactcag gcgccctgac cagcggcgtg cacaccttcc
cggctgtcct acagtcctca 180ggactctact ccctcagcag cgtggtgacc gtgccctcca
gcagcttggg cacccagacc 240tacatctgca acgtgaatca caagcccagc aacaccaagg
tggacaagaa agttgagccc 300aaatcttgtg acaaaactca cacatgccca ccgtgcccag
cacctgaact cctgggggga 360ccgtcagtct tcctcttccc cccaaaaccc aaggacaccc
tcatgatctc ccggacccct 420gaggtcacat gcgtggtggt ggacgtgagc cacgaagacc
ctgaggtcaa gttcaactgg 480tacgtggacg gcgtggagta caagtgcaag gtctccaaca
aagccctccc agcccccatc 540gagaaaacca tctccaaagc caaagggcag ccccgagaac
cacaggtgta caccctgccc 600ccatcccggg atgagctgac caagaaccag gtcagcctga
cctgcctggt caaaggcttc 660tatcccagcg acatcgccgt ggagtgggag agcaatgggc
agccggagaa caactacaag 720accacgcctc ccgtgctgga ctccgacggc tccttcttcc
tctacagcaa gctcaccgtg 780gacaagagca ggtggcagca ggggaacgtc ttctcatgct
ccgtgatgca tgaggctctg 840cacaaccact acacacagaa gagcctctcc ctgtctccgg
gtaaatga 888367993DNAhomo sapiens; 367gcctccacca
agggcccatc ggtcttcccc ctggcaccct cctccaagag cacctctggg 60ggcacagcag
ccctgggctg cctggtcaag gactacttcc ccgaaccggt gacggtgtcg 120tggaactcag
gcgccctgac cagcggcgtg cacaccttcc cggctgtcct acagtcctca 180ggactctact
ccctcagcag cgtggtgacc gtgccctcca gcagcttggg cacccagacc 240tacatctgca
acgtgaatca caagcccagc aacaccaagg tggacaagaa agttgagccc 300aaatcttgtg
acaaaactca cacatgccca ccgtgcccag cacctgaact cctgggggga 360ccgtcagtct
tcctcttccc cccaaaaccc aaggacaccc tcatgatctc ccggacccct 420gaggtcacat
gcgtggtggt ggacgtgagc cacgaagacc ctgaggtcaa gttcaactgg 480tacgtggacg
gcgtggaggt gcataatgcc aagacaaagc cgcgggagga gcagtacaac 540agcacgtacc
gtgtggtcag cgtcctcacc gtcctgcacc aggactggct gaatggcaag 600gagtacaagt
gcaaggtctc caacaaagcc ctcccagccc ccatcgagaa aaccatctcc 660aaagccaaag
ggcagccccg agaaccacag gtgtacaccc tgcccccatc ccgggatgag 720ctgaccaaga
accaggtcag cctgacctgc ctggtcaaag gcttctatcc cagcgacatc 780gccgtggagt
gggagagcaa tgggcagccg gagaacaact acaagaccac gcctcccgtg 840ctggactccg
acggctcctt cttcctctac agcaagctca ccgtggacaa gagcaggtgg 900cagcagggga
acgtcttctc atgctccgtg atgcatgagg ctctgcacaa ccactacaca 960cagaagagcc
tctccctgtc tccgggtaaa tga
9933681200DNAhomo sapiens; 368gcctccacca agggcccatc ggtcttcccc ctggcaccct
cctccaagag cacctctggg 60ggcacagcag ccctgggctg cctggtcaag gactacttcc
ccgaaccggt gacggtgtcg 120tggaactcag gcgccctgac cagcggcgtg cacaccttcc
cggctgtcct acagtcctca 180ggactctact ccctcagcag cgtggtgacc gtgccctcca
gcagcttggg cacccagacc 240tacatctgca acgtgaatca caagcccagc aacaccaagg
tggacaagaa agttgagccc 300aaatcttgtg acaaaactca cacatgccca ccgtgcccag
cacctgaact cctgggggga 360ccgtcagtct tcctcttccc cccaaaaccc aaggacaccc
tcatgatctc ccggacccct 420gaggtcacat gcgtggtggt ggacgtgagc cacgaagacc
ctgaggtcaa gttcaactgg 480tacgtggacg gcgtggaggt gcataatgcc aagacaaagc
cgcgggagga gcagtacaac 540agcacgtacc gtgtggtcag cgtcctcacc gtcctgcacc
aggactggct gaatggcaag 600gagtacaagt gcaaggtctc caacaaagcc ctcccagccc
ccatcgagaa aaccatctcc 660aaagccaaag ggcagccccg agaaccacag gtgtacaccc
tgcccccatc ccgggatgag 720ctgaccaaga accaggtcag cctgacctgc ctggtcaaag
gcttctatcc cagcgacatc 780gccgtggagt gggagagcaa tgggcagccg gagaacaact
acaagaccac gcctcccgtg 840ctggactccg acggctcctt cttcctctac agcaagctca
ccgtggacaa gagcaggtgg 900cagcagggga acgtcttctc atgctccgtg atgcatgagg
ctctgcacaa ccactacaca 960cagaagagcc tctccctgtc tccggagctg caactggagg
agagctgtgc ggaggcgcag 1020gacggggagc tggacgggct gtggacgacc atcaccatct
tcatcacact cttcctgtta 1080agcgtgtgct acagtgccac cgtcaccttc ttcaaggtga
agtggatctt ctcctcggtg 1140gtggacctga agcagaccat catccccgac tacaggaaca
tgatcggaca gggggcctag 12003691739DNAhomo sapiens; 369gcctccacca
agggcccatc ggtcttcccc ctggcgccct gctccaggag cacctccgag 60agcacagcgg
ccctgggctg cctggtcaag gactacttcc ccgaaccggt gacggtgtcg 120tggaactcag
gcgctctgac cagcggcgtg cacaccttcc cggctgtcct acagtcctca 180ggactctact
ccctcagcag cgtggtgacc gtgccctcca gcaacttcgg cacccagacc 240tacacctgca
acgtagatca caagcccagc aacaccaagg tggacaagac agttggtgag 300aggccagctc
agggagggag ggtgtctgct ggaagccagg ctcagccctc ctgcctggac 360gcaccccggc
tgtgcagccc cagcccaggg cagcaaggca ggccccatct gtctcctcac 420ccggaggcct
ctgcccgccc cactcatgct cagggagagg gtcttctggc tttttccacc 480aggctccagg
caggcacagg ctgggtgccc ctaccccagg cccttcacac acaggggcag 540gtgcttggct
cagacctgcc aaaagccata tccgggagga ccctgcccct gacctaagcc 600gaccccaaag
gccaaactgt ccactccctc agctcggaca ccttctctcc tcccagatcc 660gagtaactcc
caatcttctc tctgcagagc gcaaatgttg tgtcgagtgc ccaccgtgcc 720caggtaagcc
agcccaggcc tcgccctcca gctcaaggcg ggacaggtgc cctagagtag 780cctgcatcca
gggacagacc ccagctgggt gctgacacgt ccacctccat ctcttcctca 840gcaccacctg
tggcaggacc gtcagtcttc ctcttccccc caaaacccaa ggacaccctc 900atgatctccc
ggacccctga ggtcacgtgc gtggtggtgg acgtgagcca cgaagacccc 960gaggtccagt
tcaactggta cgtggacggc gtggaggtgc ataatgccaa gacaaagcca 1020cgggaggagc
agttcaacag cacgttccgt gtggtcagcg tcctcaccgt cgtgcaccag 1080gactggctga
acggcaagga gtacaagtgc aaggtctcca acaaaggcct cccagccccc 1140atcgagaaaa
ccatctccaa aaccaaaggt gggacccgcg gggtatgagg gccacatgga 1200cagaggccgg
ctcggcccac cctctgccct gggagtgacc gctgtgccaa cctctgtccc 1260tacagggcag
ccccgagaac cacaggtgta caccctgccc ccatcccggg aggagatgac 1320caagaaccag
gtcagcctga cctgcctggt caaaggcttc taccccagcg acatctccgt 1380ggagtgggag
agcaatgggc agccggagaa caactacaag accacacctc ccatgctgga 1440ctccgacggc
tccttcttcc tctacagcaa gctcaccgtg gacaagagca ggtggcagca 1500ggggaacgtc
ttctcatgct ccgtgatgca tgaggctctg cacaaccact acacacagaa 1560gagcctctcc
ctgtctccgg gtaaatgagt gccacggccg gcaagccccc gctccccagg 1620ctctcggggt
cgcgcgagga tgcttggcac gtaccccgtc tacatacttc ccgggcaccc 1680agcatggaaa
taaagcaccc agcgctgccc tgggcccctg cgagactgtg atggttctt
1739370981DNAhomo sapiens; 370gcctccacca agggcccatc ggtcttcccc ctggcgccct
gctccaggag cacctccgag 60agcacagcgg ccctgggctg cctggtcaag gactacttcc
ccgaaccggt gacggtgtcg 120tggaactcag gcgctctgac cagcggcgtg cacaccttcc
cggctgtcct acagtcctca 180ggactctact ccctcagcag cgtggtgacc gtgccctcca
gcaacttcgg cacccagacc 240tacacctgca acgtagatca caagcccagc aacaccaagg
tggacaagac agttgagcgc 300aaatgttgtg tcgagtgccc accgtgccca gcaccacctg
tggcaggacc gtcagtcttc 360ctcttccccc caaaacccaa ggacaccctc atgatctccc
ggacccctga ggtcacgtgc 420gtggtggtgg acgtgagcca cgaagacccc gaggtccagt
tcaactggta cgtggacggc 480gtggaggtgc ataatgccaa gacaaagcca cgggaggagc
agttcaacag cacgttccgt 540gtggtcagcg tcctcaccgt cgtgcaccag gactggctga
acggcaagga gtacaagtgc 600aaggtctcca acaaaggcct cccagccccc atcgagaaaa
ccatctccaa aaccaaaggg 660cagccccgag aaccacaggt gtacaccctg cccccatccc
gggaggagat gaccaagaac 720caggtcagcc tgacctgcct ggtcaaaggc ttctacccca
gcgacatctc cgtggagtgg 780gagagcaatg ggcagccgga gaacaactac aagaccacac
ctcccatgct ggactccgac 840ggctccttct tcctctacag caagctcacc gtggacaaga
gcaggtggca gcaggggaac 900gtcttctcat gctccgtgat gcatgaggct ctgcacaacc
actacacaca gaagagcctc 960tccctgtctc cgggtaaatg a
981371981DNAhomo sapiens; 371gcctccacca agggcccatc
ggtcttcccc ctggcgccct gctccaggag cacctccgag 60agcacagcgg ccctgggctg
cctggtcaag gactacttcc ccgaaccggt gacggtgtcg 120tggaactcag gcgctctgac
cagcggcgtg cacaccttcc cggctgtcct acagtcctca 180ggactctact ccctcagcag
cgtggtgacc gtgacctcca gcaacttcgg cacccagacc 240tacacctgca acgtagatca
caagcccagc aacaccaagg tggacaagac agttgagcgc 300aaatgttgtg tcgagtgccc
accgtgccca gcaccacctg tggcaggacc gtcagtcttc 360ctcttccccc caaaacccaa
ggacaccctc atgatctccc ggacccctga ggtcacgtgc 420gtggtggtgg acgtgagcca
cgaagacccc gaggtccagt tcaactggta cgtggacggc 480atggaggtgc ataatgccaa
gacaaagcca cgggaggagc agttcaacag cacgttccgt 540gtggtcagcg tcctcaccgt
cgtgcaccag gactggctga acggcaagga gtacaagtgc 600aaggtctcca acaaaggcct
cccagccccc atcgagaaaa ccatctccaa aaccaaaggg 660cagccccgag aaccacaggt
gtacaccctg cccccatccc gggaggagat gaccaagaac 720caggtcagcc tgacctgcct
ggtcaaaggc ttctacccca gcgacatctc cgtggagtgg 780gagagcaatg ggcagccgga
gaacaactac aagaccacac ctcccatgct ggactccgac 840ggctccttct tcctctacag
caagctcacc gtggacaaga gcaggtggca gcaggggaac 900gtcttctcat gctccgtgat
gcatgaggct ctgcacaacc actacacaca gaagagcctc 960tccctgtctc cgggtaaatg a
9813721739DNAhomo sapiens;
372gcctccacca agggcccatc ggtcttcccc ctggcgccct gctccaggag cacctccgag
60agcacagcgg ccctgggctg cctggtcaag gactacttcc ccgaaccggt gacggtgtcg
120tggaactcag gcgctctgac cagcggcgtg cacaccttcc cggctgtcct acagtcctca
180ggactctact ccctcagcag cgtggtgacc gtgacctcca gcaacttcgg cacccagacc
240tacacctgca acgtagatca caagcccagc aacaccaagg tggacaagac agttggtgag
300aggccagctc agggagggag ggtgtctgct ggaagccagg ctcagccctc ctgcctggac
360gcaccccggc tgtgcagccc cagcccaggg cagcaaggca ggccccatct gtctcctcac
420ccggaggcct ctgcccgccc cactcatgct cagggagagg gtcttctggc tttttccacc
480aggctccagg caggcacagg ctgggtgccc ctaccccagg cccttcacac acaggggcag
540gtgcttggct cagacctgcc aaaagccata tccgggagga ccctgcccct gacctaagcc
600gaccccaaag gccaaactgt ccactccctc agctcggaca ccttctctcc tcccagatcc
660gagtaactcc caatcttctc tctgcagagc gcaaatgttg tgtcgagtgc ccaccgtgcc
720caggtaagcc agcccaggcc tcgccctcca gctcaaggcg ggacaggtgc cctagagtag
780cctgcatcca gggacagacc ccagctgggt gctgacacgt ccacctccat ctcttcctca
840gcaccacctg tggcaggacc gtcagtcttc ctcttccccc caaaacccaa ggacaccctc
900atgatctccc ggacccctga ggtcacgtgc gtggtggtgg acgtgagcca cgaagacccc
960gaggtccagt tcaactggta cgtggacggc atggaggtgc ataatgccaa gacaaagcca
1020cgggaggagc agttcaacag cacgttccgt gtggtcagcg tcctcaccgt cgtgcaccag
1080gactggctga acggcaagga gtacaagtgc aaggtctcca acaaaggcct cccagccccc
1140atcgagaaaa ccatctccaa aaccaaaggt gggacccgcg gggtatgagg gccacatgga
1200cagaggccgg ctcggcccac cctctgccct gggagtgacc gctgtgccaa cctctgtccc
1260tacagggcag ccccgagaac cacaggtgta caccctgccc ccatcccggg aggagatgac
1320caagaaccag gtcagcctga cctgcctggt caaaggcttc taccccagcg acatctccgt
1380ggagtgggag agcaatgggc agccggagaa caactacaag accacacctc ccatgctgga
1440ctccgacggc tccttcttcc tctacagcaa gctcaccgtg gacaagagca ggtggcagca
1500ggggaacgtc ttctcatgct ccgtgatgca tgaggctctg cacaaccact acacacagaa
1560gagcctctcc ctgtctccgg gtaaatgagt gccacggccg gcaagccccc gctccccagg
1620ctctcggggt cgcgcgagga tgcttggcac gtaccccgtc tacatacttc ccgggcaccc
1680agcatggaaa taaagcaccc agcgctgccc tgggcccctg cgagactgtg atggttctt
17393732304DNAhomo sapiens; 373cttccaccaa gggcccatcg gtcttccccc
tggcgccctg ctccaggagc acctctgggg 60gcacagcggc cctgggctgc ctggtcaagg
actacttccc agaaccggtg acggtgtcgt 120ggaactcagg cgccctgacc agcggcgtgc
acaccttccc ggctgtccta cagtcctcag 180gactctactc cctcagcagc gtggtgaccg
tgccctccag cagcttgggc acccagacct 240acacctgcaa cgtgaatcac aagcccagca
acaccaaggt ggacaagaga gttggtgaga 300ggccagcgca gggagggagg gtgtctgctg
gaagccaggc tcagccctcc tgcctggacg 360catcccggct gtgcagtccc agcccagggc
accaaggcag gccccgtctg actcctcacc 420cggaggcctc tgcccgcccc actcatgctc
agggagaggg tcttctggct ttttccacca 480ggctccgggc aggcacaggc tggatgcccc
taccccaggc ccttcacaca caggggcagg 540tgctgcgctc agagctgcca agagccatat
ccaggaggac cctgcccctg acctaagccc 600accccaaagg ccaaactctc tactcactca
gctcagatac cttctctctt cccagatctg 660agtaactccc aatcttctct ctgcagagct
caaaacccca cttggtgaca caactcacac 720atgcccacgg tgcccaggta agccagccca
ggcctcgccc tccagctcaa ggcgggacaa 780gagccctaga gtggcctgag tccagggaca
ggccccagca gggtgctgac gcatccacct 840ccatcccaga tccccgtaac tcccaatctt
ctctctgcag agcccaaatc ttgtgacaca 900cctcccccgt gcccacggtg cccaggtaag
ccagcccagg cctcgccctc cagctcaagg 960caggacaaga gccctagagt ggcctgagtc
cagggacagg ccccagcagg gtgctgacgc 1020gtccacctcc atcccagatc cccgtaactc
ccaatcttct ctctgcagag cccaaatctt 1080gtgacacacc tcccccatgc ccacggtgcc
caggtaagcc agcccaggcc tcgccctcca 1140gctcaaggcg ggacaagagc cctagagtgg
cctgagtcca gggacaggcc ccagcagggt 1200gctgacgcat ccacctccat cccagatccc
cgtaactccc aatcttctct ctgcagagcc 1260caaatcttgt gacacacctc ccccgtgccc
aaggtgccca ggtaagccag cccaggcctc 1320gccctccagc tcaaggcagg acaggtgccc
tagagtggcc tgcatccagg gacaggtccc 1380agtcgggtgc tgacacatct gcctccatct
cttcctcagc acctgaactc ctgggaggac 1440cgtcagtctt cctcttcccc ccaaaaccca
aggataccct tatgatttcc cggacccctg 1500aggtcacgtg cgtggtggtg gacgtgagcc
acgaagaccc cgaggtccag ttcaagtggt 1560acgtggacgg cgtggaggtg cataatgcca
agacaaagcc gcgggaggag cagtacaaca 1620gcacgttccg tgtggtcagc gtcctcaccg
tcctgcacca ggactggctg aacggcaagg 1680agtacaagtg caaggtctcc aacaaagccc
tcccagcccc catcgagaaa accatctcca 1740aaaccaaagg tgggacccgc ggggtatgag
ggccacatgg acagaggcca gcttgaccca 1800ccctctgccc tgggagtgac cgctgtgcca
acctctgtcc ctacaggaca gccccgagaa 1860ccacaggtgt acaccctgcc cccatcccgg
gaggagatga ccaagaacca ggtcagcctg 1920acctgcctgg tcaaaggctt ctaccccagc
gacatcgccg tggagtggga gagcagcggg 1980cagccggaga acaactacaa caccacgcct
cccatgctgg actccgacgg ctccttcttc 2040ctctacagca agctcaccgt ggacaagagc
aggtggcagc aggggaacat cttctcatgc 2100tccgtgatgc atgaggctct gcacaaccgc
ttcacgcaga agagcctctc cctgtctccg 2160ggtaaatgag tgcgacggcc ggcaagcccc
cgctccccgg gctctcgggg tcgcgcgagg 2220atgcttggca cgtaccccgt gtacatactt
cccgggcacc cagcatggaa ataaagcacc 2280cagcgctgcc ctgggcccct gcga
23043741134DNAhomo
sapiens;misc_feature(1)..(1)n is a, c, g, or t 374ncttccacca agggcccatc
ggtcttcccc ctggcgccct gctccaggag cacctctggg 60ggcacagcgg ccctgggctg
cctggtcaag gactacttcc cagaaccggt gacggtgtcg 120tggaactcag gcgccctgac
cagcggcgtg cacaccttcc cggctgtcct acagtcctca 180ggactctact ccctcagcag
cgtggtgacc gtgccctcca gcagcttggg cacccagacc 240tacacctgca acgtgaatca
caagcccagc aacaccaagg tggacaagag agttgagctc 300aaaaccccac ttggtgacac
aactcacaca tgcccacggt gcccagagcc caaatcttgt 360gacacacctc ccccgtgccc
acggtgccca gagcccaaat cttgtgacac acctccccca 420tgcccacggt gcccagagcc
caaatcttgt gacacacctc ccccgtgccc aaggtgccca 480gcacctgaac tcctgggagg
accgtcagtc ttcctcttcc ccccaaaacc caaggatacc 540cttatgattt cccggacccc
tgaggtcacg tgcgtggtgg tggacgtgag ccacgaagac 600cccgaggtcc agttcaagtg
gtacgtggac ggcgtggagg tgcataatgc caagacaaag 660ccgcgggagg agcagtacaa
cagcacgttc cgtgtggtca gcgtcctcac cgtcctgcac 720caggactggc tgaacggcaa
ggagtacaag tgcaaggtct ccaacaaagc cctcccagcc 780cccatcgaga aaaccatctc
caaaaccaaa ggacagcccc gagaaccaca ggtgtacacc 840ctgcccccat cccgggagga
gatgaccaag aaccaggtca gcctgacctg cctggtcaaa 900ggcttctacc ccagcgacat
cgccgtggag tgggagagca gcgggcagcc ggagaacaac 960tacaacacca cgcctcccat
gctggactcc gacggctcct tcttcctcta cagcaagctc 1020accgtggaca agagcaggtg
gcagcagggg aacatcttct catgctccgt gatgcatgag 1080gctctgcaca accgcttcac
gcagaagagc ctctccctgt ctccgggtaa atga 11343751134DNAhomo
sapiens;misc_feature(1)..(1)n is a, c, g, or t 375ncttccacca agggcccatc
ggtcttcccc ctggcgccct gctccaggag cacctctggg 60ggcacagcgg ccctgggctg
cctggtcaag gactacttcc cagaaccggt gacggtgtcg 120tggaactcag gcgccctgac
cagcggcgtg cacaccttcc cggctgtcct acagtcctca 180ggactctact ccctcagcag
cgtggtgacc gtgccctcca gcagcttggg cacccagacc 240tacacctgca acgtgaatca
caagcccagc aacaccaagg tggacaagag agttgagctc 300aaaaccccac ttggtgacac
aactcacaca tgcccacggt gcccagagcc caaatcttgt 360gacacacctc ccccgtgccc
acggtgccca gagcccaaat cttgtgacac acctccccca 420tgcccacggt gcccagagcc
caaatcttgt gacacacctc ccccgtgccc aaggtgccca 480gcacctgaac tcctgggagg
accgtcagtc ttcctcttcc ccccaaaacc caaggatacc 540cttatgattt cccggacccc
tgaggtcacg tgcgtggtgg tggacgtgag ccacgaagac 600cccgaggtcc agttcaagtg
gtacgtggac ggcgtggagg tgcataatgc caagacaaag 660ctgcgggagg agcagtacaa
cagcacgttc cgtgtggtca gcgtcctcac cgtcctgcac 720caggactggc tgaacggcaa
ggagtacaag tgcaaggtct ccaacaaagc cctcccagcc 780cccatcgaga aaaccatctc
caaaaccaaa ggacagcccc gagaaccaca ggtgtacacc 840ctgcccccat cccgggagga
gatgaccaag aaccaggtca gcctgacctg cctggtcaaa 900ggcttctacc ccagcgacat
cgccgtggag tgggagagca gcgggcagcc ggagaacaac 960tacaacacca cgcctcccat
gctggactcc gacggctcct tcttcctcta cagcaagctc 1020accgtggaca agagcaggtg
gcagcagggg aacatcttct catgctccgt gatgcatgag 1080gctctgcaca accgctacac
gcagaagagc ctctccctgt ctccgggtaa atga 11343762304DNAhomo sapiens;
376cttccaccaa gggcccatcg gtcttccccc tggcgccctg ctccaggagc acctctgggg
60gcacagcggc cctgggctgc ctggtcaagg actacttccc agaaccggtg acggtgtcgt
120ggaactcagg cgccctgacc agcggcgtgc acaccttccc ggctgtccta cagtcctcag
180gactctactc cctcagcagc gtggtgaccg tgccctccag cagcttgggc acccagacct
240acacctgcaa cgtgaatcac aagcccagca acaccaaggt ggacaagaga gttggtgaga
300ggccagcgca gggagggagg gtgtctgctg gaagccaggc tcagccctcc tgcctggacg
360catcccggct gtgcagtccc agcccagggc accaaggcag gccccgtctg actcctcacc
420cggaggcctc tgcccgcccc actcatgctc agggagaggg tcttctggct ttttccacca
480ggctccgggc aggcacaggc tggatgcccc taccccaggc ccttcacaca caggggcagg
540tgctgcgctc agagctgcca agagccatat ccaggaggac cctgcccctg acctaagccc
600accccaaagg ccaaactctc tactcactca gctcagatac cttctctctt cccagatctg
660agtaactccc aatcttctct ctgcagagct caaaacccca cttggtgaca caactcacac
720atgcccacgg tgcccaggta agccagccca ggcctcgccc tccagctcaa ggcgggacaa
780gagccctaga gtggcctgag tccagggaca ggccccagca gggtgctgac gcatccacct
840ccatcccaga tccccgtaac tcccaatctt ctctctgcag agcccaaatc ttgtgacaca
900cctcccccgt gcccacggtg cccaggtaag ccagcccagg cctcgccctc cagctcaagg
960caggacaaga gccctagagt ggcctgagtc cagggacagg ccccagcagg gtgctgacgc
1020gtccacctcc atcccagatc cccgtaactc ccaatcttct ctctgcagag cccaaatctt
1080gtgacacacc tcccccatgc ccacggtgcc caggtaagcc agcccaggcc tcgccctcca
1140gctcaaggcg ggacaagagc cctagagtgg cctgagtcca gggacaggcc ccagcagggt
1200gctgacgcat ccacctccat cccagatccc cgtaactccc aatcttctct ctgcagagcc
1260caaatcttgt gacacacctc ccccgtgccc aaggtgccca ggtaagccag cccaggcctc
1320gccctccagc tcaaggcagg acaggtgccc tagagtggcc tgcatccagg gacaggtccc
1380agtcgggtgc tgacacatct gcctccatct cttcctcagc acctgaactc ctgggaggac
1440cgtcagtctt cctcttcccc ccaaaaccca aggataccct tatgatttcc cggacccctg
1500aggtcacgtg cgtggtggtg gacgtgagcc acgaagaccc cgaggtccag ttcaagtggt
1560acgtggacgg cgtggaggtg cataatgcca agacaaagct gcgggaggag cagtacaaca
1620gcacgttccg tgtggtcagc gtcctcaccg tcctgcacca ggactggctg aacggcaagg
1680agtacaagtg caaggtctcc aacaaagccc tcccagcccc catcgagaaa accatctcca
1740aaaccaaagg tgggacccgc ggggtatgag ggccacatgg acagaggcca gcttgaccca
1800ccctctgccc tgggagtgac cgctgtgcca acctctgtcc ctacaggaca gccccgagaa
1860ccacaggtgt acaccctgcc cccatcccgg gaggagatga ccaagaacca ggtcagcctg
1920acctgcctgg tcaaaggctt ctaccccagc gacatcgccg tggagtggga gagcagcggg
1980cagccggaga acaactacaa caccacgcct cccatgctgg actccgacgg ctccttcttc
2040ctctacagca agctcaccgt ggacaagagc aggtggcagc aggggaacat cttctcatgc
2100tccgtgatgc atgaggctct gcacaaccgc tacacgcaga agagcctctc cctgtctccg
2160ggtaaatgag tgcgacggcc ggcaagcccc cgctccccgg gctctcgggg tcgcgcgagg
2220atgcttggca cgtaccccgt gtacatactt cccgggcacc cagcatggaa ataaagcacc
2280cagcgctgcc ctgggcccct gcga
23043771134DNAhomo sapiens;misc_feature(1)..(1)n is a, c, g, or t
377ncttccacca agggcccatc ggtcttcccc ctggcgccct gctccaggag cacctctggg
60ggcacagcgg ccctgggctg cctggtcaag gactacttcc cagaaccggt gacggtgtcg
120tggaactcag gcgccctgac cagcggcgtg cacaccttcc cggctgtcct acagtcctca
180ggactctact ccctcagcag cgtggtgacc gtgccctcca gcagcttggg cacccagacc
240tacacctgca acgtgaatca caagcccagc aacaccaagg tggacaagag agttgagctc
300aaaaccccac ttggtgacac aactcacaca tgcccacggt gcccagagcc caaatcttgt
360gacacacctc ccccgtgccc acggtgccca gagcccaaat cttgtgacac acctccccca
420tgcccacggt gcccagagcc caaatcttgt gacacacctc ccccgtgccc aaggtgccca
480gcacctgaac tcctgggagg accgtcagtc ttcctcttcc ccccaaaacc caaggatacc
540cttatgattt cccggacccc tgaggtcacg tgcgtggtgg tggacgtgag ccacgaagac
600cccgaggtcc agttcaagtg gtacgtggac ggcgtggagg tgcataatgc caagacaaag
660ccgcgggagg agcagttcaa cagcacgttc cgtgtggtca gcgtcctcac cgtcctgcac
720caggactggc tgaacggcaa ggagtacaag tgcaaggtct ccaacaaagc cctcccagcc
780cccatcgaga aaaccatctc caaaaccaaa ggacagcccc gagaaccaca ggtgtacacc
840ctgcccccat cccgggagga gatgaccaag aaccaggtca gcctgacctg cctggtcaaa
900ggcttctacc ccagcgacat cgccgtggag tgggagagca gcgggcagcc ggagaacaac
960tacaacacca cgcctcccat gctggactcc gacggctcct tcttcctcta cagcaagctc
1020accgtggaca agagcaggtg gcagcagggg aacatcttct catgctccgt gatgcatgag
1080gctctgcaca accgcttcac gcagaagagc ctctccctgt ctccgggtaa atga
11343782304DNAhomo sapiens; 378cttccaccaa gggcccatcg gtcttccccc
tggcgccctg ctccaggagc acctctgggg 60gcacagcggc cctgggctgc ctggtcaagg
actacttccc agaaccggtg acggtgtcgt 120ggaactcagg cgccctgacc agcggcgtgc
acaccttccc ggctgtccta cagtcctcag 180gactctactc cctcagcagc gtggtgaccg
tgccctccag cagcttgggc acccagacct 240acacctgcaa cgtgaatcac aagcccagca
acaccaaggt ggacaagaga gttggtgaga 300ggccagcgca gggagggagg gtgtctgctg
gaagccaggc tcagccctcc tgcctggacg 360catcccggct gtgcagtccc agcccagggc
accaaggcag gccccgtctg actcctcacc 420cggaggcctc tgcccgcccc actcatgctc
agggagaggg tcttctggct ttttccacca 480ggctccgggc aggcacaggc tggatgcccc
taccccaggc ccttcacaca caggggcagg 540tgctgcgctc agagctgcca agagccatat
ccaggaggac cctgcccctg acctaagccc 600accccaaagg ccaaactctc tactcactca
gctcagatac cttctctctt cccagatctg 660agtaactccc aatcttctct ctgcagagct
caaaacccca cttggtgaca caactcacac 720atgcccacgg tgcccaggta agccagccca
ggcctcgccc tccagctcaa ggcgggacaa 780gagccctaga gtggcctgag tccagggaca
ggccccagca gggtgctgac gcatccacct 840ccatcccaga tccccgtaac tcccaatctt
ctctctgcag agcccaaatc ttgtgacaca 900cctcccccgt gcccacggtg cccaggtaag
ccagcccagg cctcgccctc cagctcaagg 960caggacaaga gccctagagt ggcctgagtc
cagggacagg ccccagcagg gtgctgacgc 1020gtccacctcc atcccagatc cccgtaactc
ccaatcttct ctctgcagag cccaaatctt 1080gtgacacacc tcccccatgc ccacggtgcc
caggtaagcc agcccaggcc tcgccctcca 1140gctcaaggcg ggacaagagc cctagagtgg
cctgagtcca gggacaggcc ccagcagggt 1200gctgacgcat ccacctccat cccagatccc
cgtaactccc aatcttctct ctgcagagcc 1260caaatcttgt gacacacctc ccccgtgccc
aaggtgccca ggtaagccag cccaggcctc 1320gccctccagc tcaaggcagg acaggtgccc
tagagtggcc tgcatccagg gacaggtccc 1380agtcgggtgc tgacacatct gcctccatct
cttcctcagc acctgaactc ctgggaggac 1440cgtcagtctt cctcttcccc ccaaaaccca
aggataccct tatgatttcc cggacccctg 1500aggtcacgtg cgtggtggtg gacgtgagcc
acgaagaccc cgaggtccag ttcaagtggt 1560acgtggacgg cgtggaggtg cataatgcca
agacaaagcc gcgggaggag cagttcaaca 1620gcacgttccg tgtggtcagc gtcctcaccg
tcctgcacca ggactggctg aacggcaagg 1680agtacaagtg caaggtctcc aacaaagccc
tcccagcccc catcgagaaa accatctcca 1740aaaccaaagg tgggacccgc ggggtatgag
ggccacatgg acagaggcca gcttgaccca 1800ccctctgccc tgggagtgac cgctgtgcca
acctctgtcc ctacaggaca gccccgagaa 1860ccacaggtgt acaccctgcc cccatcccgg
gaggagatga ccaagaacca ggtcagcctg 1920acctgcctgg tcaaaggctt ctaccccagc
gacatcgccg tggagtggga gagcagcggg 1980cagccggaga acaactacaa caccacgcct
cccatgctgg actccgacgg ctccttcttc 2040ctctacagca agctcaccgt ggacaagagc
aggtggcagc aggggaacat cttctcatgc 2100tccgtgatgc atgaggctct gcacaaccgc
ttcacgcaga agagcctctc cctgtctccg 2160ggtaaatgag tgcgacggcc ggcaagcccc
cgctccccgg gctctcgggg tcgcgcgagg 2220atgcttggca cgtaccccgt gtacatactt
cccgggcacc cagcatggaa ataaagcacc 2280cagcgctgcc ctgggcccct gcga
23043791717DNAhomo sapiens;
379gcttccacca agggcccatc ggtcttcccc ctggcgccct gctccaggag cacctccgag
60agcacagccg ccctgggctg cctggtcaag gactacttcc ccgaaccggt gacggtgtcg
120tggaactcag gcgccctgac cagcggcgtg cacaccttcc cggctgtcct acagtcctca
180ggactctact ccctcagcag cgtggtgacc gtgccctcca gcagcttggg cacgaagacc
240tacacctgca acgtagatca caagcccagc aacaccaagg tggacaagag agttggtgag
300aggccagcac agggagggag ggtgtctgct ggaagccagg ctcagccctc ctgcctggac
360gcaccccggc tgtgcagccc cagcccaggg cagcaaggca ggccccatct gtctcctcac
420ccggaggcct ctgaccaccc cactcatgct cagggagagg gtcttctgga tttttccacc
480aggctccggg cagccacagg ctggatgccc ctaccccagg ccctgcgcat acaggggcag
540gtgctgcgct cagacctgcc aagagccata tccgggagga ccctgcccct gacctaagcc
600caccccaaag gccaaactct ccactccctc agctcagaca ccttctctcc tcccagatct
660gagtaactcc caatcttctc tctgcagagt ccaaatatgg tcccccatgc ccatcatgcc
720caggtaagcc aacccaggcc tcgccctcca gctcaaggcg ggacaggtgc cctagagtag
780cctgcatcca gggacaggcc ccagccgggt gctgacgcat ccacctccat ctcttcctca
840gcacctgagt tcctgggggg accatcagtc ttcctgttcc ccccaaaacc caaggacact
900ctcatgatct cccggacccc tgaggtcacg tgcgtggtgg tggacgtgag ccaggaagac
960cccgaggtcc agttcaactg gtacgtggat ggcgtggagg tgcataatgc caagacaaag
1020ccgcgggagg agcagttcaa cagcacgtac cgtgtggtca gcgtcctcac cgtcctgcac
1080caggactggc tgaacggcaa ggagtacaag tgcaaggtct ccaacaaagg cctcccgtcc
1140tccatcgaga aaaccatctc caaagccaaa ggtgggaccc acggggtgcg agggccacat
1200ggacagaggt cagctcggcc caccctctgc cctgggagtg accgctgtgc caacctctgt
1260ccctacaggg cagccccgag agccacaggt gtacaccctg cccccatccc aggaggagat
1320gaccaagaac caggtcagcc tgacctgcct ggtcaaaggc ttctacccca gcgacatcgc
1380cgtggagtgg gagagcaatg ggcagccgga gaacaactac aagaccacgc ctcccgtgct
1440ggactccgac ggctccttct tcctctacag caggctcacc gtggacaaga gcaggtggca
1500ggaggggaat gtcttctcat gctccgtgat gcatgaggct ctgcacaacc actacacaca
1560gaagagcctc tccctgtctc tgggtaaatg agtgccaggg ccggcaagcc cccgctcccc
1620gggctctcgg ggtcgcgcga ggatgcttgg cacgtacccc gtgtacatac ttcccgggcg
1680cccagcatgg aaataaagca cccagcgctg ccctggg
1717380984DNAhomo sapiens; 380gcttccacca agggcccatc ggtcttcccc ctggcgccct
gctccaggag cacctccgag 60agcacagccg ccctgggctg cctggtcaag gactacttcc
ccgaaccggt gacggtgtcg 120tggaactcag gcgccctgac cagcggcgtg cacaccttcc
cggctgtcct acagtcctca 180ggactctact ccctcagcag cgtggtgacc gtgccctcca
gcagcttggg cacgaagacc 240tacacctgca acgtagatca caagcccagc aacaccaagg
tggacaagag agttgagtcc 300aaatatggtc ccccatgccc atcatgccca gcacctgagt
tcctgggggg accatcagtc 360ttcctgttcc ccccaaaacc caaggacact ctcatgatct
cccggacccc tgaggtcacg 420tgcgtggtgg tggacgtgag ccaggaagac cccgaggtcc
agttcaactg gtacgtggat 480ggcgtggagg tgcataatgc caagacaaag ccgcgggagg
agcagttcaa cagcacgtac 540cgtgtggtca gcgtcctcac cgtcctgcac caggactggc
tgaacggcaa ggagtacaag 600tgcaaggtct ccaacaaagg cctcccgtcc tccatcgaga
aaaccatctc caaagccaaa 660gggcagcccc gagagccaca ggtgtacacc ctgcccccat
cccaggagga gatgaccaag 720aaccaggtca gcctgacctg cctggtcaaa ggcttctacc
ccagcgacat cgccgtggag 780tgggagagca atgggcagcc ggagaacaac tacaagacca
cgcctcccgt gctggactcc 840gacggctcct tcttcctcta cagcaggctc accgtggaca
agagcaggtg gcaggagggg 900aatgtcttct catgctccgt gatgcatgag gctctgcaca
accactacac acagaagagc 960ctctccctgt ctctgggtaa atga
984381984DNAhomo sapiens; 381gcttccacca agggcccatc
ggtcttcccc ctggcgccct gctccaggag cacctccgag 60agcacagccg ccctgggctg
cctggtcaag gactacttcc ccgaaccggt gacggtgtcg 120tggaactcag gcgccctgac
cagcggcgtg cacaccttcc cggctgtcct acagtcctca 180ggactctact ccctcagcag
cgtggtgacc gtgccctcca gcagcttggg cacgaagacc 240tacacctgca acgtagatca
caagcccagc aacaccaagg tggacaagag agttgagtcc 300aaatatggtc ccccatgccc
atcatgccca gcacctgagt tcctgggggg accatcagtc 360ttcctgttcc ccccaaaacc
caaggacact ctcatgatct cccggacccc tgaggtcacg 420tgcgtggtgg tggacgtgag
ccaggaagac cccgaggtcc agttcaactg gtacgtggat 480ggcgtggagg tgcataatgc
caagacaaag ccgcgggagg agcagttcaa cagcacgtac 540cgtgtggtca gcgtcctcac
cgtcgtgcac caggactggc tgaacggcaa ggagtacaag 600tgcaaggtct ccaacaaagg
cctcccgtcc tccatcgaga aaaccatctc caaagccaaa 660gggcagcccc gagagccaca
ggtgtacacc ctgcccccat cccaggagga gatgaccaag 720aaccaggtca gcctgacctg
cctggtcaaa ggcttctacc ccagcgacat cgccgtggag 780tgggagagca atgggcagcc
ggagaacaac tacaagacca cgcctcccgt gctggactcc 840gacggctcct tcttcctcta
cagcaggctc accgtggaca agagcaggtg gcaggagggg 900aatgtcttct catgctccgt
gatgcatgag gctctgcaca accactacac acagaagagc 960ctctccctgt ctctgggtaa
atga 9843821717DNAhomo sapiens;
382gcttccacca agggcccatc ggtcttcccc ctggcgccct gctccaggag cacctccgag
60agcacagccg ccctgggctg cctggtcaag gactacttcc ccgaaccggt gacggtgtcg
120tggaactcag gcgccctgac cagcggcgtg cacaccttcc cggctgtcct acagtcctca
180ggactctact ccctcagcag cgtggtgacc gtgccctcca gcagcttggg cacgaagacc
240tacacctgca acgtagatca caagcccagc aacaccaagg tggacaagag agttggtgag
300aggccagcac agggagggag ggtgtctgct ggaagccagg ctcagccctc ctgcctggac
360gcaccccggc tgtgcagccc cagcccaggg cagcaaggca ggccccatct gtctcctcac
420ccggaggcct ctgaccaccc cactcatgct cagggagagg gtcttctgga tttttccacc
480aggctccggg cagccacagg ctggatgccc ctaccccagg ccctgcgcat acaggggcag
540gtgctgcgct cagacctgcc aagagccata tccgggagga ccctgcccct gacctaagcc
600caccccaaag gccaaactct ccactccctc agctcagaca ccttctctcc tcccagatct
660gagtaactcc caatcttctc tctgcagagt ccaaatatgg tcccccatgc ccatcatgcc
720caggtaagcc aacccaggcc tcgccctcca gctcaaggcg ggacaggtgc cctagagtag
780cctgcatcca gggacaggcc ccagccgggt gctgacgcat ccacctccat ctcttcctca
840gcacctgagt tcctgggggg accatcagtc ttcctgttcc ccccaaaacc caaggacact
900ctcatgatct cccggacccc tgaggtcacg tgcgtggtgg tggacgtgag ccaggaagac
960cccgaggtcc agttcaactg gtacgtggat ggcgtggagg tgcataatgc caagacaaag
1020ccgcgggagg agcagttcaa cagcacgtac cgtgtggtca gcgtcctcac cgtcgtgcac
1080caggactggc tgaacggcaa ggagtacaag tgcaaggtct ccaacaaagg cctcccgtcc
1140tccatcgaga aaaccatctc caaagccaaa ggtgggaccc acggggtgcg agggccacat
1200ggacagaggt cagctcggcc caccctctgc cctgggagtg accgctgtgc caacctctgt
1260ccctacaggg cagccccgag agccacaggt gtacaccctg cccccatccc aggaggagat
1320gaccaagaac caggtcagcc tgacctgcct ggtcaaaggc ttctacccca gcgacatcgc
1380cgtggagtgg gagagcaatg ggcagccgga gaacaactac aagaccacgc ctcccgtgct
1440ggactccgac ggctccttct tcctctacag caggctcacc gtggacaaga gcaggtggca
1500ggaggggaat gtcttctcat gctccgtgat gcatgaggct ctgcacaacc actacacaca
1560gaagagcctc tccctgtctc tgggtaaatg agtgccaggg ccggcaagcc cccgctcccc
1620gggctctcgg ggtcgcgcga ggatgcttgg cacgtacccc gtgtacatac ttcccgggcg
1680cccagcatgg aaataaagca cccagcgctg ccctggg
17173831546DNAhomo sapiens; 383gcatccccga ccagccccaa ggtcttcccg
ctgagcctct gcagcaccca gccagatggg 60aacgtggtca tcgcctgcct ggtccagggc
ttcttccccc aggagccact cagtgtgacc 120tggagcgaaa gcggacaggg cgtgaccgcc
agaaacttcc cacccagcca ggatgcctcc 180ggggacctgt acaccacgag cagccagctg
accctgccgg ccacacagtg cctagccggc 240aagtccgtga catgccacgt gaagcactac
acgaatccca gccaggatgt gactgtgccc 300tgcccaggtc agagggcagg ctggggagtg
gggcggggcc accccgtcgt gccctgacac 360tgcgcctgca cccgtgttcc ccacagggag
ccgccccttc actcacacca gagtggaccg 420cgggccgagc cccaggaggt ggtggtggac
aggccaggag gggcgaggcg ggggcatggg 480gaagtatgtg ctgaccagct caggccatct
ctccactcca gttccctcaa ctccacctac 540cccatctccc tcaactccac ctaccccatc
tccctcatgc tgccaccccc gactgtcact 600gcaccgaccg gccctcgagg acctgctctt
aggttcagaa gcgaacctca cgtgcacact 660gaccggcctg agagatgcct caggtgtcac
cttcacctgg acgccctcaa gtgggaagag 720cgctgttcaa ggaccacctg agcgtgacct
ctgtggctgc tacagcgtgt ccagtgtcct 780gccgggctgt gccgagccat ggaaccatgg
gaagaccttc acttgcactg ctgcctaccc 840cgagtccaag accccgctaa ccgccaccct
ctcaaaatcc ggtgggtcca gaccctgctc 900ggggccctgc tcagtgctct ggtttgcaaa
gcatattcct ggcctgcctc ctccctccca 960atcctgggct ccagtgctca tgccaagtac
agagggaaac tgaggcaggc tgaggggcca 1020ggacacagcc cagggtgccc accagagcag
aggggctctc tcatcccctg cccagccccc 1080tgacctggct ctctaccctc caggaaacac
attccggccc gaggtccacc tgctgccgcc 1140gccgtcggag gagctggccc tgaacgagct
ggtgacgctg acgtgcctgg cacgcggctt 1200cagccccaag gatgtgctgg ttcgctggct
gcaggggtca caggagctgc cccgcgagaa 1260gtacctgact tgggcatccc ggcaggagcc
cagccagggc accaccacct tcgctgtgac 1320cagcatactg cgcgtggcag ccgaggactg
gaagaagggg gacaccttct cctgcatggt 1380gggccacgag gccctgccgc tggccttcac
acagaagacc atcgaccgct tggcgggtaa 1440acccacccat gtcaatgtgt ctgttgtcat
ggcggaggtg gacggcacct gctactgagc 1500cgcccgcctg tccccacccc tgaataaact
ccatgctccc ccaagc 15463841062DNAhomo sapiens;
384gcatccccga ccagccccaa ggtcttcccg ctgagcctct gcagcaccca gccagatggg
60aacgtggtca tcgcctgcct ggtccagggc ttcttccccc aggagccact cagtgtgacc
120tggagcgaaa gcggacaggg cgtgaccgcc agaaacttcc cacccagcca ggatgcctcc
180ggggacctgt acaccacgag cagccagctg accctgccgg ccacacagtg cctagccggc
240aagtccgtga catgccacgt gaagcactac acgaatccca gccaggatgt gactgtgccc
300tgcccagttc cctcaactcc acctacccca tctccctcaa ctccacctac cccatctccc
360tcatgctgcc acccccgact gtcactgcac cgaccggccc tcgaggacct gctcttaggt
420tcagaagcga acctcacgtg cacactgacc ggcctgagag atgcctcagg tgtcaccttc
480acctggacgc cctcaagtgg gaagagcgct gttcaaggac cacctgagcg tgacctctgt
540ggctgctaca gcgtgtccag tgtcctgccg ggctgtgccg agccatggaa ccatgggaag
600accttcactt gcactgctgc ctaccccgag tccaagaccc cgctaaccgc caccctctca
660aaatccggaa acacattccg gcccgaggtc cacctgctgc cgccgccgtc ggaggagctg
720gccctgaacg agctggtgac gctgacgtgc ctggcacgcg gcttcagccc caaggatgtg
780ctggttcgct ggctgcaggg gtcacaggag ctgccccgcg agaagtacct gacttgggca
840tcccggcagg agcccagcca gggcaccacc accttcgctg tgaccagcat actgcgcgtg
900gcagccgagg actggaagaa gggggacacc ttctcctgca tggtgggcca cgaggccctg
960ccgctggcct tcacacagaa gaccatcgac cgcttggcgg gtaaacccac ccatgtcaat
1020gtgtctgttg tcatggcgga ggtggacggc acctgctact ga
10623851546DNAhomo sapiens; 385gcatccccga ccagccccaa ggtcttcccg
ctgagcctct gcagcaccca gccagatggg 60aacgtggtca tcgcctgcct ggtccagggc
ttcttccccc aggagccact cagtgtgacc 120tggagcgaaa gcggacaggg cgtgaccgcc
agaaacttcc cacccagcca ggatgcctcc 180ggggacctgt acaccacgag cagccagctg
accctgccgg ccacacagtg cctagccggc 240aagtccgtga catgccacgt gaagcactac
acgaatccca gccaggatgt gactgtgccc 300tgcccaggtc agagggcagg ctggggagtg
gggcggggcc accccgtcgt gccctgacac 360tgcgcctgca cccgtgttcc ccacagggag
ccgccccttc actcacacca gagtggaccg 420cgggccgagc cccaggaggt ggtggtggac
aggccaggag gggcgaggcg ggggcatggg 480gaagtatgtg ctgaccagct caggccatct
ctccactcca gttccctcaa ctccacctac 540cccatctccc tcaactccac ctaccccatc
tccctcatgc tgccaccccc gactgtcact 600gcaccgaccg gccctcgagg acctgctctt
aggttcagaa gcgaacctca cgtgcacact 660gaccggcctg agagatgcct caggtgtcac
cttcacctgg acgccctcaa gtgggaagag 720cgctgttcaa ggaccacctg accgtgacct
ctgtggctgc tacagcgtgt ccagtgtcct 780gccgggctgt gccgagccat ggaaccatgg
gaagaccttc acttgcactg ctgcctaccc 840cgagtccaag accccgctaa ccgccaccct
ctcaaaatcc ggtgggtcca gaccctgctc 900ggggccctgc tcagtgctct ggtttgcaaa
gcatattcct ggcctgcctc ctccctccca 960atcctgggct ccagtgctca tgccaagtac
agagggaaac tgaggcaggc tgaggggcca 1020ggacacagcc cagggtgccc accagagcag
aggggctctc tcatcccctg cccagccccc 1080tgacctggct ctctaccctc caggaaacac
attccggccc gaggtccacc tgctgccgcc 1140gccgtcggag gagctggccc tgaacgagct
ggtgacgctg acgtgcctgg cacgcggctt 1200cagccccaag gatgtgctgg ttcgctggct
gcaggggtca caggagctgc cccgcgagaa 1260gtacctgact tgggcatccc ggcaggagcc
cagccagggc accaccacct tcgctgtgac 1320cagcatactg cgcgtggcag ccgaggactg
gaagaagggg gacaccttct cctgcatggt 1380gggccacgag gccctgccgc tggccttcac
acagaagacc atcgaccgct tggcgggtaa 1440acccacccat gtcaatgtgt ctgttgtcat
ggcggaggtg gacggcacct gctactgagc 1500cgcccgcctg tccccacccc tgaataaact
ccatgctccc ccaagc 15463861062DNAhomo sapiens;
386gcatccccga ccagccccaa ggtcttcccg ctgagcctct gcagcaccca gccagatggg
60aacgtggtca tcgcctgcct ggtccagggc ttcttccccc aggagccact cagtgtgacc
120tggagcgaaa gcggacaggg cgtgaccgcc agaaacttcc cacccagcca ggatgcctcc
180ggggacctgt acaccacgag cagccagctg accctgccgg ccacacagtg cctagccggc
240aagtccgtga catgccacgt gaagcactac acgaatccca gccaggatgt gactgtgccc
300tgcccagttc cctcaactcc acctacccca tctccctcaa ctccacctac cccatctccc
360tcatgctgcc acccccgact gtcactgcac cgaccggccc tcgaggacct gctcttaggt
420tcagaagcga acctcacgtg cacactgacc ggcctgagag atgcctcagg tgtcaccttc
480acctggacgc cctcaagtgg gaagagcgct gttcaaggac cacctgaccg tgacctctgt
540ggctgctaca gcgtgtccag tgtcctgccg ggctgtgccg agccatggaa ccatgggaag
600accttcactt gcactgctgc ctaccccgag tccaagaccc cgctaaccgc caccctctca
660aaatccggaa acacattccg gcccgaggtc cacctgctgc cgccgccgtc ggaggagctg
720gccctgaacg agctggtgac gctgacgtgc ctggcacgcg gcttcagccc caaggatgtg
780ctggttcgct ggctgcaggg gtcacaggag ctgccccgcg agaagtacct gacttgggca
840tcccggcagg agcccagcca gggcaccacc accttcgctg tgaccagcat actgcgcgtg
900gcagccgagg actggaagaa gggggacacc ttctcctgca tggtgggcca cgaggccctg
960ccgctggcct tcacacagaa gaccatcgac cgcttggcgg gtaaacccac ccatgtcaat
1020gtgtctgttg tcatggcgga ggtggacggc acctgctact ga
10623871507DNAhomo sapiens; 387gcatccccga ccagccccaa ggtcttcccg
ctgagcctcg acagcacccc ccaagatggg 60aacgtggtcg tcgcatgcct ggtccagggc
ttcttccccc aggagccact cagtgtgacc 120tggagcgaaa gcggacagaa cgtgaccgcc
agaaacttcc cacctagcca ggatgcctcc 180ggggacctgt acaccacgag cagccagctg
accctgccgg ccacacagtg cccagacggc 240aagtccgtga catgccacgt gaagcactac
acgaattcca gccaggatgt gactgtgccc 300tgccgaggtc agagggcagg ctggggagtg
gggcggggcc accccgtcct gccctgacac 360tgcgcctgca cccgtgttcc ccacagggag
ccgccccttc actcacacca gagtggaccg 420cgggccgagc cccaggaggt ggtggtggac
aggccaggag gggcgaggcg ggggcacggg 480gaagggcgtt ctgaccagct caggccatct
ctccactcca gttcccccac ctcccccatg 540ctgccacccc cgactgtcgc tgcaccgacc
ggccctcgag gacctgctct taggttcaga 600agcgaacctc acgtgcacac tgaccggcct
gagagatgcc tctggtgcca ccttcacctg 660gacgccctca agtgggaaga gcgctgttca
aggaccacct gagcgtgacc tctgtggctg 720ctacagcgtg tccagtgtcc tgcctggctg
tgcccagcca tggaaccatg gggagacctt 780cacctgcact gctgcccacc ccgagttgaa
gaccccacta accgccaaca tcacaaaatc 840cggtgggtcc agaccctgct cggggccctg
ctcagtgctc tggtttgcaa agcatattcc 900cggcctgcct cctccctccc aatcctgggc
tccagtgctc atgccaagta cagagggaaa 960ctgaggcagg ctgaggggcc aggacacagc
ccagggtgcc caccagagca gaggggctct 1020ctcatcccct gcccagcccc ctgacctggc
tctctaccct ccaggaaaca cattccggcc 1080cgaggtccac ctgctgccgc cgccgtcgga
ggagctggcc ctgaacgagc tggtgacgct 1140gacgtgcctg gcacgtggct tcagccccaa
ggatgtgctg gttcgctggc tgcaggggtc 1200acaggagctg ccccgcgaga agtacctgac
ttgggcatcc cggcaggagc ccagccaggg 1260caccaccacc tacgctgtaa ccagcatact
gcgcgtggca gctgaggact ggaagaaggg 1320ggagaccttc tcctgcatgg tgggccacga
ggccctgccg ctggccttca cacagaagac 1380catcgaccgc atggcgggta aacccaccca
catcaatgtg tctgttgtca tggcggaggc 1440ggatggcacc tgctactgag ccgcccgcct
gtccccaccc ctgaataaac tccatgctcc 1500cccaagc
15073881023DNAhomo sapiens;
388gcatccccga ccagccccaa ggtcttcccg ctgagcctcg acagcacccc ccaagatggg
60aacgtggtcg tcgcatgcct ggtccagggc ttcttccccc aggagccact cagtgtgacc
120tggagcgaaa gcggacagaa cgtgaccgcc agaaacttcc cacctagcca ggatgcctcc
180ggggacctgt acaccacgag cagccagctg accctgccgg ccacacagtg cccagacggc
240aagtccgtga catgccacgt gaagcactac acgaattcca gccaggatgt gactgtgccc
300tgccgagttc ccccacctcc cccatgctgc cacccccgac tgtcgctgca ccgaccggcc
360ctcgaggacc tgctcttagg ttcagaagcg aacctcacgt gcacactgac cggcctgaga
420gatgcctctg gtgccacctt cacctggacg ccctcaagtg ggaagagcgc tgttcaagga
480ccacctgagc gtgacctctg tggctgctac agcgtgtcca gtgtcctgcc tggctgtgcc
540cagccatgga accatgggga gaccttcacc tgcactgctg cccaccccga gttgaagacc
600ccactaaccg ccaacatcac aaaatccgga aacacattcc ggcccgaggt ccacctgctg
660ccgccgccgt cggaggagct ggccctgaac gagctggtga cgctgacgtg cctggcacgt
720ggcttcagcc ccaaggatgt gctggttcgc tggctgcagg ggtcacagga gctgccccgc
780gagaagtacc tgacttgggc atcccggcag gagcccagcc agggcaccac cacctacgct
840gtaaccagca tactgcgcgt ggcagctgag gactggaaga agggggagac cttctcctgc
900atggtgggcc acgaggccct gccgctggcc ttcacacaga agaccatcga ccgcatggcg
960ggtaaaccca cccacatcaa tgtgtctgtt gtcatggcgg aggcggatgg cacctgctac
1020tga
10233891507DNAhomo sapiens; 389gcatccccga ccagccccaa ggtcttcccg
ctgagcctcg acagcacccc ccaagatggg 60aacgtggtcg tcgcatgcct ggtccagggc
ttcttccccc aggagccact cagtgtgacc 120tggagcgaaa gcggacagaa cgtgaccgcc
agaaacttcc cacctagcca ggatgcctcc 180ggggacctgt acaccacgag cagccagctg
accctgccgg ccacacagtg cccagacggc 240aagtccgtga catgccacgt gaagcactac
acgaatccca gccaggatgt gactgtgccc 300tgcccaggtc agagggcagg ctggggagtg
gggcggggcc accccgtcct gccctgacac 360tgcgcctgca cccgtgttcc ccacagggag
ccgccccttc actcacacca gagtggaccg 420cgggccgagc cccaggaggt ggtggtggac
aggccaggag gggcgaggcg ggggcacggg 480gaagggcgtt ctgaccagct caggccatct
ctccactcca gttcccccac ctcccccatg 540ctgccacccc cgactgtcgc tgcaccgacc
ggccctcgag gacctgctct taggttcaga 600agcgaacctc acgtgcacac tgaccggcct
gagagatgcc tctggtgcca ccttcacctg 660gacgccctca agtgggaaga gcgctgttca
aggaccacct gagcgtgacc tctgtggctg 720ctacagcgtg tccagtgtcc tgcctggctg
tgcccagcca tggaaccatg gggagacctt 780cacctgcact gctgcccacc ccgagttgaa
gaccccacta accgccaaca tcacaaaatc 840cggtgggtcc agaccctgct cggggccctg
ctcagtgctc tggtttgcaa agcatattcc 900cggcctgcct cctccctccc aatcctgggc
tccagtgctc atgccaagta cagagggaaa 960ctgaggcagg ctgaggggcc aggacacagc
ccagggtgcc caccagagca gaggggctct 1020ctcatcccct gcccagcccc ctgacctggc
tctctaccct ccaggaaaca cattccggcc 1080cgaggtccac ctgctgccgc cgccgtcgga
ggagctggcc ctgaacgagc tggtgacgct 1140gacgtgcctg gcacgtggct tcagccccaa
ggatgtgctg gttcgctggc tgcaggggtc 1200acaggagctg ccccgcgaga agtacctgac
ttgggcatcc cggcaggagc ccagccaggg 1260caccaccacc ttcgctgtaa ccagcatact
gcgcgtggca gctgaggact ggaagaaggg 1320ggacaccttc tcctgcatgg tgggccacga
ggccctgccg ctggccttca cacagaagac 1380catcgaccgc atggcgggta aacccaccca
catcaatgtg tctgttgtca tggcggaggt 1440ggatggcacc tgctactgag ccgcccgcct
gtccccaccc ctgaataaac tccatgctcc 1500cccaagc
15073901022DNAhomo sapiens;
390gcatccccga ccagccccaa ggtcttcccg ctgagcctcg acagcacccc ccaagatggg
60aacgtggtcg tcgcatgcct ggtccagggc ttcttccccc aggagccact cagtgtgacc
120tggagcgaaa gcggacagaa cgtgaccgcc agaaacttcc cacctagcca ggatgcctcc
180ggggacctgt acaccacgag cagccagctg accctgccgg ccacacagtg cccagacggc
240aagtccgtga catgccacgt gaagcactac acgaatccca gccaggatgt gactgtgccc
300tgccagttcc cccacctccc ccatgctgcc acccccgact gtcgctgcac cgaccggccc
360tcgaggacct gctcttaggt tcagaagcga acctcacgtg cacactgacc ggcctgagag
420atgcctctgg tgccaccttc acctggacgc cctcaagtgg gaagagcgct gttcaaggac
480cacctgagcg tgacctctgt ggctgctaca gcgtgtccag tgtcctgcct ggctgtgccc
540agccatggaa ccatggggag accttcacct gcactgctgc ccaccccgag ttgaagaccc
600cactaaccgc caacatcaca aaatccggaa acacattccg gcccgaggtc cacctgctgc
660cgccgccgtc ggaggagctg gccctgaacg agctggtgac gctgacgtgc ctggcacgtg
720gcttcagccc caaggatgtg ctggttcgct ggctgcaggg gtcacaggag ctgccccgcg
780agaagtacct gacttgggca tcccggcagg agcccagcca gggcaccacc accttcgctg
840taaccagcat actgcgcgtg gcagctgagg actggaagaa gggggacacc ttctcctgca
900tggtgggcca cgaggccctg ccgctggcct tcacacagaa gaccatcgac cgcatggcgg
960gtaaacccac ccacatcaat gtgtctgttg tcatggcgga ggtggatggc acctgctact
1020ga
10223911507DNAhomo sapiens; 391gcatccccga ccagccccaa ggtcttcccg
ctgagcctcg acagcacccc ccaagatggg 60aacgtggtcg tcgcatgcct ggtccagggc
ttcttccccc aggagccact cagtgtgacc 120tggagcgaaa gcggacagaa cgtgaccgcc
agaaacttcc cacctagcca ggatgcctcc 180ggggacctgt acaccacgag cagccagctg
accctgccgg ccacacagtg cccagacggc 240aagtccgtga catgccacgt gaagcactac
acgaatccca gccaggatgt gactgtgccc 300tgcccaggtc agagggcagg ctggggagtg
gggcggggcc accccgtcct gccctgacac 360tgcgcctgca cccgtgttcc ccacagggag
ccgccccttc actcacacca gagtggaccg 420cgggccgagc cccaggaggt ggtggtggac
aggccaggag gggcgaggcg ggggcacggg 480gaagggcgtt ctgaccagct caggccatct
ctccactcca gttcccccac ctcccccatg 540ctgccacccc cgactgtcgc tgcaccgacc
ggccctcgag gacctgctct taggttcaga 600agcgaacctc acgtgcacac tgaccggcct
gagagatgcc tctggtgcca ccttcacctg 660gacgccctca agtgggaaga gcgctgttca
aggaccacct gagcgtgacc tctgtggctg 720ctacagcgtg tccagtgtcc tgcctggctg
tgcccagcca tggaaccatg gggagacctt 780cacctgcact gctgcccacc ccgagttgaa
gaccccacta accgccaaca tcacaaaatc 840cggtgggtcc agaccctgct cggggccctg
ctcagtgctc tggtttgcaa agcatattcc 900cggcctgcct cctccctccc aatcctgggc
tccagtgctc atgccaagta cagagggaaa 960ctgaggcagg ctgaggggcc aggacacagc
ccagggtgcc caccagagca gaggggctct 1020ctcatcccct gcccagcccc ctgacctggc
tctctaccct ccaggaaaca cattccggcc 1080cgaggtccac ctgctgccgc cgccgtcgga
ggagctggcc ctgaacgagc tggtgacgct 1140gacgtgcctg gcacgtggct tcagccccaa
ggatgtgctg gttcgctggc tgcaggggtc 1200acaggagctg ccccgcgaga agtacctgac
ttgggcatcc cggcaggagc ccagccaggg 1260caccaccacc tacgctgtaa ccagcatact
gcgcgtggca gctgaggact ggaagaaggg 1320ggagaccttc tcctgcatgg tgggccacga
ggccctgccg ctggccttca cacagaagac 1380catcgaccgc atggcgggta aacccaccca
catcaatgtg tctgttgtca tggcggaggc 1440ggatggcacc tgctactgag ccgcccgcct
gtccccaccc ctgaataaac tccatgctcc 1500cccaagc
15073921022DNAhomo sapiens;
392gcatccccga ccagccccaa ggtcttcccg ctgagcctcg acagcacccc ccaagatggg
60aacgtggtcg tcgcatgcct ggtccagggc ttcttccccc aggagccact cagtgtgacc
120tggagcgaaa gcggacagaa cgtgaccgcc agaaacttcc cacctagcca ggatgcctcc
180ggggacctgt acaccacgag cagccagctg accctgccgg ccacacagtg cccagacggc
240aagtccgtga catgccacgt gaagcactac acgaatccca gccaggatgt gactgtgccc
300tgccagttcc cccacctccc ccatgctgcc acccccgact gtcgctgcac cgaccggccc
360tcgaggacct gctcttaggt tcagaagcga acctcacgtg cacactgacc ggcctgagag
420atgcctctgg tgccaccttc acctggacgc cctcaagtgg gaagagcgct gttcaaggac
480cacctgagcg tgacctctgt ggctgctaca gcgtgtccag tgtcctgcct ggctgtgccc
540agccatggaa ccatggggag accttcacct gcactgctgc ccaccccgag ttgaagaccc
600cactaaccgc caacatcaca aaatccggaa acacattccg gcccgaggtc cacctgctgc
660cgccgccgtc ggaggagctg gccctgaacg agctggtgac gctgacgtgc ctggcacgtg
720gcttcagccc caaggatgtg ctggttcgct ggctgcaggg gtcacaggag ctgccccgcg
780agaagtacct gacttgggca tcccggcagg agcccagcca gggcaccacc acctacgctg
840taaccagcat actgcgcgtg gcagctgagg actggaagaa gggggagacc ttctcctgca
900tggtgggcca cgaggccctg ccgctggcct tcacacagaa gaccatcgac cgcatggcgg
960gtaaacccac ccacatcaat gtgtctgttg tcatggcgga ggcggatggc acctgctact
1020ga
10223938912DNAhomo sapiens; 393cacccaccaa ggctccggat gtgttcccca
tcatatcagg gtgcagacac ccaaaggata 60acagccctgt ggtcctggca tgcttgataa
ctgggtacca cccaacgtcc gtgactgtca 120cctggtacat ggggacacag agccagcccc
agagaacctt ccctgagata caaagacggg 180acagctacta catgacaagc agccagctct
ccacccccct ccagcagtgg cgccaaggcg 240agtacaaatg cgtggtccag cacaccgcca
gcaagagtaa gaaggagatc ttccgctggc 300caggtaggtc gcaccggaga tcacccagaa
gggcccccca ggacccccag caccttccac 360tcagggcctg accacaaaga cagaagcaag
ggctgggctg tgaggcaacc cccacctccc 420cctcagagca cgttcctccc ccttcaccct
gtatccaccc ctccggaccc tccccatctc 480agtccctccg ctccctctct ctgaggccca
tctcccaata cccagatcac tttccttcca 540gacccttccc tcagtgtgca cggaggcagc
ttgcccagca aaggtgactg tctagtgggc 600ttcccacagc caagctccca ccccatgctg
cggcccttcc cttcttcctg cttggctgcc 660tgtgcccccc acctgcctgt ccacaaccca
gcctctggta catccatgcc ctctgccctc 720agcctcacct gcacttttcc ttggatttca
gagtctccaa aggcacaggc ctcctcagtg 780cccactgcac aaccccaagc agagggcagc
ctcgccaagg caaccacagc cccagccacc 840acccgtaaca caggtgagaa gccccttccc
tgcacactcc acccccaccc acctgctcat 900tcctcagccg cctcctccag gcagcccttc
ataactcctt gtctgagtct ccaagtcaca 960ctttggtaag gagagggaca ctgaacggac
ctctaacaaa cacctactgc cagccagccc 1020cagtctgggg gccagcagat gccaaacagc
cagcagactc ccagagcaga cctgggccgg 1080ctccctggcc catggaccca gctctgcctc
gctgagctga ggcatgggct ctcagcgcag 1140cctcacatag agccaccctg ccgaggcagt
ccggcttgca gactcacagg tcacttgggc 1200cgcagcagcc cctccccgtg accctcgcct
cccgcccgcc ccagcctggc tctctccaag 1260tgttggatct tggtggccag cctgcttctc
accctcaccc tgcctgccac ctcagaatgg 1320caggggaaag agggccctca ccaagaactt
tatctgagga gtctgaggct tgtgactctg 1380acctgcctga gatgtccatg tggccggggg
gacgggttca gtgttcggga gaactcgggt 1440acgtgcctga ctttctctga gtagggcagg
aagctgttag gagaagcagc agtgaggtgg 1500gctggaccaa caggcagaat gactgtccct
cagccaccct ctgggatgtg ggtcaagctc 1560tgacaaaggc atggcacagc catggtggcc
cctgcttgga tgagtggcca cggtgccctc 1620accctgggcc agaatctgcc tccactctgc
aggtgcagaa acacgacatt cccgtctcta 1680aacacaccta gctcctaggc ttggggtggg
cctatcaaat gcagggagat ggacacagca 1740caagggccag agcttcccat gagaaaggtg
agggcagctg ctccctgacc cgggcatctg 1800cacttgtccc tctccaccct cctcatgggc
agtggagact cagcaacaaa acaagttgag 1860tgcattagca gccagctctg gagccaagtc
actcacccca cggccttggc tgctggtgga 1920ggggccttcc cctgggcagc ctccaagaag
acggccaagt gctcttactc agaccacggc 1980gctgcttcct ggcacctcga tttcccacaa
caacatgggg tgcagacagg ctagggcccc 2040ctgccctggg gcctggacgg catccagtta
aagatgaccc ttcacgggcg gtgcctgagg 2100tgtgctgacc tcagcagcta agccctcagg
tctggtctgc actgccccac ctggaggacc 2160caactgaccc agacacagcc agggttatgg
catgaccccg tggacggtga cccacaggcc 2220agatgcagcc aggggctgtt ttgtgtggcc
tagaaatgtc tttacagttg tagtgggatg 2280gaggaggaag aggaagagag gaggggagag
aaaagcaggg aaggggaaaa agaggagttc 2340aatgcaaccc caaaagccag aacagttttg
agctgaaaga acaaggcagg aaacatccca 2400gtacctgact tcaaaacata ctataaagca
gttgtaatca aaacaggatc ataaaaacag 2460acacacagac ccatggaaca gaaaagcgag
cccagaaata aatctacatg cttgcagtcc 2520attgattttc aacaaaggca ccaggaaaac
acaatgggga gaggacagtt tcctcaataa 2580atagtgctgg ggaaactgga tatccatgtg
cagactaatg aaactacaca aaaatcaatt 2640gaaaacagtc taggccaggc gcggtggctc
atgccggtaa tcccagcact ttgggaggcc 2700gagacaggcg gatcacctga ggtcaggagt
tcgagaccag cttggccaac atggcgaaac 2760ccggtctcca ctaaaaatac aaaaattagc
acatggtggc ctacgtctgt tatcccagct 2820tttcaggagg ctgaggcagg agaatcgctt
gaatccggga ggtgaaggtt gcagggagcc 2880aagattgcgc cactgcattc cagcctgggc
aatggagcga gactgtctca aaaaaaaaaa 2940aaaaaagaaa agaaaacagt ctaaaggttt
aactgaacag ataaagctac tagaagaaaa 3000cataggggga aaactccatg acattagtct
gagcaacgat ttttggatat gatcccaaaa 3060gctcaggcag cactagtcac aaaagccaag
atacagaacc aacctaagca cccctcagca 3120gatgcacagg taaagaaaat gtggtacgta
tggggcacaa tggaatacga ttcagccttt 3180aaaaacagtg aaattctgtc attggcaaca
atgtagatga acctgaagga cacttatgct 3240aagtgaaata agccaggcac agaaggagca
atactgcatg attgcactta catctggcag 3300gttaaaaagg caaactctta gaggcagaca
gtagagaggt ggtgccaggg agcgggcact 3360ggtggctggg gagatgttgg tcaaagggca
caaaactgca gttgggagga attagttcag 3420gacatccctt gtacatgggg acagtggtta
gtaacaacgg attgtatcct tgaaaaccgc 3480taagaaaata gtttttaagt gttcttgaca
caaaaagtga cacgtatgtg agatactgca 3540tggtcattag ctggatttag ccattccaca
atgtacacat atttcaaaca ttgtgttgta 3600tatgataaac atgtataatt tttgtcaatt
aaaaattttt aggaagagga ggagaagaga 3660agaagaagga gaaggagaaa gaggaacaag
aagagagaga gacaaagaca ccaggttttt 3720tctgacccct gggctatcaa aacacctatt
gcccaataac tagttggccg ttggtgccct 3780aaactattga agcgattgct gttatgtgga
tgggccccgg acacttagaa actcgtgacc 3840cctgaggacc cccacgagga cagtcagggt
ccccccgaac tcagggagca ctgaggaagg 3900agctcttaga ggcgtggggc ccctcaggcc
cctcagaggg ctctgccaca tgggtcaggg 3960gcaggctgag ggggagtccc aggctccatg
cccagcctct gtgcctctga ccagggtgtc 4020ccccacaccg cctcctcccc agtgccctcc
actggccaca cctggccaga agctggggag 4080aggagagcac agtggttaag tcagtccctg
cagggagacg gcaccagaaa aacctggcct 4140gtggatgagt cccggcctgg cagccacaga
gcagagagct ctagaagcaa cgaaggcccg 4200agtctgctca gggaagagcg ggcagcagcc
ccagggccgg acagtgacca agagtggcac 4260cgcccatggc tcaacgggtc tttgcccaca
gatcccccag cccctggaga cagggtctgt 4320gtgcctggcc gtgcaggcag gcaccacact
cagggggagg ccactgtgga gctctgtgca 4380gagccccggg cgggagccta ctgctcccga
aggtccggcc acagctgctc tcgtttgctc 4440tcccctgcag agtgtccgag ccacacccag
cctcttggcg tctacctgct aacccctgca 4500gtgcaggacc tgtggctccg ggacaaagcc
accttcacct gcttcgtggt gggcagtgac 4560ctgaaggatg ctcacctgac ctgggaggtg
gccgggaagg tccccacagg gggcgtggag 4620gaagggctgc tggagcggca cagcaacggc
tcccagagcc agcacagccg tctgaccctg 4680cccaggtcct tgtggaacgc ggggacctcc
gtcacctgca cactgaacca tcccagcctc 4740ccaccccaga ggttgatggc gctgagagaa
cccggtgagc ctggctccca ggtggggaga 4800cgagggtgcc cacagcctgc tgacccctac
gcctgcccca gggccatgac cccagctggg 4860ccccagcagc accggtcatc ctccacagga
aaggagaagg gaggcaccag caccctggcc 4920ggccccactt ctctcccagt gcccccgtgg
ccagaggctg acagcctccc ccacctcccc 4980gcagctgcgc aggcacccgt caagctttcc
ctgaacctgc tggcctcgtc tgaccctccc 5040gaggcggcct cgtggctcct gtgtgaggtg
tctggcttct cgccccccaa catcctcctg 5100atgtggctgg aggaccagcg tgaggtgaac
acttctgggt ttgcccccgc acgcccccct 5160ccacagcccg ggagcaccac gttctgggcc
tggagtgtgc tgcgtgtccc agccccgccc 5220agccctcagc cagccaccta cacgtgtgtg
gtcagccacg aggactcccg gactctgctc 5280aacgccagcc ggagcctaga agtcagctgt
gagtcacccc caggcccagg gttgggacgg 5340ggactctgag gggggccata aggagctgga
atccatacta ggcaggggtg ggcactgggc 5400aggggcgggg ctaggctgtc ctgggcacac
aggccccttc tcggtgtccg gcaggagcac 5460agacttccca gtactcctgg gccatggatg
tcccagcgtc catccttgct gtccacacca 5520cgtgctggcc caggctggct ggcacagtgt
aagaggtgga tacaacccct cgccgtgccc 5580tgaggagtgg cggtttcctc ccaagacatt
ccccacggct gggtgctggg cacaggcctt 5640ccctggtgtg accgtgaatg tggtcaccct
gaacagctgc cctctctggg gacatctgac 5700tgtccaagac cacagtcagc acctctggga
gccagagggg tctccagaga cccccagatg 5760tcaggcttgg gctcagtgcc cagcgaaagg
tcagccccac acatgcccat aatgggcgcc 5820cacccagagt gacagccccc agcctcctgc
caggcccacc cttttccgcc cccttgaggc 5880atggcacaca gaccagtgcg cccactgccc
gagcatggcc ccagtgggat gtggtggcca 5940cgaggggctg tacacacagc aggaggctgt
ccgccctgct cagggcctgc tgcctatgcc 6000ccagctgtcc aaccaaggga ggcatggaag
ggcccctggt gtaagctgga gccaggcacc 6060caggcccccg gccaccctgc agagccaagg
aaaggaagac acccaagtca acaaggggca 6120gggctgaggg ctgtcccagg ctcttttggc
ccgaggggct gccagcagcc ctgacccggc 6180atgggccttc cccaaaagcg accctgtgag
gtggcctcac agagaacccc ctctgaggac 6240agtgtctgac cctgcctgcc tcacacagat
gggccccaca gcagtgggca acctgggggg 6300cagcagccca acctgaccct gcagggactg
ccccctgcag cagcagctgc ttctcagtcc 6360cccaacctcc ctgtccccgc cagagggtct
tccccgaagc tgcagcccca acccatggct 6420gcccacctgg aaccgggact ccctgtccac
tgccccctcc ccttcggggc cccatctgtg 6480ctggggccca ggttcggcct acagattccc
atcattgcca tggcctcctg accttgccta 6540tccaccccca accaccggct ccatgctgac
cctcccccag gctcccacgc ccagctggcc 6600ggccatcccc aggcacagac agtctgggat
ctcacaggtt agcctggacc atccacctgg 6660ccagacctgg gagaggctgg aagctgccct
gccaccatgc tccagggccc caggttgcag 6720tactatgggg tgagggtgtg tgtgcacacc
tgtgtgtacc taggatatcc gagtgtaccc 6780ttgtgccccc aagcacaagt ctccctccca
ggcagtgagg cccagatggt gcagtggtta 6840gagctgaggc ttatcccaca gagaaccctg
gcgccttggt caaggaagcc cctatgcctt 6900tcttgcctcg atttcccctc ttgtctgctg
agccagcagg ggccacgtcc tgggctgctg 6960tgaggaggaa gcaagttggt gctaggaggg
gctcctgtgt gtgcatgggc gggaggggtg 7020caggtatctg agcaccccgg tctccacttg
agagagcagg gcaggagctc cctgacccac 7080ccagactaca cacgctgtgt ccacgtgtct
cccattatct gtggcagagg atccggcttc 7140tttctcaatt tccagttctt cacaaagcaa
tgcctttgta aaatgcaata agaaatacta 7200gaaaaatgat atgaacagaa agacacgccg
attttttgtt attagatgta acagaccatg 7260gccccatgaa atgatcccgg accagatccg
tccacacccg ccactcagca gctctggccg 7320agctcacagt acaaccacaa taaactcttg
ttgaatgaac tctaggaagt ctgtgacgtg 7380gctggttctt gtcaatgctt cctgcctgcc
cacaggctct tcctcgtgga tggggctgtg 7440cttgccacgg aagcgcgttt ttcccggcct
aggcttgcct tgggccccac tgccgtctcc 7500agctggagat gaccttctat acacacattt
gctcatgaca gacccttgct tagccccctt 7560ccatggctcc ctcctgctgc tgggataaaa
tcaccttgcc tggatatccc ctcctgggcc 7620cctttccacc ctccttagtc agcaccccca
gttcagggca cctgctttcc ccgctgcgga 7680gaagccactc tctccttgct gcccggctgt
gtcttgcctt ccacaccttg tcacagtggc 7740cacttcctaa ggaaggcctc cctgtgtgca
ggtgtgcaga agtgccccag cctcccgtca 7800cctttgtcac gggagcccaa tccatgagag
tctatggttc tgtctgtctg ccccactcag 7860ggcagcgaca agtccaggcg gggaggacac
agtaggcaga gatttgtcga ggggacatat 7920gagcaagagg gtgaggctgg gagctccctg
gagataacca cgcctcctgg gaagactcgc 7980cgtcatttca gctccacgct gtgcgggggt
gggtggaggg gtagcctggc cctcatgacc 8040agggagcttc tcactcagcc cctgttcctc
cccagacctg gccatgaccc ccctgatccc 8100tcagagcaag gatgagaaca gcgatgacta
cacgaccttt gatgatgtgg gcagcctgtg 8160gaccgccctg tccacgtttg tggccctctt
catcctcacc ctcctctaca gcggcattgt 8220cactttcatc aaggtcaggg gagcggccag
gctctcagtg accctcgggg tgggtgtggg 8280gcaaggtgcc cttccagggg acatgccaga
gctggtccag ggatcctgga ccaggcagag 8340gcagggctga gggagcctgg aggacatgca
ggccctctgt ggcctgtgga cactgtcgaa 8400ggccctcttg accctgtgga taaaggacaa
caccccctcc cctgctcctc tgtctcccct 8460gcccctccac ccctcaggct tctagccccc
tgtctgaccc caggggctgt ctttcaggtg 8520aagtagcccc agaagagcag gacgccctgt
acctgcagag aagggaagca gcctctgtac 8580ctcatctgtg gctaccagag agcagaaagg
acccaccctg gactcttctg tgtgcaggaa 8640gatgcgccag cccctgcccc cggctcccct
ctgtccgcca cagaatccag tcttctagac 8700cagggggacg ggcacccatc actccgcagg
cgaatcagag cccccctgcc ccggccctaa 8760cccctgtgcc tccttcccgt gcttccccca
gagccagcta cacccctgcc ccggccctaa 8820cccccatgcc tccttcctgt gcttccccca
gagccagcta gtcccacctg cagcccgctg 8880gcctccccat aaacacgctt tggttcattt
ca 89123941293DNAhomo
sapiens;misc_feature(1)..(1)n is a, c, g, or t 394ncacccacca aggctccgga
tgtgttcccc atcatatcag ggtgcagaca cccaaaggat 60aacagccctg tggtcctggc
atgcttgata actgggtacc acccaacgtc cgtgactgtc 120acctggtaca tggggacaca
gagccagccc cagagaacct tccctgagat acaaagacgg 180gacagctact acatgacaag
cagccagctc tccacccccc tccagcagtg gcgccaaggc 240gagtacaaat gcgtggtcca
gcacaccgcc agcaagagta agaaggagat cttccgctgg 300ccagagtctc caaaggcaca
ggcctcctca gtgcccactg cacaacccca agcagagggc 360agcctcgcca aggcaaccac
agccccagcc accacccgta acacaggaag aggaggagaa 420gagaagaaga aggagaagga
gaaagaggaa caagaagaga gagagacaaa gacaccagag 480tgtccgagcc acacccagcc
tcttggcgtc tacctgctaa cccctgcagt gcaggacctg 540tggctccggg acaaagccac
cttcacctgc ttcgtggtgg gcagtgacct gaaggatgct 600cacctgacct gggaggtggc
cgggaaggtc cccacagggg gcgtggagga agggctgctg 660gagcggcaca gcaacggctc
ccagagccag cacagccgtc tgaccctgcc caggtccttg 720tggaacgcgg ggacctccgt
cacctgcaca ctgaaccatc ccagcctccc accccagagg 780ttgatggcgc tgagagaacc
cgctgcgcag gcacccgtca agctttccct gaacctgctg 840gcctcgtctg accctcccga
ggcggcctcg tggctcctgt gtgaggtgtc tggcttctcg 900ccccccaaca tcctcctgat
gtggctggag gaccagcgtg aggtgaacac ttctgggttt 960gcccccgcac gcccccctcc
acagcccggg agcaccacgt tctgggcctg gagtgtgctg 1020cgtgtcccag ccccgcccag
ccctcagcca gccacctaca cgtgtgtggt cagccacgag 1080gactcccgga ctctgctcaa
cgccagccgg agcctagaag tcagctacct ggccatgacc 1140cccctgatcc ctcagagcaa
ggatgagaac agcgatgact acacgacctt tgatgatgtg 1200ggcagcctgt ggaccgccct
gtccacgttt gtggccctct tcatcctcac cctcctctac 1260agcggcattg tcactttcat
caaggtgaag tag 12933953842DNAhomo sapiens;
395gcctccacac agagcccatc cgtcttcccc ttgacccgct gctgcaaaaa cattccctcc
60aatgccacct ccgtgactct gggctgcctg gccacgggct acttcccgga gccggtgatg
120gtgacctggg acacaggctc cctcaacggg acaactatga ccttaccagc caccaccctc
180acgctctctg gtcactatgc caccatcagc ttgctgaccg tctcgggtgc gtgggccaag
240cagatgttca cctgccgtgt ggcacacact ccatcgtcca cagactgggt cgacaacaaa
300accttcagcg gtaagagagg gccaagctca gagaccacag ttcccaggag tgccaggctg
360agggctggca gagtgggcag gggttgaggg ggtgggtggg ctcaaacgtg ggaacaccca
420gcatgcctgg ggacccgggc caggacgcgg gggcaagagg agggcacaca gagctcagag
480aggccaacaa ccctcatgac caccagctct cccccagtct gctccaggga cttcaccccg
540cccaccgtga agatcttaca gtcgtcctgc gacggcggcg ggcacttccc cccgaccatc
600cagctcctgt gcctcgtctc tgggtacacc ccagggacta tcaacatcac ctggctggag
660gacgggcagg tcatggacgt ggacttgtcc accgcctcta ccacgcagga gggtgagctg
720gcctccacac aaagcgagct caccctcagc cagaagcact ggctgtcaga ccgcacctac
780acctgccagg tcacctatca aggtcacacc tttgaggaca gcaccaagaa gtgtgcaggt
840acgttcccac ctgccctggt ggccgccacg gaggccagag aagaggggcg ggtgggcctc
900acacagccct ccggtgtacc acagattcca acccgagagg ggtgagcgcc tacctaagcc
960ggcccagccc gttcgacctg ttcatccgca agtcgcccac gatcacctgt ctggtggtgg
1020acctggcacc cagcaagggg accgtgaacc tgacctggtc ccgggccagt gggaagcctg
1080tgaaccactc caccagaaag gaggagaagc agcgcaatgg cacgttaacc gtcacgtcca
1140ccctgccggt gggcacccga gactggatcg agggggagac ctaccagtgc agggtgaccc
1200acccccacct gcccagggcc ctcatgcggt ccacgaccaa gaccagcggt gagccatggg
1260caggccgggg tcgtggggga agggagggag cgagtgagcg gggcccgggc tgaccccacg
1320tctggccaca ggcccgcgtg ctgccccgga agtctatgcg tttgcgacgc cggagtggcc
1380ggggagccgg gacaagcgca ccctcgcctg cctgatccag aacttcatgc ctgaggacat
1440ctcggtgcag tggctgcaca acgaggtgca gctcccggac gcccggcaca gcacgacgca
1500gccccgcaag accaagggct ccggcttctt cgtcttcagc cgcctggagg tgaccagggc
1560cgaatgggag cagaaagatg agttcatctg ccgtgcagtc catgaggcag caagcccctc
1620acagaccgtc cagcgagcgg tgtctgtaaa tcccggtaaa tgacgtactc ctgcctccct
1680ccctcccagg gctccatcca gctgtgcagt ggggaggact ggccagacct tctgtccact
1740gttgcaatga ccccaggaag ctacccccaa taaactgtgc ctgctcagag ccccaggtac
1800acccattctt gggagcgggc agggctgtgg gcaggtgcat cttggcacag aggaatgggc
1860cccccaggag gggcagtggg aggaggtggg cagggctgag tccccccagg agaggcggtg
1920ggaggaggtg ggcagggctg aggtgccact catccatctg ccttcgtgtc agggttattt
1980gtcaaacagc atatctgcag ggactcatca cagctacccc gggccctctc tgcccccact
2040ctgggtctac cccctccaag gagtccaaag acccagggga ggtcctcagg gaaggggcaa
2100gggagccccc acagccctct ctcttggggg cttggcttct acccccctgg acaggagccc
2160ctgcaccccc aggtatagat gggcacacag gcccctccag gtggaaaaac agccctaagt
2220gaaaccccca cacagacaca cacgacccga cagccctcgc ccaagtctgt gccactggcg
2280ttcgcctctc tgccctgtcc cgccttgccg agtcctggcc ccagcaccgg ggccggtgga
2340gccgagccca ctcacacccc gcagcctccg ccaccctgcc ctgtgggcac accaggccca
2400ggtcagcagc caggccccct ctcctactgc cccccaccgc cccttggtcc atcctgaatc
2460ggcccccagg ggatcgccag cctcacacac ccagtctcgc ccactcacgc ctcactcaag
2520gcacagctgt gcacacacta ggccccatag caactccaca gcaccctgta ccaccaccag
2580ggcgccatag acaccccaca cgtggtcaca cgtggcccac actccgcctc tcacgctgcc
2640tccagcgagg ctactgccaa gcccttcctc tgagccatac ctgggccgct ggatcccaga
2700gagaaatgga gaggccctca cgtggtgtcc tccagtccaa ccctccctgt caccctgtca
2760gcagcagcac cccacagcca aacacaggat ggatgcgtgg gctccatccc ccactcaccc
2820acaccggaac cccagagcag gctacgtgcc cctcacagac ctcaaaccca catgtgcatc
2880tgacacccca gatccaaacg ctccccccgg tcatgcacac caagggcaca gcacccacca
2940aatccacacg gaaacacggg caccgggcac cccatgagca caaagcccct ccatgtctga
3000agacagtccc tgcacaccgt cacagccata cattcagctt cactctcacg tcccagccca
3060cctgcaccca gctctgggcc tggagcagca gaaagaggtg tgagggcccg aggcgggacc
3120tgcacctgct gatgacccgg gaccagcagg cagctcacgg tgttggggaa gggagtggag
3180ggcacccagg gcaggagcca gagggaccag gctggtgggc ggggccgggc cggggtaggg
3240ccaggaggca gctctggaca cccacaggcc tgggctcata gtccacacca ggacagcccc
3300tcagagcacc catgcagtga gtcccaggtc ttgggagcca ggccgcagag ctcacgcatc
3360cttccgaggg ccctgagtga ggcggccact gctgtgccga ggggttgggt ccttctctgg
3420ggagggcgtg gggtctagag aggcggagtg gaggtaacca gaggtcagga gagaagccgt
3480aaggaacaga gggaaaatgg ggccagagtc ggggcgcagg gacgagaggt caggagtggt
3540cggcctggct ctgggccgtt gactgactcg ggacctgggt gcccaccctc agggctggct
3600ggcggctccg cgcagtccca gagggccccg gatagggtgc tctgccactc cggacagcag
3660cagggactgc cgagagcagc aggaggctct gtcccccacc cccgctgcca ctgtggagcc
3720gggagggctg actggccagg tcccccagag ctggacgtgt gcgtggagga ggccgagggc
3780gaggcgccgt ggacgtggac cggcctctgc atcttcgccg cactcttcct gctcagcgtg
3840ag
38423961287DNAhomo sapiens; 396gcctccacac agagcccatc cgtcttcccc
ttgacccgct gctgcaaaaa cattccctcc 60aatgccacct ccgtgactct gggctgcctg
gccacgggct acttcccgga gccggtgatg 120gtgacctggg acacaggctc cctcaacggg
acaactatga ccttaccagc caccaccctc 180acgctctctg gtcactatgc caccatcagc
ttgctgaccg tctcgggtgc gtgggccaag 240cagatgttca cctgccgtgt ggcacacact
ccatcgtcca cagactgggt cgacaacaaa 300accttcagcg tctgctccag ggacttcacc
ccgcccaccg tgaagatctt acagtcgtcc 360tgcgacggcg gcgggcactt ccccccgacc
atccagctcc tgtgcctcgt ctctgggtac 420accccaggga ctatcaacat cacctggctg
gaggacgggc aggtcatgga cgtggacttg 480tccaccgcct ctaccacgca ggagggtgag
ctggcctcca cacaaagcga gctcaccctc 540agccagaagc actggctgtc agaccgcacc
tacacctgcc aggtcaccta tcaaggtcac 600acctttgagg acagcaccaa gaagtgtgca
gattccaacc cgagaggggt gagcgcctac 660ctaagccggc ccagcccgtt cgacctgttc
atccgcaagt cgcccacgat cacctgtctg 720gtggtggacc tggcacccag caaggggacc
gtgaacctga cctggtcccg ggccagtggg 780aagcctgtga accactccac cagaaaggag
gagaagcagc gcaatggcac gttaaccgtc 840acgtccaccc tgccggtggg cacccgagac
tggatcgagg gggagaccta ccagtgcagg 900gtgacccacc cccacctgcc cagggccctc
atgcggtcca cgaccaagac cagcggcccg 960cgtgctgccc cggaagtcta tgcgtttgcg
acgccggagt ggccggggag ccgggacaag 1020cgcaccctcg cctgcctgat ccagaacttc
atgcctgagg acatctcggt gcagtggctg 1080cacaacgagg tgcagctccc ggacgcccgg
cacagcacga cgcagccccg caagaccaag 1140ggctccggct tcttcgtctt cagccgcctg
gaggtgacca gggccgaatg ggagcagaaa 1200gatgagttca tctgccgtgc agtccatgag
gcagcaagcc cctcacagac cgtccagcga 1260gcggtgtctg taaatcccgg taaatga
128739730DNAhomo sapiens; 397tccggcttct
tcgtcttcag ccgcctggag
30398228DNAhomo sapiens; 398tccggcttct tcgtcttcag ccgcctggag gtgaccaggg
ccgaatggga gcagaaagat 60gagttcatct gccgtgcagt ccatgaggca gcaagcccct
cacagaccgt ccagcgagcg 120gtgtctgtaa atcccgagct ggacgtgtgc gtggaggagg
ccgagggcga ggcgccgtgg 180acgtggaccg gcctctgcat cttcgccgca ctcttcctgc
tcagcgtg 2283991975DNAhomo sapiens; 399gggagtgcat
ccgccccaac ccttttcccc ctcgtctcct gtgagaattc cccgtcggat 60acgagcagcg
tggccgttgg ctgcctcgca caggacttcc ttcccgactc catcactttc 120tcctggaaat
acaagaacaa ctctgacatc agcagcaccc ggggcttccc atcagtcctg 180agagggggca
agtacgcagc cacctcacag gtgctgctgc cttccaagga cgtcatgcag 240ggcacagacg
aacacgtggt gtgcaaagtc cagcacccca acggcaacaa agaaaagaac 300gtgcctcttc
caggtgaggg ccgggcccag ccaccgggac agagagggag ccgaaggggg 360cgggagtggc
gggcaccggg ctgacacgtg tccctcactg cagtgattgc cgagctgcct 420cccaaagtga
gcgtcttcgt cccaccccgc gacggcttct tcggcaaccc ccgcaagtcc 480aagctcatct
gccaggccac gggtttcagt ccccggcaga ttcaggtgtc ctggctgcgc 540gaggggaagc
aggtggggtc tggcgtcacc acggaccagg tgcaggctga ggccaaagag 600tctgggccca
cgacctacaa ggtgaccagc acactgacca tcaaagagag cgactggctc 660agccagagca
tgttcacctg ccgcgtggat cacaggggcc tgaccttcca gcagaatgcg 720tcctccatgt
gtggccccgg tgagtgacct gtccccaggg gcagcaccca ccgacacaca 780ggggtccact
cgggtctggc attcgccacc ccggatgcag ccatctactc cctgagcctt 840ggcttcccag
agcggccaag ggcaggggct cgggcggcag gacccctggg ctcggcagag 900gcagttgcta
ctctttgggt gggaaccatg cctccgccca catccacacc tgccccacct 960ctgactccct
tctcttgact ccagatcaag acacagccat ccgggtcttc gccatccccc 1020catcctttgc
cagcatcttc ctcaccaagt ccaccaagtt gacctgcctg gtcacagacc 1080tgaccaccta
tgacagcgtg accatctcct ggacccgcca gaatggcgaa gctgtgaaaa 1140cccacaccaa
catctccgag agccacccca atgccacttt cagcgccgtg ggtgaggcca 1200gcatctgcga
ggatgactgg aattccgggg agaggttcac gtgcaccgtg acccacacag 1260acctgccctc
gccactgaag cagaccatct cccggcccaa gggtaggccc cactcttgcc 1320cctcttcctg
cactccctgg gacctccctt ggcctctggg gcatggtgga aagcacccct 1380cactcccccg
ttgtctgggc aactggggaa aaggggactc aaccccagcc cacaggctgg 1440tccccccact
gccccgccct caccaccatc tctgttcaca ggggtggccc tgcacaggcc 1500cgatgtctac
ttgctgccac cagcccggga gcagctgaac ctgcgggagt cggccaccat 1560cacgtgcctg
gtgacgggct tctctcccgc ggacgtcttc gtgcagtgga tgcagagggg 1620gcagcccttg
tccccggaga agtatgtgac cagcgcccca atgcctgagc cccaggcccc 1680aggccggtac
ttcgcccaca gcatcctgac cgtgtccgaa gaggaatgga acacggggga 1740gacctacacc
tgcgtggtgg cccatgaggc cctgcccaac agggtcaccg agaggaccgt 1800ggacaagtcc
accggtaaac ccaccctgta caacgtgtcc ctggtcatgt ccgacacagc 1860tggcacctgc
tactgaccct gctggcctgc ccacaggctc ggggcggctg gccgctctgt 1920gtgtgcatgc
aaactaaccg tgtcaacggg gtgagatgtt gcatcttata aaatt
19754001362DNAhomo sapiens; 400gggagtgcat ccgccccaac ccttttcccc
ctcgtctcct gtgagaattc cccgtcggat 60acgagcagcg tggccgttgg ctgcctcgca
caggacttcc ttcccgactc catcactttc 120tcctggaaat acaagaacaa ctctgacatc
agcagcaccc ggggcttccc atcagtcctg 180agagggggca agtacgcagc cacctcacag
gtgctgctgc cttccaagga cgtcatgcag 240ggcacagacg aacacgtggt gtgcaaagtc
cagcacccca acggcaacaa agaaaagaac 300gtgcctcttc cagtgattgc cgagctgcct
cccaaagtga gcgtcttcgt cccaccccgc 360gacggcttct tcggcaaccc ccgcaagtcc
aagctcatct gccaggccac gggtttcagt 420ccccggcaga ttcaggtgtc ctggctgcgc
gaggggaagc aggtggggtc tggcgtcacc 480acggaccagg tgcaggctga ggccaaagag
tctgggccca cgacctacaa ggtgaccagc 540acactgacca tcaaagagag cgactggctc
agccagagca tgttcacctg ccgcgtggat 600cacaggggcc tgaccttcca gcagaatgcg
tcctccatgt gtggccccga tcaagacaca 660gccatccggg tcttcgccat ccccccatcc
tttgccagca tcttcctcac caagtccacc 720aagttgacct gcctggtcac agacctgacc
acctatgaca gcgtgaccat ctcctggacc 780cgccagaatg gcgaagctgt gaaaacccac
accaacatct ccgagagcca ccccaatgcc 840actttcagcg ccgtgggtga ggccagcatc
tgcgaggatg actggaattc cggggagagg 900ttcacgtgca ccgtgaccca cacagacctg
ccctcgccac tgaagcagac catctcccgg 960cccaaggggg tggccctgca caggcccgat
gtctacttgc tgccaccagc ccgggagcag 1020ctgaacctgc gggagtcggc caccatcacg
tgcctggtga cgggcttctc tcccgcggac 1080gtcttcgtgc agtggatgca gagggggcag
cccttgtccc cggagaagta tgtgaccagc 1140gccccaatgc ctgagcccca ggccccaggc
cggtacttcg cccacagcat cctgaccgtg 1200tccgaagagg aatggaacac gggggagacc
tacacctgcg tggtggccca tgaggccctg 1260cccaacaggg tcaccgagag gaccgtggac
aagtccaccg gtaaacccac cctgtacaac 1320gtgtccctgg tcatgtccga cacagctggc
acctgctact ga 13624011975DNAhomo sapiens;
401gggagtgcat ccgccccaac ccttttcccc ctcgtctcct gtgagaattc cccgtcggat
60acgagcagcg tggccgttgg ctgcctcgca caggacttcc ttcccgactc catcactttc
120tcctggaaat acaagaacaa ctctgacatc agcagcaccc ggggcttccc atcagtcctg
180agagggggca agtacgcagc cacctcacag gtgctgctgc cttccaagga cgtcatgcag
240ggcacagacg aacacgtggt gtgcaaagtc cagcacccca acggcaacaa agaaaagaac
300gtgcctcttc caggtgaggg ccgggcccag ccaccgggac agagagggag ccgaaggggg
360cgggagtggc gggcaccggg ctgacacgtg tccctcactg cagtgattgc cgagctgcct
420cccaaagtga gcgtcttcgt cccaccccgc gacggcttct tcggcaaccc ccgcaagtcc
480aagctcatct gccaggccac gggtttcagt ccccggcaga ttcaggtgtc ctggctgcgc
540gaggggaagc aggtggggtc tggcgtcacc acggaccagg tgcaggctga ggccaaagag
600tctgggccca cgacctacaa ggtgaccagc acactgacca tcaaagagag cgactggctc
660agccagagca tgttcacctg ccgcgtggat cacaggggcc tgaccttcca gcagaatgcg
720tcctccatgt gtgtccccgg tgagtgacct gtccccaggg gcagcaccca ccgacacaca
780ggggtccact cgggtctggc attcgccacc ccggatgcag ccatctactc cctgagcctt
840ggcttcccag agcggccaag ggcaggggct cgggcggcag gacccctggg ctcggcagag
900gcagttgcta ctctttgggt gggaaccatg cctccgccca catccacacc tgccccacct
960ctgactccct tctcttgact ccagatcaag acacagccat ccgggtcttc gccatccccc
1020catcctttgc cagcatcttc ctcaccaagt ccaccaagtt gacctgcctg gtcacagacc
1080tgaccaccta tgacagcgtg accatctcct ggacccgcca gaatggcgaa gctgtgaaaa
1140cccacaccaa catctccgag agccacccca atgccacttt cagcgccgtg ggtgaggcca
1200gcatctgcga ggatgactgg aattccgggg agaggttcac gtgcaccgtg acccacacag
1260acctgccctc gccactgaag cagaccatct cccggcccaa gggtaggccc cactcttgcc
1320cctcttcctg cactccctgg gacctccctt ggcctctggg gcatggtgga aagcacccct
1380cactcccccg ttgtctgggc aactggggaa aaggggactc aaccccagcc cacaggctgg
1440tccccccact gccccgccct caccaccatc tctgttcaca ggggtggccc tgcacaggcc
1500cgatgtctac ttgctgccac cagcccggga gcagctgaac ctgcgggagt cggccaccat
1560cacgtgcctg gtgacgggct tctctcccgc ggacgtcttc gtgcagtgga tgcagagggg
1620gcagcccttg tccccggaga agtatgtgac cagcgcccca atgcctgagc cccaggcccc
1680aggccggtac ttcgcccaca gcatcctgac cgtgtccgaa gaggaatgga acacggggga
1740gacctacacc tgcgtggtgg cccatgaggc cctgcccaac agggtcaccg agaggaccgt
1800ggacaagtcc accggtaaac ccaccctgta caacgtgtcc ctggtcatgt ccgacacagc
1860tggcacctgc tactgaccct gctggcctgc ccacaggctc ggggcggctg gccgctctgt
1920gtgtgcatgc aaactaaccg tgtcaacggg gtgagatgtt gcatcttata aaatt
19754021362DNAhomo sapiens; 402gggagtgcat ccgccccaac ccttttcccc
ctcgtctcct gtgagaattc cccgtcggat 60acgagcagcg tggccgttgg ctgcctcgca
caggacttcc ttcccgactc catcactttc 120tcctggaaat acaagaacaa ctctgacatc
agcagcaccc ggggcttccc atcagtcctg 180agagggggca agtacgcagc cacctcacag
gtgctgctgc cttccaagga cgtcatgcag 240ggcacagacg aacacgtggt gtgcaaagtc
cagcacccca acggcaacaa agaaaagaac 300gtgcctcttc cagtgattgc cgagctgcct
cccaaagtga gcgtcttcgt cccaccccgc 360gacggcttct tcggcaaccc ccgcaagtcc
aagctcatct gccaggccac gggtttcagt 420ccccggcaga ttcaggtgtc ctggctgcgc
gaggggaagc aggtggggtc tggcgtcacc 480acggaccagg tgcaggctga ggccaaagag
tctgggccca cgacctacaa ggtgaccagc 540acactgacca tcaaagagag cgactggctc
agccagagca tgttcacctg ccgcgtggat 600cacaggggcc tgaccttcca gcagaatgcg
tcctccatgt gtgtccccga tcaagacaca 660gccatccggg tcttcgccat ccccccatcc
tttgccagca tcttcctcac caagtccacc 720aagttgacct gcctggtcac agacctgacc
acctatgaca gcgtgaccat ctcctggacc 780cgccagaatg gcgaagctgt gaaaacccac
accaacatct ccgagagcca ccccaatgcc 840actttcagcg ccgtgggtga ggccagcatc
tgcgaggatg actggaattc cggggagagg 900ttcacgtgca ccgtgaccca cacagacctg
ccctcgccac tgaagcagac catctcccgg 960cccaaggggg tggccctgca caggcccgat
gtctacttgc tgccaccagc ccgggagcag 1020ctgaacctgc gggagtcggc caccatcacg
tgcctggtga cgggcttctc tcccgcggac 1080gtcttcgtgc agtggatgca gagggggcag
cccttgtccc cggagaagta tgtgaccagc 1140gccccaatgc ctgagcccca ggccccaggc
cggtacttcg cccacagcat cctgaccgtg 1200tccgaagagg aatggaacac gggggagacc
tacacctgcg tggtggccca tgaggccctg 1260cccaacaggg tcaccgagag gaccgtggac
aagtccaccg gtaaacccac cctgtacaac 1320gtgtccctgg tcatgtccga cacagctggc
acctgctact ga 13624031975DNAhomo sapiens;
403gggagtgcat ccgccccaac ccttttcccc ctcgtctcct gtgagaattc cccgtcggat
60acgagcagcg tggccgttgg ctgcctcgca caggacttcc ttcccgactc catcactttc
120tcctggaaat acaagaacaa ctctgacatc agcagcaccc ggggcttccc atcagtcctg
180agagggggca agtacgcagc cacctcacag gtgctgctgc cttccaagga cgtcatgcag
240ggcacagacg aacacgtggt gtgcaaagtc cagcacccca acggcaacaa agaaaagaac
300gtgcctcttc caggtgaggg ccgggcccag ccaccgggac agagagggag ccgaaggggg
360cgggagtggc gggcaccggg ctgacacgtg tccctcactg cagtgattgc cgagctgcct
420cccaaagtga gcgtcttcgt cccaccccgc gacggcttct tcggcaaccc ccgcaagtcc
480aagctcatct gccaggccac gggtttcagt ccccggcaga ttcaggtgtc ctggctgcgc
540gaggggaagc aggtggggtc tggcgtcacc acggaccagg tgcaggctga ggccaaagag
600tctgggccca cgacctacaa ggtgaccagc acactgacca tcaaagagag cgactggctc
660ggccagagca tgttcacctg ccgcgtggat cacaggggcc tgaccttcca gcagaatgcg
720tcctccatgt gtgtccccgg tgagtgacct gtccccaggg gcagcaccca ccgacacaca
780ggggtccact cgggtctggc attcgccacc ccggatgcag ccatctactc cctgagcctt
840ggcttcccag agcggccaag ggcaggggct cgggcggcag gacccctggg ctcggcagag
900gcagttgcta ctctttgggt gggaaccatg cctccgccca catccacacc tgccccacct
960ctgactccct tctcttgact ccagatcaag acacagccat ccgggtcttc gccatccccc
1020catcctttgc cagcatcttc ctcaccaagt ccaccaagtt gacctgcctg gtcacagacc
1080tgaccaccta tgacagcgtg accatctcct ggacccgcca gaatggcgaa gctgtgaaaa
1140cccacaccaa catctccgag agccacccca atgccacttt cagcgccgtg ggtgaggcca
1200gcatctgcga ggatgactgg aattccgggg agaggttcac gtgcaccgtg acccacacag
1260acctgccctc gccactgaag cagaccatct cccggcccaa gggtaggccc cactcttgcc
1320cctcttcctg cactccctgg gacctccctt ggcctctggg gcatggtgga aagcacccct
1380cactcccccg ttgtctgggc aactggggaa aaggggactc aaccccagcc cacaggctgg
1440tccccccact gccccgccct caccaccatc tctgttcaca ggggtggccc tgcacaggcc
1500cgatgtctac ttgctgccac cagcccggga gcagctgaac ctgcgggagt cggccaccat
1560cacgtgcctg gtgacgggct tctctcccgc ggacgtcttc gtgcagtgga tgcagagggg
1620gcagcccttg tccccggaga agtatgtgac cagcgcccca atgcctgagc cccaggcccc
1680aggccggtac ttcgcccaca gcatcctgac cgtgtccgaa gaggaatgga acacggggga
1740gacctacacc tgcgtggtgg cccatgaggc cctgcccaac agggtcaccg agaggaccgt
1800ggacaagtcc accggtaaac ccaccctgta caacgtgtcc ctggtcatgt ccgacacagc
1860tggcacctgc tactgaccct gctggcctgc ccacaggctc ggggcggctg gccgctctgt
1920gtgtgcatgc aaactaaccg tgtcaacggg gtgagatgtt gcatcttata aaatt
19754041362DNAhomo sapiens; 404gggagtgcat ccgccccaac ccttttcccc
ctcgtctcct gtgagaattc cccgtcggat 60acgagcagcg tggccgttgg ctgcctcgca
caggacttcc ttcccgactc catcactttc 120tcctggaaat acaagaacaa ctctgacatc
agcagcaccc ggggcttccc atcagtcctg 180agagggggca agtacgcagc cacctcacag
gtgctgctgc cttccaagga cgtcatgcag 240ggcacagacg aacacgtggt gtgcaaagtc
cagcacccca acggcaacaa agaaaagaac 300gtgcctcttc cagtgattgc cgagctgcct
cccaaagtga gcgtcttcgt cccaccccgc 360gacggcttct tcggcaaccc ccgcaagtcc
aagctcatct gccaggccac gggtttcagt 420ccccggcaga ttcaggtgtc ctggctgcgc
gaggggaagc aggtggggtc tggcgtcacc 480acggaccagg tgcaggctga ggccaaagag
tctgggccca cgacctacaa ggtgaccagc 540acactgacca tcaaagagag cgactggctc
ggccagagca tgttcacctg ccgcgtggat 600cacaggggcc tgaccttcca gcagaatgcg
tcctccatgt gtgtccccga tcaagacaca 660gccatccggg tcttcgccat ccccccatcc
tttgccagca tcttcctcac caagtccacc 720aagttgacct gcctggtcac agacctgacc
acctatgaca gcgtgaccat ctcctggacc 780cgccagaatg gcgaagctgt gaaaacccac
accaacatct ccgagagcca ccccaatgcc 840actttcagcg ccgtgggtga ggccagcatc
tgcgaggatg actggaattc cggggagagg 900ttcacgtgca ccgtgaccca cacagacctg
ccctcgccac tgaagcagac catctcccgg 960cccaaggggg tggccctgca caggcccgat
gtctacttgc tgccaccagc ccgggagcag 1020ctgaacctgc gggagtcggc caccatcacg
tgcctggtga cgggcttctc tcccgcggac 1080gtcttcgtgc agtggatgca gagggggcag
cccttgtccc cggagaagta tgtgaccagc 1140gccccaatgc ctgagcccca ggccccaggc
cggtacttcg cccacagcat cctgaccgtg 1200tccgaagagg aatggaacac gggggagacc
tacacctgcg tggtggccca tgaggccctg 1260cccaacaggg tcaccgagag gaccgtggac
aagtccaccg gtaaacccac cctgtacaac 1320gtgtccctgg tcatgtccga cacagctggc
acctgctact ga 136240511DNAhomo sapiens;
405ctaactgggg a
1140631DNAhomo sapiens; 406aggatattgt agtggtggta gctgctactc c
3140737DNAhomo sapiens; 407gtattatgat tacgtttggg
ggagttatcg ttatacc 3740818DNAhomo sapiens;
408gagtatagca gctcgtcc
1840920DNAhomo sapiens; 409gtggatacag ctatggttac
2041031DNAhomo sapiens; 410aggatattgt agtagtacca
gctgctatgc c 3141116DNAhomo sapiens;
411tgactacagt aactac
1641223DNAhomo sapiens; 412gtggatatag tggctacgat tac
2341331DNAhomo sapiens; 413gtattacgat ttttggagtg
gttattatac c 3141431DNAhomo sapiens;
414aggatattgt actaatggtg tatgctatac c
3141516DNAhomo sapiens; 415tgactacagt aactac
1641619DNAhomo sapiens; 416tgactacggt ggtaactcc
1941717DNAhomo sapiens;
417ggtataaccg gaaccac
1741831DNAhomo sapiens; 418gtattactat ggttcgggga gttattataa c
3141920DNAhomo sapiens; 419ggtatagtgg gagctactac
2042031DNAhomo sapiens;
420gtattacgat attttgactg gttattataa c
3142117DNAhomo sapiens; 421ggtacaactg gaacgac
1742218DNAhomo sapiens; 422gggtatagca gcggctac
1842320DNAhomo sapiens;
423gtagagatgg ctacaattac
2042428DNAhomo sapiens; 424agcatattgt ggtggtgact gctattcc
2842517DNAhomo sapiens; 425ggtataactg gaacgac
1742621DNAhomo sapiens;
426gggtatagca gcagctggta c
2142716DNAhomo sapiens; 427tgactacggt gactac
1642831DNAhomo sapiens; 428gtattactat gatagtagtg
gttattacta c 3142920DNAhomo sapiens;
429gtggatacag ctatggttac
2043021DNAhomo sapiens; 430gggtatagca gtggctggta c
2143117DNAhomo sapiens; 431ggtataactg gaactac
174326PRTOryctolagus
cuniculus; 432Tyr Tyr Gly Met Asp Leu 1 5
43320DNAOryctolagus cuniculus; 433attactacgg catggacctc
204346PRTOvis aries; 434Tyr Tyr Gly Val Asp
Val 1 5 43520DNAOvis aries; 435attactacgg tgtagatgtc
204366PRTBos taurus;
436Tyr Tyr Gly Val Asp Val 1 5 43720DNABos taurus;
437attactacgg tgtagatgtc
204386PRTCanis familiaris; 438Tyr Tyr Gly Met Asp Tyr 1 5
43920DNACanis familiaris; 439attactatgg tatggactac
204409PRThomo sapiens; 440Tyr Tyr Tyr Tyr
Tyr Gly Val Asp Val 1 5 44129DNAhomo
sapiens; 441attactacta ctactacggt atggacgtc
2944257DNAHomo sapiens 442atgggctggt cctgcatcat cctgtttctg
gtggccaccg ccaccggcgt gcacagc 5744319PRTHomo sapiens 443Met Gly
Trp Ser Cys Ile Ile Leu Phe Leu Val Ala Thr Ala Thr Gly 1 5
10 15 Val His Ser
44421DNAArtificial SequencePrimer 444aggccagcag agggttccat g
2144522DNAArtificial SequencePrimer
445ggctcccaga tcctcaaggc ac
22
User Contributions:
Comment about this patent or add new information about this topic: