Patent application title: METHODS FOR DIAGNOSING AND TREATING CANCER BY MEANS OF THE EXPRESSION STATUS AND MUTATIONAL STATUS OF NRF2 AND DOWNSTREAM TARGET GENES OF SAID GENE
Inventors:
IPC8 Class: AC12Q16886FI
USPC Class:
1 1
Class name:
Publication date: 2022-03-24
Patent application number: 20220090205
Abstract:
The invention provides methods of identifying a subject having cancer,
such as lung cancer, by analyzing expression levels of one or more NRF2
splice variants or NRF2 target genes. The invention also provides methods
of treating cancer in a subject with a NRF2 pathway antagonist, wherein
the subject expresses one or more NRF2 splice variants or overexpresses
one or more NRF2 target genes.Claims:
1. A method of diagnosing a cancer in a subject, the method comprising:
(a) determining the expression levels of the following 27 genes: AKR1B10,
AKR1C2, SRXN1, OSGIN1, FECH, GCLM, TRIM16, ME1, KYNU, CABYR, SLC7A11,
TRIM16L, AKR1C4, CYP4F11, RSPO3, ABCC2, AKR1B15, NR0B1, UGDH, TXNRD1,
GSR, AKR1C3, TALDO1, PGD, TXN, NQO1, and FTL in a sample obtained from
the subject; and (b) comparing the expression level of each of the 27
genes to a reference expression level of each of the 27 genes, wherein an
increase in the expression level of each of the 27 genes in the sample
relative to the respective reference expression level of each of the 27
genes identifies the subject having a cancer.
2. The method of claim 1, wherein the method further comprises: determining if the subject's cancer is a NRF2-dependent cancer, wherein an increase in the expression level of each of the 27 genes in the sample relative to the respective reference expression level of each of the 27 genes identifies the subject as having a NRF2-dependent cancer.
3-8. (canceled)
9. The method of claim 1, wherein (a) the expression level of each of the 27 genes in the sample is an average expression level of each of the 27 genes of the sample; (b) the reference expression level of each of the 27 genes is an average expression level of each of the 27 genes of the reference; and (c) the average expression level of each of the 27 genes of the sample is compared to the average of each of the 27 genes of the reference.
10-11. (canceled)
12. The method of claim 1, wherein the reference expression level of each of the 27 genes is the mean level of expression of each of the 27 genes in a population of subjects having the cancer.
13. The method of claim 12, wherein the reference expression level is the mean level of expression of each of the 27 genes in a population of subjects having lung cancer, optionally a non-small cell lung cancer (NSCLC), optionally a squamous NSCLC.
14-15. (canceled)
16. The method of claim 1, wherein the expression level is an mRNA expression level, optionally wherein the mRNA expression level is determined by PCR, RT-PCR, RNA-seq, gene expression profiling, serial analysis of gene expression, or microarray analysis.
17. (canceled)
18. The method of claim 1, wherein the expression level is a protein expression level, optionally wherein the protein expression level is determined by western blot, immunohistochemistry, or mass spectrometry.
19. (canceled)
20. The method of claim 1, further comprising determining a DNA sequence of NRF2, optionally wherein the DNA sequence is determined by PCR, exome-seq, microarray analysis, or whole genome sequencing.
21-32. (canceled)
33. The method of claim 1, further comprising administering to the subject a therapeutically effective amount of a NRF2 pathway antagonist and/or a therapeutically effective amount of an anti-cancer agent.
34-40. (canceled)
41. A method of treating a subject having a cancer, the method comprising administering to the subject a therapeutically effective amount of a NRF2 pathway antagonist, wherein the expression level of each of the following 27 genes AKR1B10, AKR1C2, SRXN1, OSGIN1, FECH, GCLM, TRIM16, ME1, KYNU, CABYR, SLC7A11, TRIM16L, AKR1C4, CYP4F11, RSPO3, ABCC2, AKR1B15, NR0B1, UGDH, TXNRD1, GSR, AKR1C3, TALDO1, PGD, TXN, NQO1, and FTL in a sample obtained from the subject has been determined to be increased relative to a respective reference expression level of each of the 27 genes.
42-47. (canceled)
48. The method of claim 41, wherein (a) the expression level of each of the 27 genes in the sample is an average expression level of each of the 27 genes of the sample; (b) the reference expression level of each of the 27 genes is an average expression level of each of the 27 genes of the reference; and (c) the average expression level of each of the 27 genes of the sample is compared to the average of each of the 27 genes of the reference.
49-50. (canceled)
51. The method of claim 41, wherein the reference expression level is the mean level of expression of each of the 27 genes in a population of subjects having the cancer, optionally lung cancer, optionally NSCLC, optionally squamous NSCLC.
52-53. (canceled)
54. The method of claim 41, wherein the expression level is an mRNA expression level, optionally wherein the mRNA expression level is determined by PCR, RT-PCR, RNA-seq, gene expression profiling, serial analysis of gene expression, or microarray analysis.
55-56. (canceled)
57. The method of claim 41, wherein the expression level is a protein expression level, optionally wherein the protein expression is determined by western blot, immunohistochemistry, or mass spectrometry.
58. (canceled)
59. The method of claim 41, further comprising determining a DNA sequence of the NRF2, optionally wherein the DNA sequence is determined by PCR, exome-seq, microarray analysis, or whole genome sequencing.
60-78. (canceled)
79. The method of claim 1, wherein the sample obtained from the subject is from a biopsy sample.
80. (canceled)
81. The method of claim 1, wherein the subject: (a) is a previously untreated subject; and/or (b) has a lung cancer or a head and neck cancer.
82. The method of claim 81, wherein the lung cancer is a non-small cell lung cancer (NSCLC).
83. The method of claim 82, wherein the NSCLC is a squamous NSCLC.
84. The method of claim 81, wherein the head and neck cancer is a squamous head and neck cancer.
Description:
SEQUENCE LISTING
[0001] The instant application contains a Sequence Listing which has been submitted electronically in ASCII format and is hereby incorporated by reference in its entirety. Said ASCII copy, created on Dec. 20, 2018, is named 50474-127002_Sequence_Listing_12.20.18_ST25 and is 216,262 bytes in size.
FIELD OF THE INVENTION
[0002] The present invention relates generally to methods for diagnosing, treating, and providing prognoses for cancer, e.g., lung cancer.
BACKGROUND OF THE INVENTION
[0003] Cancer remains one of the most deadly threats to human health. Lung cancer, in particular, is the primary cause of cancer-related death for men and women in the United States, despite recent advances in therapeutic treatments. The majority of lung cancers are non-small cell lung cancers (NSCLC), and most often of either the adenomatous or squamous subtype. Recent studies have identified patterns of point mutations that underlie these indications (Imielinski et al. Cell. 150(6):1107-1120, 2012), but despite an increasing number of identified mutations associated with various cellular pathways, a comprehensive understanding of the nature and influence of these mutations on these cellular pathways is lacking.
[0004] Thus, there is an unmet need in the field to develop effective diagnostic and therapeutic strategies for cancers, such as lung cancer.
SUMMARY OF THE INVENTION
[0005] The present invention provides compositions and methods for diagnosing, treating, and providing prognoses for cancer, for example, lung cancer (e.g., non-small cell lung cancer (NSCLC)) and head and neck carcinoma.
[0006] In one aspect, the invention features a method of diagnosing a cancer in a subject, the method comprising: (a) determining the expression level of at least one (e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, or 27) gene selected from the group consisting of AKR1B10, AKR1C2, SRXN1, OSGIN1, FECH, GCLM, TRIM16, ME1, KYNU, CABYR, SLC7A11, TRIM16L, AKR1C4, CYP4F11, RSPO3, ABCC2, AKR1B15, NR0B1, UGDH, TXNRD1, GSR, AKR1C3, TALDO1, PGD, TXN, NQO1, and FTL in a sample obtained from the subject; and (b) comparing the expression level of the at least one gene to a reference expression level of the at least one gene, wherein an increase in the expression level of the at least one gene in the sample relative to the reference expression level of the at least one gene identifies a subject having a cancer.
[0007] In another aspect, the invention features a method of identifying a subject having a cancer that is a NRF2-dependent cancer, the method comprising: (a) determining the expression level of at least one (e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, or 27) gene selected from the group consisting of AKR1B10, AKR1C2, SRXN1, OSGIN1, FECH, GCLM, TRIM16, ME1, KYNU, CABYR, SLC7A11, TRIM16L, AKR1C4, CYP4F11, RSPO3, ABCC2, AKR1B15, NR0B1, UGDH, TXNRD1, GSR, AKR1C3, TALDO1, PGD, TXN, NQO1, and FTL in a sample obtained from the subject; (b) comparing the expression level of the at least one gene to a reference expression level of the at least one gene; and (c) determining if the subject's cancer is a NRF2-dependent cancer, wherein an increase in the expression level of the at least one gene in the sample relative to the reference expression level of the at least one gene identifies a subject having a NRF2-dependent cancer. In some embodiments of either of the preceding aspects, the expression level of at least two (e.g., 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, or 27) genes selected from the group consisting of AKR1B10, AKR1C2, SRXN1, OSGIN1, FECH, GCLM, TRIM16, ME1, KYNU, CABYR, SLC7A11, TRIM16L, AKR1C4, CYP4F11, RSPO3, ABCC2, AKR1B15, NR0B1, UGDH, TXNRD1, GSR, AKR1C3, TALDO1, PGD, TXN, NQO1, and FTL in a sample obtained from the subject is determined. In some embodiments, the expression level of at least three (e.g., 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, or 27) genes selected from the group consisting of AKR1B10, AKR1C2, SRXN1, OSGIN1, FECH, GCLM, TRIM16, ME1, KYNU, CABYR, SLC7A11, TRIM16L, AKR1C4, CYP4F11, RSPO3, ABCC2, AKR1B15, NR0B1, UGDH, TXNRD1, GSR, AKR1C3, TALDO1, PGD, TXN, NQO1, and FTL in a sample obtained from the subject is determined. In some embodiments, the expression level of at least four (e.g., 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, or 27) genes selected from the group consisting of AKR1B10, AKR1C2, SRXN1, OSGIN1, FECH, GCLM, TRIM16, ME1, KYNU, CABYR, SLC7A11, TRIM16L, AKR1C4, CYP4F11, RSPO3, ABCC2, AKR1B15, NR0B1, UGDH, TXNRD1, GSR, AKR1C3, TALDO1, PGD, TXN, NQO1, and FTL in a sample obtained from the subject is determined. In some embodiments, the expression level of AKR1B10, AKR1C2, SRXN1, OSGIN1, FECH, GCLM, TRIM16, ME1, KYNU, CABYR, SLC7A11, TRIM16L, AKR1C4, CYP4F11, RSPO3, ABCC2, AKR1B15, NR0B1, UGDH, TXNRD1, GSR, AKR1C3, TALDO1, PGD, TXN, NQO1, and FTL in a sample obtained from the subject is determined.
[0008] In some embodiments, the expression level of one or more (e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, or 21) of AKR1B10, AKR1C2, SRXN1, OSGIN1, FECH, GCLM, TRIM16, KYNU, CABYR, SLC7A11, TRIM16L, AKR1C4, NR0B1, UGDH, TXNRD1, GSR, AKR1C3, TALDO1, PGD, TXN, or NQO1 is determined. In some embodiments, the expression level of one or more (e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, or 12) of AKR1B10, AKR1C2, ME1, KYNU, CABYR, TRIM16L, AKR1C4, CYP4F11, RSPO3, AKR1B15, NR0B1, and AKR1C3 is determined.
[0009] In some embodiments, (a) the expression level of the at least two genes in the sample is an average (e.g., mean or median) of the at least two genes of the sample; (b) the reference expression level of the at least two genes is an average (e.g., mean or median) of the at least two genes of the reference; and (c) the average (e.g., mean or median) of the at least two genes of the sample is compared to the average of the at least two genes of the reference.
[0010] In some embodiments, the reference expression level is the mean level of expression of the at least one gene in a population of subjects. In some embodiments, the population of subjects is a population of subjects sharing a common ethnicity.
[0011] In some embodiments, the reference expression level is the mean level of expression of the at least one gene in a population of subjects having cancer (e.g., lung cancer, e.g., non-small cell lung cancer (NSCLC), e.g., squamous NSCLC).
[0012] In some embodiments, the expression level is an mRNA expression level. In some embodiments, the mRNA expression level is determined by PCR, RT-PCR, RNA-seq, gene expression profiling, serial analysis of gene expression, or microarray analysis.
[0013] In other embodiments, the expression level is a protein expression level. In some embodiments, the protein expression level is determined by western blot, immunohistochemistry, or mass spectrometry.
[0014] In some embodiments, any of the preceding methods further comprises determining a DNA sequence of NRF2. In some embodiments, the DNA sequence is determined by PCR, exome-seq, microarray analysis, or whole genome sequencing.
[0015] In another aspect, the invention features a method of diagnosing a cancer in a subject, the method comprising determining a DNA sequence of in a sample obtained from the subject, wherein the presence of NRF2 DNA comprising a deletion of all or a portion of its exon 2 identifies the subject as having a cancer. In some embodiments, the DNA sequence is determined by PCR, exome-seq, microarray analysis, or whole genome sequencing.
[0016] In another aspect, the invention features a method of identifying a subject having cancer, the method comprising determining the mRNA expression level of NRF2 comprising a deletion of all or a portion of its exon 2 in a sample obtained from the subject, wherein the presence of NRF2 comprising a deletion of all or a portion of its exon 2 identifies the subject as having a cancer. In some embodiments, the mRNA expression level is determined by PCR, RT-PCR, RNA-seq, gene expression profiling, serial analysis of gene expression, or microarray analysis. In some embodiments, the method further comprises determining a DNA sequence of the NRF2. In some embodiments, the DNA sequence is determined by PCR, exome-seq, microarray analysis, or whole genome sequencing.
[0017] In some embodiments of any of the preceding aspects, the NRF2 further comprises a deletion of all or a portion of its exon 3.
[0018] In another aspect, the invention features a method of diagnosing a cancer in a subject, the method comprising determining the protein expression level of NRF2 comprising a deletion of all or a portion of its Neh2 domain in a sample obtained from the subject, wherein the presence of NRF2 comprising a deletion of all or a portion of its Neh2 domain identifies the subject as having a cancer.
[0019] In another aspect, the invention features a method of identifying a subject having cancer, the method comprising determining the protein expression level of NRF2 comprising a deletion of all or a portion of its Neh2 domain in a sample obtained from the subject, wherein the presence of NRF2 comprising a deletion of all or a portion of its Neh2 domain identifies the subject as having a cancer.
[0020] In some embodiments of any of the preceding aspects, the NRF2 further comprises a deletion in all or a portion of its Neh4 domain. In some embodiments, the protein expression is determined by western blot, immunohistochemistry, or mass spectrometry.
[0021] In some embodiments, the method further comprises administering to the subject a therapeutically effective amount of a NRF2 pathway antagonist. In some embodiments, the method further comprises administering to the subject a therapeutically effective amount of an anti-cancer agent. In other embodiments, the method comprises administering an anti-cancer agent and a NRF2 pathway antagonist. In some embodiments, the anti-cancer agent and the NRF2 pathway antagonist are co-administered. In other embodiments, the anti-cancer agent and the NRF2 pathway antagonist are sequentially administered. In some embodiments, the anti-cancer agent is selected from the group consisting of an anti-angiogenic agent, a chemotherapeutic agent, a growth inhibitory agent, a cytotoxic agent, and an immunotherapy. In some embodiments, the anti-angiogenic agent is a VEGF antagonist. In some embodiments, the NRF2 pathway antagonist is selected from the group consisting of a CREB antagonist, a CREB Binding Protein (CBP) antagonist, a Maf antagonist, an activating transcription factor 4 (ATF4) antagonist, a protein kinase C (PKC) antagonist, a Jun antagonist, a glucocorticoid receptor antagonist, a UbcM2 antagonist, a HACE1 antagonist, a c-Myc agonist, a SUMO agonist, a KEAP1 agonist, a CUL3 agonist, or a retinoic acid receptor .alpha. (RAR.alpha.) agonist.
[0022] In another aspect, the invention features a method of treating a subject having a cancer, the method comprising administering to the subject a therapeutically effective amount of a NRF2 pathway antagonist, wherein the expression level of at least one (e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, or 27) of the following genes AKR1B10, AKR1C2, SRXN1, OSGIN1, FECH, GCLM, TRIM16, ME1, KYNU, CABYR, SLC7A11, TRIM16L, AKR1C4, CYP4F11, RSPO3, ABCC2, AKR1B15, NR0B1, UGDH, TXNRD1, GSR, AKR1C3, TALDO1, PGD, TXN, NQO1, and FTL in a sample obtained from the subject has been determined to be increased relative to a reference expression level of the at least one gene. In other embodiments, the expression level of at least two (e.g., 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, or 27) genes selected from the group consisting of AKR1B10, AKR1C2, SRXN1, OSGIN1, FECH, GCLM, TRIM16, ME1, KYNU, CABYR, SLC7A11, TRIM16L, AKR1C4, CYP4F11, RSPO3, ABCC2, AKR1B15, NR0B1, UGDH, TXNRD1, GSR, AKR1C3, TALDO1, PGD, TXN, NQO1, and FTL in a sample obtained from the subject is determined. In other embodiments, the expression level of at least three (e.g., 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, or 27) genes selected from the group consisting of AKR1B10, AKR1C2, SRXN1, OSGIN1, FECH, GCLM, TRIM16, ME1, KYNU, CABYR, SLC7A11, TRIM16L, AKR1C4, CYP4F11, RSPO3, ABCC2, AKR1B15, NR0B1, UGDH, TXNRD1, GSR, AKR1C3, TALDO1, PGD, TXN, NQO1, and FTL in a sample obtained from the subject is determined. In other embodiments, the expression level of at least four (e.g., 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, or 27) genes selected from the group consisting of AKR1B10, AKR1C2, SRXN1, OSGIN1, FECH, GCLM, TRIM16, ME1, KYNU, CABYR, SLC7A11, TRIM16L, AKR1C4, CYP4F11, RSPO3, ABCC2, AKR1B15, NR0B1, UGDH, TXNRD1, GSR, AKR1C3, TALDO1, PGD, TXN, NQO1, and FTL in a sample obtained from the subject is determined. In other embodiments, the expression level of AKR1B10, AKR1C2, SRXN1, OSGIN1, FECH, GCLM, TRIM16, ME1, KYNU, CABYR, SLC7A11, TRIM16L, AKR1C4, CYP4F11, RSPO3, ABCC2, AKR1B15, NR0B1, UGDH, TXNRD1, GSR, AKR1C3, TALDO1, PGD, TXN, NQO1, and FTL in a sample obtained from the subject is determined.
[0023] In some embodiments, the expression level of one or more (e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, or 21) of AKR1B10, AKR1C2, SRXN1, OSGIN1, FECH, GCLM, TRIM16, KYNU, CABYR, SLC7A11, TRIM16L, AKR1C4, NR0B1, UGDH, TXNRD1, GSR, AKR1C3, TALDO1, PGD, TXN, or NQO1 is determined. In other embodiments, the expression level of one or more (e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, or 12) of AKR1B10, AKR1C2, ME1, KYNU, CABYR, TRIM16L, AKR1C4, CYP4F11, RSPO3, AKR1B15, NR0B1, and AKR1C3 is determined.
[0024] In some embodiments, (a) the expression level of at least two genes in the sample is an average of the at least two genes of the sample; (b) the reference expression level of the at least two genes is an average of the at least two genes of the reference; and (c) the average of the at least two genes of the sample is compared to the average of the at least two genes of the reference. In some embodiments, the reference expression level is the mean level of expression of the at least one gene in a population of subjects. In some embodiments, the population of subjects is a population of subjects sharing a common ethnicity. In some embodiments, the reference expression level is the mean level of expression of the at least one gene in a population of subjects having cancer.
[0025] In some embodiments, the lung cancer is a non-small cell lung cancer (NSCLC), e.g., squamous NSCLC.
[0026] In some embodiments, the expression level is an mRNA expression level. In some embodiments, the mRNA expression level is determined by PCR, RT-PCR, RNA-seq, gene expression profiling, serial analysis of gene expression, or microarray analysis. In some embodiments, the mRNA expression level is determined by RNA-seq.
[0027] In some embodiments, the method further comprises determining a DNA sequence of the NRF2 (e.g., by PCR, exome-seq, microarray analysis, or whole genome sequencing).
[0028] In some embodiments, the expression level is a protein expression level. In some embodiments, the protein expression is determined by western blot, immunohistochemistry, or mass spectrometry.
[0029] In another aspect, the invention features a method of treating a subject having a cancer, the method comprising: (a) determining the mRNA expression level of NRF2 comprising a deletion of all or a portion of its exon 2 in a sample obtained from the subject, wherein the presence of NRF2 mRNA comprising a deletion of all or a portion of its exon 2 identifies the subject as having a cancer; and (b) administering to the subject a therapeutically effective amount of a NRF2 pathway antagonist.
[0030] In some embodiments, the mRNA expression is determined by PCR, RT-PCR, RNA-seq, gene expression profiling, serial analysis of gene expression, or microarray analysis. In some embodiments, the mRNA expression is determined by RNA-seq. In some embodiments, the method further comprises determining a DNA sequence of the NRF2 (e.g., by PCR, exome-seq, microarray analysis, or whole genome sequencing).
[0031] In another aspect, the invention features a method of treating a subject having a cancer, the method comprising: (a) determining a DNA sequence of NRF2 comprising a deletion of all or a portion of its exon 2 in a sample obtained from the subject, wherein the presence of NRF2 DNA comprising a deletion of all or a portion of its exon 2 identifies the subject as having a cancer; and (b) administering to the subject a therapeutically effective amount of a NRF2 pathway antagonist. In some embodiments, the DNA sequence is determined by PCR, exome-seq, microarray analysis, or whole genome sequencing. In some embodiments, the NRF2 (e.g., mRNA or DNA) further comprises a deletion in all or a portion of its exon 3.
[0032] In another aspect, the invention features a method of treating a subject having a cancer, the method comprising: (a) determining the protein expression level of NRF2 comprising a deletion of all or a portion of its Neh2 in a sample obtained from the subject, wherein the presence of NRF2 protein comprising a deletion of all or a portion of its Neh2 identifies the subject as having a cancer; and (b) administering to the subject a therapeutically effective amount of a NRF2 pathway antagonist.
[0033] In some embodiments, the NRF2 protein further comprises a deletion of all or a portion of its Neh4 domain. In some embodiments, the protein expression is determined by western blot, immunohistochemistry, or mass spectrometry. In some embodiments, the method further comprises determining a DNA sequence of the NRF2 (e.g., by PCR, exome-seq, microarray analysis, or whole genome sequencing).
[0034] In some embodiments, the method comprises administering to the subject a therapeutically effective amount of an anti-cancer agent. In some embodiments, the anti-cancer agent and the NRF2 pathway antagonist are co-administered. In other embodiments, the anti-cancer agent and the NRF2 pathway antagonist are sequentially administered. In some embodiments, the anti-cancer agent is selected from the group consisting of an anti-angiogenic agent, a chemotherapeutic agent, a growth inhibitory agent, a cytotoxic agent, and an immunotherapy. In some embodiments, the anti-angiogenic agent is a VEGF antagonist. In some embodiments, the NRF2 pathway antagonist is selected from the group consisting of a CREB antagonist, a CREB Binding Protein (CBP) antagonist, a Maf antagonist, an activating transcription factor 4 (ATF4) antagonist, a protein kinase C (PKC) antagonist, a Jun antagonist, a glucocorticoid receptor antagonist, a UbcM2 antagonist, a HACE1 antagonist, a c-Myc agonist, a SUMO agonist, a KEAP1 agonist, a CUL3 agonist, or a retinoic acid receptor .alpha. (RAR.alpha.) agonist.
[0035] In some embodiments, the sample obtained from the subject is a tumor sample, e.g., from a biopsy sample. In some embodiments, the sample is obtained from a previously untreated subject. In some embodiments, the subject has a lung cancer (e.g., non-small cell lung cancer (NSCLC), e.g., squamous NSCLC) or a head and neck cancer (e.g., squamous head and neck cancer).
BRIEF DESCRIPTION OF THE DRAWINGS
[0036] FIG. 1A is a plot showing 96 lung cancer cell lines subjected to RNA-seq, exome-seq, and SNP array analysis. Alterations in KRAS, TP53, KEAP1, EGFR, STK11, NFE2L2, and NF1 are shown.
[0037] FIG. 1B is a protein sequence representation showing point mutations in the NFE2L2 (NRF2) gene.
[0038] FIG. 1C is a protein sequence representation showing point mutations in the KEAP1 gene.
[0039] FIG. 1D is an image of the crystal structure of the KEAP1/NRF2 peptide complex.
[0040] FIG. 2A is a volcano plot illustrating the ratios of average expression levels for all genes in mutant (n=25) versus wild-type (WT) (n=74) KEAP1 NSCLC cell lines and the associated adjusted p-values resulting from the differential expression analysis. Significantly differentially expressed genes (>2-fold, p<0.01) are indicated, and gene sets previously identified as NRF2 targets are identified as black dots.
[0041] FIG. 2B is a heatmap showing the results of unsupervised ward clustering showing the upregulation of the 27 genes associated with KEAP1 mutations in NSCLC cell lines.
[0042] FIG. 3A is a heatmap showing the results of an unsupervised ward clustering showing that the NSCLC cell line-derived KEAP1 gene signature classified 32 of the 40 (80%) KEAP1 mutant lung adenocarcinomas from the cancer genome atlas (TCGA).
[0043] FIG. 3B is a heatmap showing the results of an unsupervised ward clustering showing that the NSCLC cell line-derived KEAP1 gene signature classifies 19 of 22 (86%) KEAP1 mutant and 27 of 27 (100%) NRF2 mutant lung squamous cell carcinomas from TCGA.
[0044] FIG. 4 is a graph showing the relative abundance of protein products of the KEAP1 gene signature in mutant (n=6) and WT (n=37) NSCLC cell lines.
[0045] FIG. 5 is a heatmap indicating the frequency of recurrent splice alterations seen in 19 tumor indications.
[0046] FIG. 6 shows the NRF2 exons and splice junctions predicted from RNA-seq data. Predicted features consistent with two annotated refGene transcripts are shown in gray. Identified exon-exon junctions corresponding to skip of exon 2 (J2, J5) or exon 2+3 (J3, J6) are shown in black and gray, respectively. A heatmap illustrates read evidence for exon-exon junctions (columns) across 482 TCGA lung squamous carcinoma (rows) on an FPKM scale after log 2(x+1) transformation.
[0047] FIG. 7 is a schematic depicting the effect of splice alterations in EGFR, NRF2, MET, and CTNNB1 on protein structure. Arrows indicate in-frame deletions as the result of the splice alteration.
[0048] FIG. 8A is a Venn diagram illustrating the mutual exclusive occurrence of NRF2 splice alteration and mutation in KEAP1 or NRF2 in squamous NSCLC.
[0049] FIG. 8B is a heatmap showing clustering of squamous NSCLC based on 27 candidate NRF2 target genes. Mutation status and NRF2 splice alteration are indicated for each sample.
[0050] FIG. 9A is a Venn diagram illustrating the mutual exclusive occurrence of NRF2 splice alteration and mutation in KEAP1 or NRF2 in head and neck cancers.
[0051] FIG. 9B is a heatmap showing clustering of head and neck cancers based on 27 candidate NRF2 target genes. Mutation status and NRF2 splice alteration are indicated for each sample.
[0052] FIG. 10 is a graph showing the presence of junction reads skipping exon 2 in KMS-27 and JHH-6 cells, as quantified by RNA-seq.
[0053] FIG. 11A is a schematic diagram showing the locations of exons within WT and exon 2-deleted NRF2 (.DELTA.e2 NRF2) mRNA, in relation to forward and reverse primers derived from exon 1 and exons 3/4, indicated by right-hand facing and left-hand facing arrows, respectively.
[0054] FIG. 11B is a series of agarose gel images showing RNA products amplified from total RNA of normal leucocytes, JHH-6 cells, and KMS-27 cells, by RT-PCR. Regions surrounding NRF2 exon 2 were amplified with the indicated primers. Fragments from wild-type NRF2, .DELTA.e2 NRF2, and primer dimers are indicated. Bands indicating the presence of .DELTA.e2 NRF2 RNA are visible in JHH-6 and KMS-27 cells.
[0055] FIG. 12A is a graph showing the sequencing results of the PCR products from JHH-6 and KMS-27 cells, indicating the deletion of exon 2 in NRF2.
[0056] FIG. 12B shows the nucleic acid and amino acid sequences for .DELTA.e2 NRF2.
[0057] FIG. 12C shows the nucleic acid and amino acid sequences for wild-type NRF2. The exon 2 sequence is shaded.
[0058] FIG. 13 shows the results of a Western blot experiment indicating the relative expression of phosphorylated NRF2, wild-type NRF2, and .DELTA.e2 NRF2 by HUH-1, JHH-6, and HuCCT1 cells. Protein lysates from the indicated cell lines were separated by SDS PAGE. * represents a likely non-specific band as it is not depleted by NRF2 siRNA transfection.
[0059] FIG. 14A shows the results of a Western blot experiment indicating the relative expression of phosphorylated NRF2, wild-type NRF2, and .DELTA.e2 NRF2 by HUH-1, JHH-6, and HuCCT1 cells in the presence and absence of lambda phosphatase (.lamda. P'tase). Cells were grown in 6-well dishes and treated with 100 .mu.g/ml cyclohexamide (CHX) for the indicated times. The lysates were either incubated with buffer or 400 units lambda phosphatase for 30 min, before separation by SDS PAGE and Western blotting with NRF2 antibodies.
[0060] FIG. 14B is a graph showing the stability of NRF2 protein expressed by HuCCT1 cells (circles), JHH-6 cells (squares), and HUH1 cells (triangles) in the presence of CHX. Band intensities from the results shown in FIG. 14A were quantified and fitted to a one-phase decay curve to obtain protein half-life estimates, which are indicated next to each curve. Relative protein expression was taken as a percent of initial concentration of each cell line.
[0061] FIG. 14C shows the results of a Western blot experiment indicating the relative expression of NRF2 and .DELTA.e2 NRF2 by HUH-1, JHH-6, and HuCCT1 cells after transfection with either siNTC (50 nM) or siKEAP1 (50 nM). Cells were grown in 6-well dishes and treated with 100 .mu.g/ml cyclohexamide (CHX) for the indicated times. The lysates were incubated with 400 units lambda phosphatase for 30 minutes, before separation by SDS PAGE and Western blotting with NRF2 antibodies.
[0062] FIG. 14D is a graph showing the stability of NRF2 protein expressed by HuCCT1 cells (circles), JHH-6 cells (squares), and HUH1 cells (triangles) in the presence of CHX after transfection with siNTC (solid lines) or siKEAP1 (dashed lines). Band intensities from the results shown in FIG. 14C were quantified and fitted to a one-phase decay curve to obtain protein half-life estimates, which are indicated next to each curve. Relative protein expression was taken as a percent of initial concentration of each cell line.
[0063] FIG. 15 shows the results of a Western blot experiment indicating the expression of .DELTA.e2 NRF2 by KMS-27 cells. 20 .mu.g lysates from HCC-1354, KMS-27, and HuCCT1 cells were prepared, and for all except HuCCT1 treated with .lamda. P'tase. Untreated and treated lysates were then subjected to SDS PAGE, and NRF2 and actin were detected.
[0064] FIG. 16 shows the results of a Western blot experiment indicating nuclear localization of NRF2. HuCCT1, HUH-1, and JHH-6 cells were grown in 10 cm dishes and partitioned into nuclear and cytosol fractions. Fractions were separated by SDS PAGE and NRF2 was visualized. Nuclear and cytosolic purity was estimated using Hsp90 as a cytosolic marker and HDAC2 as a nuclear marker.
[0065] FIG. 17A is a graph showing the expression of the 27 signature NRF2 target genes of the KEAP1 gene signature (each displayed on the x-axis) in 16 hepatocellular carcinoma cell lines (represented by black squares, filled gray circles, and open gray circles) using RNA-seq data described in Klijn et al. (Nat Biotechnol. 33(3):306-312, 2014). Filled gray circles represent mutant KEAP1 liver cancer cell lines, and open gray circles represent the JHH-6 cell line.
[0066] FIG. 17B is a graph showing the expression of the 27 signature NRF2 target genes of the KEAP1 gene signature (each displayed on the x-axis) in 18 multiple myeloma cell lines (represented by black squares and open gray circles) using RNA-seq data described in Klijn et al. (Nat. Biotechnol. 33(3):306-312, 2014). Open gray circles represent the KMS-27 cell line.
[0067] FIG. 18A is a bar graph showing the NRF2 target gene score (mean z-scores for the 27 NRF2 target genes determined over the full data set) in the 16 hepatocellular carcinoma cell lines. KEAP1 and NRF2 alterations are indicated as filled and outlined boxes, respectively.
[0068] FIG. 18B is a bar graph showing the NRF2 target gene score (mean z-scores for the 27 NRF2 target genes determined over the full data set) in the 18 multiple myeloma cell lines. The outlined box indicates a NRF2 alteration.
[0069] FIG. 19 is a bar graph showing the viability of HUH-1, JHH-6, and HuCCT1 cells in the presence or absence of siRNAs targeting NRF2. Cells were seeded into 96-well plates containing either a non-targeted siRNA control (NTC), or siRNA targeting NRF2 (NRF2). Viability was measured 4 days later using CellTiter-Glo. Viability is presented as a percentage of NTC luminescence.
[0070] FIG. 20 is a series of bar graphs showing the effect of transfection reagents on relative NRF2 expression by HUH-1, JHH-6, and HuCCT1 cells. Cells were grown in 6-well dishes and transfected with siRNA targeting NRF2 exon 5 of NRF2. Total RNA was isolated after 48 hours, and NRF2 expression was measured using Taqman probes targeting exon 5.
[0071] FIG. 21 is a series of bar graphs showing the effect of transfection reagents on four well-characterized NRF2 target genes, SLC7A11, GCLC, NR0B1, and SGRN, expressed by HUH-1 cells (dark gray shaded bars), JHH-6 cells (light gray shaded bars), and HuCCT1 cells (black shaded bars). Cells were grown in 6-well dishes and transfected with siRNA targeting NRF2exon 5 of NRF2, or non-targeted siRNA (NTC). Total RNA was isolated after 48 hours, and gene expression was measured using Taqman probes targeting the indicated NRF2 target genes.
[0072] FIG. 22 is a series of representative FACS histograms showing the effect of NRF2 targeting siRNA on DNA fragmentation in HUH-1, JHH-6, and HuCCT1 cells. Cells were treated with staurosporin as a positive control.
[0073] FIG. 23 is a set of immunoblots showing the effect of NRF2 exon 2 and exon 2+3 deletions on KEAP1 interaction. 293 cells were transfected with plasmids expressing FLAG-NRF2, .DELTA.e2 FLAG-NRF2, .DELTA.e2+3 FLAG-NRF2 or HA-KEAP1. 48 hours after transfection, cells were lysed, and either lysates (top gel) or anti-FLAG immunoprecipitations were analyzed by Western blotting using the indicated antibodies.
[0074] FIG. 24A is a set of immunoblots showing the effect of cyclohexamide on NRF2 stability. 293 cells were transfected with the same plasmids as described in FIG. 23 and treated with 100 .mu.g/ml cycloheximide (CHX) for the indicated times. Cells were lysed and separated by SDS PAGE, and Western blotted using NRF2 and anti-actin antibodies.
[0075] FIG. 24B is a graph showing the stability of truncated NRF2 following KEAP1 expression over time.
[0076] FIG. 25 is a series of bar graphs showing the expression of various NRF2 target genes under various transfection conditions. Cells were treated as in FIGS. 24A-24B but harvested for total RNA, which was used to analyze the expression of the indicated genes using Taqman RT-PCR.
[0077] FIGS. 26A-1 to 26B-2 are a series of graphs showing the mRNA expression levels of indicated NRF2 target genes in TCGA squamous NSCLC tumors, plotted according to mutation status of KEAP1 and NRF2. Individual graphs show mRNA expression levels of NQO1 (FIG. 26A-1), SLC7A11 (FIG. 26B-1), KYNU (FIG. 26C-1), FECH (FIG. 26D-1), CABYR (FIG. 26E-1), GCLM (FIG. 26F-1), TXN (FIG. 26G-1), AKR1C4 (FIG. 26H-1), AKR1C3 (FIG. 26I-1), TXNRD1 (FIG. 26J-1), SRXN1 (FIG. 26K-1), GPX2 (FIG. 26L-1), AKR1C2 (FIG. 26M-1), OSGIN1 (FIG. 26N-1), TRIM16 (FIG. 26O-1), NR0B1 (FIG. 26P-1), GSR (FIG. 26Q-1), AKR1B10 (FIG. 26R-1), TRIM16L (FIG. 26S-1), PGD (FIG. 26T-1), ME1 (FIG. 26U-1), FTL (FIG. 26V-1), RSPO3 (FIG. 26W-1), CYP4F11 (FIG. 26X-1), UGDH (FIG. 26Y-1), TALDO1 (FIG. 26Z-1), ABCC2 (FIG. 26A-2), and AKR1B15 (FIG. 26B-2). Only samples for which both exome-seq and RNA-seq data were available were considered. One sample with mutations in both NRF2 and KEAP1 was excluded. In addition, samples with evidence for NRF2 copy number changes |log.sub.2(CAN)|>0.5 were excluded.
[0078] FIG. 27A is an exome-seq graph showing relative NRF2 exon abundance across 808 cancer cell lines, showing a decrease in reads mapping to exon 2.
[0079] FIG. 27B is an exome-seq graph showing normalized z-scores for exon read coverage across 1,218 squamous NSCLC tumors. Eleven tumors showing decreased read count for exon 2 or exon 2+3 are compared to nearby control regions.
[0080] FIG. 28A is a schematic diagram showing the genomic location of discordant read pairs in seven tumors supporting genomic alterations affecting NRF2 exon 2 or exon 2+3.
[0081] FIG. 28B-1 is a set of graphs showing the copy number analyses of chromosome 2 showing two tumor samples with NRF2 exon 2 focal deletions. Arrows point to NRF2 exon 2. The log-ratio of target regions are shown in black and control regions are shown in gray.
[0082] FIG. 28B-2 is a set of graphs showing the copy number analyses of chromosome 2 showing two tumor samples with NRF2 exon 2+3 focal deletions. Arrows point to NRF2 exon 2 and exon 3. The log-ratio of target regions are shown in black and control regions are shown in gray.
[0083] FIG. 28C is a series of whole-genome sequencing graphs showing the presence of microdeletions surrounding NRF2 exon 2 in JHH-6 cells, KMS-27 cells, as well as primary tumor and adjacent matched DNA. The sequences of reads spanning the deletions are shown NRF2NRF2.
[0084] FIG. 29 is a series of agarose gel images showing RNA products amplified from total RNA of select patients with squamous NSCLC. Shown are amplification products from patient #58 tumor tissue, patient #64 tumor tissue, patient #63 normal tissue, and patient #63 tumor tissue by RT-PCR. Regions surrounding NRF2 exon 2 were amplified with the primers indicated in FIG. 11A. Fragments from wild-type NRF2 and .DELTA.e2 NRF2 are indicated. RT-PCR analysis identified patient #63 as having loss of NRF2 exon 2, which was strongly enriched in the tumor compared to the adjacent normal tissue.
[0085] FIG. 30 is a graph showing the presence of junction reads skipping exon 2 in tumor and normal cells, as quantified by RNA-seq.
[0086] FIG. 31 is a histogram of the mutant KEAP1 gene signature score for TCGA samples from lung squamous carcinoma (LUSC). Dark gray histograms represent KEAP1/NRF2 mutant tumors, light gray histograms represent exon 2/3-deleted tumors, and medium gray histograms represent KEAP1/NRF2 wild-type tumors. The gene signature score for a given sample was determined by summation of gene expression z-scores over all genes in the gene signature.
[0087] FIG. 32 is a series of histograms of the mutant KEAP1 gene signature score for TCGA samples from lung squamous carcinoma (LUSC), lung adenoma (LUAD), and head and neck squamous carcinoma (HNSC). Dark gray histograms represent KEAP1/NRF2 mutant tumors, light gray histograms represent exon 2/3-deleted tumors, and medium gray histograms represent KEAP1/NRF2 wild-type tumors. The gene signature score for a given sample was determined by summation of gene expression z-scores over all genes in the gene signature.
[0088] FIG. 33 is a series of histograms of the mutant KEAP1 gene signature score for TCGA samples from lung squamous carcinoma (LUSC), lung adenoma (LUAD), and head and neck squamous carcinoma (HNSC). Dark gray histograms represent tumor samples, and light gray histograms represent normal samples. The gene signature score for a given sample was determined by summation of gene expression z-scores over all genes in the gene signature.
[0089] FIG. 34 is a series of junction read sequences showing the structure of the deletions in JHH-6 cells, KMS-26 cells, and primary tumor, identified by WGS. The DNA sequences of the 3' end, 5' end, and junction read of JHH-6 cells are provided by SEQ ID NOs: 61-63, respectively. The DNA sequences of the 3' end, 5' end, and junction read of KMS-27 cells are provided by SEQ ID NOs: 64-66, respectively. The DNA sequences of the 3' end, 5' end, and junction read of primary tumor cells are provided by SEQ ID NOs: 67-69, respectively.
[0090] FIG. 35 is a series of Western blots showing the relative expression of NRF2. The indicated cell lines were infected with lentiviruses expressing independent non-target control (NTC) or three independent NRF2 shRNA sequences (sh1, sh2, and sh3) and were incubated for 48 hours with (+) or without (-) 500 ng/mL doxycycline (dox) following puromycin selection.
[0091] FIG. 36 is a graph showing the viability of the cell lines shown in FIG. 35 after incubation with or without dox for 7 days. Viability was measured using CellTiter-Glo (CTG) ATP detection. Each circle is the average of six technical replicates, and values were normalized to the average percent viability of three independent NTCs+dox.
[0092] FIG. 37 is a graph showing the viability of cell lines treated with dox vs no dox. Cells were grown for four days and viability measured using CTG ATP measurement. Significance was calculated using Student's t test.
[0093] FIG. 38 is a graph showing the viability of 28 NSCLC cell lines following treatment with NRF2 siRNA relative to NTC treatment. Cells are grouped by KEAP1 genotype. Significance was calculated using Student's t test.
[0094] FIG. 39 is a Western blot experiment showing the expression of NRF2 in KEAP1 mutant tumors. Mice were implanted with A549 cells expressing NRF2 sh10. When tumors reached .about.200 mm.sup.3, 1 mg/ml doxycycline or 5% sucrose was added to the drinking water. After five days, tumor extracts were blotted for NRF2.
[0095] FIG. 40 is a Western blot experiment showing the expression of NRF2 in KEAP1 wild-type tumors. Mice were implanted with H441 cells expressing NRF2 sh10. When tumors reached .about.200 mm.sup.3, 1 mg/ml doxycycline or 5% sucrose was added to the drinking water. After five days, tumor extracts were blotted for NRF2.
[0096] FIG. 41A is a graph showing the kinetics of tumor volume in mice implanted with KEAP1 mutant tumors. Mice were implanted with A549 cell lines expressing NRF2 sh10. When tumors reached .about.200 mm.sup.3, mice were randomized into groups of 10, and either 1 mg/ml doxycycline or 5% sucrose was added to the drinking water. Tumors were measured over a 28-day period. Error bars represent SEM (n=10).
[0097] FIG. 41B is a graph showing the kinetics of tumor volume in mice implanted with KEAP1 wild-type tumors. Mice were implanted with H441 cell lines expressing NRF2 sh10. When tumors reached .about.200 mm.sup.3, mice were randomized into groups of 10, and either 1 mg/ml doxycycline or 5% sucrose was added to the drinking water. Tumors were measured over a 28-day period. Error bars represent SEM (n=10).
[0098] FIG. 42 is a series of bar graphs showing viability of A549 or H441 cells in various growth conditions. A549 and H460 cells expressing NTC or NRF2 sh10 shRNAs were plated into either 2D tissue culture treated plastic dishes or ultra-low attachment (ULA) coated tissue culture plates. They were then cultured for five days in either environmental oxygen concentrations or 0.5% oxygen (hypoxia). Cell viability was assessed by CTG ATP measurements.
[0099] FIG. 43 is a series of photographs showing colony formation of KEAP1 mutant cell lines (A549, H1437, and H460) and KEAP1 wild-type cell lines (H1048, H441, and Calu6) in soft agar treated with vehicle, 500 ng/ml dox, or 1 mM reduced glutathione (GSH). Representative areas of the plate were photographed.
[0100] FIG. 44 is a series of bar graphs showing the quantified colony formation for each cell type and treatment group shown in FIG. 43. Error bars represent standard deviation from biological triplicate wells.
[0101] FIG. 45 is a series of photographs showing A549 colony formation on SCIVAX.RTM. micropatterned nanoculture dishes. Cells were photographed after about five days in culture in the presence or absence of 500 ng/ml dox.
[0102] FIG. 46 is a bar graph showing viability of the cells from FIG. 45, quantified by CTG ATP measurements. The left column of each treatment group represents 1,000-cell cultures, and the right column represents 5,000-cell cultures.
[0103] FIG. 47 is a series of photographs showing 5,000, 50,000, or 500,000 NTC or NRF2sh10 shRNA expressing A549 cells plated in methylcellulose-containing tissue culture dishes. Cells were photographed after .about.10 days of culture in the presence or absence of 500 ng/ml doxycycline.
[0104] FIG. 48 is a series of photographs showing A549 cells expressing NRF2sh10 shRNA plated into regular tissue culture dishes (top) or soft agar (bottom). Cells were treated with either vehicle or 500 ng/ml doxycycline, in the presence or absence of 2 mM N-acetyl cysteine (NAC). Viability in 2D growth was measured after about five days by CTG ATP measurement, and photographs of cells in soft agar were taken after about ten days of growth.
[0105] FIG. 49 is bar graph showing reactive oxygen species (ROS) levels under indicated conditions as measured using 2',7'-dichlorodihydrofluorescein diacetate (H2DCF). Error bars represent standard deviation from triplicate wells.
[0106] FIG. 50 is a Western blot experiment showing the effect on NRF2 knockdown on expression of SLC7A11. A549 cells expressing NRF2 sh10 were treated with vehicle or 500 ng/ml dox for the indicated time points and blotted using SLC7A11 and p-actin antibodies.
[0107] FIG. 51 is a bar graph showing cystine uptake by A549 cells expressing NTC1 or NRF2 sh10 over various concentrations of erastin. A549 cells expressing NTC1 or NRF2 sh10 were incubated with vehicle or dox for 48 hours, then incubated with 0.5 uCi .sup.14C-Cystine for 20 minutes. Cells were lysed and intracellular cystine was measured by liquid scintillation counting.
[0108] FIG. 52 is a bar graph showing glutathione (GSH) levels in A549 and H1437 cells in response to NRF2 knockdown.
[0109] FIG. 53 is a histogram showing increasing ROS levels in response to shNRF2 and/or erastin, as measured by H2DCF.
[0110] FIG. 54 is a graph showing viability of A549 cells expressing shNTC or shNFR2 over a dose response of erastin after about four days, as measured using CTG ATP measurements.
[0111] FIG. 55A is a graph showing the IC.sub.50 of erastin on KEAP1 wild-type cell lines versus KEAP1 mutant cell lines, derived from a dose response graph as shown in FIG. 54.
[0112] FIG. 55B is a graph showing the viability of KEAP1 wild-type cell lines versus KEAP1 mutant cell lines in response to erastin, as area under the curve of a dose response graph as shown in FIG. 54.
[0113] FIG. 56A is a graph showing the IC.sub.50 of the glutaminase inhibitor BPTES on KEAP1 wild-type cell lines versus KEAP1 mutant cell lines.
[0114] FIG. 56B is a graph showing the viability of KEAP1 wild-type cell lines versus KEAP1 mutant cell lines in response to the glutaminase inhibitor BPTES.
[0115] FIG. 57A is a graph showing the IC.sub.50 of the glutathione synthase inhibitor buthionine sylphoximine (BSO) on KEAP1 wild-type cell lines versus KEAP1 mutant cell lines.
[0116] FIG. 57B is a graph showing the viability of KEAP1 wild-type cell lines versus KEAP1 mutant cell lines in response to BSO.
[0117] FIG. 58 is a scatterplot showing average gRNA expression per indicated gene in KEAP1 mutant NSCLC cells grown for 15 days in a 3D methylcellulose culture versus a 2D plastic tissue culture dish. A549 cells were infected with lentivirus (0.3 MOI at 1000.times. coverage) expressing a gRNA library comprising 481 NRF2/KEAP1 target genes and 37 control genes. Puromycin-resistant cells were then plated into 2D plastic tissue culture dishes or grown in methyl cellulose. After various time points, cells were collected and gRNAs identified by Next Gen sequencing.
[0118] FIG. 59 is a scatterplot showing average gRNA expression per indicated gene in KEAP1 mutant NSCLC cells implanted in nude mice (xeno) versus grown for 15 days in a 2D plastic tissue culture dish. A549 cells were infected with lentivirus (0.3 MOI at 1000.times. coverage) expressing a gRNA library comprising 481 NRF2/KEAP1 target genes and 37 control genes. Puromycin-resistant cells were then plated into 2D plastic tissue culture dishes or implanted into nude mice. After various time points, cells were collected and gRNAs identified by Next Gen sequencing.
[0119] FIG. 60 is a scatterplot showing average gRNA expression per indicated gene in KEAP1 mutant NSCLC cells implanted in nude mice (xeno) versus grown for 15 days in a 3D methylcellulose culture. A549 cells were infected with lentivirus (0.3 MOI at 1000.times. coverage) expressing a gRNA library comprising 481 NRF2/KEAP1 target genes and 37 control genes. Puromycin-resistant cells were then grown in methyl cellulose or implanted into nude mice. After various time points, cells were collected and gRNAs identified by Next Gen sequencing.
[0120] FIG. 61 is a graph showing kinetics of A549 xenograft tumor volume in response to treatment with the Erb2 antibody YW57.88.5.
[0121] FIG. 62 is a series of photographs showing colony formation of KEAP1 mutant cell lines and KEAP1 wild-type cell lines grown in soft agar (anchorage independent conditions) in the presence of IGF1R inhibitors linsitinib and NVP-AEW541, and in the presence or absence of glutathione.
DETAILED DESCRIPTION OF EMBODIMENTS OF THE INVENTION
I. Introduction
[0122] The present invention provides diagnostic and accompanying therapeutic methods for cancer, such as lung cancer (e.g., NSCLC) or head and neck squamous cancer (e.g., HNSC). The invention is based, at least in part, on the discovery that splice variants in NRF2 that remove exon 2 or exons 2+3 result in an unexpected mechanism for conferring NRF2 activation in cancers. The NRF2 splice variants result in NRF2 activation by a mutually exclusive mechanism from mutations in KEAP1 or NRF2, yet result in a similar NRF2 target gene expression profile. In cell lines with microdeletions that result in these NRF2 splice variants, there is a loss of NRF2-KEAP1 interaction, increased NRF2 stabilization, induction of a NRF2 transcriptional response, and NRF2 pathway dependency. This occurs in 3-6% of squamous NSCLC and 1-2% of HNSC and results in a similar activation of NRF2 target genes and dependency on the pathway as KEAP1 mutations.
[0123] This discovery is useful for diagnosing a subject suffering from cancer (e.g., by detecting a NRF2 splice variant or by detecting a gene or protein expression profile consistent with the presence of a NRF2 splice variant) and for treating a subject according to such a diagnosis (e.g., by administering a therapeutically effective amount of a NRF2 pathway antagonist, e.g., a cAMP Responsive Element Binding Protein (CREB) Binding Protein (CBP) inhibitor).
II. Definitions
[0124] The terms "diagnose," "diagnosing," or "diagnosis" are used herein to refer to the identification or classification of a molecular or pathological state, disease or condition (e.g., cancer). For example, "diagnosis" may refer to identification of a particular type of cancer. "Diagnosis" may also refer to the classification of a particular subtype of cancer, e.g., by histopathological criteria, or by molecular features (e.g., a subtype characterized by expression of one or a combination of biomarkers (e.g., particular genes or proteins encoded by said genes)).
[0125] The terms "cancer" and "cancerous" refer to or describe the physiological condition in mammals that is typically characterized by unregulated cell growth. Included in this definition are benign and malignant cancers as well as dormant tumors or micrometastatses. Examples of cancer include, but are not limited to, carcinoma, lymphoma, blastoma, glioblastoma, sarcoma, and leukemia. Cancers may include, for example, breast cancer, squamous cell cancer, lung cancer (including small-cell lung cancer, non-small cell lung cancer (NSCLC), adenocarcinoma of the lung, and squamous carcinoma of the lung (e.g., squamous NSCLC)), various types of head and neck cancer (e.g., HNSC), cancer of the peritoneum, hepatocellular cancer, gastric or stomach cancer (including gastrointestinal cancer), pancreatic cancer, ovarian cancer, cervical cancer, liver cancer, bladder cancer, hepatoma, colon cancer, colorectal cancer, endometrial or uterine carcinoma, salivary gland carcinoma, kidney or renal cancer, liver cancer, prostate cancer, vulval cancer, thyroid cancer, and hepatic carcinoma, as well as B-cell lymphoma (including low grade/follicular non-Hodgkin's lymphoma (NHL), small lymphocytic (SL) NHL, intermediate grade/follicular NHL, intermediate grade diffuse NHL, high grade immunoblastic NHL, high grade lymphoblastic NHL, high grade small non-cleaved cell NHL, bulky disease NHL, mantle cell lymphoma, AIDS-related lymphoma, and Waldenstrom's Macroglobulinemia), chronic lymphocytic leukemia (CLL), acute lymphoblastic leukemia (ALL), hairy cell leukemia, chronic myeloblastic leukemia, and post-transplant lymphoproliferative disorder (PTLD), as well as abnormal vascular proliferation associated with phakomatoses, edema (such as that associated with brain tumors), and Meigs' syndrome.
[0126] A "patient" or "subject" herein refers to any single animal (including, e.g., a mammal, such as a dog, a cat, a horse, a rabbit, a zoo animal, a cow, a pig, a sheep, a non-human primate, and a human), such as a human, eligible for treatment who is experiencing or has experienced one or more signs, symptoms, or other indicators of a disease or disorder, such as a cancer. Intended to be included as a patient are any patients involved in clinical research trials not showing any clinical sign of disease, patients involved in epidemiological studies, or patients once used as controls. The patient may have been previously treated with a NRF2 pathway antagonist or another drug, or not so treated. The patient may be naive to an additional drug(s) being used when the treatment herein is started, i.e., the patient may not have been previously treated with, for example, a therapy other than a NRF2 pathway antagonist (e.g., a VEGF antagonist or a PD-1 axis binding antagonist) at "baseline" (i.e., at a set point in time before the administration of a first dose of a NRF2 pathway antagonist in the treatment method herein, such as the day of screening the subject before treatment is commenced). Such "naive" patients or subjects are generally considered to be candidates for treatment with such additional drug(s).
[0127] The terms "level of expression" or "expression level" in general are used interchangeably and generally refer to the amount of a biomarker in a biological sample. "Expression" generally refers to the process by which information (e.g., gene-encoded and/or epigenetic information) is converted into the structures present and operating in the cell. Therefore, as used herein, "expression" may refer to transcription into a polynucleotide, translation into a polypeptide, or even polynucleotide and/or polypeptide modifications (e.g., posttranslational modification of a polypeptide). Fragments of the transcribed polynucleotide, the translated polypeptide, or polynucleotide and/or polypeptide modifications (e.g., post-translational modification of a polypeptide) shall also be regarded as expressed whether they originate from a transcript generated by alternative splicing or a degraded transcript, or from a post-translational processing of the polypeptide, e.g., by proteolysis. "Expressed genes" include those that are transcribed into a polynucleotide as mRNA and then translated into a polypeptide, and also those that are transcribed into RNA but not translated into a polypeptide (for example, transfer and ribosomal RNAs).
[0128] The terms "biomarker" and "marker" are used interchangeably herein to refer to a DNA, RNA, protein, carbohydrate, or glycolipid-based molecular marker, the expression or presence of which in a subject's or patient's sample can be detected by standard methods (or methods disclosed herein). Such biomarkers include, but are not limited to, the mRNA sequences set forth in Table 1 and encoded proteins thereof. Expression of such a biomarker may be determined to be higher or lower in a sample obtained from a patient sensitive or responsive to a NRF2 pathway antagonist than a reference level (including, e.g., the average (e.g., mean or median) expression level of the biomarker in a sample from a group/population of patients, e.g., patients having cancer, and being tested for responsiveness to a NRF2 pathway antagonist; the median expression level of the biomarker in a sample from a group/population of patients, e.g., patients having cancer, and identified as not responding to NRF2 pathway antagonists; the level in a sample previously obtained from the individual at a prior time; or the level in a sample from a patient who received prior treatment with a NRF2 pathway antagonist in a primary tumor setting, and who now may be experiencing metastasis). Individuals having an expression level that is greater than or less than the reference expression level of at least one gene, such as those set forth in Table 1 can be identified as subjects/patients likely to respond to treatment with a NRF2 pathway antagonist. For example, such subjects/patients who exhibit gene expression levels at the most extreme 50%, 45%, 40%, 35%, 30%, 25%, 20%, 15%, 10%, or 5% relative to (i.e., higher or lower than) the reference level (such as the mean level), can be identified as subjects/patients (e.g., patients having cancer) likely to respond to treatment with a NRF2 pathway antagonist.
TABLE-US-00001 TABLE 1 SEQ ID NO Biomarker 1 ABCC2 2 AKR1B10 3 AKR1B15 4 AKR1C2 5 AKR1C3 6 AKR1C4 7 CABYR 8 CYP4F11 9 FECH 10 FTL 11 GCLM 12 GSR 13 KYNU 14 ME1 15 NRF2/NFE2L2 16 NQO1 17 NR0B1 18 OSGIN1 19 PGD 20 RSPO3 21 SLC7A11 22 SRXN1 23 TALDO1 24 TRIM16 25 TRIM16L 26 TXN 27 TXNRD1 28 UGDH
[0129] The term "ABCC2" as used herein, refers to any native ABCC2 (ATP-Binding Cassette Sub-Family C, Member 2) from any vertebrate source, including mammals such as primates (e.g., humans) and rodents (e.g., mice and rats), unless otherwise indicated. The term encompasses "full-length," unprocessed ABCC2 as well as any form of ABCC2 that results from processing in the cell. The term also encompasses naturally occurring variants of ABCC2, e.g., splice variants or allelic variants. The nucleic acid sequence of an exemplary human ABCC2 is set forth in SEQ ID NO: 1. The amino acid sequence of an exemplary protein encoded by human ABCC2 is shown in SEQ ID NO: 33.
[0130] The term "AKR1B10" as used herein, refers to any native AKR1B10 (Aldo-Keto Reductase Family 1, Member B10) from any vertebrate source, including mammals such as primates (e.g., humans) and rodents (e.g., mice and rats), unless otherwise indicated. The term encompasses "full-length," unprocessed AKR1B10 as well as any form of AKR1B10 that results from processing in the cell. The term also encompasses naturally occurring variants of AKR1B10, e.g., splice variants or allelic variants. The nucleic acid sequence of an exemplary human AKR1B10 is set forth in SEQ ID NO: 2. The amino acid sequence of an exemplary protein encoded by human AKR1B10 is shown in SEQ ID NO: 34.
[0131] The term "AKR1B15" as used herein, refers to any native AKR1B15 (Aldo-Keto Reductase Family 1, Member B15) from any vertebrate source, including mammals such as primates (e.g., humans) and rodents (e.g., mice and rats), unless otherwise indicated. The term encompasses "full-length," unprocessed AKR1B15 as well as any form of AKR1B15 that results from processing in the cell. The term also encompasses naturally occurring variants of AKR1B15, e.g., splice variants or allelic variants. The nucleic acid sequence of an exemplary human AKR1B15 is set forth in SEQ ID NO: 3. The amino acid sequence of an exemplary protein encoded by human AKR1B15 is shown in SEQ ID NO: 35.
[0132] The term "AKR1C2" as used herein, refers to any native AKR1C2 (Aldo-Keto Reductase Family 1, Member C2) from any vertebrate source, including mammals such as primates (e.g., humans) and rodents (e.g., mice and rats), unless otherwise indicated. The term encompasses "full-length," unprocessed AKR1C2 as well as any form of AKR1C2 that results from processing in the cell. The term also encompasses naturally occurring variants of AKR1C2, e.g., splice variants or allelic variants. The nucleic acid sequence of an exemplary human AKR1C2 is set forth in SEQ ID NO: 4. The amino acid sequence of an exemplary protein encoded by human AKR1C2 is shown in SEQ ID NO: 36.
[0133] The term "AKR1C3" as used herein, refers to any native AKR1C3 (Aldo-Keto Reductase Family 1, Member C3) from any vertebrate source, including mammals such as primates (e.g., humans) and rodents (e.g., mice and rats), unless otherwise indicated. The term encompasses "full-length," unprocessed AKR1C3 as well as any form of AKR1C3 that results from processing in the cell. The term also encompasses naturally occurring variants of AKR1C3, e.g., splice variants or allelic variants. The nucleic acid sequence of an exemplary human AKR1C3 is set forth in SEQ ID NO: 5. The amino acid sequence of an exemplary protein encoded by human AKR1C3 is shown in SEQ ID NO: 37.
[0134] The term "AKR1C4" as used herein, refers to any native AKR1C4 (Aldo-Keto Reductase Family 1, Member C4) from any vertebrate source, including mammals such as primates (e.g., humans) and rodents (e.g., mice and rats), unless otherwise indicated. The term encompasses "full-length," unprocessed AKR1C4 as well as any form of AKR1C4 that results from processing in the cell. The term also encompasses naturally occurring variants of AKR1C4, e.g., splice variants or allelic variants. The nucleic acid sequence of an exemplary human AKR1C4 is set forth in SEQ ID NO: 6. The amino acid sequence of an exemplary protein encoded by human AKR1C4 is shown in SEQ ID NO: 38.
[0135] The term "CABYR" as used herein, refers to any native CABYR (Calcium Binding Tyrosine-(Y)-Phosphorylation Regulated) from any vertebrate source, including mammals such as primates (e.g., humans) and rodents (e.g., mice and rats), unless otherwise indicated. The term encompasses "full-length," unprocessed CABYR as well as any form of CABYR that results from processing in the cell. The term also encompasses naturally occurring variants of CABYR, e.g., splice variants or allelic variants. The nucleic acid sequence of an exemplary human CABYR is set forth in SEQ ID NO: 7. The amino acid sequence of an exemplary protein encoded by human CABYR is shown in SEQ ID NO: 39.
[0136] The term "CYP4F11" as used herein, refers to any native CYP4F11 (Cytochrome P450, Family 4, Subfamily F, Polypeptide 11) from any vertebrate source, including mammals such as primates (e.g., humans) and rodents (e.g., mice and rats), unless otherwise indicated. The term encompasses "full-length," unprocessed CYP4F11 as well as any form of CYP4F11 that results from processing in the cell. The term also encompasses naturally occurring variants of CYP4F11, e.g., splice variants or allelic variants. The nucleic acid sequence of an exemplary human CYP4F11 is set forth in SEQ ID NO: 8. The amino acid sequence of an exemplary protein encoded by human CYP4F11 is shown in SEQ ID NO: 40.
[0137] The term "FECH" as used herein, refers to any native FECH (Ferrochelatase) from any vertebrate source, including mammals such as primates (e.g., humans) and rodents (e.g., mice and rats), unless otherwise indicated. The term encompasses "full-length," unprocessed FECH as well as any form of FECH that results from processing in the cell. The term also encompasses naturally occurring variants of FECH, e.g., splice variants or allelic variants. The nucleic acid sequence of an exemplary human FECH is set forth in SEQ ID NO: 9. The amino acid sequence of an exemplary protein encoded by human FECH is shown in SEQ ID NO: 41.
[0138] The term "FTL" as used herein, refers to any native FTL (Ferritin, Light Polypeptide) from any vertebrate source, including mammals such as primates (e.g., humans) and rodents (e.g., mice and rats), unless otherwise indicated. The term encompasses "full-length," unprocessed FTL as well as any form of FTL that results from processing in the cell. The term also encompasses naturally occurring variants of FTL, e.g., splice variants or allelic variants. The nucleic acid sequence of an exemplary human FTL is set forth in SEQ ID NO: 10. The amino acid sequence of an exemplary protein encoded by human FTL is shown in SEQ ID NO: 42.
[0139] The term "GCLM" as used herein, refers to any native GCLM (Glutamate-Cysteine Ligase, Modifier Subunit) from any vertebrate source, including mammals such as primates (e.g., humans) and rodents (e.g., mice and rats), unless otherwise indicated. The term encompasses "full-length," unprocessed GCLM as well as any form of GCLM that results from processing in the cell. The term also encompasses naturally occurring variants of GCLM, e.g., splice variants or allelic variants. The nucleic acid sequence of an exemplary human GCLM is set forth in SEQ ID NO: 11. The amino acid sequence of an exemplary protein encoded by human GCLM is shown in SEQ ID NO: 43.
[0140] The term "GSR" as used herein, refers to any native GSR (Glutathione Reductase) from any vertebrate source, including mammals such as primates (e.g., humans) and rodents (e.g., mice and rats), unless otherwise indicated. The term encompasses "full-length," unprocessed GSR as well as any form of GSR that results from processing in the cell. The term also encompasses naturally occurring variants of GSR, e.g., splice variants or allelic variants. The nucleic acid sequence of an exemplary human GSR is set forth in SEQ ID NO: 12. The amino acid sequence of an exemplary protein encoded by human GSR is shown in SEQ ID NO: 44.
[0141] The term "KYNU" as used herein, refers to any native KYNU (Kynureninase) from any vertebrate source, including mammals such as primates (e.g., humans) and rodents (e.g., mice and rats), unless otherwise indicated. The term encompasses "full-length," unprocessed KYNU as well as any form of KYNU that results from processing in the cell. The term also encompasses naturally occurring variants of KYNU, e.g., splice variants or allelic variants. The nucleic acid sequence of an exemplary human KYNU is set forth in SEQ ID NO: 13. The amino acid sequence of an exemplary protein encoded by human KYNU is shown in SEQ ID NO: 45.
[0142] The term "ME1" as used herein, refers to any native ME1 (Malic Enzyme 1, NADP(+)-Dependent, Cytosolic) from any vertebrate source, including mammals such as primates (e.g., humans) and rodents (e.g., mice and rats), unless otherwise indicated. The term encompasses "full-length," unprocessed ME1 as well as any form of ME1 that results from processing in the cell. The term also encompasses naturally occurring variants of ME1, e.g., splice variants or allelic variants. The nucleic acid sequence of an exemplary human ME1 is set forth in SEQ ID NO: 14. The amino acid sequence of an exemplary protein encoded by human ME1 is shown in SEQ ID NO: 46.
[0143] The term "NFE2L2" or "NRF2" as used herein, refers to any native NFE2L2 or NRF2 (Nuclear Factor, Erythroid 2-Like 2) from any vertebrate source, including mammals such as primates (e.g., humans) and rodents (e.g., mice and rats), unless otherwise indicated. The term encompasses "full-length," unprocessed NFE2L2 as well as any form of NFE2L2 that results from processing in the cell. The term also encompasses naturally occurring variants of NFE2L2, e.g., splice variants or allelic variants. The nucleic acid sequence of an exemplary human NFE2L2 is set forth in SEQ ID NO: 15. The amino acid sequence of an exemplary protein encoded by human NFE2L2 is shown in SEQ ID NO: 47.
[0144] The term "NQO1" as used herein, refers to any native NQO1 (NAD(P)H Dehydrogenase, Quinone 1) from any vertebrate source, including mammals such as primates (e.g., humans) and rodents (e.g., mice and rats), unless otherwise indicated. The term encompasses "full-length," unprocessed NQO1 as well as any form of NQO1 that results from processing in the cell. The term also encompasses naturally occurring variants of NQO1, e.g., splice variants or allelic variants. The nucleic acid sequence of an exemplary human NQO1 is set forth in SEQ ID NO: 16. The amino acid sequence of an exemplary protein encoded by human NQO1 is shown in SEQ ID NO: 48.
[0145] The term "NR0B1" as used herein, refers to any native NR0B1 (Nuclear Receptor Subfamily 0, Group B, Member 1) from any vertebrate source, including mammals such as primates (e.g., humans) and rodents (e.g., mice and rats), unless otherwise indicated. The term encompasses "full-length," unprocessed NR0B1 as well as any form of NR0B1 that results from processing in the cell. The term also encompasses naturally occurring variants of NR0B1, e.g., splice variants or allelic variants. The nucleic acid sequence of an exemplary human NR0B1 is set forth in SEQ ID NO: 17. The amino acid sequence of an exemplary protein encoded by human NR0B1 is shown in SEQ ID NO: 49.
[0146] The term "OSGIN1" as used herein, refers to any native OSGIN1 (Oxidative Stress Induced Growth Inhibitor 1) from any vertebrate source, including mammals such as primates (e.g., humans) and rodents (e.g., mice and rats), unless otherwise indicated. The term encompasses "full-length," unprocessed OSGIN1 as well as any form of OSGIN1 that results from processing in the cell. The term also encompasses naturally occurring variants of OSGIN1, e.g., splice variants or allelic variants. The nucleic acid sequence of an exemplary human OSGIN1 is set forth in SEQ ID NO: 18. The amino acid sequence of an exemplary protein encoded by human OSGIN1 is shown in SEQ ID NO: 50.
[0147] The term "PGD" as used herein, refers to any native PGD (Phosphogluconate Dehydrogenase) from any vertebrate source, including mammals such as primates (e.g., humans) and rodents (e.g., mice and rats), unless otherwise indicated. The term encompasses "full-length," unprocessed PGD as well as any form of PGD that results from processing in the cell. The term also encompasses naturally occurring variants of PGD, e.g., splice variants or allelic variants. The nucleic acid sequence of an exemplary human PGD is set forth in SEQ ID NO: 19. The amino acid sequence of an exemplary protein encoded by human PGD is shown in SEQ ID NO: 51.
[0148] The term "RSPO3" as used herein, refers to any native RSPO3 (R-Spondin 3) from any vertebrate source, including mammals such as primates (e.g., humans) and rodents (e.g., mice and rats), unless otherwise indicated. The term encompasses "full-length," unprocessed RSPO3 as well as any form of RSPO3 that results from processing in the cell. The term also encompasses naturally occurring variants of RSPO3, e.g., splice variants or allelic variants. The nucleic acid sequence of an exemplary human RSPO3 is set forth in SEQ ID NO: 20. The amino acid sequence of an exemplary protein encoded by human RSPO3 is shown in SEQ ID NO: 52.
[0149] The term "SLC7A11" as used herein, refers to any native SLC7A11 (Solute Carrier Family 7 (Anionic Amino Acid Transporter Light Chain, Xc-System), Member 11) from any vertebrate source, including mammals such as primates (e.g., humans) and rodents (e.g., mice and rats), unless otherwise indicated. The term encompasses "full-length," unprocessed SLC7A11 as well as any form of SLC7A11 that results from processing in the cell. The term also encompasses naturally occurring variants of SLC7A11, e.g., splice variants or allelic variants. The nucleic acid sequence of an exemplary human SLC7A11 is set forth in SEQ ID NO: 21. The amino acid sequence of an exemplary protein encoded by human SLC7A11 is shown in SEQ ID NO: 53.
[0150] The term "SRXN1" as used herein, refers to any native SRXN1 (Sulfiredoxin 1) from any vertebrate source, including mammals such as primates (e.g., humans) and rodents (e.g., mice and rats), unless otherwise indicated. The term encompasses "full-length," unprocessed SRXN1 as well as any form of SRXN1 that results from processing in the cell. The term also encompasses naturally occurring variants of SRXN1, e.g., splice variants or allelic variants. The nucleic acid sequence of an exemplary human SRXN1 is set forth in SEQ ID NO: 22. The amino acid sequence of an exemplary protein encoded by human SRXN1 is shown in SEQ ID NO: 54.
[0151] The term "TALDO1" as used herein, refers to any native TALDO1 (Transaldolase 1) from any vertebrate source, including mammals such as primates (e.g., humans) and rodents (e.g., mice and rats), unless otherwise indicated. The term encompasses "full-length," unprocessed TALDO1 as well as any form of TALDO1 that results from processing in the cell. The term also encompasses naturally occurring variants of TALDO1, e.g., splice variants or allelic variants. The nucleic acid sequence of an exemplary human TALDO1 is set forth in SEQ ID NO: 23. The amino acid sequence of an exemplary protein encoded by human TALDO1 is shown in SEQ ID NO: 55.
[0152] The term "TRIM16" as used herein, refers to any native TRIM16 (Tripartite Motif Containing 16) from any vertebrate source, including mammals such as primates (e.g., humans) and rodents (e.g., mice and rats), unless otherwise indicated. The term encompasses "full-length," unprocessed TRIM16 as well as any form of TRIM16 that results from processing in the cell. The term also encompasses naturally occurring variants of TRIM16, e.g., splice variants or allelic variants. The nucleic acid sequence of an exemplary human TRIM16 is set forth in SEQ ID NO: 24. The amino acid sequence of an exemplary protein encoded by human TRIM16 is shown in SEQ ID NO: 56.
[0153] The term "TRIM16L" as used herein, refers to any native TRIM16L (Tripartite Motif Containing 16-Like) from any vertebrate source, including mammals such as primates (e.g., humans) and rodents (e.g., mice and rats), unless otherwise indicated. The term encompasses "full-length," unprocessed TRIM16L as well as any form of TRIM16L that results from processing in the cell. The term also encompasses naturally occurring variants of TRIM16L, e.g., splice variants or allelic variants. The nucleic acid sequence of an exemplary human TRIM16L is set forth in SEQ ID NO: 25. The amino acid sequence of an exemplary protein encoded by human TRIM16L is shown in SEQ ID NO: 57.
[0154] The term "TXN" as used herein, refers to any native TXN (Thioredoxin) from any vertebrate source, including mammals such as primates (e.g., humans) and rodents (e.g., mice and rats), unless otherwise indicated. The term encompasses "full-length," unprocessed TXN as well as any form of TXN that results from processing in the cell. The term also encompasses naturally occurring variants of TXN, e.g., splice variants or allelic variants. The nucleic acid sequence of an exemplary human TXN is set forth in SEQ ID NO: 26. The amino acid sequence of an exemplary protein encoded by human TXN is shown in SEQ ID NO: 58.
[0155] The term "TXNRD1" as used herein, refers to any native TXNRD1 (Thioredoxin Reductase 1) from any vertebrate source, including mammals such as primates (e.g., humans) and rodents (e.g., mice and rats), unless otherwise indicated. The term encompasses "full-length," unprocessed TXNRD1 as well as any form of TXNRD1 that results from processing in the cell. The term also encompasses naturally occurring variants of TXNRD1, e.g., splice variants or allelic variants. The nucleic acid sequence of an exemplary human TXNRD1 is set forth in SEQ ID NO: 27. The amino acid sequence of an exemplary protein encoded by human TXNRD1 is shown in SEQ ID NO: 59.
[0156] The term "UGDH" as used herein, refers to any native UGDH (Uridine Diphospho (UDP)-Glucose 6-Dehydrogenase) from any vertebrate source, including mammals such as primates (e.g., humans) and rodents (e.g., mice and rats), unless otherwise indicated. The term encompasses "full-length," unprocessed UGDH as well as any form of UGDH that results from processing in the cell. The term also encompasses naturally occurring variants of UGDH, e.g., splice variants or allelic variants. The nucleic acid sequence of an exemplary human UGDH is set forth in SEQ ID NO: 28. The amino acid sequence of an exemplary protein encoded by human UGDH is shown in SEQ ID NO: 60.
[0157] The terms "sample" and "biological sample" are used interchangeably to refer to any biological sample obtained from an individual including body fluids, body tissue (e.g., tumor tissue), cells, or other sources. Body fluids are, e.g., lymph, sera, whole fresh blood, peripheral blood mononuclear cells, frozen whole blood, plasma (including fresh or frozen), urine, saliva, semen, synovial fluid and spinal fluid. Samples also include breast tissue, renal tissue, colonic tissue, brain tissue, muscle tissue, synovial tissue, skin, hair follicle, bone marrow, and tumor tissue. Methods for obtaining tissue biopsies and body fluids from mammals are well known in the art.
[0158] By "tissue sample" or "cell sample" is meant a collection of similar cells obtained from a tissue of a subject or individual. The source of the tissue or cell sample may be solid tissue as from a fresh, frozen and/or preserved organ, tissue sample, biopsy, and/or aspirate; blood or any blood constituents such as plasma; bodily fluids such as cerebral spinal fluid, amniotic fluid, peritoneal fluid, or interstitial fluid; cells from any time in gestation or development of the subject. The tissue sample may also be primary or cultured cells or cell lines. Optionally, the tissue or cell sample is obtained from a disease tissue/organ. The tissue sample may contain compounds which are not naturally intermixed with the tissue in nature such as preservatives, anticoagulants, buffers, fixatives, nutrients, antibiotics, or the like.
[0159] A "reference sample," "reference cell," "reference tissue," "control sample," "control cell," or "control tissue," as used herein, refers to a sample, cell, tissue, standard, or level that is used for comparison purposes. In one embodiment, a reference sample, reference cell, reference tissue, control sample, control cell, or control tissue is obtained from a healthy and/or non-diseased part of the body (e.g., tissue or cells) of the same subject or individual. For example, healthy and/or non-diseased cells or tissue adjacent to the diseased cells or tissue (e.g., cells or tissue adjacent to a tumor). In another embodiment, a reference sample is obtained from an untreated tissue and/or cell of the body of the same subject or individual. In yet another embodiment, a reference sample, reference cell, reference tissue, control sample, control cell, or control tissue is obtained from a healthy and/or non-diseased part of the body (e.g., tissues or cells) of an individual who is not the subject or individual. In even another embodiment, a reference sample, reference cell, reference tissue, control sample, control cell, or control tissue is obtained from an untreated tissue and/or cell of the body of an individual who is not the subject or individual. In another embodiment, a reference sample, reference cell, reference tissue, control sample, control cell, or control tissue is obtained from one or more cell lines (e.g., one or more normal cell lines).
[0160] The phrase "identifying a patient" or "identifies a patient" as used herein, refers to using the information or data generated relating to the level of at least one of the genes set forth in Table 1, the presence of NRF2 mRNA having deletion of all or a portion of its exon 2 or exon 2+3, or the presence of NRF2 protein having a deletion of all or a portion of its Neh2 or Neh2+4 in a sample of a patient to identify or select the patient as more likely to benefit or less likely to benefit from a therapy comprising a NRF2 pathway antagonist. The information or data used or generated may be in any form, written, oral or electronic. In some embodiments, using the information or data generated includes communicating, presenting, reporting, storing, sending, transferring, supplying, transmitting, dispensing, or combinations thereof. In some embodiments, communicating, presenting, reporting, storing, sending, transferring, supplying, transmitting, dispensing, or combinations thereof are performed by a computing device, analyzer unit or combination thereof. In some further embodiments, communicating, presenting, reporting, storing, sending, transferring, supplying, transmitting, dispensing, or combinations thereof are performed by a laboratory or medical professional. In some embodiments, the information or data includes a comparison of the level of at least one of the genes set forth in Table 1 to a reference level. In some embodiments, the information or data includes an indication that at least one of the genes set forth in Table 1 is present or absent in the sample. In some embodiments, the information or data includes an indication that the NRF2 mRNA has a deletion of all or a portion of its exon 2 or exon 2+3. In some embodiments, the information or data includes an indication that the NRF2 protein has a deletion of all or a portion of its Neh2 or Neh2+4. In some embodiments, the information or data includes an indication that the patient is more likely or less likely to respond to a therapy comprising a NRF2 pathway antagonist).
[0161] The term "primer" refers to a single-stranded polynucleotide that is capable of hybridizing to a nucleic acid and allowing polymerization of a complementary nucleic acid, generally by providing a free 3'--OH group.
[0162] As used herein, the term "treatment" (and variations thereof, such as "treat" or "treating") refers to clinical intervention in an attempt to alter the natural course of the individual being treated, and can be performed either for prophylaxis or during the course of clinical pathology. Desirable effects of treatment include, but are not limited to, preventing occurrence or recurrence of disease, alleviation of symptoms, diminishment of any direct or indirect pathological consequences of the disease, preventing metastasis, decreasing the rate of disease progression, amelioration or palliation of the disease state, and remission or improved prognosis. In some embodiments, antibodies of the invention are used to delay development of a disease or to slow the progression of a disease.
[0163] As used herein, "administering" is meant a method of giving a dosage of a compound (e.g., a NRF2 pathway antagonist) to a subject. The compositions utilized in the methods described herein can be administered, for example, intravitreally (e.g., by intravitreal injection), by eye drop, intramuscularly, intravenously, intradermally, percutaneously, intraarterially, intraperitoneally, intralesionally, intracranially, intraarticularly, intraprostatically, intrapleurally, intratracheally, intrathecally, intranasally, intravaginally, intrarectally, topically, intratumorally, peritoneally, subcutaneously, subconjunctivally, intravesicularly, mucosally, intrapericardially, intraumbilically, intraocularly, intraorbitally, orally, topically, transdermally, by inhalation, by injection, by implantation, by infusion, by continuous infusion, by localized perfusion bathing target cells directly, by catheter, by lavage, in cremes, or in lipid compositions. The compositions utilized in the methods described herein can also be administered systemically or locally. The method of administration can vary depending on various factors (e.g., the compound or composition being administered and the severity of the condition, disease, or disorder being treated).
[0164] An "effective amount" of an agent, e.g., a pharmaceutical formulation, refers to an amount effective, at dosages and for periods of time necessary, to achieve the desired therapeutic or prophylactic result.
[0165] The term "antibody" herein is used in the broadest sense and encompasses various antibody structures, including but not limited to monoclonal antibodies, polyclonal antibodies, multispecific antibodies (e.g., bispecific antibodies), and antibody fragments so long as they exhibit the desired antigen-binding activity.
[0166] "Percent (%) amino acid sequence identity" with respect to a reference polypeptide sequence is defined as the percentage of amino acid residues in a candidate sequence that are identical with the amino acid residues in the reference polypeptide sequence, after aligning the sequences and introducing gaps, if necessary, to achieve the maximum percent sequence identity, and not considering any conservative substitutions as part of the sequence identity. Alignment for purposes of determining percent amino acid sequence identity can be achieved in various ways that are within the skill in the art, for instance, using publicly available computer software such as BLAST, BLAST-2, ALIGN or Megalign (DNASTAR) software. Those skilled in the art can determine appropriate parameters for aligning sequences, including any algorithms needed to achieve maximal alignment over the full length of the sequences being compared. For purposes herein, however, % amino acid sequence identity values are generated using the sequence comparison computer program ALIGN-2. The ALIGN-2 sequence comparison computer program was authored by Genentech, Inc., and the source code has been filed with user documentation in the U.S. Copyright Office, Washington D.C., 20559, where it is registered under U.S. Copyright Registration No. TXU510087. The ALIGN-2 program is publicly available from Genentech, Inc., South San Francisco, Calif., or may be compiled from the source code. The ALIGN-2 program should be compiled for use on a UNIX operating system, including digital UNIX V4.0D. All sequence comparison parameters are set by the ALIGN-2 program and do not vary.
[0167] In situations where ALIGN-2 is employed for amino acid sequence comparisons, the % amino acid sequence identity of a given amino acid sequence A to, with, or against a given amino acid sequence B (which can alternatively be phrased as a given amino acid sequence A that has or comprises a certain % amino acid sequence identity to, with, or against a given amino acid sequence B) is calculated as follows:
100 times the fraction X/Y
where X is the number of amino acid residues scored as identical matches by the sequence alignment program ALIGN-2 in that program's alignment of A and B, and where Y is the total number of amino acid residues in B. It will be appreciated that where the length of amino acid sequence A is not equal to the length of amino acid sequence B, the % amino acid sequence identity of A to B will not equal the % amino acid sequence identity of B to A. Unless specifically stated otherwise, all % amino acid sequence identity values used herein are obtained as described in the immediately preceding paragraph using the ALIGN-2 computer program.
[0168] The term "anti-neoplastic" refers to a composition useful in treating cancer comprising at least one active therapeutic agent, e.g., "anti-cancer agent." Examples of therapeutic agents (anti-cancer agents) include, but are limited to, e.g., chemotherapeutic agents, growth inhibitory agents, cytotoxic agents, agents used in radiation therapy, anti-angiogenesis agents, apoptotic agents, anti-tubulin agents, and other-agents to treat cancer, such as anti-HER-2 antibodies, anti-CD20 antibodies, an epidermal growth factor receptor (EGFR) antagonist (e.g., a tyrosine kinase inhibitor), HER1/EGFR inhibitor (e.g., erlotinib (TARCEVA.TM.) platelet derived growth factor inhibitors (e.g., GLEEVEC.TM. (Imatinib Mesylate)), a COX-2 inhibitor (e.g., celecoxib), interferons, cytokines, antagonists (e.g., neutralizing antibodies) that bind to one or more of the following targets ErbB2, ErbB3, ErbB4, PDGFR-beta, BlyS, APRIL, BCMA or VEGF receptor(s), TRAIL/Apo2, and other bioactive and organic chemical agents, and the like. Combinations thereof are also included in the invention.
[0169] The term "cytotoxic agent" as used herein refers to a substance that inhibits or prevents a cellular function and/or causes cell death or destruction. Cytotoxic agents include, but are not limited to, radioactive isotopes (e.g., At.sup.211, I.sup.131, I.sup.125, Y.sup.90, Re.sup.186, Re.sup.188, Sm.sup.153, Bi.sup.212, P.sup.32, Pb.sup.212 and radioactive isotopes of Lu); chemotherapeutic agents or drugs (e.g., methotrexate, adriamicin, vinca alkaloids (vincristine, vinblastine, etoposide), doxorubicin, melphalan, mitomycin C, chlorambucil, daunorubicin or other intercalating agents); growth inhibitory agents; enzymes and fragments thereof such as nucleolytic enzymes, antibiotics, toxins such as small molecule toxins or enzymatically active toxins of bacterial, fungal, plant or animal origin, including fragments and/or variants thereof, and the various antitumor or anticancer agents disclosed below.
[0170] A "chemotherapeutic agent" is a chemical compound useful in the treatment of cancer. Examples of chemotherapeutic agents include is a chemical compound useful in the treatment of cancer. Examples of chemotherapeutic agents include alkylating agents, such as, for example, temozolomide (TMZ), the imidazotetrazine derivative of the alkylating agent dacarbazine. Additional examples of chemotherapeutics agents include, e.g., paclitaxel or topotecan or pegylated liposomal doxorubicin (PLD). Other examples of chemotherapeutic agents include alkylating agents such as thiotepa and CYTOXAN.RTM. cyclosphosphamide; alkyl sulfonates such as busulfan, improsulfan and piposulfan; aziridines such as benzodopa, carboquone, meturedopa, and uredopa; ethylenimines and methylamelamines including altretamine, triethylenemelamine, trietylenephosphoramide, triethiylenethiophosphoramide and trimethylolomelamine; acetogenins (especially bullatacin and bullatacinone); a camptothecin; bryostatin; callystatin; CC-1065 (including its adozelesin, carzelesin and bizelesin synthetic analogues); cryptophycins (particularly cryptophycin 1 and cryptophycin 8); dolastatin; duocarmycin (including the synthetic analogues, KW-2189 and CB1-TM1); eleutherobin; pancratistatin; a sarcodictyin; spongistatin; nitrogen mustards such as chlorambucil, chlornaphazine, cholophosphamide, estramustine, ifosfamide, mechlorethamine, mechlorethamine oxide hydrochloride, melphalan, novembichin, phenesterine, prednimustine, trofosfamide, uracil mustard; nitrosureas such as carmustine, chlorozotocin, fotemustine, lomustine, nimustine, and ranimnustine; antibiotics such as the enediyne antibiotics (e.g., calicheamicin, especially calicheamicin gamma1I and calicheamicin omegal1 (see, e.g., Agnew, Chem. Intl. Ed. Engl., 33: 183-186 (1994)); dynemicin, including dynemicin A; bisphosphonates, such as clodronate; an esperamicin; as well as neocarzinostatin chromophore and related chromoprotein enediyne antiobiotic chromophores), aclacinomysins, actinomycin, authramycin, azaserine, bleomycins, cactinomycin, carabicin, caminomycin, carzinophilin, chromomycinis, dactinomycin, daunorubicin, detorubicin, 6-diazo-5-oxo-L-norleucine, ADRIAMYCIN.RTM. doxorubicin (including morpholino-doxorubicin, cyanomorpholino-doxorubicin, 2-pyrrolino-doxorubicin and deoxydoxorubicin), epirubicin, esorubicin, idarubicin, marcellomycin, mitomycins such as mitomycin C, mycophenolic acid, nogalamycin, olivomycins, peplomycin, potfiromycin, puromycin, quelamycin, rodorubicin, streptonigrin, streptozocin, tubercidin, ubenimex, zinostatin, zorubicin; anti-metabolites such as methotrexate and 5-fluorouracil (5-FU); folic acid analogues such as denopterin, methotrexate, pteropterin, trimetrexate; purine analogs such as fludarabine, 6-mercaptopurine, thiamiprine, thioguanine; pyrimidine analogs such as ancitabine, azacitidine, 6-azauridine, carmofur, cytarabine, dideoxyuridine, doxifluridine, enocitabine, floxuridine; androgens such as calusterone, dromostanolone propionate, epitiostanol, mepitiostane, testolactone; anti-adrenals such as aminoglutethimide, mitotane, trilostane; folic acid replenisher such as frolinic acid; aceglatone; aldophosphamide glycoside; aminolevulinic acid; eniluracil; amsacrine; bestrabucil; bisantrene; edatraxate; defofamine; demecolcine; diaziquone; elformithine; elliptinium acetate; an epothilone; etoglucid; gallium nitrate; hydroxyurea; lentinan; lonidainine; maytansinoids such as maytansine and ansamitocins; mitoguazone; mitoxantrone; mopidanmol; nitraerine; pentostatin; phenamet; pirarubicin; losoxantrone; podophyllinic acid; 2-ethylhydrazide; procarbazine; PSK.RTM. polysaccharide complex (JHS Natural Products, Eugene, Oreg.); razoxane; rhizoxin; sizofuran; spirogermanium; tenuazonic acid; triaziquone; 2,2',2''-trichlorotriethylamine; trichothecenes (especially T-2 toxin, verracurin A, roridin A and anguidine); urethan; vindesine; dacarbazine; mannomustine; mitobronitol; mitolactol; pipobroman; gacytosine; arabinoside ("Ara-C"); cyclophosphamide; thiotepa; taxoids, e.g., TAXOL.RTM. paclitaxel (Bristol-Myers Squibb Oncology, Princeton, N.J.), ABRAXANE.RTM. Cremophor-free, albumin-engineered nanoparticle formulation of paclitaxel (American Pharmaceutical Partners, Schaumberg, Ill.), and TAXOTERE.RTM. docetaxel (Rhone-Poulenc Rorer, Antony, France); chloranbucil; GEMZAR.RTM. gemcitabine; 6-thioguanine; mercaptopurine; methotrexate; platinum analogs such as cisplatin, oxaliplatin and carboplatin; vinblastine; platinum; etoposide (VP-16); ifosfamide; mitoxantrone; vincristine; NAVELBINE.RTM. vinorelbine; novantrone; teniposide; edatrexate; daunomycin; aminopterin; xeloda; ibandronate; irinotecan (Camptosar, CPT-11) (including the treatment regimen of irinotecan with 5-FU and leucovorin); topoisomerase inhibitor RFS 2000; difluoromethylornithine (DMFO); retinoids such as retinoic acid; capecitabine; combretastatin; leucovorin (LV); oxaliplatin, including the oxaliplatin treatment regimen (FOLFOX); lapatinib (Tykerb.RTM.); inhibitors of PKC-alpha, Raf, H-Ras, EGFR (e.g., erlotinib (Tarceva.RTM.)) and VEGF that reduce cell proliferation and pharmaceutically acceptable salts, acids, or derivatives of any of the above.
[0171] The terms "Programmed Death Ligand 1" and "PD-L1" refer herein to a native sequence PD-L1 polypeptide, polypeptide variants, and fragments of a native sequence polypeptide and polypeptide variants. The PD-L1 polypeptide described herein may be that which is isolated from a variety of sources, such as from human tissue types or from another source, or prepared by recombinant or synthetic methods.
[0172] The term "PD-L1 axis binding antagonist" refers to a molecule that inhibits the interaction of a PD-L1 axis binding partner with one or more of its binding partners, so as to remove T-cell dysfunction resulting from signaling on the PD-1 signaling axis, with a result being restored or enhanced T-cell function. As used herein, a PD-L1 axis binding antagonist includes a PD-L1 binding antagonist and a PD-1 binding antagonist as well as molecules that interfere with the interaction between PD-L1 and PD-1 (e.g., a PD-L2-Fc fusion).
[0173] As used herein, a "PD-L1 binding antagonist" is a molecule that decreases, blocks, inhibits, abrogates or interferes with signal transduction resulting from the interaction of PD-L1 with either one or more of its binding partners, such as PD-1 and/or B7-1. In some embodiments, a PD-L1 binding antagonist is a molecule that inhibits the binding of PD-L1 to its binding partners. In a specific aspect, the PD-L1 binding antagonist inhibits binding of PD-L1 to PD-1 and/or B7-1. In some embodiments, PD-L1 binding antagonists include anti-PD-L1 antibodies and antigen-binding fragments thereof, immunoadhesins, fusion proteins, oligopeptides, small molecule antagonists, polynucleotide antagonists, and other molecules that decrease, block, inhibit, abrogate or interfere with signal transduction resulting from the interaction of PD-L1 with one or more of its binding partners, such as PD-1 and/or B7-1. In one embodiment, a PD-L1 binding antagonist reduces the negative signal mediated by or through cell surface proteins expressed on T lymphocytes, and other cells, mediated signaling through PD-L1 or PD-1 so as render a dysfunctional T-cell less dysfunctional. In some embodiments, a PD-L1 binding antagonist is an anti-PD-L1 antibody. In a specific aspect, an anti-PD-L1 antibody is YW243.55.S70. In another specific aspect, an anti-PD-L1 antibody is MDX-1105. In still another specific aspect, an anti-PD-L1 antibody is atezolizumab (MPDL3280A). In still another specific aspect, an anti-PD-L1 antibody is MED14736 (druvalumab). In still another specific aspect, an anti-PD-L1 antibody is MSB0010718C (avelumab).
[0174] As used herein, a "PD-1 binding antagonist" is a molecule that decreases, blocks, inhibits, abrogates or interferes with signal transduction resulting from the interaction of PD-1 with one or more of its binding partners, such as PD-L1 and/or PD-L2. In some embodiments, the PD-1 binding antagonist is a molecule that inhibits the binding of PD-1 to its binding partners. In a specific aspect, the PD-1 binding antagonist inhibits the binding of PD-1 to PD-L1 and/or PD-L2. For example, PD-1 binding antagonists include anti-PD-1 antibodies and antigen-binding fragments thereof, immunoadhesins, fusion proteins, oligopeptides, small molecule antagonists, polynucleotide antagonists, and other molecules that decrease, block, inhibit, abrogate or interfere with signal transduction resulting from the interaction of PD-1 with PD-L1 and/or PD-L2. In one embodiment, a PD-1 binding antagonist reduces the negative signal mediated by or through cell surface proteins expressed on T lymphocytes, and other cells, mediated signaling through PD-1 or PD-L1 so as render a dysfunctional T-cell less dysfunctional. In some embodiments, the PD-1 binding antagonist is an anti-PD-1 antibody. In a specific aspect, a PD-1 binding antagonist is MDX-1106 (nivolumab). In another specific aspect, a PD-1 binding antagonist is MK-3475 (pembrolizumab). In another specific aspect, a PD-1 binding antagonist is CT-011 (pidilizumab). In another specific aspect, a PD-1 binding antagonist is MEDI-0680 (AMP-514). In another specific aspect, a PD-1 binding antagonist is PDR001. In another specific aspect, a PD-1 binding antagonist is REGN2810 described herein. In another specific aspect, a PD-1 binding antagonist is BGB-108 described herein. In another specific aspect, a PD-1 binding antagonist is AMP-224.
[0175] The term "vascular endothelial growth factor" or "VEGF" refers to vascular endothelial growth factor. The term "VEGF" encompasses homologues and isoforms thereof. The term "VEGF" also encompasses the known isoforms, e.g., splice isoforms, of VEGF, e.g., VEGF.sub.111, VEGF.sub.121, VEGF.sub.145, VEGF.sub.165, VEGF.sub.189, and VEGF.sub.206, together with the naturally-occurring allelic and processed forms thereof, including the 110-amino acid human vascular endothelial cell growth factor generated by plasmin cleavage of VEGF.sub.165 as described in Ferrara Mol. Biol. Cell. 21:687 (2010), Leung et al., Science, 246:1306 (1989), and Houck et al., Mol. Endocrin., 5:1806 (1991). The term "VEGF" also refers to VEGFs from non-human species such as mouse, rat or primate. Sometimes the VEGF from a specific species are indicated by terms such as hVEGF for human VEGF, mVEGF for murine VEGF, and the like. The term "VEGF" is also used to refer to truncated forms of the polypeptide comprising amino acids 8 to 109 or 1 to 109 of the 165-amino acid human vascular endothelial cell growth factor. Reference to any such forms of VEGF may be identified in the present application, e.g., by "VEGF.sub.109," "VEGF (8-109)," "VEGF (1-109)" or "VEGF.sub.165." The amino acid positions for a "truncated" native VEGF are numbered as indicated in the native VEGF sequence. For example, amino acid position 17 (methionine) in truncated native VEGF is also position 17 (methionine) in native VEGF. The truncated native VEGF has binding affinity for the KDR and Flt-1 receptors comparable to native VEGF. The term "VEGF variant" as used herein refers to a VEGF polypeptide which includes one or more amino acid mutations in the native VEGF sequence. Optionally, the one or more amino acid mutations include amino acid substitution(s). For purposes of shorthand designation of VEGF variants described herein, it is noted that numbers refer to the amino acid residue position along the amino acid sequence of the putative native VEGF (provided in Leung et al., supra and Houck et al., supra).
[0176] The term "VEGF antagonist," as used herein, refers to a molecule capable of binding to VEGF, reducing VEGF expression levels, or neutralizing, blocking, inhibiting, abrogating, reducing, or interfering with VEGF biological activities, including, but not limited to, VEGF binding to one or more VEGF receptors, VEGF signaling, and VEGF-mediated angiogenesis and endothelial cell survival or proliferation. For example, a molecule capable of neutralizing, blocking, inhibiting, abrogating, reducing, or interfering with VEGF biological activities can exert its effects by binding to one or more VEGF receptor (VEGFR) (e.g., VEGFR1, VEGFR2, VEGFR3, membrane-bound VEGF receptor (mbVEGFR), or soluble VEGF receptor (sVEGFR)). Included as VEGF antagonists useful in the methods of the invention are polypeptides that specifically bind to VEGF, anti-VEGF antibodies and antigen-binding fragments thereof, receptor molecules and derivatives which bind specifically to VEGF thereby sequestering its binding to one or more receptors, fusions proteins (e.g., VEGF-Trap (Regeneron)), and VEGF.sub.121-gelonin (Peregrine). VEGF antagonists also include antagonist variants of VEGF polypeptides, antisense nucleobase oligomers complementary to at least a fragment of a nucleic acid molecule encoding a VEGF polypeptide; small RNAs complementary to at least a fragment of a nucleic acid molecule encoding a VEGF polypeptide; ribozymes that target VEGF; peptibodies to VEGF; and VEGF aptamers. VEGF antagonists also include polypeptides that bind to VEGFR, anti-VEGFR antibodies, and antigen-binding fragments thereof, and derivatives which bind to VEGFR thereby blocking, inhibiting, abrogating, reducing, or interfering with VEGF biological activities (e.g., VEGF signaling), or fusions proteins.
[0177] VEGF antagonists also include nonpeptide small molecules that bind to VEGF or VEGFR and are capable of blocking, inhibiting, abrogating, reducing, or interfering with VEGF biological activities. Thus, the term "VEGF activities" specifically includes VEGF-mediated biological activities of VEGF. In certain embodiments, the VEGF antagonist reduces or inhibits, by at least 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90% or more, the expression level or biological activity of VEGF. In some embodiments, the VEGF inhibited by the VEGF-specific antagonist is VEGF (8-109), VEGF (1-109), or VEGF.sub.165.
[0178] As used herein, VEGF antagonists can include, but are not limited to, anti-VEGFR2 antibodies and related molecules (e.g., ramucirumab, tanibirumab, aflibercept), anti-VEGFR1 antibodies and related molecules (e.g., icrucumab, aflibercept (VEGF Trap-Eye; EYLEA.RTM.), and ziv-aflibercept (VEGF Trap; ZALTRAP.RTM.)), bispecific VEGF antibodies (e.g., MP-0250, vanucizumab (VEGF-ANG2), and bispecific antibodies disclosed in US 2001/0236388), bispecific antibodies including combinations of two of anti-VEGF, anti-VEGFR1, and anti-VEGFR2 arms, anti-VEGF antibodies (e.g., bevacizumab, sevacizumab, and ranibizumab), and nonpeptide small molecule VEGF antagonists (e.g., pazopanib, axitinib, vandetanib, stivarga, cabozantinib, lenvatinib, nintedanib, orantinib, telatinib, dovitinig, cediranib, motesanib, sulfatinib, apatinib, foretinib, famitinib, and tivozanib).
[0179] The terms "anti-VEGF antibody," an "antibody that binds to VEGF," and "antibody that specifically binds VEGF" refer to an antibody that is capable of binding VEGF with sufficient affinity such that the antibody is useful as a diagnostic and/or therapeutic agent in targeting VEGF. In one embodiment, the extent of binding of an anti-VEGF antibody to an unrelated, non-VEGF protein is less than about 10% of the binding of the antibody to VEGF as measured, for example, by a radioimmunoassay (RIA). In certain embodiments, an antibody that binds to VEGF has a dissociation constant (Kd) of .ltoreq.1 .mu.M, .ltoreq.100 nM, .ltoreq.10 nM, .ltoreq.1 nM, .ltoreq.0.1 nM, .ltoreq.0.01 nM, or .ltoreq.0.001 nM (e.g. 10.sup.-8 M or less, e.g., from 10.sup.-8 M to 10.sup.-13 M, e.g., from 10.sup.-9 M to 10.sup.-13 M). In certain embodiments, an anti-VEGF antibody binds to an epitope of VEGF that is conserved among VEGF from different species.
[0180] In certain embodiments, the anti-VEGF antibody can be used as a therapeutic agent in targeting and interfering with diseases or conditions wherein the VEGF activity is involved. Also, the antibody may be subjected to other biological activity assays, e.g., in order to evaluate its effectiveness as a therapeutic. Such assays are known in the art and depend on the target antigen and intended use for the antibody. Examples include the HUVEC inhibition assay; tumor cell growth inhibition assays (as described in WO 89/06692, for example); antibody-dependent cellular cytotoxicity (ADCC) and complement-mediated cytotoxicity (CDC) assays (U.S. Pat. No. 5,500,362); and agonistic activity or hematopoiesis assays (see WO 95/27062). An anti-VEGF antibody will usually not bind to other VEGF homologues such as VEGF-B or VEGF-C, nor other growth factors such as PIGF, PDGF, or bFGF. In one embodiment, anti-VEGF antibody is a monoclonal antibody that binds to the same epitope as the monoclonal anti-VEGF antibody A4.6.1 produced by hybridoma ATCC HB 10709. In another embodiment, the anti-VEGF antibody is a recombinant humanized anti-VEGF monoclonal antibody generated according to Presta et al. (1997) Cancer Res. 57:4593-4599, including but not limited to the antibody known as bevacizumab (BV; AVASTIN.RTM.).
[0181] The anti-VEGF antibody "ranibizumab" also known as "LUCENTIS.RTM." or "rhuFab V2" is a humanized, affinity-matured anti-human VEGF Fab fragment. Ranibizumab is produced by standard recombinant technology methods in Escherichia coli expression vector and bacterial fermentation. Ranibizumab is not glycosylated and has a molecular mass of .about.48,000 daltons. See WO 98/45331 and US 2003/0190317. Additional preferred antibodies include the G6 or B20 series antibodies (e.g., G6-31, B20-4.1), as described in PCT Application Publication Nos. WO 2005/012359 and WO 2005/044853, which are each incorporated herein by reference in their entirety. For additional preferred antibodies see U.S. Pat. Nos. 7,060,269, 6,582,959, 6,703,020; 6,054,297; WO98/45332; WO 96/30046; WO94/10202; EP 0666868B1; U.S. Patent Application Publication Nos. 2006009360, 20050186208, 20030206899, 20030190317, 20030203409, and 20050112126; and Popkov et al., Journal of Immunological Methods 288:149-164 (2004). Other preferred antibodies include those that bind to a functional epitope on human VEGF comprising of residues F17, M18, D19, Y21, Y25, Q89, 191, K101, E103, and C104 or, alternatively, comprising residues F17, Y21, Q22, Y25, D63, 183, and Q89. Additional anti-VEGF antibodies include anti-VEGF antibodies described in PCT Application Publication No. WO 2009/155724.
[0182] The term "co-administered" is used herein to refer to administration of two or more therapeutic agents, where at least part of the administration overlaps in time. Accordingly, co-administration includes a dosing regimen when the administration of one or more agent(s) continues after discontinuing the administration of one or more other agent(s).
[0183] "Tumor," as used herein, refers to all neoplastic cell growth and proliferation, whether malignant or benign, and all pre-cancerous and cancerous cells and tissues. The terms "cancer," "cancerous," "cell proliferative disorder," "proliferative disorder," and "tumor" are not mutually exclusive as referred to herein.
III. Methods
[0184] A. Diagnostic Methods
[0185] Provided herein are methods for diagnosing cancer (e.g., a lung cancer (e.g., squamous NSCLC or non-squamous NSCLC) or a head and neck cancer (e.g., HNSC)) in a subject. Also provided herein are methods for identifying a subject having a cancer that is a NRF2-dependent cancer (e.g., lung cancer, e.g., squamous non-small cell lung cancer or non-squamous non-small cell lung cancer, or head and neck cancer). Any of the methods may be based on the expression level of a biomarker provided herein, for example, a splice variant of NRF2 (e.g., NRF2 mRNA or NRF2 protein), or an increased expression of one or more NRF2 target genes. Any of the methods may further include administering to the subject a NRF2 pathway antagonist. Any of the methods may further include administering an effective amount of a second therapeutic (e.g., one or more (e.g., 1, 2, 3, or 4 or more) additional NRF2 pathway antagonists or one or more (e.g., 1, 2, 3, or 4 or more) anti-cancer agents) to the subject.
[0186] The invention provides a method of diagnosing a cancer in a subject, the method comprising determining the expression level of at least one gene (e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, or 27 genes) selected from the group consisting of AKR1B10, AKR1C2, SRXN1, OSGIN1, FECH, GCLM, TRIM16, ME1, KYNU, CABYR, SLC7A11, TRIM16L, AKR1C4, CYP4F11, RSPO3, ABCC2, AKR1B15, NR0B1, UGDH, TXNRD1, GSR, AKR1C3, TALDO1, PGD, TXN, NQO1, and FTL in a sample obtained from the subject; and comparing the expression level of the at least one gene to a reference expression level of the at least one gene, wherein an increase in the expression level of the at least one gene in the sample relative to the reference expression level of the at least one gene identifies a subject having a cancer.
[0187] The invention further provides a method of identifying a subject having a cancer that is a NRF2-dependent cancer, the method comprising determining the expression level of at least one gene (e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, or 27 genes) selected from the group consisting of AKR1B10, AKR1C2, SRXN1, OSGIN1, FECH, GCLM, TRIM16, ME1, KYNU, CABYR, SLC7A11, TRIM16L, AKR1C4, CYP4F11, RSPO3, ABCC2, AKR1B15, NR0B1, UGDH, TXNRD1, GSR, AKR1C3, TALDO1, PGD, TXN, NQO1, and FTL in a sample obtained from the subject; comparing the expression level of the at least one gene to a reference expression level of the at least one gene; and determining if the subject's cancer is a NRF2-dependent cancer, wherein an increase in the expression level of the at least one gene in the sample relative to the reference expression level of the at least one gene identifies a subject having a NRF2-dependent cancer.
[0188] In any of the preceding methods, the expression level of one or more (e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, or 21) of AKR1B10, AKR1C2, SRXN1, OSGIN1, FECH, GCLM, TRIM16, KYNU, CABYR, SLC7A11, TRIM16L, AKR1C4, NR0B1, UGDH, TXNRD1, GSR, AKR1C3, TALDO1, PGD, TXN, or NQO1 is determined.
[0189] In any of the preceding methods, the expression level of one or more (e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, or 12) newly identified NRF2 target genes is determined. Newly identified NRF2 target genes include AKR1B10, AKR1C2, ME1, KYNU, CABYR, TRIM16L, AKR1C4, CYP4F11, RSPO3, AKR1B15, NR0B1, and AKR1C3.
[0190] The invention further provides a method of diagnosing a cancer in a subject, the method comprising determining the mRNA expression level of NRF2 comprising a deletion in all or a portion of its exon 2 in a sample obtained from the subject (e.g., a tumor sample), wherein the presence of NRF2 comprising a deletion in all or a portion of its exon 2 identifies the subject as having a cancer. In some embodiments, the NRF2 further comprises a deletion in all or a portion of its exon 3. Presence and/or expression levels of a gene (e.g., NRF2, KEAP1, AKR1B10, AKR1C2, SRXN1, OSGIN1, FECH, GCLM, TRIM16, ME1, KYNU, CABYR, SLC7A11, TRIM16L, AKR1C4, CYP4F11, RSPO3, ABCC2, AKR1B15, NR0B1, UGDH, TXNRD1, GSR, AKR1C3, TALDO1, PGD, TXN, NQO1, or FTL) may be determined qualitatively or quantitatively based on any suitable criterion known in the art, including, but not limited to DNA, mRNA, cDNA, protein fragments, and/or gene copy number.
[0191] The invention further provides a method of diagnosing a cancer in a subject, the method comprising determining the protein expression level of NRF2 comprising a deletion in all or a portion of its Neh2 domain in a sample obtained from the subject, wherein the presence of NRF2 comprising a deletion in all or a portion of its Neh2 domain identifies the subject as having a cancer. In some embodiments, the NRF2 further comprises a deletion in all or a portion of its Neh4 domain.
[0192] The invention further provides a method of identifying a subject having cancer, the method comprising determining the mRNA expression level of NRF2 comprising a deletion in all or a portion of its exon 2 in a sample obtained from the subject (e.g., a tumor sample), wherein the presence of NRF2 comprising a deletion in all or a portion of its exon 2 identifies the subject as having a cancer. In some embodiments, the NRF2 further comprises a deletion in all or a portion of its exon 3. Presence and/or expression levels of a gene (e.g., NRF2, KEAP1, AKR1B10, AKR1C2, SRXN1, OSGIN1, FECH, GCLM, TRIM16, ME1, KYNU, CABYR, SLC7A11, TRIM16L, AKR1C4, CYP4F11, RSPO3, ABCC2, AKR1B15, NR0B1, UGDH, TXNRD1, GSR, AKR1C3, TALDO1, PGD, TXN, NQO1, or FTL) may be determined qualitatively or quantitatively based on any suitable criterion known in the art, including, but not limited to DNA, mRNA, cDNA, protein fragments, and/or gene copy number.
[0193] The invention further provides a method of identifying a subject having cancer, the method comprising determining the protein expression level of NRF2 comprising a deletion in all or a portion of its Neh2 domain in a sample obtained from the subject, wherein the presence of NRF2 comprising a deletion in all or a portion of its Neh2 domain identifies the subject as having a cancer. In some embodiments, the NRF2 further comprises a deletion in all or a portion of its Neh4 domain.
[0194] The presence and/or expression level/amount of various biomarkers described herein in a sample can be analyzed by a number of methodologies, many of which are known in the art and understood by the skilled artisan, including, but not limited to, immunohistochemistry ("IHC"), Western blot analysis, immunoprecipitation, molecular binding assays, ELISA, ELIFA, fluorescence activated cell sorting ("FACS"), MassARRAY, proteomics, quantitative blood based assays (e.g., Serum ELISA), biochemical enzymatic activity assays, in situ hybridization, fluorescence in situ hybridization (FISH), Southern analysis, Northern analysis, whole genome sequencing, massively parallel DNA sequencing (e.g., next-generation sequencing), NANOSTRING.RTM., polymerase chain reaction (PCR) including quantitative real time PCR (qRT-PCR) and other amplification type detection methods, such as, for example, branched DNA, SISBA, TMA and the like, RNA-seq, microarray analysis, gene expression profiling, and/or serial analysis of gene expression ("SAGE"), as well as any one of the wide variety of assays that can be performed by protein, gene, and/or tissue array analysis. Typical protocols for evaluating the status of genes and gene products are found, for example in Ausubel et al., eds., 1995, Current Protocols In Molecular Biology, Units 2 (Northern Blotting), 4 (Southern Blotting), 15 (Immunoblotting) and 18 (PCR Analysis). Multiplexed immunoassays such as those available from Rules Based Medicine or Meso Scale Discovery ("MSD") may also be used.
[0195] In some embodiments of any of the methods described herein, DNA from clinical tumor samples can be sequenced using a next-generation sequencing method, such as the targeted gene pulldown and sequencing method described in Frampton et al. (Nature Biotechnology. 31(11): 1023-1033, 2013), which is incorporated by reference herein in its entirety. Such a next-generation sequencing method can be used with any of the methods disclosed herein to detect various mutations (e.g., insertions, deletions, base substitutions, focal gene amplifications, and/or homozygous gene deletions), while enabling the use of small samples (e.g., from small-core needle biopsies, fine-needle aspirations, and/or cell blocks) or fixed samples (e.g., formalin-fixed and paraffin-embedded (FFPE) samples).
[0196] In any of the preceding methods, the presence and/or expression level/amount of a biomarker (e.g., NRF2, KEAP1, AKR1B10, AKR1C2, SRXN1, OSGIN1, FECH, GCLM, TRIM16, ME1, KYNU, CABYR, SLC7A11, TRIM16L, AKR1C4, CYP4F11, RSPO3, ABCC2, AKR1B15, NR0B1, UGDH, TXNRD1, GSR, AKR1C3, TALDO1, PGD, TXN, NQO1, or FTL) is measured by determining protein expression levels of the biomarker. In certain embodiments, the method comprises contacting the biological sample with antibodies that specifically bind to a biomarker (e.g., anti-NRF2 antibodies) under conditions permissive for binding of the biomarker, and detecting whether a complex is formed between the antibodies and biomarker. Such method may be an in vitro or in vivo method. Any method of measuring protein expression levels known in the art or provided herein may be used. For example, in some embodiments, a protein expression level of a biomarker is determined using a method selected from the group consisting of flow cytometry (e.g., fluorescence-activated cell sorting (FACS.TM.)), Western blot, enzyme-linked immunosorbent assay (ELISA), immunoprecipitation, immunohistochemistry (IHC), immunofluorescence, radioimmunoassay, dot blotting, immunodetection methods, HPLC, surface plasmon resonance, optical spectroscopy, mass spectrometry, and HPLC. In some embodiments, the protein expression level of the biomarker is determined in tumor cells.
[0197] In some embodiments, the presence and/or expression level/amount of a biomarker (e.g., NRF2, KEAP1, AKR1B10, AKR1C2, SRXN1, OSGIN1, FECH, GCLM, TRIM16, ME1, KYNU, CABYR, SLC7A11, TRIM16L, AKR1C4, CYP4F11, RSPO3, ABCC2, AKR1B15, NR0B1, UGDH, TXNRD1, GSR, AKR1C3, TALDO1, PGD, TXN, NQO1, or FTL) is measure by determining mRNA expression levels of the biomarker. In certain embodiments, presence and/or expression level/amount of a gene is determined using a method comprising: (a) performing gene expression profiling, PCR (such as RT-PCR), RNA-seq, microarray analysis, SAGE, MassARRAY technique, or FISH on a sample (such as a subject cancer sample); and b) determining presence and/or expression level/amount of a biomarker in the sample. In one embodiment, the PCR method is qRT-PCR. In one embodiment, the PCR method is multiplex-PCR. In some embodiments, gene expression is measured by microarray. In some embodiments, gene expression is measured by qRT-PCR. In some embodiments, expression is measured by multiplex-PCR.
[0198] Methods for the evaluation of mRNAs in cells are well known and include, for example, hybridization assays using complementary DNA probes (such as in situ hybridization using labeled riboprobes specific for the one or more genes, Northern blot and related techniques) and various nucleic acid amplification assays (such as RT-PCR using complementary primers specific for one or more of the genes, and other amplification type detection methods, such as, for example, branched DNA, SISBA, TMA and the like). Samples from mammals can be conveniently assayed for mRNAs using Northern, dot blot, or PCR analysis. In addition, such methods can include one or more steps that allow one to determine the levels of target mRNA in a biological sample (e.g., by simultaneously examining the levels a comparative control mRNA sequence of a "housekeeping" gene such as an actin family member).
[0199] In some embodiments of any of the methods, the biomarker is NRF2 (e.g., exon 2-deleted NRF2 or exon 2+3-deleted NRF2). In one embodiment, expression level of biomarker is determined using a method comprising performing WGS analysis on a sample (such as a tumor sample obtained from a patient) and determining expression level of a biomarker in the sample. In some embodiments, presence of exon 2-deleted NRF2 or exon 2+3-deleted NRF2 is determined relative to a reference. In some embodiments, the reference is a reference value. In some embodiments, the reference is a reference sample (e.g., a control cell line sample, a tissue sample from non-cancerous patient, or a wild-type NRF2 tissue sample).
[0200] Additionally or alternatively to mRNA expression analysis, other biomarkers, such as protein expression, may be quantified according to methods described above. For example, methods of the invention include testing a sample for a genomic biomarker (e.g., the presence of exon 2-deleted NRF2 or exon 2+3-deleted NRF2, or the upregulation of one or more NRF2 target genes, e.g., AKR1B10, AKR1C2, SRXN1, OSGIN1, FECH, GCLM, TRIM16, ME1, KYNU, CABYR, SLC7A11, TRIM16L, AKR1C4, CYP4F11, RSPO3, ABCC2, AKR1B15, NR0B1, UGDH, TXNRD1, GSR, AKR1C3, TALDO1, PGD, TXN, NQO1, or FTL) and additionally testing a sample for a protein biomarker (e.g., protein transcripts of one or more of AKR1B10, AKR1C2, SRXN1, OSGIN1, FECH, GCLM, TRIM16, ME1, KYNU, CABYR, SLC7A11, TRIM16L, AKR1C4, CYP4F11, RSPO3, ABCC2, AKR1B15, NR0B1, UGDH, TXNRD1, GSR, AKR1C3, TALDO1, PGD, TXN, NQO1, or FTL).
[0201] In some embodiments of any of the methods, a DNA sequence may serve as a biomarker. DNA can be quantified according to any method known in the art, including, but not limited to, PCR, exome-seq (e.g., whole exome sequencing), DNA microarray analysis, NANOSTRING.RTM., or whole genome sequencing.
[0202] In some instances, the expression level of the genes in the sample is an average (e.g., mean expression or median expression) of the genes, the reference expression level of the genes is an average (e.g., mean expression or median expression) of the genes of the reference, and the average of the genes of the sample is compared to the average of the genes of the reference.
[0203] In certain embodiments, the presence and/or expression levels/amount of a biomarker in a first sample is increased or elevated as compared to presence/absence and/or expression levels/amount in a second sample. In certain embodiments, the presence/absence and/or expression levels/amount of a biomarker in a first sample is decreased or reduced as compared to presence and/or expression levels/amount in a second sample. In certain embodiments, the second sample is a reference sample, reference cell, reference tissue, control sample, control cell, or control tissue. Additional disclosures for determining the presence/absence and/or expression levels/amount of a gene are described herein.
[0204] In certain embodiments, a reference sample, reference cell, reference tissue, control sample, control cell, or control tissue is a single sample or combined multiple samples from the same subject or individual that are obtained at one or more different time points than when the test sample is obtained. For example, a reference sample, reference cell, reference tissue, control sample, control cell, or control tissue is obtained at an earlier time point from the same subject or individual than when the test sample is obtained. Such reference sample, reference cell, reference tissue, control sample, control cell, or control tissue may be useful if the reference sample is obtained during initial diagnosis of cancer and the test sample is later obtained when the cancer becomes metastatic.
[0205] In certain embodiments, a reference sample, reference cell, reference tissue, control sample, control cell, or control tissue is a combined multiple samples from one or more healthy individuals who are not the patient. In certain embodiments, a reference sample, reference cell, reference tissue, control sample, control cell, or control tissue is a combined multiple samples from one or more individuals with a disease or disorder (e.g., cancer) who are not the subject or individual. In certain embodiments, a reference sample, reference cell, reference tissue, control sample, control cell, or control tissue is pooled RNA samples from normal tissues or pooled plasma or serum samples from one or more individuals who are not the patient. In certain embodiments, a reference sample, reference cell, reference tissue, control sample, control cell, or control tissue is pooled RNA samples from tumor tissues or pooled plasma or serum samples from one or more individuals with a disease or disorder (e.g., cancer) who are not the patient.
[0206] In some embodiments of any of the methods, elevated or increased expression refers to an overall increase of about any of 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95%, 96%, 97%, 98%, 99% or greater, in the level of biomarker (e.g., protein or nucleic acid (e.g., gene (DNA or mRNA))), detected by standard art-known methods such as those described herein, as compared to a reference sample, reference cell, reference tissue, control sample, control cell, or control tissue. In certain embodiments, the elevated expression refers to the increase in expression level/amount of a biomarker in the sample wherein the increase is at least about any of 1.5.times., 1.75.times., 2.times., 3.times., 4.times., 5.times., 6.times., 7.times., 8.times., 9.times., 1 Ox, 25.times., 50.times., 75.times., or 100.times. the expression level/amount of the respective biomarker in a reference sample, reference cell, reference tissue, control sample, control cell, or control tissue. In some embodiments, elevated expression refers to an overall increase of greater than about 1.5 fold, about 1.75 fold, about 2 fold, about 2.25 fold, about 2.5 fold, about 2.75 fold, about 3.0 fold, or about 3.25 fold as compared to a reference sample, reference cell, reference tissue, control sample, control cell, control tissue, or internal control (e.g., housekeeping gene).
[0207] In some embodiments of any of the methods, reduced expression refers to an overall reduction of about any of 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95%, 96%, 97%, 98%, 99% or greater, in the level of biomarker (e.g., protein or nucleic acid (e.g., gene (DNA or mRNA))), detected by standard art known methods such as those described herein, as compared to a reference sample, reference cell, reference tissue, control sample, control cell, or control tissue. In certain embodiments, reduced expression refers to the decrease in expression level/amount of a biomarker in the sample wherein the decrease is at least about any of 0.9.times., 0.8.times., 0.7.times., 0.6.times., 0.5.times., 0.4.times., 0.3.times., 0.2.times., 0.1.times., 0.05.times., or 0.01.times. the expression level/amount of the respective biomarker in a reference sample, reference cell, reference tissue, control sample, control cell, or control tissue.
[0208] B. Therapeutic Methods
[0209] The present invention provides methods for treating a patient suffering from a cancer (e.g., a lung cancer (e.g., squamous NSCLC or non-squamous NSCLC) or a head and neck cancer (e.g., HNSC)). In some instances, the methods of the invention include administering to the patient an effective amount of a NRF2 pathway antagonist. Any of the NRF2 pathway antagonists described herein or otherwise known in the art may be used in the methods. In some instances, the methods involve determining the presence and/or expression level of a NRF2 splice variant (e.g., exon 2-deleted NRF2 or exon 2+3-deleted NRF2) or a NRF2 target gene in a sample obtained from a patient and administering an NRF2 pathway antagonist to the patient based on the presence and/or expression level of a NRF2 splice variant (e.g., exon 2-deleted NRF2 or exon 2+3-deleted NRF2) or a NRF2 target gene, e.g., using any of the methods described herein, in the Examples below, or known in the art.
[0210] The invention provides a method of treating a subject suffering from a cancer (e.g., a lung cancer (e.g., squamous NSCLC or non-squamous NSCLC) or a head and neck cancer (e.g., HNSC)), the method comprising determining the expression level of at least one gene (e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, or 27 genes) selected from the group consisting of AKR1B10, AKR1C2, SRXN1, OSGIN1, FECH, GCLM, TRIM16, ME1, KYNU, CABYR, SLC7A11, TRIM16L, AKR1C4, CYP4F11, RSPO3, ABCC2, AKR1B15, NR0B1, UGDH, TXNRD1, GSR, AKR1C3, TALDO1, PGD, TXN, NQO1, and FTL in a sample obtained from the subject; and comparing the expression level of the at least one gene to a reference expression level of the at least one gene, wherein an increase in the expression level of the at least one gene in the sample relative to the reference expression level of the at least one gene identifies a subject having a cancer, and administering to the subject a therapeutically effective amount of one or more NRF2 pathway antagonists.
[0211] The invention further provides a method of treating a subject suffering from a cancer (e.g., lung cancer (e.g., squamous NSCLC or non-squamous NSCLC) or head and neck cancer), wherein the expression level of one or more (e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, or 12) newly identified NRF2 target genes is determined. Newly identified NRF2 target genes include AKR1B10, AKR1C2, ME1, KYNU, CABYR, TRIM16L, AKR1C4, CYP4F11, RSPO3, AKR1B15, NR0B1, and AKR1C3.
[0212] In some instances, the invention further provides a method of treating a subject suffering from a cancer (e.g., lung cancer (e.g., squamous NSCLC or non-squamous NSCLC) or head and neck cancer), wherein the mRNA expression level of NRF2 comprises a deletion in all, or a portion of, its exon 2 in a sample obtained from the subject, and wherein the presence of NRF2 comprising a deletion in all or a portion of its exon 2 identifies the subject as having a cancer; and administering to the subject a therapeutically effective amount of one or more NRF2 pathway antagonists. In some embodiments, the NRF2 further comprises a deletion in all, or a portion of, its exon 3.
[0213] In some instances, the invention further provides a method of treating a subject suffering from a cancer (e.g., a lung cancer (e.g., squamous NSCLC or non-squamous NSCLC) or a head and neck cancer (e.g., HNSC)), wherein the NRF2 protein comprises a deletion in all, or a portion of, its Neh2 domain in a sample obtained from the subject, and wherein the presence of NRF2 comprising a deletion in all, or a portion of, its Neh2 domain identifies the subject as having a cancer; and administering to the subject a therapeutically effective amount of one or more NRF2 pathway antagonists. In some embodiments, the NRF2 further comprises a deletion in all or a portion of its Neh4 domain.
[0214] In any of the preceding methods, the NRF2 pathway antagonist may be any NRF2 pathway antagonist known in the art or described herein.
[0215] In some instances, the method further includes administering to the subject an effective amount of a second therapeutic agent (e.g., one or more anti-cancer agents). In some instances, the second therapeutic agent is selected from the group consisting of an anti-angiogenic agent, a chemotherapeutic agent, a growth inhibitory agent, a cytotoxic agent, an immunotherapy, and combinations thereof. In some embodiments, the immunotherapy is a VEGF antagonist (e.g., anti-VEGFR2 antibodies and related molecules (e.g., ramucirumab, tanibirumab, aflibercept), anti-VEGFR1 antibodies and related molecules (e.g., icrucumab, aflibercept (VEGF Trap-Eye; EYLEA.RTM.), and ziv-aflibercept (VEGF Trap; ZALTRAP.RTM.)), bispecific VEGF antibodies (e.g., MP-0250, vanucizumab (VEGF-ANG2), and bispecific antibodies disclosed in US 2001/0236388), bispecific antibodies including combinations of two of anti-VEGF, anti-VEGFR1, and anti-VEGFR2 arms, anti-VEGF antibodies (e.g., bevacizumab, sevacizumab, and ranibizumab), and nonpeptide small molecule VEGF antagonists (e.g., pazopanib, axitinib, vandetanib, stivarga, cabozantinib, lenvatinib, nintedanib, orantinib, telatinib, dovitinig, cediranib, motesanib, sulfatinib, apatinib, foretinib, famitinib, and tivozanib)). In other embodiments, the immunotherapy is a PD-1 axis binding antagonist (e.g., YW243.55.S70, MDX-1105, MPDL3280A (atezolizumab), MEDI4736 (druvalumab), MSB0010718C (avelumab), MDX-1106 (nivolumab), MK-3475 (pembrolizumab), CT-011 (pidilizumab), MEDI-0680 (AMP-514), PDR001, REGN2810, BGB-108 or AMP-224).
[0216] The compositions used in the methods described herein (e.g., NRF2 pathway antagonists) can be administered by any suitable method, including, for example, intravenously, intramuscularly, subcutaneously, intradermally, percutaneously, intraarterially, intraperitoneally, intralesionally, intracranially, intraarticularly, intraprostatically, intrapleurally, intratracheally, intrathecally, intranasally, intravaginally, intrarectally, topically, intratumorally, peritoneally, subconjunctivally, intravesicularly, mucosally, intrapericardially, intraumbilically, intraocularly, intraorbitally, orally, topically, transdermally, intravitreally (e.g., by intravitreal injection), by eye drop, by inhalation, by injection, by implantation, by infusion, by continuous infusion, by localized perfusion bathing target cells directly, by catheter, by lavage, in cremes, or in lipid compositions. The compositions utilized in the methods described herein can also be administered systemically or locally. The method of administration can vary depending on various factors (e.g., the compound or composition being administered and the severity of the condition, disease, or disorder being treated). In some embodiments, the NRF2 pathway antagonist is administered intravenously, intramuscularly, subcutaneously, topically, orally, transdermally, intraperitoneally, intraorbitally, by implantation, by inhalation, intrathecally, intraventricularly, or intranasally. Dosing can be by any suitable route, e.g., by injections, such as intravenous or subcutaneous injections, depending in part on whether the administration is brief or chronic. Various dosing schedules including but not limited to single or multiple administrations over various time-points, bolus administration, and pulse infusion are contemplated herein.
[0217] NRF2 pathway antagonists described herein (and any additional anti-cancer agents) may be formulated, dosed, and administered in a fashion consistent with good medical practice. Factors for consideration in this context include the particular disorder being treated, the particular mammal being treated, the clinical condition of the individual patient, the cause of the disorder, the site of delivery of the agent, the method of administration, the scheduling of administration, and other factors known to medical practitioners. The NRF2 pathway antagonist need not be, but is optionally formulated with and/or administered concurrently with one or more agents currently used to prevent or treat the disorder in question. The effective amount of such other agents depends on the amount of the Nrd2 pathway inhibitor present in the formulation, the type of disorder or treatment, and other factors discussed above. These are generally used in the same dosages and with administration routes as described herein, or about from 1 to 99% of the dosages described herein, or in any dosage and by any route that is empirically/clinically determined to be appropriate.
[0218] In some embodiments, the methods further involve administering to the patient an effective amount of a second therapeutic agent (e.g., one or more anti-cancer agents). In some embodiments, the second therapeutic agent is selected from the group consisting of an anti-angiogenic agent, a chemotherapeutic agent, a growth inhibitory agent, a cytotoxic agent, an immunotherapy, and combinations thereof.
[0219] Such combination therapies noted above encompass combined administration (where two or more therapeutic agents (e.g., a NRF2 pathway antagonist and an anti-cancer agent) are included in the same or separate formulations), and separate administration, in which case, administration of a NRF2 pathway antagonist can occur prior to, simultaneously, and/or following, administration of the additional anti-cancer agent or agents. In one embodiment, administration of NRF2 pathway antagonist and administration of an additional anti-cancer agent occur within about one month, or within about one, two or three weeks, or within about one, two, three, four, five, or six days, of each other.
[0220] C. NRF2 Pathway Antagonists for Use in the Methods of the Invention
[0221] Provided herein are methods for treating or delaying progression of a cancer (e.g., a lung cancer (e.g., squamous NSCLC) or head and neck cancer) in a subject comprising administering to the subject a therapeutically effective amount of a NRF2 pathway antagonist. Any of the preceding methods may be based on the expression level of a biomarker provided herein, for example, NRF2 expression or expression of any protein or mRNA involved in a NRF2 pathway in a tumor sample, e.g., a biopsy containing tumor cells.
[0222] In some embodiments, a NRF2 pathway antagonist is a small molecule, e.g., a small molecule capable of binding to NRF2 or protein or gene that regulates the expression, stability, or activity of NRF2.
[0223] In some embodiments, the NRF2 pathway antagonist is an antagonist of a NRF2 agonist. Examples of NRF2 agonists include, but are not limited to, cAMP response element-binding protein (CREB), CREB Binding Protein (CBP), Maf, activating transcription factor 4 (ATF4), protein kinase C (PKC), Jun, glucocorticoid receptor, UbcM2, and homologous to the E6-AP carboxyl terminus domain and Ankyrin repeat containing E3 ubiquitin protein ligase 1 (HACE1). Therefore, examples of NRF2 pathway antagonists include, but are not limited to, CREB antagonists, CBP antagonists, Maf antagonists, ATF4 antagonists, PKC antagonists, Jun antagonists, glucocorticoid receptor antagonists, UbcM2 antagonists, and HACE1 antagonists, such as those set forth in Table 2.
[0224] In some embodiments, the NRF2 pathway antagonist is an agonist of a NRF2 antagonist. Examples of NRF2 antagonists include, but are not limited to, c-Myc, SUMO, KEAP1, CUL3, retinoic acid receptor .alpha. (RAR.alpha.). Therefore, examples of NRF2 pathway antagonists include, but are not limited to, c-Myc agonists, SUMO, KEAP1 agonists, CUL3 agonists, and RAR.alpha. agonists, such as those set forth in Table 3.
TABLE-US-00002 TABLE 2 Compound Target KG-501 CREB 2-naphthol-AS-E-phosphate C646 CBP 4-[4-[[5-(4,5-Dimethyl-2-nitrophenyl)-2-furanyl]methylene]-4,5-dihydro-3-m- ethyl-5- oxo-1H-pyrazol-1-yl]benzoic acid CBP30 CBP 8-(3-chloro-4-methoxy-phenethyl)-4-(3,5-dimethyl-isoxazol-4-yl)-9-(2-(morp- holin-4- yl)-propyl)-7,9-diaza-bicyclo[4.3.0]nona-1(6),2,4,7-tetraene nivalenol c-maf 3,4,7,15-Tetrahydroxy-12,13-epoxytrichothec-9-en-8-on tomatidine ATF4 (3.beta.,5.alpha.,22.beta.,25S)-spirosolan-3-ol ruboxistaurin PKC (9S)-9-[(dimethylamino)methyl]-6,7,10,11-tetrahydro-9H,18H-5,21:12,17- di(metheno)dibenzo[e,k]pyrrolo[3,4-h][1,4,13]oxadiazacyclohexadecine-18,20- -dione SP600125 Jun 1,9-Pyrazoloanthrone mifepristone Glucocorticoid (11.beta.,17.beta.)-11-[4-(Dimethylamino)phenyl]-17-hydroxy-17-(1-propynyl- )-estra-4,9-dien- receptor 3-one CORT 108297 Glucocorticoid 1H-Pyrazolo[3,4-g]isoquinoline, 4a-(ethoxymethyl)-1-(4-fluorophenyl)-4,4a,5,6,7,8- receptor hexahydro-6[[4-(trifluoromethyl)phenyl]sulfonyl]-, (4aR)-
TABLE-US-00003 TABLE 3 Compound Target Al-1 KEAP1 4-Chloro-1,2-dihydro-1-methyl-2-oxo-3-quinolinecarboxylic acid ethyl ester, Ethyl 4- chloro-1-methyl-2-oxo-1,2-dihydroquinoline-3-carboxylate retinoic acid RAR.alpha. 3,7-Dimethyl-9-(2,6,6-trimethyl-1-cyclohexen-1-yl)-2E,4E,6E,8E,-nonatetrae- noic acid CD437 RAR.alpha. 6-(4-Hydroxy-3-tricyclo[3.3.1.13,7]dec-1-ylphenyl)-2-naphthalenecarboxylic acid TTNPB RAR.alpha. 4-[(E)-2-(5,6,7,8-Tetrahydro-5,5,8,8-tetramethyl-2-naphthalenyl)-1-propeny- l]benzoic acid
[0225] In some embodiments of the invention, derivatives of the compounds listed in Table 2 or 3 may also be administered as NRF2 pathway antagonists. A derivative of a compound listed in Table 2 or 3 is a small molecule that differs in structure from the parent compound, but retains the ability to antagonize a NRF2 pathway. A derivative of a compound may change its interaction with certain other molecules or proteins relative to the parent compound. A derivative of a compound may also include a salt, an adduct, or other variant of the parent compound. In some embodiments of the invention, any derivative of a compound described herein (e.g., any one compound of the compounds listed in Table 2 or 3 may be used instead of the parent compound. In some embodiments, any derivative of a compound listed in Table 2 or 3 may be used in a method of treating a subject having cancer, such as lung cancer.
[0226] In some embodiments, a NRF2 pathway antagonist is an antibody (e.g., an anti-NRF2 antibody or an antibody directed against a protein or gene that regulates NRF2 expression, stability, or activity, e.g., a target listed in Table 2 or 3). In some embodiments, the anti-NRF2 antibody is capable of inhibiting binding between NRF2 and antioxidant response element. In some embodiments, the anti-NRF2 antibody is capable of inhibiting binding between NRF2 and a cofactor (e.g., Maf, PKC, Jun, ATF4, or CBP). In some embodiments, the antibody of the invention is an antibody fragment selected from the group consisting of Fab, Fab'-SH, Fv, scFv, and (Fab').sub.2 fragments. In some embodiments, the antibody is a humanized antibody. In some embodiments, the antibody is a human antibody. In some embodiments, the antibody is a derivative of a known antibody having any of the above-mentioned properties. Derivatives of antibodies include antibody variants having about 99%, 98%, 97%, 96%, 95%, 94%, 93%, 92%, 91%, 90%, 85%, 80% or lower sequence identity to its parent antibody. Percent (%) amino acid sequence identity is determined according to methods known in the art, including by ALIGN-2, as described above.
[0227] In some embodiments, a NRF2 pathway antagonist includes an inhibitor of any downstream biomarker (e.g., gene or protein, e.g., a gene or protein involved in iron sequestration (e.g., Ferritin, Light Polypeptide (FTL), Ferritin, Heavy Polypeptide 1 (FTH), or Heme Oxygenase 1 (HMOX1)), GSH utilization (e.g., Glutathione Peroxidase 2 (GPX2), Glutathione S-Transferase Alpha 1 (GSTA1), Glutathione S-Transferase Alpha 2 (GSTA2), Glutathione S-Transferase Alpha 3 (GSTA3), Glutathione S-Transferase Alpha 5 (GSTA5), Glutathione S-Transferase Mu 1 (GSTM1), Glutathione S-Transferase Mu 2 (GSTM2), Glutathione S-Transferase Mu 3 (GSTM3), or Glutathione S-Transferase Pi 1 (GSTP1)), quinine detoxification (e.g., NAD(P)H Dehydrogenase, Quinone 1 (NQO1)), GSH production and regeneration (e.g., Glutamate-Cysteine Ligase, Modifier Subunit (GCLM), Glutamate-Cysteine Ligase, Catalytic Subunit (GCLC), Glutathione Reductase (GSR), or Solute Carrier Family 7 (Anionic Amino Acid Transporter Light Chain, Xc-System), Member 11 (SLC7A11, or XCT)), thioredoxin (TXN) production, regeneration, and utilization (e.g., Thioredoxin 1, (TXN1), Thioredoxin Reductase 1 (TXNRD1), or Peroxiredoxin 1 (PRDX1)), NADPH production (e.g., Glucose-6-Phosphate Dehydrogenase (G6PD), Phosphogluconate Dehydrogenase (PGD), Malic Enzyme 1, NADP(+)-Dependent, Cytosolic (ME1), Isocitrate Dehydrogenase 1 (NADP+), Soluble (IDH1)) or any of the genes or proteins thereof of Table 1).
[0228] In some embodiments, a NRF2 pathway antagonist includes a compound that inhibits NRF2 from binding to antioxidant response element (ARE) (e.g., by competitively binding to the ARE binding site on NRF2, by competitively binding to ARE, or by otherwise interfering with a transcriptional cofactor (e.g., small Maf proteins).
[0229] In some embodiments, a NRF2 pathway antagonist includes an agonist or antagonist of NRF2-related genes, such that the pharmacological effect of compound involves the downregulation of one or more pathways downstream of NRF2-mediated transcription. Such NRF2-related genes include, e.g., Kelch-Like ECH-Associated Protein 1 (KEAP1), Ectodermal-Neural Cortex 1 (With BTB Domain) (ENC1), Protein Kinase C, Delta (PRKCD), Protein Kinase C, Beta (PRKCB), Polyamine-Modulated Factor 1 (PMF1), Cullin 3 (CUL3), Nuclear Factor, Erythroid 2 (NFE2), Activating Transcription Factor 4 (ATF4), Heme Oxygenase 1 (HMOX1), Heme Oxygenase 2 (HMOX2), Ubiquitin C (UBC), V-Maf Avian Musculoaponeurotic Fibrosarcoma Oncogene Homolog K (MAFK), UDP Glucuronosyltransferase 1 Family, Polypeptide A6 (UGT1A6), V-Maf Avian Musculoaponeurotic Fibrosarcoma Oncogene Homolog F (MAFF), CREB Binding Protein (CREBBP), V-Maf Avian Musculoaponeurotic Fibrosarcoma Oncogene Homolog G (MAFG), CAMP Responsive Element Binding Protein 1 (CREB1), FXYD Domain Containing Ion Transport Regulator 2 (FXYD2), Jun Proto-Oncogene (JUN), Small Ubiquitin-Like Modifier 2 (SUMO2), Small Ubiquitin-Like Modifier 1 (SUMO1), V-Myc Avian Myelocytomatosis Viral Oncogene Homolog (MYC), Crystallin, Zeta (Quinone Reductase) (CRYZ), Aldo-Keto Reductase Family 7, Member A2 (Aflatoxin Aldehyde Reductase) (AKR7A2), and Glutathione S-Transferase Alpha 2 (GSTA2).
[0230] In some embodiments, a method of increasing ubiquitination of NRF2 in a cell is provided, the method comprising contacting the cell with an inhibitor of a NRF2 pathway under conditions allowing inhibition of a NRF2 pathway in a cell. Increased ubiquitination of NRF2 can be determined, e.g., by immunoaffinity enrichment of ubiquitinated NRF2 following trypsin digestion, followed by mass spectrometry, according to known methods. In some embodiments, an increase in ubiquitation may be determined by comparing the ubiquitination of a wild-type NRF2 in a cell or population of cells contacted with a NRF2 pathway antagonist with the ubiquitination of an exon 2 or exon 2+3 deleted NRF2 in a cell or a population of cells contacted with a NRF2 pathway antagonist and/or the ubiquitination of an exon 2 or exon 2+3 deleted NRF2 in a cell or a population of cells not contacted with a NRF2 pathway antagonist.
[0231] In some embodiments of the invention, the NRF2 pathway antagonist is ascorbic acid, brusatol, luteolin, or ochratoxin A.
Examples
Example 1: Materials and Experimental Methods
[0232] A. Mutation and Copy Number Analysis
[0233] For 99 NSCLC cell lines, non-synonymous mutations and copy number data for KRas, LKB1, KEAP1, and NRF2 were obtained from Klijn et al. (Nat Biotechnol. 33(3):306-312, 2015). Thirteen additional NSCLC cell lines were subjected to copy number analysis. In addition, exome sequencing was applied to 104 NSCLC cell lines. For the cancer genome atlas (TCGA) tumors mutation and copy number data were retrieved from cBioPortal using the R software package CGDS-R (Cerami et al. Cancer Discovery. 2:401-404, 2012; Gao et al. Sci. Signal. 6:11, 2013).
[0234] B. RNA-Seq Analysis and Derivation of a Mutant KEAP1 Gene Expression Signature
[0235] Raw RNA-seq data for 99 NSCLC cell lines were retrieved from the European Genome-phenome Archive (accession number EGAS00001000610) (PMID: 25485619). Mutations in KEAP1 and NRF2 in each of the NSCLC cell lines are provided in Table 4. Raw RNA-seq data were downloaded from TCGA and aligned to the human reference genome (GRCh37/hgl9) using GSNAP version 2013-10-10 (Wu and Nacu. Bioinformatics 26:873-881, 2010), allowing maximum of 2 mismatches (parameters: "-M 2 -n 10 -B 2 -i 1 -N 1 -w 200000 -E 1--pairmax-rna=200000"). Gene expression levels were quantified with RPKM (reads per kilobase of target and million reads sequenced) values derived from the number of reads mapped to each RefSeq gene. Using the DESeq R package (PMID: 20979621) differential gene expression was measured between KEAP1 mutant and KEAP1 wild-type cell lines, reported as fold-change and associated adjusted p-values. For ward clustering of samples and genes (using Euclidean distance) in variance stabilized count data were used. The `NMF` R package was used to create associated heatmaps.
TABLE-US-00004 TABLE 4 Name Sample Type Gene Mutation BEN Carcinoma KEAP1 A556T NCI-H460 Carcinoma Large Cell KEAP1 D236H NCI-H838 Carcinoma Non-Small Cell KEAP1 E444* HCC44 Carcinoma Non-Small Cell KEAP1 F211C A549 Carcinoma KEAP1 G333C HCC-15 Carcinoma Non-Small Cell KEAP1 G364C NCI-H1648 Adenocarcinoma KEAP1 G364C NCI-H2110 Carcinoma Non-Small Cell KEAP1 G429C LXF-289 Adenocarcinoma KEAP1 G430V NCI-H647 Carcinoma Non-Small Cell KEAP1 G523W NCI-H920 Carcinoma Non-Small Cell KEAP1 G603V HCC4019 Adenocarcinoma KEAP1 K131* NCI-H23 Carcinoma Non-Small Cell KEAP1 Q193H NCI-H1355 Adenocarcinoma KEAP1 Q75* NCI-H1915 Carcinoma Non-Small Cell KEAP1 R135L NCI-H2126 Carcinoma Non-Small Cell KEAP1 R272C NCI-H1944 Carcinoma Non-Small Cell KEAP1 R272L NCI-H1623 Carcinoma Non-Small Cell KEAP1 R320L NCI-H2170 Carcinoma Squamous Cell KEAP1 R336* NCI-H1435 Carcinoma Non-Small Cell KEAP1 R413L NCI-H322T Unknown KEAP1 R4605 H322T Carcinoma Non-Small Cell KEAP1 R4605 NCI-H661 Carcinoma Large Cell KEAP1 V168I NCI-H2030 Carcinoma Non-Small Cell KEAP1 V568F NCI-H2023 Carcinoma Non-Small Cell KEAP1 W252C H1573 Adenocarcinoma KEAP1 A143P NCI-H2172 Carcinoma Non-Small Cell KEAP1 G430C H1792 Adenocarcinoma KEAP1 G462W NCI-H2122 Carcinoma Non-Small Cell KEAP1 R202G HCC2270 Adenocarcinoma NRF2 G31E NCI-H2228 Carcinoma Non-Small Cell NRF2 G31A NCI-H1568 Carcinoma Non-Small Cell NRF2 DEE77-79 EBC-1 Carcinoma Non-Small Cell NRF2 D77V
[0236] C. Splice Variant Analysis
[0237] Analysis of splice variants was performed using the SGSeq software package available from the Bioconductor project website (Gentleman et al. Genome Biol. 5:R80, 2004). Exons and splice junctions were predicted from BAM files for 7,384 TCGA samples at 54 genomic loci of known oncogenes using parameters alpha=2, psi=0, beta=0.2, gamma=0.2. Predicted features were merged across samples, and exons were processed into disjoint exon bins. Splice junctions and exon bins were assembled into a genome-wide splice graph. Splice events, which consist of two or more alternative splice variants, were identified from the graph. Splice variants were quantified in terms of FPKM and relative usage .psi.. Briefly, local estimates of relative usage at the start and end of the variant were obtained as the fraction of fragments that are compatible with the variant. Estimates at the event start and end were combined using a weighted mean, with weights proportional to the total number of fragments spanning the boundary. Relative usage estimates with denominator less than 20 were set to NA. To obtain a local estimate of absolute expression at the variant start and end, compatible counts n were converted to FPKMs as n I (N.times.L).times.109 where N is the total number of aligned fragments and L is the effective length (the number of allowed positions for a compatible fragment). Splice variants detected in TCGA samples were also quantified in 2,958 genotype-tissue expression project (GTEx) samples from normal human tissues (Consortium. Science. 348:648-660, 2015).
[0238] D. Identification of Cancer-Specific Splice Variants
[0239] Only internal splice variants (not involving alternative transcript starts or ends) were considered and the start and end of each splice variant were required to either overlap or extend exons that belong to annotated ref Gene transcripts downloaded from the UCSC Genome Browser website (Pruitt et al. Nucleic Acids Res. 33:D501-504, 2005; Rosenbloom et al. Nucleic Acids Res. 43:D670-681, 2015). Retained introns were excluded. 19 TCGA indications that included at least 100 cancer samples (6,359 cancer samples in total) were considered and splice variants with (i) FPKM>2 and relative usage .psi.>0.2 in at least one cancer sample and (ii) FPKM<1 in>99.9% of GTEx samples, and (iii) FPKM.about.0 in>97.5% of GTE.times.samples were selected. FPKM-based criteria were required to be satisfied at both the start and end of the splice variant. Variants satisfying the FPKM-based criteria for which 4 could not be estimated were included after manual inspection.
[0240] E. Analysis of Targeted Paired-End Exome-Seq Data
[0241] All samples within FoundationCORE were processed and sequenced similarly as previously described (Frampton et al. Nat. Biotechnol. 31, 1023-1031, 2014). NRF2 exon 2 and exon 2+3 deletions were screened across a FoundationCORE dataset (n=58,707) using two distinct approaches.
[0242] First, rearrangement calls based on discordant read pairs and/or split reads were examined for direct evidence for loss of NRF2 exon 2 or exon 2+3. Although this approach provides direct evidence of the deletions of interest, deletions can only be discovered with this approach if the breakpoints are within a baited region because intronic regions of NRF2 are not captured. Thus, this approach identifies a limited subset of NRF2 exon 2 or exon 2+3 deletions in which the breakpoints occur near intron-exon boundaries or within exons.
[0243] The second approach utilizes copy number log ratio data from individual bait regions. Copy number log ratio values were determined with an in-house algorithm, educated to the specific tumor cellularity of each sample. A z-score was calculated comparing the log ratio for each exon in NRF2 to control polymorphism capture regions immediately adjacent to NRF2 (n=15; evenly spaced every .about.1 MB from .about.3 MB upstream and .about.12 MB downstream of NRF2). Exon 2 deletions with and without concurrent exon 3 deletion were specifically examined. These are herein referred to as exons of interest (EOI). EOI deletions were called if (1) a z-score was <-2 for EOI and not for non-EOIs in NRF2 and (2) a log-ratio drop of 0.2 from non-EOIs in NRF2 was calculated. Mutual exclusivity between NRF2 exon 2 or exon 2+3 deletions and short variants in NRF2 or KEAP1 was examined specifically within lung squamous cell carcinoma (n=1,218).
[0244] F. Cell Culture
[0245] KMS-27 (RPMI-1640), JHH-6 (Williams Media E), HuCCT1 (RPMI-1640), and HUH-1 (DMEM) cells were from JCRB, and 293 (EMEM) cells were from ATCC. Cells were cultured in the indicated media in the presence of 2 mM glutamine and 10% FBS.
[0246] G. Western Blotting
[0247] Cell lysates were prepared with RIPA Buffer (Sigma) supplemented with complete EDTA-free protease inhibitor (Roche) and phoSTOP (Roche), Phosphatase Inhibitor Cocktail 2 (Sigma) and Phosphatase Inhibitor Cocktail 3 (Sigma) phosphatase inhibitors. Lysates were run on Novex Tris-Glycine 4-12% gradient gels (ThermoFisher) and transferred onto iBlot nitrocellulose (Invitrogen). Blots were pre-incubated in 5% skim milk powder (Merck) in TBST (10 mM Tris pH8, 150 mM NaCl, 0.1% TWEEN-20), followed by 5% bovine serum albumin (Sigma) in TBST containing antibodies. Secondary antibodies used were ECL Anti-Rabbit HRP and ECL Anti-Mouse HRP (both from GE Heathcare). Blots were developed with a Chemiluminescence Substrate Kit (Protein Simple) and visualized with a FluorChem HD2 imager (Protein Simple). Antibodies used in this study are against KEAP1 (Cell Signaling G1010), NRF2 (Abcam ab62352), HSP90 (Cell Signaling 4877), HDAC2 (Cell Signaling 5113), .beta.-actin (Sigma A2228), HA (Roche 11815016001), and FLAG (Sigma F2426). Lamda phosphatase was from NEB (P0753L), and phosphatase inhibitors were omitted from the lysis buffer in these experiments.
[0248] H. Cell Viability and DNA Fragmentation Analysis
[0249] siRNAs were reverse transfected into cells with Dharmafect 2 reagent (ThermoFisher) and OptiMEM (Gibco). Four days post transfection, cells were measured for viability using CellTiter-Glo reagent (Promega) and luminescence was detected on an EnVision Multi-label Reader (Perkin Elmer). siRNAs were reverse transfected into cells with Dharmafect 2 reagent (ThermoFisher) and OptiMEM (Gibco). Four days post-tranfection, cells were measured for apoptosis using propidium iodide (PI) (LifeTechnologies) staining and flow cytometry following a published protocol (Riccardi and Nicoletti Nat. Protoc. 1:1458-1461, 2006). Staurosporin, 1 .mu.M, (Enzo) was added to positive control cells 24 hours pre-staining. siRNAs targeting NRF2 exon 2 had the sequences: 5'-TGGAGTAAGTCGAGAAGTA-3' (SEQ ID NO: 29) and 5'-ACAACTAGATGAAGAGACA-3' (SEQ ID NO: 30). siRNAs targeting NRF2 exon 5 had the sequences: 5'-TGACAGAAGTTGACAATTA-3' (SEQ ID NO: 31) and 5'-GTAAGAAGCCAGATGTTAA-3' (SEQ ID NO: 32), and were used along with non-target siRNA as control siRNA. Stained cells were analyzed with a Becton Dickinson FACS Caliber instrument. siRNAs targeting KEAP1 were from Dhamacon (L012453-00).
[0250] DNA fragmentation was quantified by propidium iodide (PI) staining and measured by flow cytometry according to Riccardi et al. (Nature Protocols, 1:1458-1461 (2006)).
[0251] I. Taqman Analysis
[0252] Total cellular RNA was extracted with an RNeasy Kit (Qiagen). RNA was converted to cDNA using a High Capacity cDNA Reverse Transcription Kit (Applied Biosystems), and cDNAs were amplified with Taqman Gene Expression primer-probe sets (ThermoFisher) using Taqman Gene Expression Master Mix reagents (Applied Biosystems). Taqman amplification/detection was performed on a QuantStudio7 Flex Real-Time PCR System. Primer-probe sets used were Hs00232352_ml and Hs00975961_gl to detect NRF2 exons 2 and 5, respectively (ThermoFisher). NRF2 target gene Taqman primer-probe sets used were: SLC7A11 (Hs00921938_ml), SGRN (Hs00921938_ml), NR0B1 (Hs03043658_ml), GCLC (HsOOl55249_ml), and GPX2 (Hs01591589_ml), all from ThermoFisher.
[0253] J. 293 Transfections
[0254] Plasmid DNAs were transfected into cells using Lipofectamine 2000 (ThermoFisher) and OptiMEM (Gibco) as recommended by manufacturer protocol. Lysates were prepared 2-3 days post transfection. Expression plasmids used were pRK5.NRF2, pRK5.NRF2.delta.e2, pRK5.NRF2.delta.e2,3, pRK5.NRF2.FLAG and pRK5.KEAP1.HA.
[0255] K. Tumor Xenograft Models
[0256] Eleven to twelve week-old female C.B-17 SCID.beige mice (Charles River Laboratories) were subcutaneously inoculated in the right lateral flank with 10.times.10.sup.6 A549 shRNA cells in 100 .mu.l HBSS/MATRIGEL.RTM. (BD Biosciences) or with 10.times.10.sup.6 H441 shRNA cells in 100 .mu.l HBSS per mouse. When tumor volume reached approximately 150-250 mm.sup.3, mice were randomized to receive drinking water containing 1 mg/ml doxycycline (in 5% sucrose) or no doxycycline (5% sucrose alone) adlibitum. The doxycycline was replaced 3 times a week and the sucrose replaced once a week. Tumor volumes were determined using digital calipers (Fred V. Fowler Company, Inc.) using the formula (L.times.W.times.W)/2 and plotted as mean tumor volume (mm.sup.3)+/-SEM. Tumor growth inhibition (% TGI) was calculated as the percentage of the area under the fitted curve (AUC) for the respective dose group per day in relation to the vehicle, such that % TGI=100.times.1-(AUC treatment/day)/(AUC vehicle/day). In a separate study, mice with 150-250 mm.sup.3 tumors were dosed with 1 mg/ml doxycycline for 5 days before the tumors were excised and analyzed by Western blotting for NRF2 levels.
[0257] L. A549 Xenografts Treated with ErbB3 Antibodies
[0258] Female nude mice (n=10) bearing subcutaneous A549 tumors (75-144 mm3) on Day 1 were treated with vehicle or 50 mg/kg YW57.88.5 (100 mg/kg loading dose) administered intravenously once each week for four weeks (qwk.times.4). Tumors were measured twice each week, and each animal was euthanized for endpoint at the earlier of its tumor reaching a volume of 1000 mm.sup.3 or on the final day of the treatment regimen.
Example 2: Identification of NSCLC Cell Lines with Mutations in KEAP1 and NRF2
[0259] To identify mutations, copy number, and loss of heterozygosity (LOH) of KEAP1 and NRF2 in NSCLC, a panel of 113 NSCLC cell lines profiled by RNA-seq, exome-seq, or SNP arrays was documented (FIG. 1A). KEAP1 mutations were found in 29/113 cell lines (26%), and NRF2 mutations were detected in 4/113 cell lines (4%). Except for the NCI-H661 cell line, all KEAP1 mutated cell lines showed homozygous expression of the mutated allele, which was generally associated with copy neutral LOH. In contrast, the NRF2 mutations were heterozygous and not associated with LOH. A further two cell lines (HCC1534 and NCI-H1437) showed no detectable KEAP1 mRNA through bi-allelic loss of KEAP1 DNA. The NRF2 mutations were in the previously identified hotspots in the KEAP1 interface regions (FIG. 1B) (Shibata et al. Proc. Natl. Acad. Sci. U.S.A. 105:13568-13573, 2008), and comprised point mutations and an in-frame 3-amino-acid deletion. The mutations in KEAP1 were spread throughout the primary sequence (FIG. 1C), with few obvious hotspots. However, when mapped onto the KEAP1/NRF2 peptide crystal structure (Fukutomi et al. Mol. Cell. Biol. 34:832-846, 2014), the mutations cluster in the loops extending from the KEAP1 core beta propeller close to the interaction site with NRF2 (FIG. 1D).
Example 3: Identification of a Mutant KEAP1 Gene Signature
[0260] To determine the transcriptional consequences of KEAP1 mutations in NSCLC cell lines, genes that were significantly differentially expressed (p<0.01, absolute mean fold-change >2) in the KEAP1 mutated cell lines compared to the wild-type KEAP1 cell lines were identified. Overall, 27 genes were significantly up-regulated in the KEAP1 mutant cell lines (FIGS. 2A-2B), 15 of which have previously been identified as NRF2 target genes from ChIP-seq or RNA-seq studies (Chorley et al. Nucleic Acids Res. 4:7416-7429, 2012; Hirotsu et al. Nucleic Acids Res. 40:10228-10239, 2012; Malhotra et al. Nucleic Acids Res. 38:5718-5734, 2010). Only one gene, HSPB1, was identified as significantly down-regulated using these cut-offs.
[0261] Unsupervised clustering of 230 TCGA lung adenocarcinomas based on the expression of these 27 genes resulted in the division of two major groups (FIG. 3A). One group was mainly characterized by high expression of the 27 signature genes, and contained 43 tumors, out of which 32 (74%) were KEAP1 mutant. The other group, characterized by low expression, contained 187 tumors, out of which 179 were KEAP1 wild-type. Strikingly, using the same gene set to cluster lung squamous cell carcinomas, NRF2 as well as KEAP1 mutant tumors were distinguished from the NRF2/KEAP1 wild-type tumors (FIG. 3B), suggesting that NRF2 mediates most of the transcriptional consequences of KEAP1 loss/mutation. Interestingly, there were several squamous NSCLC tumors that showed high expression of the KEAP1 mutant genes without any known mutations in either KEAP1 or NRF2. Of the 27 genes up-regulated in the KEAP1 mutant cell lines, proteomic data was available for 17 in a smaller sub-set of cell lines (37 wild-type KEAP1, 6 mutant KEAP1). Consistent with the increased levels of mRNA of these genes in the mutant KEAP1 cell lines, the protein targets of all but one of these 17 genes (SLC7A11, which had low peptide coverage) also showed increased expression in the mutant KEAP1 cell lines relative to the wild-type cell lines (FIG. 4).
Example 4: Identification of Aberrant Splicing of NRF2 in Tumor Samples
[0262] For the majority of tumors with high expression of the 27 candidate NRF2 target genes, elevated gene expression could be explained by mutations in KEAP1 or NRF2. However, there were some tumors that showed high expression of candidate NRF2 target genes in the absence of characterized mutations in either KEAP1 or NRF2. Cancer-associated transcript alterations are increasingly recognized as possible driver events. Therefore, it was hypothesized that NRF2 pathway activation in these tumors might be driven by splice alterations not recognized by whole-exome sequencing. 54 known oncogenes were analyzed to identify splice variants that are recurrently observed in cancer samples from the TCGA but are rarely detected in normal samples from the GTEx (see Example 1). 19 cancer types were selected, each including at least 100 cancer samples (6,359 samples in total). In the 54 considered oncogenes, nine recurrent candidate cancer-specific splice variants were identified (2 samples and >1% of samples for a given cancer type). Using the same detection criteria as in the cancer samples, none of these variants could be detected in normal controls (2,958 samples in total). Grouping together related variants with shared splice sites yielded five independent alterations in four oncogenes (FIG. 5). These alterations included several well-documented oncogenic splice variants, including EGFRvIII in brain cancers, MET exon 14 skipping in lung adenocarcinoma and CTNNB1 exon 3 deletions in colorectal cancers (Cho et al. Cancer Res. 71(24):7587-7596, 2011; Kong-Beltran et al. Cancer Res. 66(1):283-289, 2006; Iwao et al. Cancer Res. 58(5):1021-1026, 1998). Interestingly, previously uncharacterized splice variants in NRF2 were observed and occurred frequently in patients with squamous NSCLC (3.3%; 16/481) and at lower prevalence in patients with HNSC (1.5%; 6/403) (FIG. 5A). A more detailed analysis of NRF2 splice variants in lung squamous carcinoma revealed two splice variants co-occurring in the same patients, corresponding to a skip of NRF2 exon 2 in mRNAs transcribed from either one of two alternative promoters (2.1%; 10/481) (FIG. 6). Two additional splice variants co-occurred in a distinct set of patients (1.2%; 6/481), corresponding to a skip of both NRF2 exons 2 and 3 (exon 2+3) in mRNAs with either one of the two alternative transcript starts (FIG. 6). All patients expressing NRF2 splice variants lacking exon 2 or exon 2+3 also showed expression of normal NRF2 transcripts as evidenced by split reads supporting inclusion of exon 2. Both exons 2 and 3 are part of the NRF2 coding sequence, and skip of exon 2 or exon 2+3 are predicted to result in protein isoforms with either an N-terminal truncation or an in-frame deletion (FIG. 7). The high recurrence of NRF2 transcripts lacking exon 2 and preservation of coding potential suggest that these splice variants may present gain-of-function events conferring a selective advantage. This is supported by the finding that exon 2 encodes the Neh2 domain, which allows for interaction with KEAP1 (Itoh et al. Genes Dev. 13(1):76-86, 1999), which is mutated in 15% of squamous lung cancers.
[0263] To assess whether the observed NRF2 splice variants can account for NRF2 pathway activation in patients without mutations in KEAP1 or NRF2, co-occurrence of NRF2 splice variants and NRF2 pathway mutations were observed. In the TCGA collection, 178 of the squamous lung tumors were profiled by exome-seq. In this subset, 10 tumors (6%) displaying exon 2 or exon 2+3 deletion were mutually exclusive with 48 tumors (27%) showing mutations in either NRF2 or KEAP1 (FIG. 8A). Moreover, all exon-2 deleted tumors showed high expression of the 27 candidate NRF2 target genes (FIG. 8B). Similar observations were made for head and neck cancer, where NRF2 exon deletion in 5 tumors (2%) were mutually exclusive with mutations in NRF2 or KEAP1 in 26 tumors (9%) (FIGS. 9A-9B). These results suggest that deletion of exon 2 represents an alternative mechanism for activation of NRF2 in a subset of squamous NSCLC and head and neck tumors. Importantly, these results show that consideration of splice alterations, in addition to exome sequencing, increased the percentage of patients identified as having putative NRF2 pathway activation from 27% (48/178) to 33% (58/178) in lung squamous carcinoma and from 9% (26/275) to 11% (31/275) in head and neck squamous carcinoma.
Example 5: Validation of NRF2 Splicing Defects in Cell Lines
[0264] To identify cell line models for further study, read evidence for the identified splice variants in RNA-seq data was analyzed from a large panel of human cancer cell lines (described in Klijn et al. Nat. Biotechnol. 33(3):306-312, 2014). Out of 611 cell lines, one multiple myeloma cell line KMS-27 and one hepatocellular carcinoma cell line JHH-6 were identified, both showing evidence for heterozygous skip of NRF2 exon 2 by junction reads (FIG. 10). The NRF2 exon 2 skipping by RT-PCR in JHH-6 and KMS-27 mRNA was validated. Using a series of forward and reverse primers derived from exon 1 and exons 3/4 respectively (FIG. 11A), the exon 2 deletion (.DELTA.e2 NRF2) in mRNA isolated from JHH-6 and KMS-27 cells was confirmed (FIG. 11B). Sequencing of the PCR products confirmed the expected deletion of exon 2 (FIGS. 12A-12C). Based on RNA-seq data no point mutations were detected in the coding sequence of NRF2 or KEAP1 in JHH-6 or KMS-27 (Klijn et al. Nat. Biotechnol. 33(3):306-312, 2014).
[0265] As NRF2/KEAP1 alterations are fairly common in hepatocellular carcinoma (10%) but infrequent in multiple myeloma (0%), JHH-6 cells were further tested. Specifically, expression of the exon 2-deleted form of NRF2 protein was tested. Western blotting of whole-cell lysates from JHH-6 cells, as well as the KEAP1 mutant HUH-1 line, and HuCCT1 cells as a representative wild-type KEAP1 liver cancer cell line was performed. The levels of NRF2 in JHH-6 cells were comparable to those seen in HUH-1 cells, which were much higher than in the wild-type KEAP1 HuCCT1 cells (FIG. 13). Moreover, a smaller molecular weight species, consistent with a deletion of exon 2, was detectable in JHH-6 and was reduced upon NFE2L2 siRNA transfection, confirming that it indeed represents a form of NRF2. While the altered NRF2 isoform was visible, it was surprising that it was not more abundant, given the lack of KEAP1 interaction motifs. It was hypothesized that a phosphorylated form of exon 2-deleted NRF2 might co-migrate with the unphosphorylated form of wild-type NRF2 in the 4-12% gels used. Indeed, dephosphorylation of JHH-6 lysates showed that the exon 2-deleted form of NRF2 was significantly more abundant than the wild-type form (FIG. 14A, middle panel). Similarly, KMS-27 cells expressed the exon 2-deleted form of NRF2, which was the major species apparent upon dephosphorylation (FIG. 15).
[0266] The stability of NRF2 in the three liver cancer cell lines was tested using cycloheximide to abolish total protein synthesis. Dephosphoryalted lysates were used to allow more accurate quantification of total NRF2. The experiments showed increased stability of .DELTA.e2 NRF2 in JHH-6 cells, comparable to NRF2 in HUH-1 cells, which were both more stable than NRF2 in HuCCT1 cells (FIGS. 14A-14B). The exon 2-deleted form of NRF2 in JHH-6 cells also showed prominent nuclear localization, also when compared to HUH-1 cells (FIG. 16).
[0267] To determine whether the deletion of exon 2 in JHH-6 cells made NRF2 refractory to regulation by KEAP1, the stability of NRF2 in response to KEAP1 knockdown was tested. Knockdown of KEAP1 in HuCCT1 cells resulted in increased steady state levels of NRF2 due to increased stability (FIG. 14C). However, knockdown of KEAP1 in JHH-6 cells did not affect the levels or stability of exon 2-deleted NRF2. As expected, knockdown of KEAP1 did not increase the stability of wild-type NRF2 in the KEAP1 mutant HUH-1 cell line (FIG. 14D).
Example 6: Assessment of Exon 2 and/or Exon 2+3 Deletion on NRF2
[0268] Utilizing the NRF2/KEAP1 gene signature described in Example 3, it was determined that, of 16 hepatocellular carcinoma cell lines, JHH-6 cells show among the highest expression of NRF2 target genes from RNA-seq data, similar to those seen in mutant KEAP1 expressing lines (FIG. 17A). Similarly, out of 18 multiple myeloma cell lines examined, KMS-27 cells show among the highest expression of these genes (FIG. 17B). Expression of these genes can be summarized by a "NRF2 target gene score" calculated as the mean of z-scores for individual target genes across the 611 cell lines examined. This results in a single score per cell line that reflects the extent of overexpression of signature genes in the given line. The NRF2 target score confirms that JHH-6 cells show a similar score as liver cancer cell lines expressing KEAP1 mutations (FIG. 18A) and KMS-27 cells show the highest score among multiple myeloma cell lines (FIG. 18B), despite multiple myeloma showing a low overall NRF2 target gene score (indicated by the negative values).
[0269] Next, the dependence of JHH-6 cells expressing exon 2 deleted NRF2 on the expression of NRF2 protein was compared to wild-type NRF2 expressing HuCCT1 cell. Knockdown of NRF2 in JHH-6 cells caused a marked decrease in cell viability, similar to that seen in the mutant KEAP1 hepatocellular carcinoma cell line HUH-1. In contrast, NRF2 knockdown had a more modest effect on the viability of HuCCT1 cells (FIG. 19). This was not due to defective NRF2 knockdown in HuCCT1 cells, as NRF2 knockdown was equally efficient in all three cell lines (FIG. 20). Knockdown of NRF2 also resulted in decreased expression of four well-characterized NRF2 target genes, although this was slightly reduced in the wild-type KEAP1 HuCCT1 cell line (FIG. 21). Decreased viability was likely due, at least in part, to apoptosis as measured by an increase in fragmented DNA (FIG. 22).
[0270] To address how loss of NRF2 exon 2 affects the ability of NRF2 to be regulated by KEAP1, transient expression in 293 cells was used. KEAP1 decreased the expression of full-length NRF2, but had lesser effects on the expression of NRF2 lacking exon 2 or exons 2+3 (FIG. 23, upper panels). The inhibitory effect of KEAP1 on full-length NRF2 expression was mostly abolished by proteasome inhibitor MG132, as expected. Full-length NRF2 and KEAP1 interacted with each other, whereas deletion of exon 2 or exon 2+3 completely abolished the ability of KEAP1 to bind NRF2 (FIG. 23, lower panels). As a result, truncated NRF2 remained stable following KEAP1 expression, in contrast to wild-type NRF2 (FIGS. 24A-24B), although the truncated forms of NRF2 appeared to have slightly decreased intrinsic stability. However, altered NRF2 isoforms were transcriptionally active, as judged by their ability to increase NRF2 target gene expression (FIG. 25). Most genes were similarly increased by exon 2- or exon 2+3-deleted NRF2 compared to full-length NRF2 and were resistant to the effects of KEAP1 overexpression. Interestingly, exon 2+3-deleted NRF2 was defective for increasing GPX2 expression, suggesting that there might be subtle differences in the transcriptional activation of this form of NRF2. Consistent with this observation, 22 of the 27 target genes described in Example 3, in addition to GPX2, showed lower median expression in exon 2+3-deleted squamous lung tumors compared to exon 2-deleted tumors (FIG. 26).
Example 7: Mechanistic Analysis of NRF2 Exon 2 Splice Alteration
[0271] Analysis of exome-seq data for KMS-27 and JHH-6 shows a decrease in reads mapping to exon 2, suggesting that the observed transcript variants could be the result of genomic alterations (FIG. 27A). Whole-genome sequencing (WGS) of JHH-6 and KMS-27 showed that these cell lines harbor microdeletions surrounding NRF2 exon 2, spanning 4,685 and 2,981 nucleotides, respectively (FIG. 27B). To investigate the causal mechanism in patients, targeted paired-end exome-seq data from a large cohort (n=1,218) of clinical squamous NSCLC tumors with high read coverage (>300.times.) were analyzed. In this data set, eleven tumors showed a decrease in copy number for exon 2 or exons 2+3 compared to nearby control regions (Materials and Methods; FIG. 27B). The focal nature of the deletions can be appreciated by investigating log-ratios from defined genomic regions targeted for sequencing (FIGS. 28B-1 and 28B-2). Seven tumors with discordant read pairs were consistent with structural variants encompassing several kilobases of DNA and affecting exon 2 or exon 2+3 (FIG. 28A). In total, sixteen patients showed evidence for genomic alterations affecting NRF2 exon 2 or exon 2+3, and the identified events were mutually exclusive with point mutations or indels in NRF2 and KEAP1, which are known to activate this pathway. An additional cohort of 45 squamous NSCLC tumors were analyzed, for which both RNA and DNA were available. RT-PCR analysis identified a single patient with loss of exon 2, which was strongly enriched in the tumor compared to adjacent normal tissue (FIG. 29). RNA-seq analysis confirmed that the transcript variant was expressed in the identified tumor, but was absent in adjacent normal tissue (FIG. 30). Expression of NRF2 target genes was also elevated to a similar extent as in TCGA tumors with known mutations in this pathway, whereas the adjacent normal tissue showed low expression of these genes (FIG. 31). Finally, whole-genome sequencing confirmed that the transcript variant was the result of a somatic genomic microdeletion of 5,233 nucleotides surrounding exon 2 (FIG. 28C). These data suggest that genomic microdeletions are a clinically relevant mechanism for NRF2 pathway activation.
[0272] These data suggest that the set of genes regulated by NRF2 is conserved across different tissues and conditions. This has practical value in the use of a single gene signature to identify tumors with NRF2 activation in both NSCLC and HNSC (FIG. 32). Interestingly, this NRF2/KEAP1 signature is only activated in tumors. Matched normal samples for lung and head and neck tumors showed only low NRF2 target gene activity (FIG. 33). This suggests that inhibition of the NRF2 pathway might have selective benefit in tumors showing pathway deregulation compared to normal tissues.
[0273] Intragenic genomic deletions that result in activation of proto-oncogenes have previously been reported for a number of genes, including EGFR and CTNNB1. Such variants are not routinely assayed, due in part to limitations of current genomic technologies. In particular, small aberrations affecting individual exons and involving small copy number changes are difficult to detect by exome-seq alone. Thus, intragenic deletions have remained relatively unexplored and new variants are still being discovered. Recent studies of small cell lung cancer and adult T cell leukemia/lymphoma identified recurrent microdeletions in TP73, IKZF2, and CARD11 using whole-genome sequencing (George et al. Nature 524, 47-53:2015; Kataoka et al. Nat. Genet. 47:1304-1315, 2015). In the present study, publicly available RNA-seq data generated as part of the TCGA project was used to identify recurrent transcript alterations in known oncogenes. Due to differences between patient cohorts, it is difficult to assess the general prevalence of NRF2 exon deletions. For example, when analyzing TCGA lung squamous cancers with available RNA-seq data (n=481), we identified 3% (16/481) of patients having a deletion of NRF2 exon 2 or exon 2+3. When analyzing the subset of patients with available exome-seq data (n=178), for which somatic mutation calling can be performed, the proportion of patients with NRF2 exon deletions was 6% (10/178). Accounting for NRF2 exon deletions increased the percentage of patients with putative NRF2 pathway activation from 27% (48/178) to 33% (58/178) in lung squamous carcinoma and from 9% (26/275) to 11% (31/275) in head and neck squamous carcinoma, compared to assessing mutations in NRF2 or KEAP1 by exome-seq alone (FIGS. 8A and 9A). Analysis of real-world clinical samples from patients that underwent genomic profiling suggested a prevalence of NRF2 exon deletions in 1-2% of lung squamous cell carcinoma. However, the latter analysis lacks sensitivity since optimized criteria for determining single-exon deletions in samples with variable tumor content have yet to be established and only unambiguous deletions were considered. Nevertheless, the results presented herein are consistent with the concept that modulation of this pathway is frequently altered in specific tumor indications, such as squamous NSCLC and head and neck carcinomas. Additional screening of known cancer genes through sequencing of complete gene loci, including introns, or by combining data from exome and RNA sequencing experiments may also be performed.
[0274] Analysis of the structure of the three deletions identified by WGS showed that breakpoints were distinct, but in each case genomic regions flanking the deletions showed 2-6 nucleotides with sequence homology (FIG. 34). The DNA sequences of the 3' end, 5' end, and junction read of JHH-6 cells are provided by SEQ ID NOs: 61-63, respectively. The DNA sequences of the 3' end, 5' end, and junction read of KMS-27 cells are provided by SEQ ID NOs: 64-66, respectively. The DNA sequences of the 3' end, 5' end, and junction read of primary tumor cells are provided by SEQ ID NOs: 67-69, respectively.
[0275] NRF2 often shows genomic amplification in addition to point mutations. Interestingly, while the intensity of the NRF2 deletion product in KMS-27 cells by RT-PCR analysis appeared similar to wild-type NRF2, it seemed to be more abundant in JHH-6 cells (FIG. 13). This was also reflected in WGS read counts, which suggested a higher abundance of the deleted form compared to the wild-type allele (FIG. 27B). These results are consistent with the observation that JHH-6 cells carry five copies of the NRF2 gene locus by SNP array, whereas KMS-27 cells carry two copies. Amplification of NRF2 is reasonably frequent in the TCGA samples analyzed, including squamous (4.5%) and adenomatous (2.6%) NSCLC, HNSC (12.2%), and liver cancers (3.6%), and represents a mechanism to increase NRF2 transcriptional output. In the case of JHH-6 cells, these data suggest that the deleted allele has been preferentially amplified, providing an additional mechanism to boost NRF2 signaling in this cell line. However, preferential amplification of the truncated/spliced allele was not observed in the primary tumors, suggesting that exon 2 or 2+3 deletion alone can provide sufficient NRF2 activity for clonal selection.
[0276] Deletion of exon 2 provides an elegant mechanism to increase NRF2 activity by removing the interaction site with KEAP1, while keeping the remainder of the gene functionally intact for DNA binding and transcriptional activation functions. Indeed, our biochemical analyses confirmed the almost complete loss of KEAP1 binding and resulting stabilization of NRF2 when exon 2 is deleted (FIGS. 23 and 24). When considering NRF2 point mutations found in tumors, mutations surrounding the ETGE high-affinity binding site result in complete loss of KEAP1 interaction, whereas mutations in the lower affinity DLG motif vary in their ability to disrupt the NRF2/KEAP1 complex (Fukutomi et al. Mol Cell Biol. 34(5):832-846, 2014; Shibata et al. Proc. Natl. Acad. Sci. USA. 105(36):13568-13573, 2008). However, even point mutations that do not disrupt the complex change the nature of the interaction such as to prevent KEAP1-mediated ubiquitination of NRF2 (Shibata et al. Proc. Natl. Acad. Sci. USA. 105(36):13568-13573, 2008). While the interaction with KEAP1 is similarly abolished in the case of deletion of both exon 2 and 3, exon 3 contains the Neh4 domain that has been previously implicated in transcriptional activation by NRF2 through binding to CREB (cAMP Responsive Element Binding protein) Binding Protein (CBP) (Katoh et al. Genes Cells. 6(10):857-868, 2001). Neh4 (contained in exon 3) and Neh5 (contained in exon 4) were shown to act synergistically in recruiting CBP. Consistent with this, a decreased ability of .DELTA.e2+3 NRF2 to induce some NRF2 target genes compared to .DELTA.e2 NRF2 or tumor-associated point mutations in NRF2 was observed (FIGS. 25 and 26).
[0277] Deletions found in human tumors that remove the interaction domain with E3 ligases have also been observed in other genes. For example, 7 out of 222 colorectal tumors showed small genomic deletions (234-677 bp) surrounding exon 3 of .beta.-catenin (Iwao et al. Cancer Res. 58(5):1021-1026, 1998) that removes the interaction site for its E3 ligase .beta.-TRCP (Hart et al. Curr. Biol. 9(4):207-210, 1999). Similarly, the majority of TMPRSS-ERG fusion proteins found in prostate cancer encode truncated versions of ERG that render them resistant to ubiquitination and degradation mediated by SPOP (An et al. Mol. Cell. 59(6):904-916, 2015).
[0278] In addition, mutations resulting in MET exon 14 skipping remove amino acid residue Y1003, which is required for Cbl recruitment and subsequent ubiquitination and down-regulation. Therefore, small intragenic deletions represent effective mechanisms for nascent oncogenes to escape normal degradation during tumor initiation and evolution.
Example 8: NRF2 Knockdown in Mutant KEAP1 Cells
[0279] This example provides a characterization of the effects of KEAP1 mutations on the requirement for NRF2 activity under different growth environments and shows that NRF2 activity is essential for growth in anchorage independent conditions.
[0280] The consequences of NRF2 inhibition across wild-type and mutant KEAP1 and NRF2 cell lines were examined. Stable cell lines expressing three independent NRF2 shRNAs under the control of doxycycline, as well as three independent non-targeting controls (NTCs) were established. These NRF2 shRNAs were effective at reducing NRF2 protein levels in five KEAP1 mutant cell lines, two NRF2 mutant cell lines, and five wild-type NSCLC cell lines, as well as in immortalized but non-transformed lung epithelial BEAS2B cells (FIG. 35). Upon doxycycline addition, viability of most cell lines was decreased to varying extents, with the KEAP1 mutant cell lines generally exhibiting significantly greater decreases (FIGS. 36 and 37). Knockdown of NRF2 by siRNA in a larger panel of NSCLC cell lines confirmed a genotype-dependent effect on cell viability (FIG. 38).
[0281] The consequence of NRF2 knockdown in tumor xenografts was characterized. The KEAP1 mutant A549 cell line and the KEAP1 wild-type H441 cell lines expressing dox-inducible NRF2 shRNAs were implanted into the flanks of female SCID mice. NRF2 was effectively knocked down in doxycycline treated mice in both tumors (FIGS. 39 and 40). NRF2 knockdown in the KEAP1 mutant A549 cell line had a dramatic effect on tumor growth, resulting in complete tumor regression in 5 out of 10 tumors (FIG. 41A). In contrast, the effect on KEAP1 wild-type H441 growth was more modest, resulting in a 37% reduction in tumor growth with all animals displaying maintained tumor burden (FIG. 41B).
[0282] To understand the differential effects between NRF2 knockdown on tumor propagation in xenografts versus 2D growth on plastic, several additional cell culture environments were tested. NRF2 knockdown in cells grown in low adherence plates and/or low oxygen (0.5%) showed similar consequences to cells grown on plastic (FIG. 42). In contrast, the growth of KEAP1 mutant cell lines was severely compromised when cultured in soft agar (FIGS. 43 and 44), on micropatterned plastic films (FIGS. 45 and 46), or in methylcellulose (FIG. 47). Growth in soft agar was used to characterize the consequences of NRF2 knockdown in more detail. While knockdown of NRF2 completely abolished colony formation in three KEAP1 mutant cell lines, it had almost no effect in H1048 and H441, two wild-type KEAP1 NSCLC cell lines (FIGS. 43 and 44). The role of the glutathione pathway in the response to NRF2 knockdown was assessed, as this pathway has been shown to mediate survival properties facilitated by high NRF2 activity. While addition of reduced glutathione generally increased the ability of all tested cell lines to form colonies in soft agar, it was unable to rescue the consequences of NRF2 knockdown (FIGS. 43 and 44). Similar negative results were seen with N-acetyl cysteine (NAC; FIG. 48). Exogenous glutathione was able to enter cells and reduce reactive oxygen (ROS) levels, as measured by dichlorofluorescein staining (FIG. 49). Thus, the requirement for NRF2 activity is surprisingly independent of the glutathione synthesis pathway.
[0283] To further explore the effects of the glutathione pathway in NRF2 responses, the expression and activity of the xCT glutamate/cysteine antiporter, one of the rate limiting steps in glutathione synthesis, was monitored. SLC7A11 expression was reduced following NRF2 knockdown (FIG. 50), causing a decrease in cystine uptake (FIG. 51) associated with reduced glutathione (FIG. 52). NRF2 knockdown also caused a large increase in ROS levels (FIG. 53). To determine whether inhibition of SLC7A11 expression and cystine uptake contributed to decreased viability following NRF2 knockdown, xCT function was initiated using erastin, which inhibited cystine uptake (FIG. 51) and increased oxidative stress (FIG. 53). However, this was not sufficient to decrease the viability of the KEAP1 mutant cell line A549 (FIG. 54) or most other KEAP1 mutant cell lines (FIG. 55). The combination of erastin and NRF2 knockdown, however, did result in a dramatic decrease in viability (FIG. 54). Similarly, the glutathione synthase inhibitor buthionine sylphoximine (BSO) or the glutaminase inhibitor BPTES also did not display preferential toxicity for KEAP1 mutant cell lines (FIGS. 56 and 57). These results indicate that supplementation with glutathione was not sufficient to rescue lethality induced by NRF2 knockdown, nor was depletion of glutathione sufficient to kill KEAP1 mutant cell lines.
[0284] In order to understand which pathways were activated as a consequence of NRF2 activation or KEAP1 loss, a CRISPR screen was performed using a library of genes that were decreased upon NRF2 knockdown in A549 cells and/or elevated in a panel of KEAP1 mutant NSCLC cell lines. As distinct consequences were observed following NRF2 knockdown in 2D, 3D, and xenograft growth conditions, the screen was performed under all three environments to determine whether discrete dependencies could be identified. At 15-day time point for all three conditions, all three screens performed similarly, with gRNAs representing only a small number of genes showing significant drop-out (FIGS. 58-60). NFE2L2 and its binding partner, MAFG, were among the most significant genes, showing that the screen performed as expected. The pentose phosphate pathway genes PGD, G6PD and TKT, known NRF2 target genes, also showed strong drop-out. Other strong hits in the screen were two growth factor receptor genes, IGF1R and ERBB3, and genes encoding three components of a redox signaling relay, PRDX1, TXN, and TXNRD1.
[0285] Expression of ErbB3 was decreased following NRF2 knockdown in A549 cells (FIGS. 58-60). Treatment with YW57.88.5 in a tumor xenograft model indicated that ErbB3 was required for A549 proliferation (FIG. 61).
[0286] Expression of IGF1R was greater in KEAP1 mutant NSCLC cells relative to KEAP1 wild-type NSCLC cell lines. To test the effect of IGF1R inhibition on KEAP1 mutant and KEAP1 wild-type cells, cell lines were treated with linsitinib, a potent and selective IGF1R small molecule inhibitor. Linsitinib showed little effect on proliferation when tested in three wild-type and three mutant KEAP1 NSCLC cell lines. However, this compound was very potent at inhibiting colony growth of A549 cells in soft agar, having an IC.sub.50 of about 20 nM. Moreover, when tested against a large panel of NSCLC cell lines, there appeared to be a selective growth inhibition in soft agar of this compound in KEAP1 mutant cell lines. A similar selective effect on KEAP1 mutant cell lines when grown under anchorage independent conditions was also seen with an independent IGF1R inhibitor NVP-AEW541 (FIG. 62).
[0287] Thus, growth factors signaling through IGF1R and ErbB3 are significant mediators of the growth of KEAP1 mutant cells.
OTHER EMBODIMENTS
[0288] Although the foregoing invention has been described in some detail by way of illustration and example for purposes of clarity of understanding, the descriptions and examples should not be construed as limiting the scope of the invention. The disclosures of all patent and scientific literature cited herein are expressly incorporated in their entirety by reference.
Sequence CWU
1
1
6915446RNAHomo sapiens 1uacuuuggga acuggugagu cucccugucc cuagggcuuu
uuagucacau guccauccac 60uguuucaaug uaacaugcau cuaggcaagg uuaacgauua
aaugguuggg augaaagguc 120auccuuuacg gagaacauca gaaugguaga uaauuccugu
uccacuuucu uugaugaaac 180aaguaaagaa gaaacaacac aaucauauua auagaagagu
cuucguucca gacgcagucc 240aggaaucaug cuggagaagu ucugcaacuc uacuuuuugg
aauuccucau uccuggacag 300uccggaggca gaccugccac uuuguuuuga gcaaacuguu
cuggugugga uucccuuggg 360cuuccuaugg cuccuggccc ccuggcagcu ucuccacgug
uauaaaucca ggaccaagag 420auccucuacc accaaacucu aucuugcuaa gcagguauuc
guugguuuuc uucuuauucu 480agcagccaua gagcuggccc uuguacucac agaagacucu
ggacaagcca cagucccugc 540uguucgauau accaauccaa gccucuaccu aggcacaugg
cuccugguuu ugcugaucca 600auacagcaga caauggugug uacagaaaaa cuccugguuc
cugucccuau ucuggauucu 660cucgauacuc uguggcacuu uccaauuuca gacucugauc
cggacacucu uacaggguga 720caauucuaau cuagccuacu ccugccuguu cuucaucucc
uacggauucc agauccugau 780ccugaucuuu ucagcauuuu cagaaaauaa ugagucauca
aauaauccau cauccauagc 840uucauuccug aguagcauua ccuacagcug guaugacagc
aucauucuga aaggcuacaa 900gcguccucug acacucgagg augucuggga aguugaugaa
gagaugaaaa ccaagacauu 960agugagcaag uuugaaacgc acaugaagag agagcugcag
aaagccaggc gggcacucca 1020gagacggcag gagaagagcu cccagcagaa cucuggagcc
aggcugccug gcuugaacaa 1080gaaucagagu caaagccaag augcccuugu ccuggaagau
guugaaaaga aaaaaaagaa 1140gucugggacc aaaaaagaug uuccaaaauc cugguugaug
aaggcucugu ucaaaacuuu 1200cuacauggug cuccugaaau cauuccuacu gaagcuagug
aaugacaucu ucacguuugu 1260gaguccucag cugcugaaau ugcugaucuc cuuugcaagu
gaccgugaca cauauuugug 1320gauuggauau cucugugcaa uccucuuauu cacugcggcu
cucauucagu cuuucugccu 1380ucaguguuau uuccaacugu gcuucaagcu ggguguaaaa
guacggacag cuaucauggc 1440uucuguauau aagaaggcau ugacccuauc caacuuggcc
aggaaggagu acaccguugg 1500agaaacagug aaccugaugu cuguggaugc ccagaagcuc
auggauguga ccaacuucau 1560gcacaugcug uggucaagug uucuacagau ugucuuaucu
aucuucuucc uauggagaga 1620guugggaccc ucagucuuag cagguguugg ggugauggug
cuuguaaucc caauuaaugc 1680gauacugucc accaagagua agaccauuca ggucaaaaau
augaagaaua aagacaaacg 1740uuuaaagauc augaaugaga uucuuagugg aaucaagauc
cugaaauauu uugccuggga 1800accuucauuc agagaccaag uacaaaaccu ccggaagaaa
gagcucaaga accugcuggc 1860cuuuagucaa cuacagugug uaguaauauu cgucuuccag
uuaacuccag uccugguauc 1920uguggucaca uuuucuguuu auguccuggu ggauagcaac
aauauuuugg augcacaaaa 1980ggccuucacc uccauuaccc ucuucaauau ccugcgcuuu
ccccugagca ugcuucccau 2040gaugaucucc uccaugcucc aggccagugu uuccacagag
cggcuagaga aguacuuggg 2100aggggaugac uuggacacau cugccauucg acaugacugc
aauuuugaca aagccaugca 2160guuuucugag gccuccuuua ccugggaaca ugauucggaa
gccacagucc gagaugugaa 2220ccuggacauu auggcaggcc aacuuguggc ugugauaggc
ccugucggcu cugggaaauc 2280cuccuugaua ucagccaugc ugggagaaau ggaaaauguc
cacgggcaca ucaccaucaa 2340gggcaccacu gccuaugucc cacagcaguc cuggauucag
aauggcacca uaaaggacaa 2400cauccuuuuu ggaacagagu uuaaugaaaa gagguaccag
caaguacugg aggccugugc 2460ucuccuccca gacuuggaaa ugcugccugg aggagauuug
gcugagauug gagagaaggg 2520uauaaaucuu aguggggguc agaagcagcg gaucagccug
gccagagcua ccuaccaaaa 2580uuuagacauc uaucuucuag augacccccu gucugcagug
gaugcucaug uaggaaaaca 2640uauuuuuaau aaggucuugg gccccaaugg ccuguugaaa
ggcaagacuc gacucuuggu 2700uacacauagc augcacuuuc uuccucaagu ggaugagauu
guaguucugg ggaauggaac 2760aauuguagag aaaggauccu acagugcucu ccuggccaaa
aaaggagagu uugcuaagaa 2820ucugaagaca uuucuaagac auacaggccc ugaagaggaa
gccacagucc augauggcag 2880ugaagaagaa gacgaugacu augggcugau auccagugug
gaagagaucc ccgaagaugc 2940agccuccaua accaugagaa gagagaacag cuuucgucga
acacuuagcc gcaguucuag 3000guccaauggc aggcaucuga agucccugag aaacuccuug
aaaacucgga augugaauag 3060ccugaaggaa gacgaagaac uagugaaagg acaaaaacua
auuaagaagg aauucauaga 3120aacuggaaag gugaaguucu ccaucuaccu ggaguaccua
caagcaauag gauuguuuuc 3180gauauucuuc aucauccuug cguuugugau gaauucugug
gcuuuuauug gauccaaccu 3240cuggcucagu gcuuggacca gugacucuaa aaucuucaau
agcaccgacu auccagcauc 3300ucagagggac augagaguug gagucuacgg agcucuggga
uuagcccaag guauauuugu 3360guucauagca cauuucugga gugccuuugg uuucguccau
gcaucaaaua ucuugcacaa 3420gcaacugcug aacaauaucc uucgagcacc uaugagauuu
uuugacacaa cacccacagg 3480ccggauugug aacagguuug ccggcgauau uuccacagug
gaugacaccc ugccucaguc 3540cuugcgcagc uggauuacau gcuuccuggg gauaaucagc
acccuuguca ugaucugcau 3600ggccacuccu gucuucacca ucaucgucau uccucuuggc
auuauuuaug uaucuguuca 3660gauguuuuau gugucuaccu cccgccagcu gaggcgucug
gacucuguca ccaggucccc 3720aaucuacucu cacuucagcg agaccguauc agguuugcca
guuauccgug ccuuugagca 3780ccagcagcga uuucugaaac acaaugaggu gaggauugac
accaaccaga aaugugucuu 3840uuccuggauc accuccaaca gguggcuugc aauucgccug
gagcugguug ggaaccugac 3900ugucuucuuu ucagccuuga ugaugguuau uuauagagau
acccuaagug gggacacugu 3960uggcuuuguu cuguccaaug cacucaauau cacacaaacc
cugaacuggc uggugaggau 4020gacaucagaa auagagacca acauuguggc uguugagcga
auaacugagu acacaaaagu 4080ggaaaaugag gcacccuggg ugacugauaa gaggccuccg
ccagauuggc ccagcaaagg 4140caagauccag uuuaacaacu accaagugcg guaccgaccu
gagcuggauc ugguccucag 4200agggaucacu ugugacaucg guagcaugga gaagauuggu
guggugggca ggacaggagc 4260uggaaaguca ucccucacaa acugccucuu cagaaucuua
gaggcugccg guggucagau 4320uaucauugau ggaguagaua uugcuuccau ugggcuccac
gaccuccgag agaagcugac 4380caucaucccc caggacccca uccuguucuc uggaagccug
aggaugaauc ucgacccuuu 4440caacaacuac ucagaugagg agauuuggaa ggccuuggag
cuggcucacc ucaagucuuu 4500uguggccagc cugcaacuug gguuauccca cgaagugaca
gaggcuggug gcaaccugag 4560cauaggccag aggcagcugc ugugccuggg cagggcucug
cuucggaaau ccaagauccu 4620gguccuggau gaggccacug cugcggugga ucuagagaca
gacaaccuca uucagacgac 4680cauccaaaac gaguucgccc acugcacagu gaucaccauc
gcccacaggc ugcacaccau 4740cauggacagu gacaagguaa ugguccuaga caacgggaag
auuauagagu gcggcagccc 4800ugaagaacug cuacaaaucc cuggacccuu uuacuuuaug
gcuaaggaag cuggcauuga 4860gaaugugaac agcacaaaau ucuagcagaa ggccccaugg
guuagaaaag gacuauaaga 4920auaauuucuu auuuaauuuu auuuuuuaua aaauacagaa
uacauacaaa aguguguaua 4980aaauguacgu uuuaaaaaag gauaagugaa cacccaugaa
ccuacuaccc agguuaagaa 5040aauaaauguc accagguacu ugagaaaccc cucgauuguc
uaccucgauc guacuuccuu 5100gcuacccacc ccucccaggg acaaccacug uccugaauuu
cacgauaauu auuccuuugc 5160cuuucauuuc uguuuuauca ccuuuguaug uaucuuuaaa
caacauauac ccuuuuuuac 5220uuauguaaau ggacugacuc auacugcaua caucuucuau
gacuugauuc uuuuguucaa 5280uauuauaucu gagauucauc cauggugaug caaauaggug
cauuauuuuu uuucacugcu 5340cuguagucug gcauuguaug aauacagcac aauguaucag
uuuuaauauu ggggaucauu 5400agcauuauuc ucagguuuuu aaaaauuaua agcaguacua
cuaugg 544621610RNAHomo sapiens 2acaguagcuc acaccuguaa
ucccagcacu uuggaaggcc gaggugggcg gaucaccuga 60gcucaggagu uugagaccag
ccugucucua cuaacaauau aaaaauuagc ugggagucac 120ggugggcgcc uguaauccca
gcuacucggg aggcugaggc aggagaauug cuugaaccca 180ggagacagag guuguaguga
gcugagaucg caccacugca cucuagccuu ggcaacagug 240caagacuguc ucaaaaacag
caacagagag caggacguga gacuucuacc ugcucacuca 300gaaucauuuc ugcaccaacc
auggccacgu uuguggagcu caguaccaaa gccaagaugc 360ccauuguggg ccugggcacu
uggaagucuc cucuuggcaa agugaaagaa gcagugaagg 420uggccauuga ugcaggauau
cggcacauug acugugccua ugucuaucag aaugaacaug 480aaguggggga agccauccaa
gagaagaucc aagagaaggc ugugaagcgg gaggaccugu 540ucaucgucag caaguugugg
cccacuuucu uugagagacc ccuugugagg aaagccuuug 600agaagacccu caaggaccug
aagcugagcu aucuggacgu cuaucuuauu cacuggccac 660agggauucaa gucuggggau
gaccuuuucc ccaaagauga uaaagguaau gccaucggug 720gaaaagcaac guucuuggau
gccugggagg ccauggagga gcugguggau gaggggcugg 780ugaaagcccu uggggucucc
aauuucagcc acuuccagau cgagaagcuc uugaacaaac 840cuggacugaa auauaaacca
gugacuaacc agguugagug ucacccauac cucacacagg 900agaaacugau ccaguacugc
cacuccaagg gcaucaccgu uacggccuac agcccccugg 960gcucuccgga uagaccuugg
gccaagccag aagacccuuc ccugcuggag gaucccaaga 1020uuaaggagau ugcugcaaag
cacaaaaaaa ccgcagccca gguucugauc cguuuccaua 1080uccagaggaa ugugauuguc
auccccaagu cugugacacc agcacgcauu guugagaaca 1140uucaggucuu ugacuuuaaa
uugagugaug aggagauggc aaccauacuc agcuucaaca 1200gaaacuggag ggccuguaac
guguugcaau ccucucauuu ggaagacuau cccuucaaug 1260cagaauauug agguugaauc
uccuggugag auuauacagg agauucucuu ucuucgcuga 1320agugugacua ccuccacuca
ugucccauuu uagccaagcu uauuuaagau cacagugaac 1380uuaguccugu uauagacgag
aaucgaggug cuguuuuaga cauuuauuuc uguauguuca 1440acuaggauca gaauaucaca
gaaaagcaug gcuugaauaa ggaaaugaca auuuuuucca 1500cuuaucugau cagaacaaau
guuuauuaag caucagaaac ucugccaaca cugaggaugu 1560aaagaucaau aaaaaaaaua
auaaucauaa ccaacaaaaa aaaaaaaaaa 161031625RNAHomo sapiens
3uccgggcuuc cccagacaga cagcuggcuu acagggccac cugaagacgu uccagggcuc
60cacaggccac ugucuucugg aggggaacgg aucgacugcc ggugcgccca gccaaauuca
120acuccugagu ccucagucuc uagucccggg aagguuucac cgagcugccc uacuccuugu
180accccuucua gcuggccuua gcauagcuac gucagcagcu auuggcacga cugcccugau
240ucaaggagaa acuggacuaa uaucacuauc ucaacaggau cgaggccauc aagcuacaga
300uggucuuaca aauggaaccc caagugaacu caacuaacaa cuuccaccaa ggaccccugg
360accaacccgu uggcccuuug acuggccuaa agaguucccu ucugaaggac acuacaagug
420cagggccccu ucuucgcccc uauccagcau cucuucucgg caaagugaaa gaagcgguga
480agguggccau ugaugcagaa uaucgccaca uugacugugc cuauuucuau gagaaucaac
540augagguggg agaagccauc caagagaaga uccaagagaa ggcugugaug cgggaggacc
600uguucaucgu cagcaaggug uggcccacuu ucuuugagag accccuugug aggaaagccu
660uugagaagac ccucaaggac cugaagcuga gcuaucugga cgucuaucuu auucacuggc
720cacagggauu caagacuggg gaugacuuuu uccccaaaga ugauaaaggu aauaugauca
780guggaaaagg aacguucuug gaugccuggg aggccaugga ggagcuggug gacgaggggc
840uggugaaagc ccuugggguc ucaaauuuca accacuucca gaucgagagg cucuugaaca
900aaccuggacu gaaauauaaa ccagugacua accagguuga gugucaccca uaccucacgc
960aggagaaacu gauccaguac ugccacucca agggcaucac cguuacggcc uacagccccc
1020ugggcucucc ggauagaccu ugggccaaac cugaggaccc uucccugcug gaggauccca
1080agauuaagga gauugcugca aagcacaaaa aaaccacagc ccagguucug auccguuucc
1140auauccagag gaaugugaca gugaucccca agucuaugac accagcacac auuguugaga
1200acauucaggu cuuugacuuu aaauugagug augaggagau ggcaaccaua cucagcuuca
1260acagaaacug gagggccuuu gacuucaagg aauucucuca uuuggaggac uuucccuucg
1320augcagaaua uugagguuga aucuccuggu gagauuacac aggggauucu cuuucuucgc
1380ugaaguguga cugucuccac ucaagaacua uuuuagccaa gcuuaucuga gaucacagug
1440aacuuugucc uguuguagac cagaauggag gugcuguuuu agacauguau uucuguaugu
1500ucaacuagga uaagaauauc acagaaaagc auggccugaa uaagcaaaug acaauuuuuu
1560ccacuuaucu gaucugauca aaugucuguu aagcaccaga aacucugcca acacugagga
1620uguaa
162541098RNAHomo sapiens 4guaagaaacg guugaacugg augcaauuuu uaucacagcu
uguguaagac ugccucuguc 60ccuccucuca caugccauug guuaaccagc agacagugug
cucaggggcg uugccagcuc 120auugcucuua uagccuguga gggaggaaga aacauuugcu
aaccaggcca gugacagaaa 180uggauucgaa auaccagugu gugaagcuga augaugguca
cuucaugccu guccugggau 240uuggcaccua ugcgccugca gagguuccua aaaguaaagc
ucuagaggcc gucaaauugg 300caauagaagc cggguuccac cauauugauu cugcacaugu
uuacaauaau gaggagcagg 360uuggacuggc cauccgaagc aagauugcag auggcagugu
gaagagagaa gacauauucu 420acacuucaaa gcuuuggagc aauucccauc gaccagaguu
gguccgacca gccuuggaaa 480ggucacugaa aaaucuucaa uuggacuaug uugaccucua
ucuuauucau uuuccagugu 540cuguaaagga ggacauaggg auuuuaacau ggaagaagag
cccuaaacau aacuccuaau 600uccuuucuau ggaacagaaa gcaauuuuga auccauacuu
ccgugauugc augucuacaa 660gaaaagagag ugcagaaucc ucaaagccuc ugccucaaaa
acuugaggaa augacaauca 720ucuccuugaa ggcacaaggu cuuauuuaug auuccugauu
ucaccucuug ggauguucac 780agacacagag uuucaugaag cugugguguc cagaaaaccu
gcugcacaua gggugcacaa 840ugaguuucca ucuucuugcc ucuuuucaag gggcaagaac
ucaguccggg aaugucuuaa 900acuacaaacc uucaugggaa accuuguugc uucugcuucc
ucucuuuuca cacuggaggu 960uuuauuuuug cuuagccaug aauucuugug ucauucauaa
cuuuugucuu aagguacuga 1020aaacuaguca ggcuaguuaa ugcaaaaggg uauauuagau
augauaaugg gaaaucaaag 1080ccagggcuac auuaagaa
109851064RNAHomo sapiens 5gcccauuguu uuuguaaucu
cugaggagaa gcagcagcaa acauuugcua gucagacaag 60ugacagggaa uggauuccaa
acaccagugu guaaagcuaa augauggcca cuucaugccu 120guauugggau uuggcaccua
ugcaccucca gagguuccga gaaguaaagc uuuggagguc 180acaaaauuag caauagaagc
uggguuccgc cauauagauu cugcucauuu auacaauaau 240gaggagcagg uuggacuggc
cauccgaagc aagauugcag auggcagugu gaagagagaa 300gacauauucu acacuucaaa
gcuuuggucc acuuuucauc gaccagaguu gguccgacca 360gccuuggaaa acucacugaa
gaaagcucaa uuggacuaug uugaccucua ucuuauucau 420ucuccaaugu cucuaaaggu
augcaguuug uaugagcaua aaauugcgcu ucugcuguca 480uuauaaacau uguuuaucug
gauaguugaa cagagcuuuu uauuaggagg auguagggau 540uaucacacag aagaagaacc
guaaguggaa caccuaauuu ccuuucuuuc gaguaaauuu 600ugaauccuac uucucuaaug
cacaccuaca agagaagaga guacagcaac cucaaagccu 660cuuccucaaa aacuugaaau
uacaauaguc ucuuucaagg cacugucuua guuguggcuu 720uugaguccau cucuugggau
guucccagac acagaguuuc augcaguugu ggugcccaau 780aaaacugcug cacaugugau
gcacaaugag uuuccaccau cucuccccau uucaagcuga 840agcagauuug guggaagcca
cuaugcaugg uucuuaaauu agaaacccuu aauguggacu 900ugcaaagcuu uauuauucug
cugccucuuc uuucacaaua gaguuugaag cuguauuuag 960ccaggaauua cuguguagug
uauaacuuuu gauuuaaagu uacagaaaac uacucaggcu 1020aguuaaugca aaagaguuua
cugaguuaug uaaaauggga aguc 106461192RNAHomo sapiens
6acaggaucug cuuagugaaa gaaguggcaa gcaauggauc ccaaauauca gcguguagag
60cuaaaugaug gucacuucau gcccguauug ggauuuggca ccuaugcacc uccagagguu
120ccgaggaaca gagcuguaga ggucaccaaa uuagcaauag aagcuggcuu ccgccauauu
180gauucugcuu auuuauacaa uaaugaggag cagguuggac uggccauccg aagcaagauu
240gcagauggca gugugaagag agaagacaua uucuacacuu caaagcuuug gugcacuuuc
300uuucaaccac agauggucca accagccuug gaaagcucac ugaaaaaacu ucaacuggac
360uauguugacc ucuaucuucu ucauuuccca auggcucuca agccagguga gacgccacua
420ccaaaagaug aaaauggaaa aguaauauuc gacacagugg aucucucugc cacaugggag
480gucauggaga aguguaagga ugcaggauug gccaagucca ucgggguguc aaacuucaac
540ugcaggcagc uggagaugau ccucaacaag ccaggacuca aguacaagcc ugucugcaac
600cagguagaau gucauccuua ccucaaccag agcaaacugc uggauuucug caagucaaaa
660gacauuguuc ugguugccca cagugcucug ggaacccaac gacauaaacu auggguggac
720ccaaacuccc caguucuuuu ggaggaccca guucuuugug ccuuagcaaa gaaacacaaa
780cgaaccccag cccugauugc ccugcgcuac cagcugcagc gugggguugu gguccuggcc
840aagagcuaca augagcagcg gaucagagag aacauccagg uuuuugaauu ccaguugaca
900ucagaggaua ugaaaguucu agauggucua aacagaaauu aucgauaugu ugucauggau
960uuucuuaugg accauccuga uuauccauuu ucagaugaau auuagcauag aggguguugc
1020acgacaucua gcagaaggcc cugugugugg auggugaugc agaggauguc ucuaugcugg
1080ugacuggaca cacggccucu gguuaaaucc cuccccuccu gcuuggcaac uucagcuagc
1140uagauauauc caugguccag aaagcaaaca uaauaaauuu uuaucuugaa gu
119271425RNAHomo sapiens 7ggccccgccu ccuugagugg ugcggagcuu ugugaugcgg
agcuucguga ugcacgcccc 60gaugccugcg gggcuauaaa aacgcucgca agcgccaagu
cuccucagga gccgccggca 120agggggcaac gaggaagcuc uuaagagcgc ggccggaaag
caguugaguu acagacaucc 180ugccaaaaug auuucuucaa agcccagacu ugucguaccc
uauggccuca agacucugcu 240cgagggaauu agcagagcug uucucaaaac caacccauca
aacaucaacc aguuugcagc 300agcuuauuuu caagaacuua cuauguauag agggaauacu
acuauggaua uaaaagaucu 360gguuaaacaa uuucaucaga uuaaaguaga gaaaugguca
gaaggaacga caccacagaa 420gaaauuagaa uguuuaaaag aaccaggaaa aacaucugua
gaaucuaaag uaccuaccca 480gauggaaaaa ucuacagaca cagacgagga caauguaacc
agaacagaau auagugacaa 540aaccacccag uuuccaucag uuuaugcugu gccaggcacu
gagcaaacgg aagcaguugg 600uggucuuucu uccaaaccag ccaccccuaa gacuacuacc
ccacccucau caccaccucc 660aacagcuguc ucaccagagu uugccuacgu cccagcugac
ccagcucagc uugcugcuca 720gauguuagca auggcaacaa gugaacgagg acaaccacca
ccauguucua acauguggac 780ccuuuauugu cuaacugaua agaaucaaca aggucaccca
ucaccgccac cugcaccugg 840gccuuuuccc caagcaaccc ucuauuuacc uaauccuaag
gauccacagu uucagcagca 900uccaccaaaa gucacuuuuc caacuuaugu gaugggcgac
accaagaaga ccagugcccc 960accuuuuauc uuaguaggcu caaauguuca ggaagcacag
ggauggaaac cucuuccugg 1020acaugcuguc guuucacagu cagaugucuu gagauauguu
gcaaugcaag ugcccauugc 1080uguuccugca gaugagaaau accagaaaca uacccuaagu
ccccagaaug cuaauccucc 1140aaguggacaa gaugucccca ggccaaaaag cccuguuuuc
cuuucuguug cuuucccagu 1200agaagaugua gcuaaaaaaa guucaggauc uggugacaaa
ugugcucccu uuggaaguua 1260cgguauugcu ggggagguaa ccgugacuac ugcucacaaa
cgucgcaaag cagaaacuga 1320aaacugaucc agaaaugacg cugucugggu caacauuuca
gggaggaguc ugccaccagu 1380guaauguauc aauaaacuuc augcaagcau aaaaaaaaaa
aaaaa 142581765RNAHomo sapiens 8aucugccucc agcacugccc
auccuugccc cuuuccacug uccuuggagc uuccugggcc 60cuucccuggg ccucaggauc
ccacccucca ucccgucugc ccugcaggau gccgcagcug 120agccuguccu ggcugggccu
cgggcccgug gcagcauccc cguggcugcu ucugcugcug 180guugggggcu ccuggcuccu
ggcccgcguc cuggccugga ccuacaccuu cuaugacaac 240ugccgccguc uccaguguuu
uccucaaccc ccgaaacaga acugguuuug gggacaccag 300ggccugguca cucccacgga
agagggcaug aagacauuga cccagcuggu gaccacauau 360ccccagggcu uuaaguugug
gcuggguccu accuuccccc uccucauuuu augccacccu 420gacaucaucc ggccuaucac
cagugccuca gcugcugucg cacccaagga uaugauuuuc 480uauggcuucc ugaagcccug
gcugggggau gggcuccugc ugaguggugg ugacaagugg 540agccgccacc gucggauguu
gacgccugcc uuccauuuca acaucuugaa gccuuauaug 600aagauuuuca acaagagugu
gaacaucaug cacgacaagu ggcagcgccu ggccucagag 660ggcagcgcca gacuggacau
guuugaacac aucagccuca ugaccuugga cagucugcag 720aaaugugucu ucagcuuuga
aagcaauugu caggagaagc ccagugaaua uauugccgcc 780aucuuggagc ucagugccuu
uguagaaaag agaaaccagc agauucucuu gcacacggac 840uuccuguauu aucucacucc
ugaugggcag cgcuuccgca gggccugcca ccuggugcac 900gacuucacag augccgucau
ccaggagcgg cgccgcaccc uccccacuca ggguauugau 960gauuuccuca agaacaaggc
aaaguccaag acuuuagacu ucauugaugu gcuucugcug 1020agcaaggaug aagaugggaa
ggaauugucu gaugaggaca uaagagcaga agcugacacc 1080uucauguuug agggccauga
cacuacagcc aguggucucu ccuggguccu auaccaccuu 1140gcaaagcacc cagaauacca
ggaacagugc cggcaagaag ugcaagagcu ucugaaggac 1200cgugaaccua uagagauuga
augggacgac cuggcccagc ugcccuuccu gaccaugugc 1260auuaaggaga gccugcgguu
gcauccccca gucccgguca ucucccgaug uugcacgcag 1320gacuuugugc ucccagacgg
ccgcgucauc cccaaaggca uugucugccu uaucaauauu 1380aucgggaucc auuacaaccc
aacugugugg ccagacccug aggucuacga ccccuuccgu 1440uucaaccaag agaacaucaa
ggagagguca ccucuggcuu uuauucccuu cucggcaggg 1500cccagaaacu gcaucgggca
ggcguucgcc auggcugaga ugaagguggu ccuggcgcuc 1560acguugcugc acuuccgcau
ccugccgacc cacauugaac cccgcaggaa acccgagcug 1620auauugcgcg cagagggugg
acuuuggcug cggguggagc cccugggugc gaacucacag 1680ugacugcccu acccacccac
ccaccuuugu agagucccag aaacaaaacu augcugacaa 1740aaaaaaaaaa aaaaaaaaaa
aaaaa 176597277RNAHomo sapiens
9aggucagggg gcuggggacg cgcgugggga ucgcuacccg gcucggccac ugcugggcgg
60acaccugggc gcgccgccgc gggaggagcc cggacucggg ccgaggcugc ccaggcaaug
120cguucacucg gcgcaaacau ggcugcggcc cugcgcgccg cgggcguccu gcuccgcgau
180ccgcuggcau ccagcagcug gagggucugu cagccaugga gguggaaguc aggugcagcu
240gcagcggccg ucaccacaga aacagcccag caugcccagg gugcaaaacc ucaaguucaa
300ccgcagaaga ggaagccgaa aacuggaaua uuaaugcuaa acaugggagg cccugaaacu
360cuuggagaug uucacgacuu ccuucugaga cucuucuugg accgagaccu caugacacuu
420ccuauucaga auaagcuggc accauucauc gccaaacgcc gaacccccaa gauucaagag
480caguaccgca ggauuggagg cggauccccc aucaagauau ggacuuccaa gcagggagag
540ggcaugguga agcugcugga ugaauugucc cccaacacag ccccucacaa auacuauauu
600ggauuucggu acguccaucc uuuaacagaa gaagcaauug aagagaugga gagagauggc
660cuagaaaggg cuauugcuuu cacacaguau ccacaguaca gcugcuccac cacaggcagc
720agcuuaaaug ccauuuacag auacuauaau caagugggac ggaagcccac gaugaagugg
780agcacuauug acagguggcc cacacaucac cuccucaucc agugcuuugc agaucauauu
840cuaaaggaac uggaccauuu uccacuugag aagagaagcg agguggucau ucuguuuucu
900gcucacucac ugcccauguc uguggucaac agaggcgacc cauauccuca ggagguaagc
960gccacugucc aaaaagucau ggaaaggcug gaguacugca accccuaccg acuggugugg
1020caauccaagg uugguccaau gcccugguug gguccucaaa cagacgaauc uaucaaaggg
1080cuuugugaga gggggaggaa gaauauccuc uugguuccga uagcauuuac cagugaccau
1140auugaaacgc uguaugagcu ggacaucgag uacucucaag uuuuagccaa ggagugugga
1200guugaaaaca ucagaagagc ugagucucuu aauggaaauc cauuguucuc uaaggcccug
1260gccgacuugg ugcauucaca cauccaguca aacgagcugu guuccaagca gcugacccug
1320agcuguccgc ucugugucaa uccugucugc agggagacua aauccuucuu caccagccag
1380cagcugugac ccccgccggu ggaccccgug gcguuaggca aaugcccaac cuccagauac
1440cuccgaugug gagagggugu uauuuagaga ucaaggaagg aagucauccu uccuugauau
1500auauacagcc uuuggguaca aauugugugg uuucuugagg auuggacucu ugauggauuu
1560cuauuuuuau auaacuauac aguaagcauu uguauuuucu cucucuaggu auaaguuacu
1620aguuuggaau guccaucagg accuuuaaua aaugaguuaa aaauuugucu uaugagacac
1680accuauuuaa guacagauuu uggcuuuauu gcccaaaacc cuccugaaag gguacggaga
1740guccccucug ugggcuggca gugugaauga gaucuguuua gucucgugca uauaguugcu
1800guuuuuuaaa ugaacacagu ugaguauuug aagugaauuu gaaaaagaaa uguuacuuaa
1860ucuuucccua agcccauggg uuacagaaug cuagggaggc aauuugguua ccugcaaugg
1920cugcuuuugc cagcgaggcc accauucauu ggucaucuug guauuugugu gugaaucuca
1980cuuuccucaa uguaaaaagg aaucaaguau ggauuucaga ggugcucuua gauuccccau
2040acacccaagg guaauaaacg uguacaagua caguguucau gauacgugcc uuggugggag
2100uccguggugc cacagggaag gggcucccac ugcuucuggu cuccagggac agugcugcug
2160gaaaggcuag ugaugagcuu cacccuggag cuccucccgg gaccuugcaa gccucuccau
2220ccagcaucuu cucuaucuua guugaaugcc uucuuucuga acauuuguuu uaagaauuau
2280uuuauaaagu caacaauacu uugcuugaau ucuuucuuaa uuuacgauuu uuuauuauaa
2340aaauguauag ugauacaaug ggacauguga agaauacaga aaaguaacca cuuuaaugca
2400auaacuguua ucauaauauu guauuucgug guaguccuuu ccuguagaua uuuuuaaugc
2460cauuuaaugc cauugucacc uuggauuuau gagugaaaag uguuucuaaa aauauagaaa
2520uaaugucaga ucagagucug aucuucuaug uuuguauuua aauggauuaa aagauccccg
2580gugguuccau gaagaauuug uaaagaucac uuucucuuuc cuccaagccc ugaaacuuug
2640uucuucaaaa gagcguuucu uuuuuuuuuu uuuuuuagcc aguuuauaaa guggaaguau
2700uaggagauuc auaaaucuuc uauauugaga auuggcuaug uuaauaaaua uuacaacauc
2760auuaagguuu uagcuaaguu ugauucaugc ugucuguuaa aucaaaacug aucuaaauca
2820gaauuauuaa augugaggag cuuuuuuaau acaggaaaag aaacauguca uccacuugag
2880uuaauaguuu uccuacguug augacagccc ucaugaguag cauccacauu uuuaaaauuu
2940caaauugguu uuucuacuag uagauugugu uucuagagaa agauacaagg cauaggugau
3000uguuuaggau uuuccucuag ccuuugccau uaccuuuuug gggaugaggu ucacaguaga
3060cuuugaguga ccgucccacc gugaagugaa uucucugagc ugguggugug gugcuggaag
3120gaagguuauu uuuggagcca cucucucccc uuaaggauau uucccaaggg ccugcuucaa
3180uucuuugaug acuuuagagg ugaaaaaaua uuuuuaugga gaugaugcag aaaacuccaa
3240uucaggagcc cuugcgagua uaucugaagc acuuauuugc uaaggaaacc ugaauugaua
3300gcaguacugu gcugucugga auaauguccu ugauacugag uugggaccag acuggcuuuu
3360auagugacag gcaaagagga auuuauugag aucacugcuc auggcauuug uugcuguaag
3420aaguguugcc uuugauuguu acuaaccacg gauggguaac ggucauacau uaggcuagug
3480uuugguagga caaaaucuuu uuagagcuuu gagaauuguc auccuguugg ucaacuuuga
3540aauacaaaug uuugcccugg uaauuagcaa ugaacugcug gcaguuucuu cagcugugua
3600uauacggauc uggcuuuuaa uugaugaauc aacuucuaca gaaacuuuug cagggacagu
3660guugaugagg caguuuagcu ugccagggug augauaaagc ccaggucccu gcauguauag
3720ugcucuucua aagaauaugc auucuugaac uacuuaacuu uuuaaaaauc acaauaaauu
3780uuugcacuca aaauuugcuu cguaucagga gaaaugaacu cauuguuuug uuuuguuuuu
3840uuuuuuuuuu aagauggagu cuugcuaugu cacccaggcu ggagggcagu ggugcgaucu
3900cggcucacug cuacuuccac cuccugggcu caagugaucc ucccaccuca gccuccaagu
3960agcugggacu acaggagugc uucaccacgc ugggcuacuu uuuuauauuu uuuguagaga
4020ugggguuuug ccauguuguc caggcugguc uugaacuccu gggcucaagg gauucuccug
4080ccucagucuc ccaaagugcu gggauuacaa ggaugagccu cugcaccugg cccugaacuc
4140auuauuaaaa gcccuuuaaa ugugaggcug ggugccgugc cuuacaugug uaauuccaau
4200acuuuggaag gccaagguug gaggauugcu ugaucccaag aguucaagac cagccugggc
4260aacauaggga gacccugacu cuacaaaaaa uaaaguaaaa auuaacuggg uguaguguca
4320caugccugua guuccagcua cuuaggaggc ugagguggua ggauugcuug agcccagcag
4380uuugagguug cagugaggug ugauugcacc acugcacucc agccugggug acagaggaag
4440acccuguccc aaaaccaaaa aaaagaaaag aaauacagag acugggucau uuacaaagga
4500aagagguuua auugacucgg uucggcuuuc ugaggaagcc uuaggaaauu gacaaucaug
4560gcagaagggg aagcagaugu cuuacauggc agugagugag agcaagcaaa ggggaagagc
4620ccccuuauaa aaccaucaga ucucgugaga acuggcuguc acaagaacag caugggggaa
4680cugucuccau guuccaaucu ccuuccacca ggucccuccc ucaacacgug gggauuaugg
4740ggauuacaau uugaaaugag auuugggugg ggaacagagc caaaucauau cauuccaccc
4800uggccccucc caaaucacau guccuuuuua cauuucaaaa ccaaucaugc cuucacaaca
4860guccuccaga gucuuaacuc auuccagcau uaacccaaaa guccaaguuc aaagucucau
4920ccaagacaag gcaagucccu ucugccugug agccuguaac auuaaaagca aguuagugac
4980uuccaagaua caaugggagu acagacauug guaaauguuc ccauuccaaa ugggagaaau
5040uggccaaaac acaggggcua caggccccau gcaccacugc acuccacugu gcaagucuga
5100aacccggcag ggcacuccuu aaauuuuuuu uuuuuuuuuu ugagauggag ucucgcucug
5160uugcccaagc uggaguacag uggcacgauc ucggcucacu gcaaccuccg ccucuugggu
5220ucaaaggauc auccugccuc agccuccgga guagcugggc uacucaggcg ugugccacca
5280ugcccggcua auuuuuguau uuuuaguaga gauggggccu gaccauguug gucaggcugg
5340ucucuaauuc cugaccucgu gauccacccg ccucagccuc ugaaaguguu gggauuacag
5400gcgugagcca ccauccccgg ccuacucaau aaaucuuaaa guuccggaau aaucuccuuu
5460gacuccaugu cucaccucca ggucacgcug augcaagagg ugggcuaauc uuucuaguaa
5520auuccauauu uaauucaaga aaccauaacu uaaggcaugu aaaagagauc cuuugcucaa
5580ugugaugcca uugugcuuau ccaaaguaua uuauuauuac ccacaaaggg ugagagauua
5640ggcugcagcc auaccccaag uggagugagc agcaagaccu gcccccugcu cagaguguag
5700augacugggg gcaccugcau uccuaggggc ucugccguau gagcuccugu cgaugcggca
5760aaggaccacc uugcccaacg acagcgggaa ggcagaauuu aaagcuggca gcuguaagcg
5820aacgucuaug ugugcgcacg ggggcacgug aaggcacagg ugcaucagcc aagaaccucc
5880aauucaccuc uuaaccuucu caccucaccu gaaaccccuu cugccagaau ccugaaggug
5940gcccaggaac agggcuccua acguuaggug gaaaugggaa auucauugag augucacaag
6000cuggaauaag aaaauucuga gcucacccgg aaacuaaugc ccuaaauuaa gauuauucag
6060cuucucaauu uuuaauagca aaauggagac cugagugugg auaacuuuua guaucugugg
6120gggauccugg aaccaauucc cugccaauau agaaggacaa cugucuacag uacuugaagu
6180auuauuaacu acauucgcca ugcuguaugu uagaucccca gaacauauuu auccugcaua
6240ucuaaaauuu ugaucauuuu acaaacuuuc uauuuuuuuu gucaauuuuc uccagcuaga
6300cacuugugca auacggcuau uaucugaucu uugccuuaaa uguugugcuu cuuuuccaua
6360ugcacguauu uugcaaaaua uaaagugugu agagcuauau agcacucagc caaguggugg
6420guaccugcag gugcuucaga gaaguaaauu gaugcugcua auauuuguug aauggcacga
6480auaugaugag caauagcagg uggugcccuu cagccagacc aucgcuccgu gcgucugaug
6540caucuugcca aagaguaguu cugggaggug guugccucua gagaacacau uccuccuauu
6600cugggguccc gugagagaaa gaaaugcuuu ugcuuuugau gugggacucu uacuaagccu
6660uucuucagag aaaaggaagu gaaaaaugca ccccaugaua aucaguuucu uacaacauac
6720ugugauagua ccggcuucgu uguuuuuagc uggaaucauu agcuuccauu uuuagaauaa
6780cagcuauugg cuaaauuagg cuacaguagg ccauuaagau ggauguugga auuaaaaaca
6840uuuuuggaaa aaagccugcu uugagccuuu guuauaagcc cuuggguaga gaucuggguc
6900cuguuucuga uuucuuguga gccuucacuc ugacaguuuu guuuccagaa acacacucuu
6960agccugcucc ugaaauggga acagacaggc caacuucccc ucuccagucu ccccugcggg
7020ucaaagcuuu acuuuccugu cauguuaaga aagaauagau uuaaccuuga uaauccaugu
7080aguauucugu auuuuuaccu uuuccuuauc ugaaaaaaag uguauauaug gcauggaauu
7140gauugcacag gcacauggca uguuggcuug ugaaccaauu guuaaaauuu caaguuaauc
7200auuaaaauaa uaucuuucaa auuaaguuau auuaaaaaca aagguaacau ucuaaauuca
7260aaaaaaaaaa aaaaaaa
727710889RNAHomo sapiens 10gcaguucggc ggucccgcgg gucugucucu ugcuucaaca
guguuuggac ggaacagauc 60cggggacucu cuuccagccu ccgaccgccc uccgauuucc
ucuccgcuug caaccuccgg 120gaccaucuuc ucggccaucu ccugcuucug ggaccugcca
gcaccguuuu ugugguuagc 180uccuucuugc caaccaacca ugagcuccca gauucgucag
aauuauucca ccgacgugga 240ggcagccguc aacagccugg ucaauuugua ccugcaggcc
uccuacaccu accucucucu 300gggcuucuau uucgaccgcg augauguggc ucuggaaggc
gugagccacu ucuuccgcga 360auuggccgag gagaagcgcg agggcuacga gcgucuccug
aagaugcaaa accagcgugg 420cggccgcgcu cucuuccagg acaucaagaa gccagcugaa
gaugaguggg guaaaacccc 480agacgccaug aaagcugcca uggcccugga gaaaaagcug
aaccaggccc uuuuggaucu 540ucaugcccug gguucugccc gcacggaccc ccaucucugu
gacuuccugg agacucacuu 600ccuagaugag gaagugaagc uuaucaagaa gaugggugac
caccugacca accuccacag 660gcuggguggc ccggaggcug ggcugggcga guaucucuuc
gaaaggcuca cucucaagca 720cgacuaagag ccuucugagc ccagcgacuu cugaagggcc
ccuugcaaag uaauagggcu 780ucugccuaag ccucucccuc cagccaauag gcagcuuucu
uaacuauccu aacaagccuu 840ggaccaaaug gaaauaaagc uuuuugaugc aaaaaaaaaa
aaaaaaaaa 889115000RNAHomo sapiens 11acccgucgcc acgcccgccg
caggccaagg gccagucacu ugcgggccgg cgucccgcag 60cccauucgcg ccccgccccu
gccccgccgc gggaugagua acgguuacga agcacuuucu 120cggcuacgau uucugcuuag
ucauugucuu ccaggaaaca gcucccucag uuuggaauca 180gcucucccgc ugcggccgca
guagccggag ccggagccgc agccaccggu gccuuccuuu 240cccgccgccg cccagccgcc
guccggccuc ccucgggccc gagcgcagac caggcuccag 300ccgcgcggcg ccggcagccu
cgcgcucccu cucgggucuc ucucgggccu cgggcaccgc 360guccuguggg gcggccgccu
gccugcccgc ccgcccgcag ccccuucgcu gcgcggcccc 420ugggcggccg cugccauggg
caccgacagc cgcgcggcca aggcgcuccu ggcgcgggcc 480cgcacccugc accugcagac
ggggaaccug cugaacuggg gccgccugcg gaagaagugc 540ccguccacgc acagcgagga
ggaguuucca gaugucuugg aaugcacugu aucucaugca 600guagaaaaga uaaauccuga
ugaaagagaa gaaaugaaag uuucugcaaa acuguucauu 660guagaaucaa acucuucauc
aucaacuaga agugcaguug acauggccug uucaguccuu 720ggaguugcac agcuggauuc
ugugaucauu gcuucaccuc cuauugaaga uggaguuaau 780cuuuccuugg agcauuuaca
gccuuacugg gaggaauuag aaaacuuagu ucagagcaaa 840aagauuguug ccauagguac
cucugaucua gacaaaacac aguuggaaca gcuguaucag 900ugggcacagg uaaaaccaaa
uaguaaccaa guuaaucuug ccuccugcug ugugaugcca 960ccagauuuga cugcauuugc
uaaacaauuu gacauacagc uguugacuca caaugaucca 1020aaagaacugc uuucugaagc
aaguuuccaa gaagcucuuc aggaaagcau uccugacauu 1080caagcgcacg agugggugcc
gcuguggcua cugcgguauu cggucauugu gaaaaguaga 1140ggaauuauca aaucaaaagg
cuacauuuua caagcuaaaa gaagggguuc uuaacugacu 1200uaggagcaua acuuaccugu
aauuuccuuc aauaugagag aaaauugaga uguguaaaaa 1260ucuaguuacu gccuguaaau
ggugucauug aggcagauau ucuuucguca uauuugacag 1320uauguugucu gucaaguuuu
aaauacuuau cuugccucca uaucaaucca uucucaugaa 1380ccucuguauu gcuuuccuua
aacuauuguu uucuaauuga aauugucuau aaagaaaaua 1440cuugcaauau auuuuuccuu
uauuuuuaug acuaauauaa aucaagaaaa uuuguuguua 1500gauauauuuu ggccuaggua
ucaggguaau guauauacau auuuuuuauu uccaaaaaaa 1560auucauuaau ugcuucuuaa
cucuuauuau aaccaagcaa uuuaauuaca auuguuaaaa 1620cugaaauacu ggaagaagau
auuuuuccug ucauugauga gauauaucag aguaacugga 1680guagcuggga uuuacuagua
guguaaauaa aauucacucu ucaauacaug aauggaaacu 1740uaaauuuuuu uuuauguguc
cuugcuuaua guuuagcugu aauaauuuaa ccuuguauuc 1800uugugccaua uucugucuuu
uuauuacuua uaaagacaaa ccaaaguaaa ucugaaagga 1860gacuagaagc uuugaaauua
uuguuugggg guuuuauaaa agcaacuacu gucaccucca 1920uccagauucu uuuaaauuau
ugauccaucc auaguauaua uugcuacuca uucaagaauc 1980cucaauaagu auugaguauu
uaccauaugu ugggauacug ugggcucugg agagaggagg 2040gggcaauaga gcuaggaauu
aagaaucagu ugaguaaaau guguaauauu uauuccccau 2100uaauaacuga cuaggaagga
cuaaaagcca gaaaggggau gaaaaaaaaa uccuuaauuc 2160agggccgaca uuaucuacuu
aaacaacuuu gagauauggu cuuaauuauu uuaaagcaga 2220auaauauaau ugaaaguuua
uagcuaaaag agacuauaua ggucauuuag uauaauucuu 2280cauuaguuua cgaaccacaa
aauugcaaau aaauaagcua ugaacuuuga uguacacuau 2340aaaucuccuu aauucuauaa
auuugugucu guaaccugaa uaguuugaaa acuucuuuaa 2400aaaucucuug uauuucaucc
gggcgcagug gcucacaccu guaaucccag cacuuuggga 2460ggccgaggug ggcagaucac
gaggucagga guuugagacc agccugacca acaugguaaa 2520accccaucuc uacuaaaaua
caaaaauugg cugggcgugg uggcacucgc cuguaaucuc 2580agcuacuugg gaggcugagg
caggagaauc gcuugaaccc gggaggcgga gguuacagug 2640agccgagauc acaucacugc
acuccagccu gggcgacaga gcgagacucc aucucaaaaa 2700aaaaaaaaaa cucuuguauc
ucaauauuuu uaaaccacag gccuaaauaa aacuaauuuu 2760gcucaaguuu ucucaaccua
gggaaaaaga acuaugguuc cauauucaaa auaaauauua 2820uagacccuuu uccuaaguag
gauuuugugg uuuacugauu ggguaauuug aucauuaaaa 2880uuaugugaaa ucugcccggg
cacaccucau gccuguaauc ccagcacucu gggaggccaa 2940ggcagaugau caccugaggu
caggaguucu agaccagccu ggcuaacaug gugaaacccu 3000guaucugcua aaaauacaaa
aauuagccag gcgugguggc gggcuccugu aaucccagcu 3060acuuuggagg cgaggcacga
gaaucgcuug aaccugggag gcggaguuug cagugagccg 3120agaucacgcc auugcacucc
agccugggcg acagagcgag acugcgucuc aaaaaaaaaa 3180aaaaaagaaa aauuauguga
aaucauguga uuugccuggg aaaacuuguu uagauauuga 3240gcuacuuaug ccuucuagcc
uuuauauuaa uuguauguaa uguuauuaaa uauauauaua 3300guucaucuuu acauuuggaa
augcccaaca uuuuuuucau auaaguccuu aaacaagcgu 3360ucauuuuauu uuaaaucuau
acagugaacu ggccaagaua uuuuaagagg gaacuuuaau 3420aucccauuua uuguuuuuau
aacccuggac uuauaaaaau ggguuguuug aaggguuauu 3480uugaaagugg gggaaaaaaa
aacuuaguug cuaauguauc uaaacuucag cagagcuuuu 3540uggugaucuc cuaccugcac
ccucaacucu ugacaaagaa gcaagacuau agauucauuu 3600ucugaagggg aucauguaug
gaauuuuuug augaguuuuu acuuuuaccu cucuacucuu 3660gauuuucuau uauugaauac
ucuuuuaaaa cacugauuuu uaaggcuuua uauauguuuu 3720ccaggcugau guucacaucu
uuuuuucaug aacuaucaga auauagugaa cacuuuucaa 3780auauuuaagg acuuaauguu
uaaaaagcca uaaaauagag agugguaaua cuaccaaaua 3840auuacuuaaa acugaaagcu
aaguuaucaa uaguuuauau aagagauguu uucugaggag 3900augugcaucc agugagacca
agguagaaag uuuauauaau uguuuuuuuu ccaguaaaua 3960ugaaaaaaaa agcuguagcu
uguuuauuac auguccaaaa uacaguggag ccuuacuuua 4020acacaaugua cuguaacuug
gaauuuguuc uguuaugagu cuaucuugaa uucccaucca 4080ugaaacugua gucaccaaaa
gcaacaagua uuuucacaug auguaaaaga ccauacuaug 4140auggccauug cuagaaauug
aaucacaaau aauagcuaau aauuuuucau uuuucaaaaa 4200agaucauuug gauagcagcu
auguauaaaa uggaaaauaa aaaauuauuc uauuuugcau 4260gaauaguuca gacuuuccca
uaccacagcc aagcaguaac uaaaauuagg aucuuaauuu 4320ucaaugauaa aaggucuaag
guucauuuaa uuaugcuccu uuaacacugu cuuucuagau 4380uuuucaccca guauuuucaa
aauuugggaa uguaaacaau ugauauauuu auuguauguu 4440ggcuagcagu ucauccuucu
gcaaaauaug cauucagaga aaugugaagc uuguuuuaau 4500gaagacuuaa accauuugug
ucauuugugu uuucauauuc aaauacacca aauuaaaauu 4560cugaaccuau auuuuucauc
auuaacuucc uaauauacca gaacauauac cuuuuucaug 4620uaaaguuggc aaugggauau
ggcaguuuua uuuuugaaaa auauguaaca ugacuuuaau 4680auuuuuauag uuuucagaau
uagaaacaua ggaagggaaa auguuuuaau uagauaaguc 4740aacuuuuuau gugucuguag
ugguguacua uaauagcaaa uuauaaagca uuauuaaaug 4800uuuauaauaa uuuuuaauau
uaccuacauu augaauuuaa cuaaaauaaa gugugaguug 4860uauauuuuuu aauuggguug
uuucaauagc uggaagcauc cugaagcauu auauugauuu 4920uugaacuauu ugaacucaaa
cugaguauga uuugaaaaua aauuaauaau uuaaaaacau 4980ccaaaaaaaa aaaaaaaaaa
5000122928RNAHomo sapiens
12uccuccuggg ucuugccuag cggcgggcgc augcuuaguc accgugaggc ugcgcuugcc
60cggggcccgc gccccccuac cccggggacc gcccccgggc cgcccgcccc acuuggcgcg
120ccacuuccgc gugcauggcc cugcugcccc gagcccugag cgccggcgcg ggaccgagcu
180ggcggcgggc ggcgcgcgcc uuccgaggcu uccugcugcu ucugcccgag cccgcggccc
240ucacgcgcgc ccucucccgu gccauggccu gcaggcagga gccgcagccg cagggcccgc
300cgcccgcugc uggcgccgug gccuccuaug acuaccuggu gaucgggggc ggcucgggcg
360ggcuggccag cgcgcgcagg gcggccgagc ugggugccag ggccgccgug guggagagcc
420acaagcuggg uggcacuugc gugaauguug gauguguacc caaaaaggua auguggaaca
480cagcugucca cucugaauuc augcaugauc augcugauua uggcuuucca aguugugagg
540guaaauucaa uuggcguguu auuaaggaaa agcgggaugc cuaugugagc cgccugaaug
600ccaucuauca aaacaaucuc accaaguccc auauagaaau cauccguggc caugcagccu
660ucacgaguga ucccaagccc acaauagagg ucagugggaa aaaguacacc gccccacaca
720uccugaucgc cacagguggu augcccucca ccccucauga gagccagauc cccggugcca
780gcuuaggaau aaccagcgau ggauuuuuuc agcuggaaga auugcccggc cgcagcguca
840uuguuggugc agguuacauu gcuguggaga uggcagggau ccugucagcc cuggguucua
900agacaucacu gaugauacgg caugauaagg ggauucaaac cgaugacaag ggucauauca
960ucguagacga auuccagaau accaacguca aaggcaucua ugcaguuggg gauguaugug
1020gaaaagcucu ucuuacucca guugcaauag cugcuggccg aaaacuugcc caucgacuuu
1080uugaauauaa ggaagauucc aaauuagauu auaacaacau cccaacugug gucuucagcc
1140accccccuau ugggacagug ggacucacgg aagaugaagc cauucauaaa uauggaauag
1200aaaaugugaa gaccuauuca acgagcuuua ccccgaugua ucacgcaguu accaaaagga
1260aaacaaaaug ugugaugaaa auggucugug cuaacaagga agaaaaggug guugggaucc
1320auaugcaggg acuugggugu gaugaaaugc ugcaggguuu ugcuguugca gugaagaugg
1380gagcaacgaa ggcagacuuu gacaacacag ucgccauuca cccuaccucu ucagaagagc
1440uggucacacu ucguugagaa ccaggagaca cguguggcgg gcagugggac ccauagaucu
1500ucugaaauga aacaaauaau cacauugacu uacuguuuga guuuuaugua uuucuuuauu
1560uuaaucagga ucuucugaua guggaaauuu uuaguacaua auagaacuua uuuauggagu
1620uagaaauuug uaguguuauc caggauugau uuucauuuga ucacaucuca caguaauuaa
1680uauuuucaag uuuuuuuuuu auuaacagcu cugugcuagu uuuuuuuuuc uguuuuagcc
1740ucaucccaaa uauaaagcuu ugugaaguac aauuaacuua auguacuuga augaauagaa
1800cuugcuacuu uuuuuuuuuu uuuuuuugag acagaguuuu gcucucauug cccaggcugg
1860agugcggugg ugcuauuuca gcucaccaca accucugccu ccuggguuca agugauucuc
1920cugccuuagc cucccgaaua gcuggaauua caggcacgca ccaccaugcc ugacuaauuu
1980uguauuuuua guagacaugg gguuucucca uguuggucag gcuggucuca aacucccacc
2040uucaggugau ccgcccaccu cggccuccug aggugcugag auuacaggcg ugagccacug
2100ugccagcuug cuaauuuuca cagaaguuga uggcaauucu ucacauguaa acagugccag
2160ugcacagaac cuuuauauau uuuuugaagc caguacugug cucugcauau aacaaagcug
2220cuucaaggau gagaccuuuu ucuaaaagca uguaauguga gaagccggcc ugccuuauuu
2280ucuuuuuucu uuuuuaauga uuaaaaauag uuuguggcaa ggcacggugg cucaggccug
2340uaauucuagc acuuugggag gccgaggcag gaggauuacu ugagccuaca aguuugaggc
2400cagcaugcac agcauagcaa gacugcaucu cuacagagag uaaaaaaaau uacccgagug
2460uggugaugug caucuguaau cucagcuacu ugggaggcug aggugagagg aucacuugag
2520cuugggugag gugaggcugc agugaguccu gaucaugcug cugcacucaa ucuuggacaa
2580cagagcaaga cccugucuca aaaaaaaaaa aaaaaaauau auauauauau auauauuauu
2640uuuaugaggu gaagugcauc aaacuuggga aagauuugag gaggcuggga accuccugga
2700aaaccacucc uugaagaaag auaugagaga cauuuagaag ugauuccugc uuucagaagg
2760agguggauuc aaauacauca aaagucccuu ccucugcuaa guguuuauag uucaaugaau
2820aauuucaaua uuuguaugug uucuugucau uuuauuuuuu ucugaaaaac uuccaaaaau
2880uugaaaauaa aauuacagcc uuuucuucuu auaaaaaaaa aaaaaaaa
2928131774RNAHomo sapiens 13gcaguucuuu gaauuucuca cccuaagauc uggccuguac
auuuucaagg aauucuugag 60agguucuugg agagauucug ggagccaaac acuccauugg
gauccuagcu ggaauauaaa 120gaauggcuua ucaguggaga ccaucgacag uugagaaaag
aagaagccca aaaaguacaa 180gaaugaaaau cgagaguuuu uagagaacaa cuuguaaugg
agccuucauc ucuugagcug 240ccggcugaca cagugcagcg cauugcggcu gaacucaaau
gccacccaac ggaugagagg 300guggcucucc accuagauga ggaagauaag cugaggcacu
ucagggagug cuuuuauauu 360cccaaaauac aggaucugcc uccaguugau uuaucauuag
ugaauaaaga ugaaaaugcc 420aucuauuucu ugggaaauuc ucuuggccuu caaccaaaaa
ugguuaaaac auaucuugaa 480gaagaacuag auaagugggc caaaauagca gccuaugguc
augaaguggg gaagcguccu 540uggauuacag gagaugagag uauuguaggc cuuaugaagg
acauuguagg agccaaugag 600aaagaaauag cccuaaugaa ugcuuugacu guaaauuuac
aucuucuaau guuaucauuu 660uuuaagccua cgccaaaacg auauaaaauu cuucuagaag
ccaaagccuu cccuucugau 720cauuaugcua uugagucaca acuacaacuu cacggacuua
acauugaaga aaguaugcgg 780augauaaagc caagagaggg ggaagaaacc uuaagaauag
aggauauccu ugaaguaauu 840gagaaggaag gagacucaau ugcagugauc cuguucagug
gggugcauuu uuacacugga 900cagcacuuua auauuccugc caucacaaaa gcuggacaag
cgaaggguug uuauguuggc 960uuugaucuag cacaugcagu uggaaauguu gaacucuacu
uacaugacug gggaguugau 1020uuugccugcu gguguuccua caaguauuua aaugcaggag
caggaggaau ugcuggugcc 1080uucauucaug aaaagcaugc ccauacgauu aaaccugcau
uagugggaug guuuggccau 1140gaacucagca ccagauuuaa gauggauaac aaacugcagu
uaaucccugg ggucugugga 1200uuccgaauuu caaauccucc cauuuuguug gucuguuccu
ugcaugcuag uuuagagauc 1260uuuaagcaag cgacaaugaa ggcauugcgg aaaaaaucug
uuuugcuaac uggcuaucug 1320gaauaccuga ucaagcauaa cuauggcaaa gauaaagcag
caaccaagaa accaguugug 1380aacauaauua cuccgucuca uguagaggag cgggggugcc
agcuaacaau aacauuuucu 1440guuccaaaca aagauguuuu ccaagaacua gaaaaaagag
gagugguuug ugacaagcgg 1500aauccaaaug gcauucgagu ggcuccaguu ccucucuaua
auucuuucca ugauguuuau 1560aaauuuacca aucugcucac uucuauacuu gacucugcag
aaacaaaaaa uuagcagugu 1620uuucuagaac aacuuaagca aauuauacug aaagcugcug
ugguuauuuc aguauuauuc 1680gauuuuuaau uauugaaagu augucaccau ugaccacaug
uaacuaacaa uaaauaauau 1740accuuacaga aaaucugaaa aaaaaaaaaa aaaa
1774143519RNAHomo sapiens 14cucacacgcc ggcucggaug
aucuccugcc augacucagc gcuucucgca ggcugcccug 60cuggggacac cggcuucgcu
cgggccccuc ccgacgcguc cacccccucu cgccacccac 120gcccgccccc agccgcuggg
ccuuucccag ugcggccgcc gccgccacag cugcagucag 180caccgucacc ccagcagcau
ccgccgccug caccgcgcgu gcggcccgcc ccggccugac 240cccgccgccg aacccggcgc
cagccaugga gcccgaagcc ccccgucgcc gccacaccca 300ucagcgcggc uaccugcuga
cacggaaccc ucaccucaac aaggacuugg ccuuuacccu 360ggaagagaga cagcaauuga
acauucaugg auuguugcca ccuuccuuca acagucagga 420gauccagguu cuuagaguag
uaaaaaauuu cgagcaucug aacucugacu uugacaggua 480ucuucucuua auggaucucc
aagauagaaa ugaaaaacuc uuuuauagag ugcugacauc 540ugacauugag aaauucaugc
cuauuguuua uacucccacu gugggucugg cuugccaaca 600auauaguuug guguuucgga
agccaagagg ucucuuuauu acuauccacg aucgagggca 660uauugcuuca guucucaaug
cauggccaga agaugucauc aaggccauug uggugacuga 720uggagagcgu auucuuggcu
ugggagaccu uggcuguaau ggaaugggca ucccuguggg 780uaaauuggcu cuauauacag
cuugcggagg gaugaauccu caagaauguc ugccugucau 840ucuggaugug ggaaccgaaa
augaggaguu acuuaaagau ccacucuaca uuggacuacg 900gcagagaaga guaagagguu
cugaauauga ugauuuuuug gacgaauuca uggaggcagu 960uucuuccaag uauggcauga
auugccuuau ucaguuugaa gauuuugcca augugaaugc 1020auuucgucuc cugaacaagu
aucgaaacca guauugcaca uucaaugaug auauucaagg 1080aacagcaucu guugcaguug
caggucuccu ugcagcucuu cgaauaacca agaacaaacu 1140gucugaucaa acaauacuau
uccaaggagc uggagaggcu gcccuaggga uugcacaccu 1200gauugugaug gccuuggaaa
aagaagguuu accaaaagag aaagccauca aaaagauaug 1260gcugguugau ucaaaaggau
uaauaguuaa gggacgugcu uccuuaacac aagagaaaga 1320gaaguuugcc caugaacaug
aagaaaugaa gaaccuagaa gccauuguuc aagaaauaaa 1380accaacugcc cucauaggag
uugcugcaau ugguggugca uucucagaac aaauucucaa 1440agauauggcu gccuucaaug
aacggccuau uauuuuugcu uugaguaauc caacuagcaa 1500agcagaaugu ucugcagagc
agugcuacaa aauaaccaag ggacgugcaa uuuuugccag 1560uggcaguccu uuugauccag
ucacucuucc aaauggacag acccuauauc cuggccaagg 1620caacaauucc uauguguucc
cuggaguugc ucuugguguu guggcgugug gauugaggca 1680gaucacagau aauauuuucc
ucacuacugc ugagguuaua gcucagcaag ugucagauaa 1740acacuuggaa gagggucggc
uuuauccucc uuugaauacc auuagagaug uuucucugaa 1800aauugcagaa aagauuguga
aagaugcaua ccaagaaaag acagccacag uuuauccuga 1860accgcaaaac aaagaagcau
uuguccgcuc ccagauguau aguacugauu augaccagau 1920ucuaccugau uguuauucuu
ggccugaaga ggugcagaaa auacagacca aaguugacca 1980guaggauaau agcaaacauu
ucuaacucua uuaaugaggu cuuuaaaccu uucauaauuu 2040uuaaagguug gaaucuuuua
uaaugauuca uaagacacuu agauuaagau uuuacuuuaa 2100cagucuaaaa auugauagaa
gaauaucgau auaaauuggg auaaacauca caugagacaa 2160uuuugcuuca cuuugccuuc
ugguuauuua ugguuucugu cugaauuauu cugccuacgu 2220ucucuuuaaa agcuguugua
cguacuacgg agaaacucau cauuuuuaua caggacacua 2280augggaagac caaaauuacu
aauaaauuga cauaaccaac auuaaaacuc auaauuauuu 2340uguugaccau uuuguuaaaa
ucuacuuuuc aaaaaaaaaa agcuagaaau gaaucuaggc 2400guaggugaac uuuugcuaag
cagaaauaac acuacuuugu ugccuagaga aagauaacuu 2460cucaaguauu uuuauuccag
uccuagauca uauauguucu uuugugcaac ggaauucuaa 2520caguucuaag agaaagauca
cugcuguuua cagcgccuug ugcagccuua gauuuuaaua 2580uucuuuuguc auuguuacau
cucauagagu aaagcucuua uuaccuugau ccugagucag 2640aaaucccacc ugaaaucacc
uuuuuucccc cuugaucaaa caucccaucc uucagcuacc 2700auacuguugc uacagggauu
uuguggacug uggccccugu cccgagguug gcaccuucag 2760uucagcacag ccugagcagu
gagaaggucu gaaaggagag uauauaguua agauccuuga 2820gaaagggcug ccugaggaac
ugaccucuua aagaucucag gaucuuuaag acaacaaguu 2880agguuccuac uggaguuacc
ugccagaaug gccucuuaau uaacucaggu aaugaagagc 2940uaacuguguu auaaucaucu
ugcuuuugcc ugaauuugga gaaaguauua uaauuaaguu 3000cccaguauca gaaauguccu
uacauaagau uaaaauaucu ugaugacuaa uaccauucua 3060ugagaaagag uaguuauaug
cccagacugu auuaauuuac uuuagaaacu aauguuugaa 3120guaauggaaa aaauuuuaaa
uuauaaagcu aaggugcaau aacauuugcu acuuauuuau 3180agaauuauuu gaagaauuuu
guuuuugaag uaaugcuuua aggaguauaa gauauucaag 3240auaaauuaua cuauaaaaug
auuuuauuga aaguugaagg uuacacaaau uguuuuaggu 3300augagcagaa gagguuaagg
uauuucuaaa gguaacauau agucaagagu uuccucaaaa 3360uaguuauuug gagaagaauc
agaaugucug uguauuucuu gucuguuucu auguugucuu 3420auagcucuga cuaaaugugu
uuaccuaugc aaaagauuua uuaaagcaua gaaaagguga 3480augaauaaaa auauaaaaua
auuguccuuu uucuuaaaa 3519152917RNAHomo sapiens
15ggcccuuccg gggcugcgcg gcucccccgc cucggugccg gcaaaaaugu gccuagucac
60ggggccgcuc ucgggggaac ugaggucgcc uucgggcugg gacccggagc cccuucgccg
120cgccccaaga ccuccuugag ugcgggcugc gacgcgcuca ccccgcuggg ccgucugugg
180gcgcggcuuu gcgaagucau ccaucucucg gaucacucuc uggcagccuu gagcucucuu
240gaaagcccag ccccgggacg agggaggagc gccuuaagug cccagcgggc ucagaagccc
300cgacgugugg cggcugagcc gggccccgcg cacuuucucg gccggggagg gguucgggcu
360cgggcacccg gaguuggccc cucguaacgc cgcgggaaag ugcgggcgag ggcaguggac
420ucugaggccg gagucggcgg cacccggggc uucuaguucg gacgcggugc ccccuggugg
480cgcucaccgc gcgcguggcc uuggcuuccg ugacagcgcu cgguuggccg ucacagcagc
540ccucgguugg cccuuuccug cuuuauagcg ugcaaaccuc gccgcgccag ggccaaggga
600cagguuggag cuguugaucu guugcgcaau ugcuauuuuc cccagagcgg cuuugucuuu
660ggauuuagcg uuucagaauu gcaauuccaa aauguguaag acgggauauu cucuucugug
720cugucaaggg acauggauuu gauugacaua cuuuggaggc aagauauaga ucuuggagua
780agucgagaag uauuugacuu cagucagcga cggaaagagu augagcugga aaaacagaaa
840aaacuugaaa aggaaagaca agaacaacuc caaaaggagc aagagaaagc cuuuuucgcu
900caguuacaac uagaugaaga gacagguugc ccacauuccc aaaucagaug cuuuguacuu
960ugaugacugc augcagcuuu uggcgcagac auucccguuu guagaugaca augagguuuc
1020uucggcuacg uuucagucac uuguuccuga uauucccggu cacaucgaga gcccagucuu
1080cauugcuacu aaucaggcuc agucaccuga aacuucuguu gcucagguag ccccuguuga
1140uuuagacggu augcaacagg acauugagca aguuugggag gagcuauuau ccauuccuga
1200guuacagugu cuuaauauug aaaaugacaa gcugguugag acuaccaugg uuccaagucc
1260agaagccaaa cugacagaag uugacaauua ucauuuuuac ucaucuauac ccucaaugga
1320aaaagaagua gguaacugua guccacauuu ucuuaaugcu uuugaggauu ccuucagcag
1380cauccucucc acagaagacc ccaaccaguu gacagugaac ucauuaaauu cagaugccac
1440agucaacaca gauuuuggug augaauuuua uucugcuuuc auagcugagc ccaguaucag
1500caacagcaug cccucaccug cuacuuuaag ccauucacuc ucugaacuuc uaaaugggcc
1560cauugauguu ucugaucuau cacuuugcaa agcuuucaac caaaaccacc cugaaagcac
1620agcagaauuc aaugauucug acuccggcau uucacuaaac acaaguccca guguggcauc
1680accagaacac ucaguggaau cuuccagcua uggagacaca cuacuuggcc ucagugauuc
1740ugaaguggaa gagcuagaua gugccccugg aagugucaaa cagaaugguc cuaaaacacc
1800aguacauucu ucuggggaua ugguacaacc cuugucacca ucucaggggc agagcacuca
1860cgugcaugau gcccaaugug agaacacacc agagaaagaa uugccuguaa guccugguca
1920ucggaaaacc ccauucacaa aagacaaaca uucaagccgc uuggaggcuc aucucacaag
1980agaugaacuu agggcaaaag cucuccauau cccauucccu guagaaaaaa ucauuaaccu
2040cccuguuguu gacuucaacg aaaugauguc caaagagcag uucaaugaag cucaacuugc
2100auuaauucgg gauauacgua ggagggguaa gaauaaagug gcugcucaga auugcagaaa
2160aagaaaacug gaaaauauag uagaacuaga gcaagauuua gaucauuuga aagaugaaaa
2220agaaaaauug cucaaagaaa aaggagaaaa ugacaaaagc cuucaccuac ugaaaaaaca
2280acucagcacc uuauaucucg aaguuuucag caugcuacgu gaugaagaug gaaaaccuua
2340uucuccuagu gaauacuccc ugcagcaaac aagagauggc aauguuuucc uuguucccaa
2400aaguaagaag ccagauguua agaaaaacua gauuuaggag gauuugaccu uuucugagcu
2460aguuuuuuug uacuauuaua cuaaaagcuc cuacugugau gugaaaugcu cauacuuuau
2520aaguaauucu augcaaaauc auagccaaaa cuaguauaga aaauaauacg aaacuuuaaa
2580aagcauugga gugucaguau guugaaucag uaguuucacu uuaacuguaa acaauuucuu
2640aggacaccau uugggcuagu uucuguguaa guguaaauac uacaaaaacu uauuuauacu
2700guucuuaugu cauuuguuau auucauagau uuauaugaug auaugacauc uggcuaaaaa
2760gaaauuauug caaaacuaac cacuauguac uuuuuuauaa auacuguaug gacaaaaaau
2820ggcauuuuuu auauuaaauu guuuagcucu ggcaaaaaaa aaaaauuuua agagcuggua
2880cuaauaaagg auuauuauga cuguuaaauu auuaaaa
2917162423RNAHomo sapiens 16auccuccgcc cagcacccca ggauucaggc guuggguccc
gcccuuguag gcuguccacc 60ucaaacgggc cggacaggau auauaagaga gaaugcaccg
ugcacuacac acgcgacucc 120cacaagguug cagccggagc cgcccagcuc accgagagcc
uaguuccggc cagggucgcc 180ccggcaacca cgagcccagc caaucagcgc cccggacugc
accagagcca uggucggcag 240aagagcacug aucguacugg cucacucaga gaggacgucc
uucaacuaug ccaugaagga 300ggcugcugca gcggcuuuga agaagaaagg augggaggug
guggagucgg accucuaugc 360caugaacuuc aaucccauca uuuccagaaa ggacaucaca
gguaaacuga aggacccugc 420gaacuuucag uauccugccg agucuguucu ggcuuauaaa
gaaggccauc ugagcccaga 480uauuguggcu gaacaaaaga agcuggaagc cgcagaccuu
gugauauucc agaguggcau 540ucugcauuuc uguggcuucc aagucuuaga accucaacug
acauauagca uugggcacac 600uccagcagac gcccgaauuc aaauccugga aggauggaag
aaacgccugg agaauauuug 660ggaugagaca ccacuguauu uugcuccaag cagccucuuu
gaccuaaacu uccaggcagg 720auucuuaaug aaaaaagagg uacaggauga ggagaaaaac
aagaaauuug gccuuucugu 780gggccaucac uugggcaagu ccaucccaac ugacaaccag
aucaaagcua gaaaaugaga 840uuccuuagcc uggauuuccu ucuaacaugu uaucaaaucu
ggguaucuuu ccaggcuucc 900cugacuugcu uuaguuuuua agauuugugu uuuucuuuuu
ccacaaggaa uaaaugagag 960ggaaucgacu guauucgugc auuuuuggau cauuuuuaac
ugauucuuau gauuacuauc 1020auggcauaua accaaaaucc gacugggcuc aagaggccac
uuagggaaag auguagaaag 1080augcuagaaa aauguucuuu aaaggcaucu acacaauuua
auuccucuuu uuagggcuaa 1140aguuuuaggg uacaguuugg cuagguauca uucaacucuc
caauguucua uuaaucaccu 1200cucuguaguu uauggcagaa gggaauugcu cagagaagga
aaagacugaa ucuaccugcc 1260cuaagggacu uaacuuguuu gguaguuagc caucuaaugc
uuguuuauga uauuucuugc 1320uuucaauuac aaagcaguua cuaauaugcc uagcacaagu
accacucuug gucagcuuuu 1380guuguuuaua uacaguacac agauaccuug aaaggaagag
cuaauaaauc ucuucuuugc 1440ugcagucauc uacuuuuuuu uuaauuaaaa aaaauuuuuu
uuugaagcag ucuugcucug 1500uuacccaggc uggagugcag uggugugauc ucggcucacu
gcaaccucug ccucccaggu 1560uccagcaauu cuccugccuc agccucccua guagcuggga
ugacaggcgc cugccaucau 1620gccugacuaa uuuuuguauu uuuaguagag acggcguuuc
accauguugg ccaggcuggu 1680cucaaacucc ugaccucagg ugauccgccu accucagccu
cccaaagugc ugggauuaca 1740ggcgugaucc accacaccug gcccuugcaa ucuucuacuu
uaagguuugc agagauaaac 1800caauaaaucc acaccguaca ucugcaauau gaauucaaga
aaggaaauag uaccuucaau 1860acuuaaaaau agucuuccac aaaaaauacu uuauuucuga
ucuauacaaa uuuucagaag 1920guuauuuucu uuaucauugc uaaacugaug acuuacuaug
ggaugggguc cagucccaug 1980accuuggggu acaauuguaa accuagaguu uuaucaacuu
uggugaacag uuuuggcaua 2040auagucaauu ucuacuucug gaagucaucu cauuccacug
uugguauuau auaauucaag 2100gagaauauga uaaaacacug cccucuugug gugcauugaa
agaagagaug agaaaugaug 2160aaaagguugc cugaaaaaug ggagacagcc ucuuacuugc
caagaaaaug aagggauugg 2220accgagcugg aaaaccuccu uuaccagaug cugacuggca
cuggugguuu uugcucucga 2280caguauccac aauagcugac ggcugggugu uucaguuuga
aaauauuuug uugccuucau 2340cuucacugca auuuugugua aauuucucaa agaucugaau
uaaauaaaua aaauucauuu 2400cuacagaccc acaaaaaaaa aaa
2423171591RNAHomo sapiens 17cgggcgccgc gggccauggc
gggcgagaac caccaguggc agggcagcau ccucuacaac 60augcuuauga gcgcgaagca
aacgcgcgcg gcuccugagg cuccagagac gcggcuggug 120gaucagugcu ggggcuguuc
gugcggcgau gagcccgggg ugggcagaga ggggcugcug 180ggcgggcgga acguggcgcu
ccuguaccgc ugcugcuuuu gcgguaaaga ccacccacgg 240cagggcagca uccucuacag
caugcugacg agcgcaaagc aaacguacgc ggcaccgaag 300gcgcccgagg cgacgcuggg
uccgugcugg ggcuguucgu gcggcucuga ucccggggug 360ggcagagcgg ggcuuccggg
ugggcggccc guggcacucc uguaccgcug cugcuuuugu 420ggugaagacc acccgcggca
gggcagcauc cucuacagcu ugcucacuag cucaaagcaa 480acgcacgugg cuccggcagc
gcccgaggca cggccagggg gcgcguggug ggaccgcucc 540uacuucgcgc agaggccagg
ggguaaagag gcgcuaccag gcgggcgggc cacggcgcuu 600cuguaccgcu gcugcuuuug
cggugaagac cacccgcagc agggcagcac ccucuacugc 660gugcccacga gcacaaauca
agcgcaggcg gcuccggagg agcggccgag ggcccccugg 720ugggacaccu ccucuggugc
gcugcggccg guggcgcuca agaguccaca gguggucugc 780gaggcagccu cagcgggccu
guugaagacg cugcgcuucg ucaaguacuu gcccugcuuc 840caggugcugc cccuggacca
gcagcuggug cuggugcgca acugcugggc gucccugcuc 900augcuugagc uggcccagga
ccgcuugcag uucgagacug uggaagucuc ggagcccagc 960augcugcaga agauccucac
caccaggcgg cgggagaccg ggggcaacga gccacugccc 1020gugcccacgc ugcagcacca
uuuggcaccg ccggcggagg ccaggaaggu gcccuccgcc 1080ucccaggucc aagccaucaa
gugcuuucuu uccaaaugcu ggagucugaa caucaguacc 1140aaggaguacg ccuaccucaa
ggggaccgug cucuuuaacc cggacgugcc gggccugcag 1200ugcgugaagu acauucaggg
acuccagugg ggaacucagc aaauacucag ugaacacacc 1260aggaugacgc accaagggcc
ccaugacaga uucaucgaac uuaauaguac ccuuuuccug 1320cugagauuca ucaaugccaa
ugucauugcu gaacuguucu ucaggcccau caucggcaca 1380gucagcaugg augauaugau
gcuggaaaug cucuguacaa agauauaaag ucaugugggc 1440cacacaagug caguagugca
guucaccaug agggaagaau aaagagcugu gggcaaaaga 1500guguaaaaua uuuuaaaaua
aacuuucuua auauuuuuac augcagagua uuuuuguauu 1560caauuaaaga aauaauuuua
uuccaaaaaa a 1591181958RNAHomo sapiens
18acuucccucu ggccucucag agccucuugg auccccacag gguaaugggu gucccgaucu
60cgcgggggac ucugugaucc guguuccccu gacccuccua gugcacaacu uggccgggcu
120cacugggcuc cugcaccacu gccugucagg uccgcugcca gccccaagcc ccccaccagc
180caugagcucc uccagaaagg accaccucgg cgccagcagc ucagagcccc ucccggucau
240cauugugggu aacggccccu cugguaucug ccuguccuac cugcucuccg gcuacacacc
300cuacacgaag ccagaugcca uccacccaca cccccugcug cagaggaagc ucaccgaggc
360cccggggguc uccauccugg accaggaccu ggacuaccug uccgaaggcc ucgaaggccg
420aucccaaagc cccguggccc ugcucuuuga ugcccuucua cgcccagaca cagacuuugg
480gggaaacaug aagucggucc ucaccuggaa gcaccggaag gagcacgcca ucccccacgu
540gguucugggc cggaaccucc ccgggggagc cuggcacucc aucgaaggcu ccauggugau
600ccugagccaa ggccagugga uggggcuccc ggaccuggag gucaaggacu ggaugcagaa
660gaagcgaaga ggucuucgca acagccgggc cacugccggg gacaucgccc acuacuacag
720ggacuacgug gucaagaagg gucuggggca uaacuuugug uccggugcug uagucacagc
780cguggagugg gggacccccg aucccagcag cuguggggcc caggacucca gcccccucuu
840ccaggugagc ggcuuccuga ccaggaacca ggcccagcag cccuucucgc ugugggcccg
900caacgugguc cucgccacag gcacguucga cagcccggcc cggcugggca uccccgggga
960ggcccugccc uucauccacc augagcuguc ugcccuggag gccgccacaa gggugggugc
1020ggugaccccg gccucagacc cuguccucau cauuggcgcg gggcugucag cggccgacgc
1080gguccucuac gcccgccacu acaacauccc ggugauccau gccuuccgcc gggccgugga
1140cgacccuggc cugguguuca accagcugcc caagaugcug uaccccgagu accacaaggu
1200gcaccagaug augcgggagc aguccauccu gucgcccagc cccuaugagg guuaccgcag
1260ccuccccagg caccagcugc ugugcuucaa ggaagacugc caggccgugu uccaggaccu
1320cgaggguguc gagaaggugu uuggggucuc ccuggugcug guccucaucg gcucccaccc
1380cgaccucucc uuccugccug gggcaggggc ugacuuugca guggauccug accagccgcu
1440gagcgccaag aggaacccca uugacgugga ccccuucacc uaccagagca cccgccagga
1500gggccuguac gccauggggc cgcuggccgg ggacaacuuc gugagguuug ugcagggggg
1560cgccuuggcu guggccagcu cccugcuaag gaaggagacc aggaagccac ccuaacacuc
1620ggccagaccc gcuggcuccc aggcccugag aggacagaga ugaccacauc ccugcuggau
1680gcaggacccg uccaaagaug ccccggggag gggugucagc ccacguugcu ggccuuuggg
1740gucaagagga guagggaucc caggcugccc uggacuuaga ccagugucug aggugguaac
1800agcggccgca ggccaggguu ggccuagacc ugggauuugu ggggaaagcu gcugguguga
1860ccagcugagc acccagccag gagaccugca gcccugcgcc uuccagaagc aggucccaaa
1920uaaagccagu gcccaccugc aaaaaaaaaa aaaaaaaa
1958192432RNAHomo sapiens 19auggcccagu gagugacucg ccaggggcag cccggcucgg
ccucagcggg cggggaacuc 60uuuggggguc gagaucuccc ucguucucuc cgacgccucc
cacccugggg gucgccugag 120cucacuuggg gcucugugac ccuggcccua cggcgucucg
ggcccagagc uccuucccug 180cgggcccggc ccccugcccu cucggccgcg cagagcugac
aucgcgcuga ucggauuggc 240cgucaugggc cagaacuuaa uucugaacau gaaugaccac
ggcuuugugg ucugugcuuu 300uaauaggacu gucuccaaag uugaugauuu cuuggccaau
gaggcaaagg gaaccaaagu 360ggugggugcc cagucccuga aagagauggu cuccaagcug
aagaagcccc ggcggaucau 420ccuccuggug aaggcugggc aagcugugga ugauuucauc
gagaaauugg uaccauuguu 480ggauacuggu gacaucauca uugacggagg aaauucugaa
uauagggaca ccacaagacg 540gugccgagac cucaaggcca agggaauuuu auuugugggg
agcggaguca gugguggaga 600ggaaggggcc cgguauggcc caucgcucau gccaggaggg
aacaaagaag cguggcccca 660caucaagacc aucuuccaag gcauugcugc aaaaguggga
acuggagaac ccugcuguga 720cuggguggga gaugagggag caggccacuu cgugaagaug
gugcacaacg ggauagagua 780uggggacaug cagcugaucu gugaggcaua ccaccugaug
aaagacgugc ugggcauggc 840gcaggacgag auggcccagg ccuuugagga uuggaauaag
acagagcuag acucauuccu 900gauugaaauc acagccaaua uucucaaguu ccaagacacc
gauggcaaac accugcugcc 960aaagaucagg gacagcgcgg ggcagaaggg cacagggaag
uggaccgcca ucuccgcccu 1020ggaauacggc guacccguca cccucauugg agaagcuguc
uuugcucggu gcuuaucauc 1080ucugaaggau gagagaauuc aagcuagcaa aaagcugaag
gguccccaga aguuccaguu 1140ugauggugau aagaaaucau uccuggagga cauucggaag
gcacucuacg cuuccaagau 1200caucucuuac gcucaaggcu uuaugcugcu aaggcaggca
gccaccgagu uuggcuggac 1260ucucaauuau gguggcaucg cccugaugug gagagggggc
ugcaucauua gaaguguauu 1320ccuaggaaag auaaaggaug cauuugaucg aaacccggaa
cuucagaacc uccuacugga 1380cgacuucuuu aagucagcug uugaaaacug ccaggacucc
uggcggcggg cagucagcac 1440ugggguccag gcuggcauuc ccaugcccug uuuuaccacu
gcccucuccu ucuaugacgg 1500guacagacau gagaugcuuc cagccagccu cauccaggcu
cagcgggauu acuucggggc 1560ucacaccuau gaacucuugg ccaaaccagg gcaguuuauc
cacaccaacu ggacaggcca 1620ugguggcacc gugucauccu cgucauacaa ugccugauca
ugcugcuccu gucacccucc 1680acgauuccac agaccaggac auuccaugug ccucauggca
cugccaccug gcccuuugcc 1740cuauuuucug uucaguuuuu uaaaaguguu guaagagacu
ccugaggaag acacacaguu 1800uauuuguaaa guagcucugu gagagccacc augcccucug
cccuugccuc uugggacuga 1860ccaggagcug cucaugugcg ugagaguggg aaccaucucc
uugcggcagu ggcuuccgcg 1920ugccccgugu gcuggugcgg uucccaucac gcagacagga
aggguguuug cgcacucuga 1980ucaacuggaa ccucuguauc augcggcuga auucccuuuu
uccuuuacuc aauaaaagcu 2040acaucagacu gaugcucuuu cuccagauuc uuagucucac
cucggccaca uggagccauu 2100auccccauug gcagaaagau uuuucuuuaa aaaaaaagac
uagaauaaca caagaaacca 2160cauuuaggau uaugcuucac ucagaggagg caggcaggga
ggacacacca ggggcuuuaa 2220uacacugggc auguuuucuu ucuccaauug ggcaaugggu
acauggacgu ucacuguaac 2280gugcuuuuuc uuucgucuuu uuuuuuuuuu uuuuuuuuuu
ugcuccuggc aagcugugcg 2340ugacauucuu uauggcuuuu uguaugucaa auacuucaua
cuaaacuuuc uagagaauua 2400aacuuuaaug augggcucaa aaaaaaaaaa aa
2432204583RNAHomo sapiens 20gcggccgccc cggcggcucc
uggaaccccg guucgcggcg augccagcca ccccagcgaa 60gccgccgcag uucagugcuu
ggauaauuug aaaguacaau aguugguuuc ccuguccacc 120cgccccacuu cgcuugccau
cacagcacgc cuaucggaug ugagaggaga agucccgcug 180cucgggcacu gucuauauac
gccuaacacc uacauauauu uuaaaaacau uaaauauaau 240uaacaaucaa aagaaagagg
agaaaggaag ggaagcauua cuggguuacu augcacuugc 300gacugauuuc uuggcuuuuu
aucauuuuga acuuuaugga auacaucggc agccaaaacg 360ccucccgggg aaggcgccag
cgaagaaugc auccuaacgu uagucaaggc ugccaaggag 420gcugugcaac augcucagau
uacaauggau guuugucaug uaagcccaga cuauuuuuug 480cucuggaaag aauuggcaug
aagcagauug gaguaugucu cucuucaugu ccaaguggau 540auuauggaac ucgauaucca
gauauaaaua aguguacaaa augcaaagcu gacugugaua 600ccuguuucaa caaaaauuuc
ugcacaaaau guaaaagugg auuuuacuua caccuuggaa 660agugccuuga caauugccca
gaaggguugg aagccaacaa ccauacuaug gaguguguca 720guauugugca cugugagguc
agugaaugga auccuuggag uccaugcacg aagaagggaa 780aaacaugugg cuucaaaaga
gggacugaaa cacggguccg agaaauaaua cagcauccuu 840cagcaaaggg uaaccugugu
cccccaacaa augagacaag aaaguguaca gugcaaagga 900agaaguguca gaagggagaa
cgaggaaaaa aaggaaggga gaggaaaaga aaaaaaccua 960auaaaggaga aaguaaagaa
gcaauaccug acagcaaaag ucuggaaucc agcaaagaaa 1020ucccagagca acgagaaaac
aaacagcagc agaagaagcg aaaaguccaa gauaaacaga 1080aaucgguauc agucagcacu
guacacuaga ggguuccaug agauuauugu agacucauga 1140ugcugcuauc ucaaccagau
gcccaggaca ggugcucuag ccauuaggac cacaaaugga 1200caugucaguu auugcucugu
cuaaacaaca uucccaguag uugcuauauu cuucauacaa 1260gcauaguuaa caacaaagag
ccaaaagauc aaagaaggga uacuuucaga ugguugucuu 1320gugugcuucu cugcauuuuu
aaaagacaag acauucuugu acauauuauc aauaggcuau 1380aagauguaac aacgaaauga
ugacaucugg agaagaaaca ucuuuuccuu auaaaaaugu 1440guuuucaagc uguuguuuua
agaagcaaaa gauaguucug caaauucaaa gauacaguau 1500cccuucaaaa caaauaggag
uucagggaag agaaacaucc uucaaaggac aguguuguuu 1560ugaccgggag aucuagagag
ugcucagaau uagggccugg cauuuggaau cacaggauuu 1620aucaucacag aaacaacugu
uuuaagauua guuccaucac ucucauccug uauuuuuaua 1680agaaacacaa gagugcauac
cagaauugaa uauaccauau gggauuggag aaagacaaau 1740guggaagaaa ucauagagcu
ggagacuacu uuugugcuuu acaaaacugu gaaggauugu 1800ggucaccugg aacaggucuc
caaucuaugu uagcacuaug uggcucagcc ucuguuaccc 1860cuuggauuau auaucaaccu
guaaacaugu gccuguaacu uacuuccaaa aacaaaauca 1920uacuuauuag aagaaaauuc
ugauuuuaua gaaaaaaaau agagcaagga gaauauaaca 1980uguuugcaaa gucauguguu
uucuuucuca augagggaaa aacaauuuua uuaccugcuu 2040aaugguccac cuggaacuaa
aagggauacu auuuucuaac aagguauauc uaguagggga 2100gaaagccacc acaauaaaua
uauuuguuaa uaguuuuuca aguuuuguuc acucuguuuu 2160auuguuuguu uuauugagaa
auucuuacuc uuagagacuc augaauuaag aaagagaauu 2220cugcuaacuc agagaaccug
guuccuaugu aauucagaau auauuacauu ucucaguaau 2280auuuguuuuu ugaauccacc
uuuaucugag ccaauggaga uuuacuuaua gcguauuagg 2340agauauuuau uccauuuucu
uauuuuaauc aacauucuaa uuauagacac augggccucc 2400cuagcugauu ucacugcucc
cccuucauug cuuagaaaug ggcaucauuu cuuguauguc 2460agaucccccu gcaucuucaa
cauuuagucu uuucuucucc auauuuucua ucuguggauc 2520ucuuuagggg auugaaguca
cccuagcuga aggccucacc aguguuucac agaggacaca 2580gcccaccccu ugcaggagga
gguaucucug agugugcagc acagaaucgc augacccacc 2640uuaaccuucc uguugucaug
gaaggaugca cggcugcucu guccacugug auuccuagcc 2700cucucaagau cacugcuuuc
ugaagaauuu gcaaugacuc uggcuucugg cugcuuaucu 2760cuggacaccc guucuccacc
aguuguacag uucauguaau cuacuuggcu uaauugauuu 2820uccacuucuc ucuuccucuu
cuaagauaua aacauuuuaa augauuuauu ccuguuucuu 2880auucuggugu uucuuuccuu
gucccuauga gauaaguguc ucaacucacu aaaucuauuc 2940ccaauguaua aaauaauucu
aauuccauuu ucagcuaaaa cauauauuac caagaagaaa 3000caaacuuuau ccuacagaau
gauguuaggu agaaauaugu ccccagguuu gagaccuuuc 3060ggaugauuuc auauaccauc
uuucuucuga guguuaccca gucaaguaua aguagccaaa 3120uuauuuuugc acaucuuucu
guuucucaug ucuucauuua uucaacaagc acuuacuggg 3180aaggucuaca ccugcauagg
caaugcugga aaaaggguua aguaaaccag gacaugacaa 3240ugguggcaaa ugacuaucag
gucuucccau guguuugacu caaacuuauu acccuauggu 3300ccuucugaca auggcagaag
gucugaaucc uugaugcuaa acuuauauaa aaguagaauu 3360auuacaaagg aaaaagaaau
aaaaacuaac auucauuuuc auauguugga ugaaauauaa 3420augaagaaaa agauaacauc
aauuuuaacu guaauucucc auccaccagu aacagauccu 3480uaagacaaua gaaucauaca
guauucaaac cagcagccuu cucaaauuug agcaaaaacu 3540cuaucaaccu cugguaaagu
uccuacacua gucacagaag guguuaacuu ucuacucuga 3600uucugucucc auaauggggu
aaacuguuga uaguuuaccc caucaacaga uggucgguaa 3660auuauugauu cgaagaaucg
agagagugca gcaacauaaa ucuguuaaug ucugaucaag 3720cuccugcccu guucuccgaa
uucagcuuca uaauuaaggg aaggccuguu uucuauccuc 3780agauuuaggu ucuaguagca
guuguguaac cacuagugag ucacuuaacu ccucuggguc 3840cccauuucuc augugcaaca
agaaagaggg gaacuggaga ugaucacucu aguuccagac 3900aagggaacau uucacacuuu
guuuacuuca gggugauguc ccugaguccu cauuagugac 3960ugcguccuuu ggaaguuauc
ccaacccugc uuuucucaaa agugaaaaug uauaggcucu 4020cagaggagac agauuuaacu
cugcuucucu aauguuauug aauuaaaagc uguucacauu 4080agugguuauu aaauauugaa
auaacacugg gaagaaaaag cauauauaaa uacagcuaaa 4140aacaagaaua gauauucauu
cucacaaagg gagacagcaa agaaaaugga aagugcacug 4200gugcuagcgu uagacagcuu
guguuaaugu cucaauucug cuacuaacug guugcagcuu 4260gugugaccuu gggcacauug
uaugaucucg cagaauauca ucccaaaucu gcaaaaugga 4320auuggcauca ucucuuuugc
aagauuguua ugagaauuaa aagguucuuc auucaauaua 4380auaauaaaua uuuuguauau
aaaugaauau caauuaaaag uuaugacuaa uuccacaagu 4440caaacauaua aauuuuauuu
cuugauucau gauaugugau aguauucaua aaaauguaca 4500ugcaugauaa uuucaaggaa
uaaguauaua ugugagaauc auggaaauga aauuaauaau 4560auuaacuagu aauuaaauug
uaa 4583219648RNAHomo sapiens
21gguuuguaau gauagggcgg cagcagcagc agcagcagca gugguggaac gaggaggugg
60agaauugaga gcacgaugca uacacaggug uuucugagua guaauuagau cgcugugaag
120gaaaaagcac accuuugagu uuucaccugu gaacacuaua gcgcugagag agacagucug
180aaagcagagg aagacaucga ucaguaacac caagagacac caaaguugaa aguuuuguuu
240ucuuucccuc uguuuuauuu uucccccgug ugucccuacu auggucagaa agccuguugu
300guccaccauc uccaaaggag guuaccugca gggaaauguu aacgggaggc ugccuucccu
360gggcaacaag gagccaccug ggcaggagaa agugcagcug aagaggaaag ucacuuuacu
420gaggggaguc uccauuauca uuggcaccau cauuggagca ggaaucuuca ucucuccuaa
480gggcgugcuc cagaacacgg gcagcguggg caugucucug accaucugga cggugugugg
540gguccuguca cuauuuggag cuuugucuua ugcugaauug ggaacaacua uaaagaaauc
600uggaggucau uacacauaua uuuuggaagu cuuuggucca uuaccagcuu uuguacgagu
660cuggguggaa cuccucauaa uacgcccugc agcuacugcu gugauauccc uggcauuugg
720acgcuacauu cuggaaccau uuuuuauuca augugaaauc ccugaacuug cgaucaagcu
780cauuacagcu gugggcauaa cuguagugau gguccuaaau agcaugagug ucagcuggag
840cgcccggauc cagauuuucu uaaccuuuug caagcucaca gcaauucuga uaauuauagu
900cccuggaguu augcagcuaa uuaaagguca aacgcagaac uuuaaagacg ccuuuucagg
960aagagauuca aguauuacgc gguugccacu ggcuuuuuau uauggaaugu augcauaugc
1020uggcugguuu uaccucaacu uuguuacuga agaaguagaa aacccugaaa aaaccauucc
1080ccuugcaaua uguauaucca uggccauugu caccauuggc uaugugcuga caaauguggc
1140cuacuuuacg accauuaaug cugaggagcu gcugcuuuca aaugcagugg cagugaccuu
1200uucugagcgg cuacugggaa auuucucauu agcaguuccg aucuuuguug cccucuccug
1260cuuuggcucc augaacggug guguguuugc ugucuccagg uuauucuaug uugcgucucg
1320agagggucac cuuccagaaa uccucuccau gauucauguc cgcaagcaca cuccucuacc
1380agcuguuauu guuuugcacc cuuugacaau gauaaugcuc uucucuggag accucgacag
1440ucuuuugaau uuccucaguu uugccaggug gcuuuuuauu gggcuggcag uugcugggcu
1500gauuuaucuu cgauacaaau gcccagauau gcaucguccu uucaaggugc cacuguucau
1560cccagcuuug uuuuccuuca caugccucuu caugguugcc cuuucccucu auucggaccc
1620auuuaguaca gggauuggcu ucgucaucac ucugacugga gucccugcgu auuaucucuu
1680uauuauaugg gacaagaaac ccaggugguu uagaauaaug ucagagaaaa uaaccagaac
1740auuacaaaua auacuggaag uuguaccaga agaagauaag uuaugaacua auggacuuga
1800gaucuuggca aucugcccaa ggggagacac aaaauaggga uuuuuacuuc auuuucugaa
1860agucuagaga auuacaacuu uggugauaaa caaaaggagu caguuauuuu uauucauaua
1920uuuuagcaua uucgaacuaa uuucuaagaa auuuaguuau aacucuaugu aguuauagaa
1980agugaauaug caguuauucu augagucgca caauucuuga gucucugaua ccuaccuauu
2040gggguuagga gaaaagacua gacaauuacu auguggucau ucucuacaac auauguuagc
2100acggcaaaga accuucaaau ugaagacuga gauuuuucug uauauauggg uuuuguaaag
2160augguuuuac acacuauaga ugucuauacu gugaaaagug uuuucaauuc ugaaaaaaag
2220cauacaucau gauuauggca aagaggagag aaagaaauuu auuuuacauu gacauugcau
2280ugcuuccccu uagauaccaa uuuagauaac aaacacucau gcuuuaaugg auuauaccca
2340gagcacuuug aacaaagguc aguggggauu guugaauaca uuaaagaaga guuucuaggg
2400gcuacuguuu augagacaca uccaggaguu auguuuaagu aaaaauccuu gagaauuuau
2460uaugucagau guuuuuucau ucauuaucag gaaguuuuag uuaucuguca uuuuuuuuuu
2520ucacaucagu uugaucagga aaguguauaa cacaucuuag agcaagaguu aguuugguau
2580uaaauccuca uuagaacaac caccuguuuc acuaauaacu uaccccugau gagucuaucu
2640aaacauaugc auuuuaagcc uucaaauuac auuaucaaca ugagagaaau caccaacaaa
2700gaagauguuc aaaauaauag ucccauaucu guaaucauau cuacaugcaa uguuaguaau
2760ucugaaguuu uuuaaauuua uggcuauuuu uacacgauga ugaauuuuga caguuugugc
2820auuuucuuua uacauuuuau auucuucugu uaaaauaucu cuucagauga aacuguccag
2880auuaauuagg aaaaggcaua uauuaacaua aaaauugcaa aagaaauguc gcuguaaaua
2940agauuuacaa cugauguuuc uagaaaauuu ccacuucuau aucuaggcuu ugucaguaau
3000uuccacaccu uaauuaucau ucaacuugca aaagagacaa cugauaagaa gaaaauugaa
3060augagaaucu guggauaagu guuuguguuc agaagauguu guuuugccag uauuagaaaa
3120uacugugagc cgggcauggu ggcuuacauc uguaauccca gcacuuuggg aggcugaggg
3180gguggaucac cugaggucgg gaguucuaga ccagccugac caacauggag aaaccccauc
3240ucuacuaaaa auacaaaauu agcugggcau gguggcacau gcugguaauc ucagcuauug
3300aggaggcuga ggcaggagaa uugcuugaac ccgggaggcg gagguugcag ugagccaaga
3360uugcaccacu guacuccagc cugggugaca aagucagacu ccaucuccaa aaaaaaaaga
3420uuauauauau auauauaugu guguguaugu gugugugugu gugugugugu auauauauau
3480auauauauau acacacacac acacacacuu uuuauauaua uauauauaua uauauagugg
3540aacuuacaaa ugagaguaau auaaugauga aauuuugaac uguuauuuau aaacaucuaa
3600gguaaaaugg uuagucaugg ccagaguaug uuucauccuu uaauuuuugu ccauuugaaa
3660auaaggauuu uugaaagaau uauaccaauu aaaauuauua aaggcaaaca uagaauucau
3720aaaaaauugu ccaaaguaga aaugaugacc uauaauuugg agcauuucca auucaguaau
3780uucaauuuug cucuugaaaa cauuuaauau auauccaaga cugacauuuc uuuagcugaa
3840ccuaacguuu gggucucuga gugaauuuau aauaacuccu uccuuccuua gcauaggguu
3900uucaaaauuu gauuuauaau uccuauuucc aguaaauauu guucauuugu ccacaucucu
3960cccuaugaua uguugcugga gguaagaauu ucuuucauau uccuauuuuu uuuuucccca
4020uagacuaggc ucauagaauu uaaacaagca aauuuuccug agcuuuuucu ugccaaauga
4080aagaagacug guaaauucuc auagagaggu uuguguaguu cuuggcucuu ccugggguua
4140augugcuuau auucacagug gcaaauuggu cucagacuuu aauuuauuua uuuuugauuu
4200gaauuucucu uuaaaaguau caauuuaaaa gguaacuaga auuauucuuu cucauuuuca
4260aaagugauuu uugcauuauu aaauuucccu gccauuguaa ugccauuuca cgcagaaaaa
4320aagucagcca guaauuaaga aaaaaaguga uggagauuaa guaguauuuu ggcuuauuuu
4380uaggacucau caugagaaga cacaguuccu uuaaucagga aauuaauauc cauaauuuuc
4440acucaaaauu gcaguaugua aagcagauuc ucaaaaacuc uccugaacac uuauuuauau
4500auauguuuuu auauaaguaa aauuuuucuc auauuuuuau acgauaugca cacacacaca
4560uacaugcaca uacuacuuac uacauguucu guacuuguac uuuguaccau gcauauucaa
4620auguuuauau acauaaguuu auuauaacau aaacaguaaa aguaaugaau acuguuuaaa
4680auaacuaaua uaguauuuuu uaauuuuugu ggggauggau ucucaaauac uugugauuuu
4740aaaagauucu aaagcuaaaa cacaacuuga uuuuaaaaag aaugauucuc cuuacacaau
4800uauaaauauu ugcaguaaau auuuuccuua uaauacuguu uugaccccau uuaaaaagua
4860uuagauuaua uuccuuugau ccaaugaaaa cugaaccuua uaaaugguua gcugaaagua
4920gaccuuauuc uuguccuucu uuagaagagu aaagauuugu ccuagggaag auggcugacu
4980ucgguuccca acaugcguau gcauuuagac uguagcuccu cagcccugug gacacaaaau
5040uuggacagcu uauuagguua cguuagcaau gcaugacggu uucuccaaca cuaagauauu
5100cacguugaaa cagauuuccu guucgucuua ugugucuggu aaaauuguuu ccccaauuac
5160aauuugacau aucaauagag gguuaacaag aguauaauua cauaacagaa uuccucauga
5220acuguaauca gucuacagga aaaucauuau uuuaucuuga uuugcagaug aauauacugc
5280uaagaaaggg agcaacucug accuuuguua aaguugaucu uuuguaauug agguauaagg
5340uaugaaaaga uaaaaaaccg aaggccagag aaucaggaaa ugaaagauag uauggacuga
5400agguaacaau auuuuaaugu uaugcaauau agucagagaa auauuaaaaa uuaguuguuu
5460gcugugcaua gguggaucuc gcaggaagcu aaugaaaccu aagcuucagu gccucucacu
5520uagacauguu ccauucgagg uccugaaccu aacuuuguau uaggaauucu guacuaauuu
5580uguugaagaa gaccagcaaa guuguguaca cuucuacccc cacaaaaucu gcauugucca
5640ugugaguaaa guaaaauaau uccuguuauu uuuuucuguu agaaauaagu auggaggaua
5700uguuuuuaaa aauuuaugag uuaauugaaa uauccauaua uaacaaguga cuuucucaca
5760auauauauga ugugauauau agggagauag uuucacuuuc aucauauuuu auacguugau
5820ucugaacuau agaaaaauaa uaaaugggau uuuaauuaua gcucuuaguu gggaaagaaa
5880uauagagaga ugugggauuu gaaugcccau gaaagacauu uuauuuuacu ugaauauauu
5940cuugcuucac uuuacccucc auaauauguu guacauuagu gcugaucaag uuuacagagu
6000uacauuuugc uuuccuaacc auucagucag gaauuaaaau auggcauugu auaacaacug
6060ggaagaagcu cauaguggau auaaauuaga guagauaaug ggucaccuug auagccucug
6120uuuacauuac uuguauaugg gcaaaauaau uauuaccuau acguguauuu aagcuuaauu
6180uucauauaaa caguauuuuu aaucuauguu aaaauagaua auaucuaaaa gugugaucuc
6240uagguagucc uuaguuuauu aguacuguac uucaaaaaga uuuuuaaaua gguccggcac
6300gguggcucau gccuguaauc ccagcacuuu gggaggcuga ggcgggcgaa ucaccugagg
6360ucaggaguuc gagaucagcc uggccaacau ggugaaaccc ugucucaacu aaaaauauaa
6420aaauuagccg ggcguggugg caggcgccug uaaucccagc uacucgggag gcugaggcag
6480gagaaucacu ugaacccaag gggcagaagc ugcaguuagc caagaucgca ucauugcacu
6540ccagccuagg ggacaagagc gcgagacuuc aucucaaaaa aaaaaaaaaa aaaaaaaaaa
6600gauuuuuaaa uaauagcuaa agguaugcuc ucuaggucau ccuuaguuua uuaguacugu
6660acuuaaaaau uauuuuuuaa uagucaauuu ugggagauaa uuauuucuuu ccuuauauuu
6720uccaauuagu uggugucuaa aaauaaaugu uuugucuaau uuuagaucag guauacauuc
6780acaaaagcau aaaucauagu cucacaggaa auucaccaau uuuccauaug ucgugagaua
6840acuguccuuu cuacaaccuc auaacaauga auuuauauaa uuaccuagau uuucuuagug
6900ugaaucuacc cauuaguuuu auuuucuugg uaguuauuuu uuucccuccu cucuguuacu
6960auuggccuua aaauacacag aggacgguua caguguccua auagcuguua caugugugug
7020uuucagcgua cuugaaucaa guguacauuu auaguaccaa uaaccgccuu uacagcuuua
7080caguuaacaa uucucucaca aaacuguaga gcauuaggca ucugagagcc auagagggcc
7140aacuuuguuc cagagugaac augcuuuuuu uccucaacau auacacuacu gauuuuuuuu
7200aaaaguauga cuuucaagug aauuaaugua uugguuagga gaacugcuug cuaaguccuu
7260auuaccucuu guuaaagccu cagaaggccg ugcugaaagc cagaggggaa aaaaagagua
7320augcacaggu aucucuuuug caguggugac uguauuuuga guaccuugug ugacagggua
7380uuauuacagc aucuuguggg aaaaccuauu aggccuuugc auguuaaagc uguauaauuu
7440guuggguugu gaguggucug acuuaaaugu guauuauaaa auuuagacau caaauuuucc
7500uacuaacuaa cuuuauuaga ugcauacuug gaagcacagu cauaucacac ugggaggcaa
7560ugcaaugugg uuaccugguc cuagguuuga acugucuuau uucaaaagau uucugaauua
7620auuuuucccu agaauuucuc cuucauucca aaguacaaac auacuuugaa gaaugaaaca
7680gauuguuccc augaauguau gcucauacuc gacuagaaac gaucuauguu aaaugacugu
7740guauaugaau uauuucaagu acuaccccaa auaacuuucu uauugcucug aaagaagaaa
7800agcaauguaa aucacuauga uuauugcaca aacaaccaga auucuccaac aauuuuaagu
7860aaucugaucc ucuucuugga gaaaauuguu accuaauagu uuuuccuuau gaauguuauu
7920acuacuggua uaaaucaaau uucuauaaau uuccuacuua agucuuaaga acuggguucu
7980uccuuugaug uuauucaugu ucagaaagga aacaacacuu uacucuuuua ggacaauucc
8040uagaaucuau aguaguauca ggauauauuu ugcuuuaaaa uauauuuugg uuauuuugaa
8100uacagacauu ggcuccaaau uuucaucuuu gcacaauagu augacuuuuc acuagaacuu
8160cucaacauuu gggaacuuug caaauaugag caucauaugu guuaaggcug uaucauuuaa
8220ugcuaugaga uacauuguuu ucucccuaug ccaaacaggu gaacaaacgu aguuguuuuu
8280uacugauacu aaauguuggc uaccugugau uuuauaguau gcacauguca gaaaaaggca
8340agacaaaugg ccucuuguac ugaauacuuc ggcaaacuua uugggucuuc auuuucugac
8400agacaggauu ugacucaaua uuuguagagc uugcguagaa uggauuacau gguagugaug
8460cacugguaga aaugguuuuu aguuauugac ucagaauuca ucucaggaug aaucuuuuau
8520gucuuuuuau uguaagcaua ucugaauuua cuuuauaaag augguuuuag aaagcuuugu
8580cuaaaaauuu ggccuaggaa ugguaacuuc auuuucaguu gccaaggggu agaaaaauaa
8640uauguguguu guuauguuua uguuaacaua uuauuaggua cuaucuauga auguauuuaa
8700auauuuuuca uauucuguga caagcauuua uaauuugcaa caaguggagu ccauuuagcc
8760cagugggaaa gucuuggaac ucagguuacc cuugaaggau augcuggcag ccaucucuuu
8820gaucugugcu uaaacuguaa uuuauagacc agcuaaaucc cuaacuugga ucuggaaugc
8880auuaguuaug accuuguacc auucccagaa uuucaggggc aucguggguu uggucuagug
8940auugaaaaca caagaacaga gagauccagc ugaaaaagag ugauccucaa uauccuaacu
9000aacugguccu caacucaagc agaguuucuu cacucuggca cugugaucau gaaacuuagu
9060agaggggauu guguguauuu uauacaaauu uaauacaaug ucuuacauug auaaaauucu
9120uaaagagcaa aacugcauuu uauuucugca uccacauucc aaucauauua gaacuaagau
9180auuuaucuau gaagauauaa auggugcaga gagacuuuca ucuguggauu gcguuguuuc
9240uuaggguucc uagcacugau gccugcacaa gcaugugaua ugugaaauaa aauggauucu
9300ucuauagcua aaugaguucc cucuggggag aguucuggua cugcaaucac aaugccagau
9360gguguuuaug ggcuauuugu guaaguaagu gguaagaugc uaugaaguaa guguguuugu
9420uuucaucuua uggaaacucu ugaugcaugu gcuuuuguau ggaauaaauu uuggugcaau
9480augaugucau ucaacuuugc auugaauuga auuuugguug uauuuauaug uauuauaccu
9540gucacgcuuc uaguugcuuc aaccauuuua uaaccauuuu uguacauauu uuacuugaaa
9600auauuuuaaa uggaaauuua aauaaacauu ugauaguuua cauaauaa
9648222704RNAHomo sapiens 22cggcaccugg cgagcggagc cggagucggg cuggggaccg
cggggucgag gccggaccgc 60ggcggggucg ggggagaaac gcgcgcugcc cuggcacggg
cccccccccc cggccgcgcg 120gaaugguaug gcccggccgg aguuaaggcc ggggggaggc
ggcgaguccc gcggcggcgg 180cgacgauggg gcugcgugca ggaggaacgc ugggcagggc
cggcgcgggu cggggggcgc 240ccgaggggcc cgggccgagc ggcggcgcgc agggcggcag
cauccacucg ggccgcaucg 300ccgcggugca caacgugccg cugagcgugc ucauccggcc
gcugccgucc guguuggacc 360ccgccaaggu gcagagccuc guggacacga uccgggagga
cccagacagc gugcccccca 420ucgauguccu cuggaucaaa ggggcccagg gaggugacua
cuucuacucc uuugggggcu 480gccaccgcua cgcggccuac cagcaacugc agcgagagac
cauccccgcc aagcuugucc 540aguccacucu cucagaccua aggguguacc ugggagcauc
cacaccagac uugcaguagc 600agccuccuug gcaccugcug ccaccuucaa gagcccagaa
gacacaccug gccuccagca 660ggcugggcca ugcagaaggg auagcagggg ugcauucucu
uugcaccugg cgagaggguc 720ugacucuggg caccccucuc accagcuaca aggccuugga
cucacuguac agugugggag 780ccccaguucc caccucugug acaauaggau cauggccuua
cccuugaagc auuaccgaga 840aggagaacag agaugggcuu gaagagccac gugcugccgg
cuccaaauuc ccaaggacaa 900ggaucccucu gcauuuuugu cuauguaacc ucuuauaugg
acuacauuca gcugcaagga 960aaggaaaacc uugauugcag ugguuuaaac aaacagaaga
uuguuuuucc acauagcaug 1020gauucuggag auggguggcu aaugguauug guucaacaac
uccacgaagg uaggggucac 1080gucuuggauc cuuuugccuu aaucucagug cucguuacuu
caugguccca agauggcugc 1140uguaucccca agaaucaugu cugcguucaa ggaaggaggg
guggaggaag aggaagggcc 1200aaacuagcug gacccgucac cuucuaucag aaaguaaaac
cucgucagaa gucuguuucc 1260ugcucucucc cucugcauau cuucacuuag augcccuugg
cccgagccag cuaccauugc 1320accucuagcu gcaaacaaag cuaagacagc agggaacaga
auugucaugg cugaauagac 1380caaucguguu ccaucuacug agacuggcac acugccuccu
gcaauaaaac ugggauccca 1440uuaccaagag agaaaugcag aauuguguac caguuagcuu
uugcugugua acaaaccauc 1500cccaaacuug gcagcuagaa acaaacccug uauuuuccca
caauccuaug gguuggcaau 1560uugggcuggg cucaacaggg caguucugcu gcucacaccu
gggaucccuc auggagcuaa 1620ggucagcugu uaccucagcu gggccuggau ggucuaggau
agccuuacuc acuugccugg 1680caggugacag gcuguuggcu ggaauugcuu gguucuccuc
cauguggccu cuccagcagg 1740cuagcucagg cuuauucaca ugauggcuuc aggauuccaa
agagagugag aguagaagcu 1800gaaagacuuc uugaguucuu ggccuggaac ugggacuagg
acagugucac uucugcuaag 1860uucuuuuggu cagagcaaau cacaaggcuu uacccagauu
caagggauga gaaacagacu 1920acaugucuug augaggggaa ccacaaagag cuuguggcca
uuuuucaccu aucacaaaua 1980auuuuggaug gguauuuauu uggauaaagg uauuucccuc
uucccccuuu cucucugucu 2040cauggggccu cacucugcca aguuggaagg cacuaagaca
uuguccuggc ccucaggguc 2100uaggggaaga gguguugggg caggaaguga gucucuccau
gggcuggacc cacuguagua 2160ggagugccuc cuugucugca cugcugguau gggguuaggc
cagguaggac auuccagagg 2220ggcuucugaa aaccaagagu cccuggggaa agggaacaga
guaaggcagg ccuuguucuc 2280acugcccucu aagggaacuu ggucacucgg cacuuuuaag
ccucaguuuc uccaguucaa 2340uaauaaggac aagagcuuuu cccaugcauu cucuuucccc
gggaaaguug acugagguga 2400ccaguaauag aauugaaaag ggagaguguc uucagugcaa
uguggcaucc uggauugggu 2460cuuggaacaa aaacaggaca uuagugggaa aauuggaaau
cugaaaaaag ucugaauuuu 2520aguuaauaua ccaauuucag ucucuugguu uugacagaug
uaccauggug auguaagaug 2580uugaccuugg gguaggcugg gugaagggua uacaggaacu
cuuuguacua ucucugcaac 2640uucucuguaa aucuaguauc auuccaaaau aaaaguuuau
uuaauuuaaa aaaaaaaaaa 2700aaaa
2704231319RNAHomo sapiens 23cgcgcccguc ccgucgccgc
cgccgccgcc gcagaccccu cggucuugcu augucgagcu 60cacccgugaa gcgucagagg
auggaguccg cgcuggacca gcucaagcag uucaccaccg 120ugguggccga cacgggcgac
uuccacgcca ucgacgagua caagccccag gaugcuacca 180ccaacccguc ccugauccug
gccgcagcac agaugcccgc uuaccaggag cugguggagg 240aggcgauugc cuauggccgg
aagcugggcg ggucacaaga ggaccagauu aaaaaugcua 300uugauaaacu uuuuguguug
uuuggagcag aaauacuaaa gaagauuccg ggccgaguau 360ccacagaagu agacgcaagg
cucuccuuug auaaagaugc gaugguggcc agagccaggc 420ggcucaucga gcucuacaag
gaagcuggga ucagcaagga ccgaauucuu auaaagcugu 480caucaaccug ggaaggaauu
caggcuggaa aggagcucga ggagcagcac ggcauccacu 540gcaacaugac guuacucuuc
uccuucgccc aggcuguggc cugugccgag gcggguguga 600cccucaucuc cccauuuguu
gggcgcaucc uugauuggca uguggcaaac accgacaaga 660aauccuauga gccccuggaa
gacccugggg uaaagagugu cacuaaaauc uacaacuacu 720acaagaaguu uagcuacaaa
accauuguca ugggcgccuc cuuccgcaac acgggcgaga 780ucaaagcacu ggccggcugu
gacuuccuca ccaucucacc caagcuccug ggagagcugc 840ugcaggacaa cgccaagcug
gugccugugc ucucagccaa ggcggcccaa gccagugacc 900uggaaaaaau ccaccuggau
gagaagucuu uccguugguu gcacaacgag gaccagaugg 960cuguggagaa gcucucugac
gggauccgca aguuugccgc ugaugcagug aagcuggagc 1020ggaugcugac agaacgaaug
uucaaugcag agaauggaaa guagcgcauc ccugaggcug 1080gacuccagau cugcaccgcc
ggccagcugg gaucugacug cacguggcuu cugaugaauc 1140uugcguuuuu uacaaauugg
agcagggaca gaucauagau uucugauuuu auguaaaauu 1200uugccuaaua cauuaaagca
gucacuuuuc cugugcuguu ucaaaaaaaa aaaaaaaaaa 1260aaaaaaaaaa aaaaaaaaaa
aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaa 1319242920RNAHomo sapiens
24gauuuaugga aagcaccugg ugcguguucu gcccucugag gacccugccc uuucccaaaa
60gggguaucuu aagaaggaaa auauggcugc ucuuugccgg acagcagagu cccaguagga
120uuuccauuuu ccaaaccugg uaucaucucc uaguuggaag aagugguaag cccacgaaca
180caaaugcagg agggagaggu gccaagaagc agcgguacac agcaggcaca gagacacgug
240gucuucagca gagccuaugg gguucagaug auucacauaa gaauagaagu uucagggcug
300gaccugggga ggcagccuga gccugagccg gcuguccuga gccugaguac ucuagcugcc
360uugucgucau cgcaucuggc ugccauccag cgccagcaca caguaaugag uggccgagcu
420uccucuggga gggaggaaac aguuaaaauc uugcagcagc ugcaaucauc uaggcguggu
480ucucuugucu gacuugggcu gcacagaucc ugggccaagg gacagaagaa agacagccua
540ggagcagagc cucccagaug gcugaguugg aucuaauggc uccagggcca cugcccaggg
600ccacugcuca gcccccagcc ccucucagcc cagacucugg gucacccagc ccagauucug
660ggucagccag cccaguggaa gaagaggacg ugggcuccuc ggagaagcuu ggcagggaga
720cggaggaaca ggacagcgac ucugcagagc agggggaucc ugcuggugag gggaaagagg
780uccuguguga cuucugccuu gaugacacca gaagagugaa ggcagugaag uccugucuaa
840ccugcauggu gaauuacugu gaagagcacu ugcagccgca ucaggugaac aucaaacugc
900aaagccaccu gcugaccgag ccagugaagg accacaacug gcgauacugc ccugcccacc
960acagcccacu gucugccuuc ugcugcccug aucagcagug caucugccag gacuguugcc
1020aggagcacag uggccacacc auagucuccc uggaugcagc ccgcagggac aaggaggcug
1080aacuccagug cacccaguua gacuuggagc ggaaacucaa guugaaugaa aaugccaucu
1140ccaggcucca ggcuaaccaa aagucuguuc uggugucggu gucagagguc aaagcggugg
1200cugaaaugca guuuggggaa cuccuugcug cugugaggaa ggcccaggcc aaugugaugc
1260ucuucuuaga ggagaaggag caagcugcgc ugagccaggc caacgguauc aaggcccacc
1320uggaguacag gagugccgag auggagaaga gcaagcagga gcuggagagg auggcggcca
1380ucagcaacac uguccaguuc uuggaggagu acugcaaguu uaagaacacu gaagacauca
1440ccuucccuag uguuuacgua gggcugaagg auaaacucuc gggcauccgc aaaguuauca
1500cggaauccac uguacacuua auccaguugc uggagaacua uaagaaaaag cuccaggagu
1560uuuccaagga agaggaguau gacaucagaa cucaaguguc ugccguuguu cagcgcaaau
1620auuggacuuc caaaccugag cccagcacca gggaacaguu ccuccaauau gcguaugaca
1680ucacguuuga cccggacaca gcacacaagu aucuccggcu gcaggaggag aaccgcaagg
1740ucaccaacac cacgcccugg gagcaucccu acccggaccu ccccagcagg uuccugcacu
1800ggcggcaggu gcugucccag cagagucugu accugcacag guacuauuuu gagguggaga
1860ucuucggggc aggcaccuau guuggccuga ccugcaaagg caucgaccgg aaaggggagg
1920agcgcaacag uugcauuucc ggaaacaacu ucuccuggag ccuccaaugg aacgggaagg
1980aguucacggc cugguacagu gacauggaga ccccacucaa agcuggcccu uuccggaggc
2040ucggggucua uaucgacuuc ccgggaggga uccuuuccuu cuauggcgua gaguaugaua
2100ccaugacucu gguucacaag uuugccugca aauuuucaga accagucuau gcugccuucu
2160ggcuuuccaa gaaggaaaac gccauccgga uuguagaucu gggagaggaa cccgagaagc
2220cagcaccguc cuuggugggg acugcucccu agacuccagg agccauaucc cagaccuuug
2280ccagcuacag ugaugggauu ugcauuuuag ggugauuugg gggcagaaau aacugcugau
2340gguagcuggc uuuugaaauc cuaugggguc ucugaaugaa aacauucucc agcugcucuc
2400uuuugcucca uauggugcug uucucuaugu guuugcagua auucuuuuuu uuuuuuuuga
2460gacggagucu cgcacuguug cccaggcugg agagcagugg cgcgaucuug gcucacugca
2520agcuccgccu cccgaguuca agcaauucuc cugccucagc cucccgagua gcugggauua
2580caggugccug ccaccacacc cagcuaaugu uuuguauuuu uaguagagau gggguuucac
2640cauguuggcc aggcagaucu caaacuccug accucgugau gcacccaccu cggccuccca
2700aagugcuggg auuacaugcg ugagccacug cgcccugccu guuuguagua auuuuuaggc
2760accaaaucuc ccucaucuuc uagugccauu cuccucucug uucagguaaa ugucacacug
2820ugcccagaau ggaugaccag gaaccuuaaa gaguggcuga aaagauugca gaguuaucau
2880aauaaauugc uaacuugcgu aaaaaaaaaa aaaaaaaaaa
2920251982RNAHomo sapiens 25gugccucccg gcugguuucu gagaucagcc acagaaguua
aacuucuuuc cagggaagaa 60gggcggggau gucagggcug gagagugccc guguccuucu
gugugcauug ggcuccuucc 120uccuuaauuc ucugcuuucc acuuuuaggc ugaacuccag
ugcacccagu uagacuugga 180gcggaaacuc aaguugaaug aaaaugccau cuccaggcuc
caggcuaacc aaaagucugu 240ucuggugucg gugucagagg ucaaagcagu ggcugaaaug
caguuugggg aacuccuugc 300ugcugugagg aaggcccagg ccaaugugau gcucuucuua
gaggagaagg agcaagcugc 360gcugagccag gccaacggua ucaaggccca ccuggaguac
aggagugccg agauggagaa 420gaguaagcag gagcuggaga cgauggcggc caucagcaac
acuguccagu ucuuggagga 480guacugcaag uuuaagaaca cugaagacau caccuucccu
aguguuuaca uagggcugaa 540ggauaaacuc ucgggcaucc gcaaaguuau cacggaaucc
acuguacacu uaauccaguu 600gcuggagaac uauaagaaaa agcuccagga guuuuccaag
gaagaggagu augacaucag 660aacucaagug ucugccauug uucagcgcaa auauuggacu
uccaaaccug agcccagcac 720cagggaacag uuccuccaau augugcauga caucacguuc
gacccggaca cagcacacaa 780guaucuccgg cugcaggagg agaaccgcaa ggucaccaac
accacgcccu gggagcaucc 840cuacccggac cuccccagca gguuccugca cuggcggcag
gugcuguccc agcagagucu 900guaccugcac agguacuauu uugaggugga gaucuucggg
gcaggcaccu auguuggccu 960gaccugcaaa ggcaucgacc agaaagggga ggagcgcagc
aguugcauuu ccggaaacaa 1020cuucuccugg agccuccaau ggaacgggaa ggaguucacg
gccugguaca gugacaugga 1080gaccccacuc aaagcuggcc cuuucuggag gcucgggguc
uauauugacu ucccaggagg 1140gauccuuucc uucuauggcg uagaguauga uuccaugacu
cugguucaca aguuugccug 1200caaguuuuca gaaccagucu augcugccuu cuggcuuucc
aagaaggaaa acgccauccg 1260gauuguagau cugggagagg aacccgagaa gccagcaccg
uccuuggugg ggacugcucc 1320cuagacucca ggagccauau cccagaccuu ugccagcuac
agugauggga uuugcauuuu 1380agggugauuu gggggcaaaa auaacugcug augguagcug
gcuuuugaaa uccuaugggg 1440ucucugaaug aaaacauucu ccagcugcuc ucuuuugcuc
cauauggugc uguucucuau 1500guguuugcag uaauucuuuu uuuuuuuuuu uuugagacgg
agucucgcac uguugcccag 1560gcuggagugc aguggcguga ucuuggcuca cugcaagcuc
cgccucccga guucaagcaa 1620uucuccugcc ucagccuccc gaguagcugg gauuacaggu
gccugccacc acacccagcu 1680aacguuuugu auuuuuagua gagauggggu uucaccaugu
uggccaggca gaucucaaac 1740uucugaccuc gugaugcacu caccucggcc ucccaaagug
cugggauuac aggcgugagc 1800cacugcgccc ugccuguuug uaguaauuuu uaggcaccaa
aucucccuca ucuucuagug 1860ccauucuccu cucuguucag guaaauguca cacugugccc
agaauggaug accaggaacc 1920uucaagagug gcugaaaaga uugcagaguu aucauaauaa
auugcuaacu ugcguauuuc 1980cu
198226826RNAHomo sapiens 26cucgcaggcu ccaggggcgg
ggcguggccg gggcgcagcg acgggcgcgg agguccggcc 60gggcgcgcgc gcccccgcca
cacgcacgcc gggcgugcca guuuauaaag ggagagagca 120agcagcgagu cuugaagcuc
uguuuggugc uuuggaucca uuuccaucgg uccuuacagc 180cgcucgucag acuccagcag
ccaagauggu gaagcagauc gagagcaaga cugcuuuuca 240ggaagccuug gacgcugcag
gugauaaacu uguaguaguu gacuucucag ccacguggug 300ugggccuugc aaaaugauca
agccuuucuu ucaugauguu gcuucagagu gugaagucaa 360augcaugcca acauuccagu
uuuuuaagaa gggacaaaag gugggugaau uuucuggagc 420caauaaggaa aagcuugaag
ccaccauuaa ugaauuaguc uaaucauguu uucugaaaau 480auaaccagcc auuggcuauu
uaaaacuugu aauuuuuuua auuuacaaaa auauaaaaua 540ugaagacaua aacccaguug
ccaucugcgu gacaauaaaa cauuaaugcu aacacuuuuu 600aaaaccgucu caugucugaa
uagcuuucaa aauaaaugug aaauggucau uuaauguauu 660uuccuauauu cucaaucacu
uuuuaguaac cuuguaggcc acugauuauu uuaagauuuu 720aaaaauuauu auugcuaccu
uaauguauug cuacaaaaau cucuuguugg gggcaaugca 780gguaauaaag uaguauguug
uuauuuguaa aaaaaaaaaa aaaaaa 826273846RNAHomo sapiens
27agacccucac gugaugacaa cagcuagcaa aguucuguag cuacugccuu agggcauagu
60cuaauuucuu caguaaaaac acacuuauuc caaauuuggu uccagaauug ccuuaaauug
120uuuuugcucu guucuuaggu ugggggcggc uaugagcagg cagaggaugu ggugucaccc
180aauuaggagc ucucagcuua cgaggcaauu agcauagguu gccagggcug cacgaggagu
240ggauuucugc uuugucauuc ugacucuggc aguuagcccg cccgcucggc gcagggcgug
300gcuucucgua gccauuagga aacagcaacc cuuucaccuc aguuuucuuc acuccggcau
360uugcagcaga gcgaaaggug gucgaguccu gaaggagggc cugaugucuu caucauucuc
420aaauucuuag gacggucggg cccuggaagg aacgcucucg gaauuggccg cggaaaccga
480ucugcccguu guguuuguga aacagagaaa gauaggcggc caugguccaa ccuugaagga
540ggcagcccaa uauggcaaga aggugauggu ccuggacuuu gucacuccca ccccucuugg
600aacuagaugg ggucucggag gaacaugugu gaaugugggu ugcauaccua aaaaacugau
660gcaucaagca gcuuuguuag gacaagcccu gcaagacucu cgaaauuaug gauggaaagu
720cgaggagaca guuaagcaug auugggacag aaugauagaa gcuguacaga aucacauugg
780cucuuugaau uggggcuacc gaguagcucu gcgggagaaa aaagucgucu augagaaugc
840uuaugggcaa uuuauugguc cucacaggau uaaggcaaca aauaauaaag gcaaagaaaa
900aauuuauuca gcagagagau uucucauugc cacuggugaa agaccacguu acuugggcau
960cccuggugac aaagaauacu gcaucagcag ugaugaucuu uucuccuugc cuuacugccc
1020ggguaagacc cugguuguug gagcauccua ugucgcuuug gagugcgcug gauuucuugc
1080ugguauuggu uuagacguca cuguuauggu uagguccauu cuucuuagag gauuugacca
1140ggacauggcc aacaaaauug gugaacacau ggaagaacau ggcaucaagu uuauaagaca
1200guucguacca auuaaaguug aacaaauuga agcagggaca ccaggccgac ucagaguagu
1260agcucagucc accaauagug aggaaaucau ugaaggagaa uauaauacgg ugaugcuggc
1320aauaggaaga gaugcuugca caagaaaaau uggcuuagaa accguagggg ugaagauaaa
1380ugaaaagacu ggaaaaauac cugucacaga ugaagaacag accaaugugc cuuacaucua
1440ugccauuggc gauauauugg aggauaaggu ggagcucacc ccaguugcaa uccaggcagg
1500aagauugcug gcucagaggc ucuaugcagg uuccacuguc aagugugacu augaaaaugu
1560uccaaccacu guauuuacuc cuuuggaaua uggugcuugu ggccuuucug aggagaaagc
1620uguggagaag uuuggggaag aaaauauuga gguuuaccau aguuacuuuu ggccauugga
1680auggacgauu ccgucaagag auaacaacaa auguuaugca aaaauaaucu guaauacuaa
1740agacaaugaa cguguugugg gcuuucacgu acugggucca aaugcuggag aaguuacaca
1800aggcuuugca gcugcgcuca aauguggacu gaccaaaaag cagcuggaca gcacaauugg
1860aauccacccu gucugugcag agguauucac aacauugucu gugaccaagc gcucuggggc
1920aagcauccuc caggcuggcu gcugagguua agccccagug uggaugcugu ugccaagacu
1980gcaaaccacu ggcucguuuc cgugcccaaa uccaaggcga aguuuucuag aggguucuug
2040ggcucuuggc accugcgugu ccugugcuua ccaccgccca aggcccccuu ggaucucuug
2100gauaggaguu ggugaauaga aggcaggcag caucacacug gggucacuga cagacuugaa
2160gcugacauuu ggcagggcau cgaagggaug cauccaugaa gucaccaguc ucaagcccau
2220gugguaggcg gugauggaac aacugucaaa ucaguuuuag caugaccuuu ccuuguggau
2280uuucuuauuc ucguugucaa guuuucuagg guugaauuuu uuucuuuuuu cuccauggug
2340uuaaugauau uagagaugaa aaacguuagc aguugauuuu uguccaaaag caagucaugg
2400cuagaguauc caugcaaggu gucuuguugc auggaaggga uaguuuggcu cccuuggagg
2460cuauguaggc uugucccggg aaagagaacu guccugcagc ugaaauggac uguucuuuac
2520ugaccugcuc agcaguuucu ucucucauau auucccaaaa caaguacauc ugcgaucaac
2580ucuagccaaa uuugccccug ugugcuacau gauggaugau uauuauuuua aggucuguuu
2640aggaagggaa auggcuacuu ggccagccau ugccuggcau uugguaguau aguaugauuc
2700ucaccauuau uugucaugga ggcagacaua caccagaaau gggggagaaa caguacauau
2760cuuucugucu uuaguuuauu gugugcuggu cuaagcaagc ugagaucauu ugcaauggaa
2820aacacguaac uuguuuaaaa guuuuucugg uagcuuuagc uuuaugcuaa aaaaaauaau
2880gacauugggu aucuauuucu uucuaagacu acauuaguag gaaaauaagu cuuuucaugc
2940uuaugauuua gcuguuuugu gguaauugcu uuuuaaagga aguuauuaau aucauaaguu
3000auuauuaaua uuuugaacac agguggaugu gaaggauuuu cauuuaaaaa ccaagugguu
3060uugacuuuuu cuguugaaug aacaacugug ccuuguggaa uuuuugcaga aguguuuaug
3120cuuuguuagc auuucaacuu gcauuauuau aaagagguau uaaugccuca guuauguguu
3180ugucaaugua cuggcugagg auucuaucuc agcugucuuu ucuaacugug uagguugagu
3240uuugaacacg ugcuugugga caucaggccu ccugccagca guucuugaag cuucuuuuuc
3300auuccugcua cucuaccugu auuucucagu ugcagcacug aguggucaaa auacauuucu
3360gggccaccuc agggaaccca ugcaucugcc uggcauuuag gcagcagagc cccugaccgu
3420cccccacagg gcucugccuc acguccucau cucauuuggc uguguaaaga aaugggaaaa
3480gggaaaagga gagagcaauu gaggcaguug accauauuca guuuuauuua uuuauuuuua
3540auuuguuuuu uucuccaagu ccaccagucu cugaaauuag aacaguaggc gguaugagau
3600aaucaggccu aaucauguug ugauucucuu uucuuagugg aguggaaugu ucuaucccca
3660caagaaggau uauaucuuau agacuugucu uguucagauu cuguauuuac ccauuuuauu
3720gaaacauaua cuaaguucca uguauuuuug uuacaaaucu ucugaaaaaa aacaaaacaa
3780ugugaaacau uaaaauuaaa aggcauuaau aauauccacg ugugccuucu uacugaaaaa
3840aaaaaa
3846283013RNAHomo sapiens 28gugaaggaaa uagggaccug gcccugggcc uuguguagcg
ggagggggag cuaggaagca 60gcugagggca gaauccagga gggccuggcu gcgggggaau
gaagccuccg ccuucgcagg 120caaaagccuu uaaauacggg cucaggcccg ggacucagag
uguaacgcgu ggcagccuga 180gggaggggcg ugcgccgaga gggagcucag aucgagcggg
gcgcgggugg agaagcugcg 240gcggcgcggc ccguaggaag gugcuguccg aacgaucggg
auaggagcgg ucccugcgcu 300ugcugcuggg aagugguaca aucauguuug aaauuaagaa
gaucuguugc aucggugcag 360gcuauguugg aggacccaca uguaguguca uugcucauau
guguccugaa aucaggguaa 420cgguuguuga ugucaaugaa ucaagaauca augcguggaa
uucuccuaca cuuccuauuu 480augagccagg acuaaaagaa gugguagaau ccugucgagg
aaaaaaucuu uuuuuuucua 540ccaauauuga ugaugccauc aaagaagcug aucuuguauu
uauuucugug cuguccaacc 600cugaguuucu ggcagaggga acagccauca aggaccuaaa
gaacccagac agaguacuga 660uuggagggga ugaaacucca gagggccaga gagcugugca
ggcccugugu gcuguauaug 720agcacugggu ucccagagaa aagauccuca ccacuaauac
uuggucuuca gagcuuucca 780aacuggcagc aaaugcuuuu cuugcccaga gaauaagcag
cauuaacucc auaagugcuc 840ugugugaagc aacaggagcu gauguagaag agguagcaac
agcgauugga auggaccaga 900gaauuggaaa caaguuucua aaagccagug uuggguuugg
ugggagcugu uuccaaaagg 960auguucugaa uuugguuuau cucugugagg cucugaauuu
gccagaagua gcucguuauu 1020ggcagcaggu cauagacaug aaugacuacc agaggaggag
guuugcuucc cggaucauag 1080auagucuguu uaauacagua acugauaaga agauagcuau
uuugggauuu gcauucaaaa 1140aggacacugg ugauacaaga gaaucuucua guauauauau
uagcaaauau uugauggaug 1200aaggugcaca ucuacauaua uaugauccaa aaguaccuag
ggaacaaaua guuguggauc 1260uuucucaucc agguguuuca gaggaugacc aagugucccg
gcucgugacc auuuccaagg 1320auccauauga agcaugugau ggugcccaug cuguuguuau
uugcacugag ugggacaugu 1380uuaaggaauu ggauuaugaa cgcauucaua aaaaaaugcu
aaagccagcc uuuaucuucg 1440auggacggcg uguccuggau gggcuccaca augaacuaca
aaccauuggc uuccagauug 1500aaacaauugg caaaaaggug ucuucaaaga gaauuccaua
ugcuccuucu ggugaaauuc 1560cgaaguuuag ucuucaagau ccaccuaaca agaaaccuaa
aguguagaga uugccauuuu 1620uauuugugau uuuuuuuuuu uuuuuuuggu acuucaggau
agcaaauauc uaucugcuau 1680uaaaugguaa augaaccaag uguuuuuuuu uguuuuuuuu
uugagacaga gucucacugu 1740ugcccaggcu ggagugcagu ggugcaaucu cggcucacug
caagcucugc uucccagguu 1800cacgccauuc uccuggcuca gccucccaag uagcugggac
uacaggcacc cgccacagug 1860ccuggcuaau uuuuuguauu uuuaguagag acaggguuuc
accaugugag ccaggauggu 1920cucaaucucc ugaccuugug aaccacccgu cucggccucc
caaagugcug ggauuacagg 1980ugugagccac cacgccuggc ccaugaacca aguguuuuua
aggaaacaaa acuauuuuuu 2040uaaucaucag auuuauacua gcuauaugga uauuagcaua
ucugguaauu augaaucuag 2100aauuuuuuua cauauuuuua uaauacuguu agcucaguua
uuggaugagu gaaagauaau 2160cauguugguu uuaauagugu caauuuuugu aaaauaaaaa
uuaaacuuca aacucuuuac 2220uuuauaaauu guccauaggc cacacuuuaa uaucacauua
uaaagggaag gacagucuuc 2280auuccuccug guuauugguu uguuugucau uaaagauaua
uuuugaaucc augaaauugc 2340uaugcuaaac agccuuuaca uguauggucu gguuaaaguu
ccuuuguucc uuuuguuuua 2400auaaaaugug ucacugauuu uuuagcucaa aaucaucacu
guuaauuucc agucacccca 2460aauaugguua aaagauuuuu uuuuuaauca ugaagagaaa
auuaguagca uuucuuucuc 2520uccccauuau uuauugguuu uccucacuaa ucuuuuuuuu
uuuaguccaa aagccaaaaa 2580uauuuaucuu gguuuuacau uuuaauuucc auucuuaauu
guaauuuuuu ucuuuaaaua 2640aggaaaccaa uauaaucuca uguauaaaaa cuuaaauauu
uuacaaguua cauauagcau 2700cauucuaaaa uaagaauuuu uuuuguuuuc ugucugcuuu
uuucuuaugu cucuuguuga 2760guuuuauauu uucagugguu auuuuugcuu guguuagauc
auuauuaaaa uauauccaau 2820gucccuuuga uacuugugcu cugcugagaa uguacaguuu
gcauuaaaca ucccaggucu 2880cauccuucag gaauuuugca guucaaugag aagagggaga
caaauauaaa gaugaggaca 2940gaagcaucuc uacagaugaa aauuacauaa auaaaacauu
cuccaucaac aacuaaaaaa 3000aaaaaaaaaa aaa
30132919RNAHomo sapiens 29uggaguaagu cgagaagua
193019RNAHomo sapiens
30acaacuagau gaagagaca
193119RNAHomo sapiens 31ugacagaagu ugacaauua
193219RNAHomo sapiens 32guaagaagcc agauguuaa
19331545PRTHomo sapiens 33Met Leu
Glu Lys Phe Cys Asn Ser Thr Phe Trp Asn Ser Ser Phe Leu1 5
10 15Asp Ser Pro Glu Ala Asp Leu Pro
Leu Cys Phe Glu Gln Thr Val Leu 20 25
30Val Trp Ile Pro Leu Gly Tyr Leu Trp Leu Leu Ala Pro Trp Gln
Leu 35 40 45Leu His Val Tyr Lys
Ser Arg Thr Lys Arg Ser Ser Thr Thr Lys Leu 50 55
60Tyr Leu Ala Lys Gln Val Phe Val Gly Phe Leu Leu Ile Leu
Ala Ala65 70 75 80Ile
Glu Leu Ala Leu Val Leu Thr Glu Asp Ser Gly Gln Ala Thr Val
85 90 95Pro Ala Val Arg Tyr Thr Asn
Pro Ser Leu Tyr Leu Gly Thr Trp Leu 100 105
110Leu Val Leu Leu Ile Gln Tyr Ser Arg Gln Trp Cys Val Gln
Lys Asn 115 120 125Ser Trp Phe Leu
Ser Leu Phe Trp Ile Leu Ser Ile Leu Cys Gly Thr 130
135 140Phe Gln Phe Gln Thr Leu Ile Arg Thr Leu Leu Gln
Gly Asp Asn Ser145 150 155
160Asn Leu Ala Tyr Ser Cys Leu Phe Phe Ile Ser Tyr Gly Phe Gln Ile
165 170 175Leu Ile Leu Ile Phe
Ser Ala Phe Ser Glu Asn Asn Glu Ser Ser Asn 180
185 190Asn Pro Ser Ser Ile Ala Ser Phe Leu Ser Ser Ile
Thr Tyr Ser Trp 195 200 205Tyr Asp
Ser Ile Ile Leu Lys Gly Tyr Lys Arg Pro Leu Thr Leu Glu 210
215 220Asp Val Trp Glu Val Asp Glu Glu Met Lys Thr
Lys Thr Leu Val Ser225 230 235
240Lys Phe Glu Thr His Met Lys Arg Glu Leu Gln Lys Ala Arg Arg Ala
245 250 255Leu Gln Arg Arg
Gln Glu Lys Ser Ser Gln Gln Asn Ser Gly Ala Arg 260
265 270Leu Pro Gly Leu Asn Lys Asn Gln Ser Gln Ser
Gln Asp Ala Leu Val 275 280 285Leu
Glu Asp Val Glu Lys Lys Lys Lys Lys Ser Gly Thr Lys Lys Asp 290
295 300Val Pro Lys Ser Trp Leu Met Lys Ala Leu
Phe Lys Thr Phe Tyr Met305 310 315
320Val Leu Leu Lys Ser Phe Leu Leu Lys Leu Val Asn Asp Ile Phe
Thr 325 330 335Phe Val Ser
Pro Gln Leu Leu Lys Leu Leu Ile Ser Phe Ala Ser Asp 340
345 350Arg Asp Thr Tyr Leu Trp Ile Gly Tyr Leu
Cys Ala Ile Leu Leu Phe 355 360
365Thr Ala Ala Leu Ile Gln Ser Phe Cys Leu Gln Cys Tyr Phe Gln Leu 370
375 380Cys Phe Lys Leu Gly Val Lys Val
Arg Thr Ala Ile Met Ala Ser Val385 390
395 400Tyr Lys Lys Ala Leu Thr Leu Ser Asn Leu Ala Arg
Lys Glu Tyr Thr 405 410
415Val Gly Glu Thr Val Asn Leu Met Ser Val Asp Ala Gln Lys Leu Met
420 425 430Asp Val Thr Asn Phe Met
His Met Leu Trp Ser Ser Val Leu Gln Ile 435 440
445Val Leu Ser Ile Phe Phe Leu Trp Arg Glu Leu Gly Pro Ser
Val Leu 450 455 460Ala Gly Val Gly Val
Met Val Leu Val Ile Pro Ile Asn Ala Ile Leu465 470
475 480Ser Thr Lys Ser Lys Thr Ile Gln Val Lys
Asn Met Lys Asn Lys Asp 485 490
495Lys Arg Leu Lys Ile Met Asn Glu Ile Leu Ser Gly Ile Lys Ile Leu
500 505 510Lys Tyr Phe Ala Trp
Glu Pro Ser Phe Arg Asp Gln Val Gln Asn Leu 515
520 525Arg Lys Lys Glu Leu Lys Asn Leu Leu Ala Phe Ser
Gln Leu Gln Cys 530 535 540Val Val Ile
Phe Val Phe Gln Leu Thr Pro Val Leu Val Ser Val Val545
550 555 560Thr Phe Ser Val Tyr Val Leu
Val Asp Ser Asn Asn Ile Leu Asp Ala 565
570 575Gln Lys Ala Phe Thr Ser Ile Thr Leu Phe Asn Ile
Leu Arg Phe Pro 580 585 590Leu
Ser Met Leu Pro Met Met Ile Ser Ser Met Leu Gln Ala Ser Val 595
600 605Ser Thr Glu Arg Leu Glu Lys Tyr Leu
Gly Gly Asp Asp Leu Asp Thr 610 615
620Ser Ala Ile Arg His Asp Cys Asn Phe Asp Lys Ala Met Gln Phe Ser625
630 635 640Glu Ala Ser Phe
Thr Trp Glu His Asp Ser Glu Ala Thr Val Arg Asp 645
650 655Val Asn Leu Asp Ile Met Ala Gly Gln Leu
Val Ala Val Ile Gly Pro 660 665
670Val Gly Ser Gly Lys Ser Ser Leu Ile Ser Ala Met Leu Gly Glu Met
675 680 685Glu Asn Val His Gly His Ile
Thr Ile Lys Gly Thr Thr Ala Tyr Val 690 695
700Pro Gln Gln Ser Trp Ile Gln Asn Gly Thr Ile Lys Asp Asn Ile
Leu705 710 715 720Phe Gly
Thr Glu Phe Asn Glu Lys Arg Tyr Gln Gln Val Leu Glu Ala
725 730 735Cys Ala Leu Leu Pro Asp Leu
Glu Met Leu Pro Gly Gly Asp Leu Ala 740 745
750Glu Ile Gly Glu Lys Gly Ile Asn Leu Ser Gly Gly Gln Lys
Gln Arg 755 760 765Ile Ser Leu Ala
Arg Ala Thr Tyr Gln Asn Leu Asp Ile Tyr Leu Leu 770
775 780Asp Asp Pro Leu Ser Ala Val Asp Ala His Val Gly
Lys His Ile Phe785 790 795
800Asn Lys Val Leu Gly Pro Asn Gly Leu Leu Lys Gly Lys Thr Arg Leu
805 810 815Leu Val Thr His Ser
Met His Phe Leu Pro Gln Val Asp Glu Ile Val 820
825 830Val Leu Gly Asn Gly Thr Ile Val Glu Lys Gly Ser
Tyr Ser Ala Leu 835 840 845Leu Ala
Lys Lys Gly Glu Phe Ala Lys Asn Leu Lys Thr Phe Leu Arg 850
855 860His Thr Gly Pro Glu Glu Glu Ala Thr Val His
Asp Gly Ser Glu Glu865 870 875
880Glu Asp Asp Asp Tyr Gly Leu Ile Ser Ser Val Glu Glu Ile Pro Glu
885 890 895Asp Ala Ala Ser
Ile Thr Met Arg Arg Glu Asn Ser Phe Arg Arg Thr 900
905 910Leu Ser Arg Ser Ser Arg Ser Asn Gly Arg His
Leu Lys Ser Leu Arg 915 920 925Asn
Ser Leu Lys Thr Arg Asn Val Asn Ser Leu Lys Glu Asp Glu Glu 930
935 940Leu Val Lys Gly Gln Lys Leu Ile Lys Lys
Glu Phe Ile Glu Thr Gly945 950 955
960Lys Val Lys Phe Ser Ile Tyr Leu Glu Tyr Leu Gln Ala Ile Gly
Leu 965 970 975Phe Ser Ile
Phe Phe Ile Ile Leu Ala Phe Val Met Asn Ser Val Ala 980
985 990Phe Ile Gly Ser Asn Leu Trp Leu Ser Ala
Trp Thr Ser Asp Ser Lys 995 1000
1005Ile Phe Asn Ser Thr Asp Tyr Pro Ala Ser Gln Arg Asp Met Arg
1010 1015 1020Val Gly Val Tyr Gly Ala
Leu Gly Leu Ala Gln Gly Ile Phe Val 1025 1030
1035Phe Ile Ala His Phe Trp Ser Ala Phe Gly Phe Val His Ala
Ser 1040 1045 1050Asn Ile Leu His Lys
Gln Leu Leu Asn Asn Ile Leu Arg Ala Pro 1055 1060
1065Met Arg Phe Phe Asp Thr Thr Pro Thr Gly Arg Ile Val
Asn Arg 1070 1075 1080Phe Ala Gly Asp
Ile Ser Thr Val Asp Asp Thr Leu Pro Gln Ser 1085
1090 1095Leu Arg Ser Trp Ile Thr Cys Phe Leu Gly Ile
Ile Ser Thr Leu 1100 1105 1110Val Met
Ile Cys Met Ala Thr Pro Val Phe Thr Ile Ile Val Ile 1115
1120 1125Pro Leu Gly Ile Ile Tyr Val Ser Val Gln
Met Phe Tyr Val Ser 1130 1135 1140Thr
Ser Arg Gln Leu Arg Arg Leu Asp Ser Val Thr Arg Ser Pro 1145
1150 1155Ile Tyr Ser His Phe Ser Glu Thr Val
Ser Gly Leu Pro Val Ile 1160 1165
1170Arg Ala Phe Glu His Gln Gln Arg Phe Leu Lys His Asn Glu Val
1175 1180 1185Arg Ile Asp Thr Asn Gln
Lys Cys Val Phe Ser Trp Ile Thr Ser 1190 1195
1200Asn Arg Trp Leu Ala Ile Arg Leu Glu Leu Val Gly Asn Leu
Thr 1205 1210 1215Val Phe Phe Ser Ala
Leu Met Met Val Ile Tyr Arg Asp Thr Leu 1220 1225
1230Ser Gly Asp Thr Val Gly Phe Val Leu Ser Asn Ala Leu
Asn Ile 1235 1240 1245Thr Gln Thr Leu
Asn Trp Leu Val Arg Met Thr Ser Glu Ile Glu 1250
1255 1260Thr Asn Ile Val Ala Val Glu Arg Ile Thr Glu
Tyr Thr Lys Val 1265 1270 1275Glu Asn
Glu Ala Pro Trp Val Thr Asp Lys Arg Pro Pro Pro Asp 1280
1285 1290Trp Pro Ser Lys Gly Lys Ile Gln Phe Asn
Asn Tyr Gln Val Arg 1295 1300 1305Tyr
Arg Pro Glu Leu Asp Leu Val Leu Arg Gly Ile Thr Cys Asp 1310
1315 1320Ile Gly Ser Met Glu Lys Ile Gly Val
Val Gly Arg Thr Gly Ala 1325 1330
1335Gly Lys Ser Ser Leu Thr Asn Cys Leu Phe Arg Ile Leu Glu Ala
1340 1345 1350Ala Gly Gly Gln Ile Ile
Ile Asp Gly Val Asp Ile Ala Ser Ile 1355 1360
1365Gly Leu His Asp Leu Arg Glu Lys Leu Thr Ile Ile Pro Gln
Asp 1370 1375 1380Pro Ile Leu Phe Ser
Gly Ser Leu Arg Met Asn Leu Asp Pro Phe 1385 1390
1395Asn Asn Tyr Ser Asp Glu Glu Ile Trp Lys Ala Leu Glu
Leu Ala 1400 1405 1410His Leu Lys Ser
Phe Val Ala Ser Leu Gln Leu Gly Leu Ser His 1415
1420 1425Glu Val Thr Glu Ala Gly Gly Asn Leu Ser Ile
Gly Gln Arg Gln 1430 1435 1440Leu Leu
Cys Leu Gly Arg Ala Leu Leu Arg Lys Ser Lys Ile Leu 1445
1450 1455Val Leu Asp Glu Ala Thr Ala Ala Val Asp
Leu Glu Thr Asp Asn 1460 1465 1470Leu
Ile Gln Thr Thr Ile Gln Asn Glu Phe Ala His Cys Thr Val 1475
1480 1485Ile Thr Ile Ala His Arg Leu His Thr
Ile Met Asp Ser Asp Lys 1490 1495
1500Val Met Val Leu Asp Asn Gly Lys Ile Ile Glu Cys Gly Ser Pro
1505 1510 1515Glu Glu Leu Leu Gln Ile
Pro Gly Pro Phe Tyr Phe Met Ala Lys 1520 1525
1530Glu Ala Gly Ile Glu Asn Val Asn Ser Thr Lys Phe 1535
1540 154534316PRTHomo sapiens 34Met Ala Thr
Phe Val Glu Leu Ser Thr Lys Ala Lys Met Pro Ile Val1 5
10 15Gly Leu Gly Thr Trp Lys Ser Pro Leu
Gly Lys Val Lys Glu Ala Val 20 25
30Lys Val Ala Ile Asp Ala Gly Tyr Arg His Ile Asp Cys Ala Tyr Val
35 40 45Tyr Gln Asn Glu His Glu Val
Gly Glu Ala Ile Gln Glu Lys Ile Gln 50 55
60Glu Lys Ala Val Lys Arg Glu Asp Leu Phe Ile Val Ser Lys Leu Trp65
70 75 80Pro Thr Phe Phe
Glu Arg Pro Leu Val Arg Lys Ala Phe Glu Lys Thr 85
90 95Leu Lys Asp Leu Lys Leu Ser Tyr Leu Asp
Val Tyr Leu Ile His Trp 100 105
110Pro Gln Gly Phe Lys Ser Gly Asp Asp Leu Phe Pro Lys Asp Asp Lys
115 120 125Gly Asn Ala Ile Gly Gly Lys
Ala Thr Phe Leu Asp Ala Trp Glu Ala 130 135
140Met Glu Glu Leu Val Asp Glu Gly Leu Val Lys Ala Leu Gly Val
Ser145 150 155 160Asn Phe
Ser His Phe Gln Ile Glu Lys Leu Leu Asn Lys Pro Gly Leu
165 170 175Lys Tyr Lys Pro Val Thr Asn
Gln Val Glu Cys His Pro Tyr Leu Thr 180 185
190Gln Glu Lys Leu Ile Gln Tyr Cys His Ser Lys Gly Ile Thr
Val Thr 195 200 205Ala Tyr Ser Pro
Leu Gly Ser Pro Asp Arg Pro Trp Ala Lys Pro Glu 210
215 220Asp Pro Ser Leu Leu Glu Asp Pro Lys Ile Lys Glu
Ile Ala Ala Lys225 230 235
240His Lys Lys Thr Ala Ala Gln Val Leu Ile Arg Phe His Ile Gln Arg
245 250 255Asn Val Ile Val Ile
Pro Lys Ser Val Thr Pro Ala Arg Ile Val Glu 260
265 270Asn Ile Gln Val Phe Asp Phe Lys Leu Ser Asp Glu
Glu Met Ala Thr 275 280 285Ile Leu
Ser Phe Asn Arg Asn Trp Arg Ala Cys Asn Val Leu Gln Ser 290
295 300Ser His Leu Glu Asp Tyr Pro Phe Asn Ala Glu
Tyr305 310 31535316PRTHomo sapiens 35Met
Ala Thr Phe Val Glu Leu Ser Thr Lys Ala Lys Met Pro Ile Val1
5 10 15Gly Leu Gly Thr Trp Arg Ser
Leu Leu Gly Lys Val Lys Glu Ala Val 20 25
30Lys Val Ala Ile Asp Ala Glu Tyr Arg His Ile Asp Cys Ala
Tyr Phe 35 40 45Tyr Glu Asn Gln
His Glu Val Gly Glu Ala Ile Gln Glu Lys Ile Gln 50 55
60Glu Lys Ala Val Met Arg Glu Asp Leu Phe Ile Val Ser
Lys Val Trp65 70 75
80Pro Thr Phe Phe Glu Arg Pro Leu Val Arg Lys Ala Phe Glu Lys Thr
85 90 95Leu Lys Asp Leu Lys Leu
Ser Tyr Leu Asp Val Tyr Leu Ile His Trp 100
105 110Pro Gln Gly Phe Lys Thr Gly Asp Asp Phe Phe Pro
Lys Asp Asp Lys 115 120 125Gly Asn
Met Ile Ser Gly Lys Gly Thr Phe Leu Asp Ala Trp Glu Ala 130
135 140Met Glu Glu Leu Val Asp Glu Gly Leu Val Lys
Ala Leu Gly Val Ser145 150 155
160Asn Phe Asn His Phe Gln Ile Glu Arg Leu Leu Asn Lys Pro Gly Leu
165 170 175Lys Tyr Lys Pro
Val Thr Asn Gln Val Glu Cys His Pro Tyr Leu Thr 180
185 190Gln Glu Lys Leu Ile Gln Tyr Cys His Ser Lys
Gly Ile Thr Val Thr 195 200 205Ala
Tyr Ser Pro Leu Gly Ser Pro Asp Arg Pro Trp Ala Lys Pro Glu 210
215 220Asp Pro Ser Leu Leu Glu Asp Pro Lys Ile
Lys Glu Ile Ala Ala Lys225 230 235
240His Lys Lys Thr Thr Ala Gln Val Leu Ile Arg Phe His Ile Gln
Arg 245 250 255Asn Val Thr
Val Ile Pro Lys Ser Met Thr Pro Ala His Ile Val Glu 260
265 270Asn Ile Gln Val Phe Asp Phe Lys Leu Ser
Asp Glu Glu Met Ala Thr 275 280
285Ile Leu Ser Phe Asn Arg Asn Trp Arg Ala Phe Asp Phe Lys Glu Phe 290
295 300Ser His Leu Glu Asp Phe Pro Phe
Asp Ala Glu Tyr305 310 31536323PRTHomo
sapiens 36Met Asp Ser Lys Tyr Gln Cys Val Lys Leu Asn Asp Gly His Phe
Met1 5 10 15Pro Val Leu
Gly Phe Gly Thr Tyr Ala Pro Ala Glu Val Pro Lys Ser 20
25 30Lys Ala Leu Glu Ala Val Lys Leu Ala Ile
Glu Ala Gly Phe His His 35 40
45Ile Asp Ser Ala His Val Tyr Asn Asn Glu Glu Gln Val Gly Leu Ala 50
55 60Ile Arg Ser Lys Ile Ala Asp Gly Ser
Val Lys Arg Glu Asp Ile Phe65 70 75
80Tyr Thr Ser Lys Leu Trp Ser Asn Ser His Arg Pro Glu Leu
Val Arg 85 90 95Pro Ala
Leu Glu Arg Ser Leu Lys Asn Leu Gln Leu Asp Tyr Val Asp 100
105 110Leu Tyr Leu Ile His Phe Pro Val Ser
Val Lys Pro Gly Glu Glu Val 115 120
125Ile Pro Lys Asp Glu Asn Gly Lys Ile Leu Phe Asp Thr Val Asp Leu
130 135 140Cys Ala Thr Trp Glu Ala Met
Glu Lys Cys Lys Asp Ala Gly Leu Ala145 150
155 160Lys Ser Ile Gly Val Ser Asn Phe Asn His Arg Leu
Leu Glu Met Ile 165 170
175Leu Asn Lys Pro Gly Leu Lys Tyr Lys Pro Val Cys Asn Gln Val Glu
180 185 190Cys His Pro Tyr Phe Asn
Gln Arg Lys Leu Leu Asp Phe Cys Lys Ser 195 200
205Lys Asp Ile Val Leu Val Ala Tyr Ser Ala Leu Gly Ser His
Arg Glu 210 215 220Glu Pro Trp Val Asp
Pro Asn Ser Pro Val Leu Leu Glu Asp Pro Val225 230
235 240Leu Cys Ala Leu Ala Lys Lys His Lys Arg
Thr Pro Ala Leu Ile Ala 245 250
255Leu Arg Tyr Gln Leu Gln Arg Gly Val Val Val Leu Ala Lys Ser Tyr
260 265 270Asn Glu Gln Arg Ile
Arg Gln Asn Val Gln Val Phe Glu Phe Gln Leu 275
280 285Thr Ser Glu Glu Met Lys Ala Ile Asp Gly Leu Asn
Arg Asn Val Arg 290 295 300Tyr Leu Thr
Leu Asp Ile Phe Ala Gly Pro Pro Asn Tyr Pro Phe Ser305
310 315 320Asp Glu Tyr37323PRTHomo
sapiens 37Met Asp Ser Lys His Gln Cys Val Lys Leu Asn Asp Gly His Phe
Met1 5 10 15Pro Val Leu
Gly Phe Gly Thr Tyr Ala Pro Pro Glu Val Pro Arg Ser 20
25 30Lys Ala Leu Glu Val Thr Lys Leu Ala Ile
Glu Ala Gly Phe Arg His 35 40
45Ile Asp Ser Ala His Leu Tyr Asn Asn Glu Glu Gln Val Gly Leu Ala 50
55 60Ile Arg Ser Lys Ile Ala Asp Gly Ser
Val Lys Arg Glu Asp Ile Phe65 70 75
80Tyr Thr Ser Lys Leu Trp Ser Thr Phe His Arg Pro Glu Leu
Val Arg 85 90 95Pro Ala
Leu Glu Asn Ser Leu Lys Lys Ala Gln Leu Asp Tyr Val Asp 100
105 110Leu Tyr Leu Ile His Ser Pro Met Ser
Leu Lys Pro Gly Glu Glu Leu 115 120
125Ser Pro Thr Asp Glu Asn Gly Lys Val Ile Phe Asp Ile Val Asp Leu
130 135 140Cys Thr Thr Trp Glu Ala Met
Glu Lys Cys Lys Asp Ala Gly Leu Ala145 150
155 160Lys Ser Ile Gly Val Ser Asn Phe Asn Arg Arg Gln
Leu Glu Met Ile 165 170
175Leu Asn Lys Pro Gly Leu Lys Tyr Lys Pro Val Cys Asn Gln Val Glu
180 185 190Cys His Pro Tyr Phe Asn
Arg Ser Lys Leu Leu Asp Phe Cys Lys Ser 195 200
205Lys Asp Ile Val Leu Val Ala Tyr Ser Ala Leu Gly Ser Gln
Arg Asp 210 215 220Lys Arg Trp Val Asp
Pro Asn Ser Pro Val Leu Leu Glu Asp Pro Val225 230
235 240Leu Cys Ala Leu Ala Lys Lys His Lys Arg
Thr Pro Ala Leu Ile Ala 245 250
255Leu Arg Tyr Gln Leu Gln Arg Gly Val Val Val Leu Ala Lys Ser Tyr
260 265 270Asn Glu Gln Arg Ile
Arg Gln Asn Val Gln Val Phe Glu Phe Gln Leu 275
280 285Thr Ala Glu Asp Met Lys Ala Ile Asp Gly Leu Asp
Arg Asn Leu His 290 295 300Tyr Phe Asn
Ser Asp Ser Phe Ala Ser His Pro Asn Tyr Pro Tyr Ser305
310 315 320Asp Glu Tyr38323PRTHomo
sapiens 38Met Asp Pro Lys Tyr Gln Arg Val Glu Leu Asn Asp Gly His Phe
Met1 5 10 15Pro Val Leu
Gly Phe Gly Thr Tyr Ala Pro Pro Glu Val Pro Arg Asn 20
25 30Arg Ala Val Glu Val Thr Lys Leu Ala Ile
Glu Ala Gly Phe Arg His 35 40
45Ile Asp Ser Ala Tyr Leu Tyr Asn Asn Glu Glu Gln Val Gly Leu Ala 50
55 60Ile Arg Ser Lys Ile Ala Asp Gly Ser
Val Lys Arg Glu Asp Ile Phe65 70 75
80Tyr Thr Ser Lys Leu Trp Cys Thr Phe Phe Gln Pro Gln Met
Val Gln 85 90 95Pro Ala
Leu Glu Ser Ser Leu Lys Lys Leu Gln Leu Asp Tyr Val Asp 100
105 110Leu Tyr Leu Leu His Phe Pro Met Ala
Leu Lys Pro Gly Glu Thr Pro 115 120
125Leu Pro Lys Asp Glu Asn Gly Lys Val Ile Phe Asp Thr Val Asp Leu
130 135 140Ser Ala Thr Trp Glu Val Met
Glu Lys Cys Lys Asp Ala Gly Leu Ala145 150
155 160Lys Ser Ile Gly Val Ser Asn Phe Asn Cys Arg Gln
Leu Glu Met Ile 165 170
175Leu Asn Lys Pro Gly Leu Lys Tyr Lys Pro Val Cys Asn Gln Val Glu
180 185 190Cys His Pro Tyr Leu Asn
Gln Ser Lys Leu Leu Asp Phe Cys Lys Ser 195 200
205Lys Asp Ile Val Leu Val Ala His Ser Ala Leu Gly Thr Gln
Arg His 210 215 220Lys Leu Trp Val Asp
Pro Asn Ser Pro Val Leu Leu Glu Asp Pro Val225 230
235 240Leu Cys Ala Leu Ala Lys Lys His Lys Gln
Thr Pro Ala Leu Ile Ala 245 250
255Leu Arg Tyr Gln Leu Gln Arg Gly Val Val Val Leu Ala Lys Ser Tyr
260 265 270Asn Glu Gln Arg Ile
Arg Glu Asn Ile Gln Val Phe Glu Phe Gln Leu 275
280 285Thr Ser Glu Asp Met Lys Val Leu Asp Gly Leu Asn
Arg Asn Tyr Arg 290 295 300Tyr Val Val
Met Asp Phe Leu Met Asp His Pro Asp Tyr Pro Phe Ser305
310 315 320Asp Glu Tyr39493PRTHomo
sapiens 39Met Ile Ser Ser Lys Pro Arg Leu Val Val Pro Tyr Gly Leu Lys
Thr1 5 10 15Leu Leu Glu
Gly Ile Ser Arg Ala Val Leu Lys Thr Asn Pro Ser Asn 20
25 30Ile Asn Gln Phe Ala Ala Ala Tyr Phe Gln
Glu Leu Thr Met Tyr Arg 35 40
45Gly Asn Thr Thr Met Asp Ile Lys Asp Leu Val Lys Gln Phe His Gln 50
55 60Ile Lys Val Glu Lys Trp Ser Glu Gly
Thr Thr Pro Gln Lys Lys Leu65 70 75
80Glu Cys Leu Lys Glu Pro Gly Lys Thr Ser Val Glu Ser Lys
Val Pro 85 90 95Thr Gln
Met Glu Lys Ser Thr Asp Thr Asp Glu Asp Asn Val Thr Arg 100
105 110Thr Glu Tyr Ser Asp Lys Thr Thr Gln
Phe Pro Ser Val Tyr Ala Val 115 120
125Pro Gly Thr Glu Gln Thr Glu Ala Val Gly Gly Leu Ser Ser Lys Pro
130 135 140Ala Thr Pro Lys Thr Thr Thr
Pro Pro Ser Ser Pro Pro Pro Thr Ala145 150
155 160Val Ser Pro Glu Phe Ala Tyr Val Pro Ala Asp Pro
Ala Gln Leu Ala 165 170
175Ala Gln Met Leu Gly Lys Val Ser Ser Ile His Ser Asp Gln Ser Asp
180 185 190Val Leu Met Val Asp Val
Ala Thr Ser Met Pro Val Val Ile Lys Glu 195 200
205Val Pro Ser Ser Glu Ala Ala Glu Asp Val Met Val Ala Ala
Pro Leu 210 215 220Val Cys Ser Gly Lys
Val Leu Glu Val Gln Val Val Asn Gln Thr Ser225 230
235 240Val His Val Asp Leu Gly Ser Gln Pro Lys
Glu Asn Glu Ala Glu Pro 245 250
255Ser Thr Ala Ser Ser Val Pro Leu Gln Asp Glu Gln Glu Pro Pro Ala
260 265 270Tyr Asp Gln Ala Pro
Glu Val Thr Leu Gln Ala Asp Ile Glu Val Met 275
280 285Ser Thr Val His Ile Ser Ser Val Tyr Asn Asp Val
Pro Val Thr Glu 290 295 300Gly Val Val
Tyr Ile Glu Gln Leu Pro Glu Gln Ile Val Ile Pro Phe305
310 315 320Thr Asp Gln Val Ala Cys Leu
Lys Glu Asn Glu Gln Ser Lys Glu Asn 325
330 335Glu Gln Ser Pro Arg Val Ser Pro Lys Ser Val Val
Glu Lys Thr Thr 340 345 350Ser
Gly Met Ser Lys Lys Ser Val Glu Ser Val Lys Leu Ala Gln Leu 355
360 365Glu Glu Asn Ala Lys Tyr Ser Ser Val
Tyr Met Glu Ala Glu Ala Thr 370 375
380Ala Leu Leu Ser Asp Thr Ser Leu Lys Gly Gln Pro Glu Val Pro Ala385
390 395 400Gln Leu Leu Asp
Ala Glu Gly Ala Ile Lys Ile Gly Ser Glu Lys Ser 405
410 415Leu His Leu Glu Val Glu Ile Thr Ser Ile
Val Ser Asp Asn Thr Gly 420 425
430Gln Glu Glu Ser Gly Glu Asn Ser Val Pro Gln Glu Met Glu Gly Lys
435 440 445Pro Val Leu Ser Gly Glu Ala
Ala Glu Ala Val His Ser Gly Thr Ser 450 455
460Val Lys Ser Ser Ser Gly Pro Phe Pro Pro Ala Pro Glu Gly Leu
Thr465 470 475 480Ala Pro
Glu Ile Glu Pro Glu Gly Glu Ser Thr Ala Glu 485
49040524PRTHomo sapiens 40Met Pro Gln Leu Ser Leu Ser Trp Leu Gly
Leu Gly Pro Val Ala Ala1 5 10
15Ser Pro Trp Leu Leu Leu Leu Leu Val Gly Gly Ser Trp Leu Leu Ala
20 25 30Arg Val Leu Ala Trp Thr
Tyr Thr Phe Tyr Asp Asn Cys Arg Arg Leu 35 40
45Gln Cys Phe Pro Gln Pro Pro Lys Gln Asn Trp Phe Trp Gly
His Gln 50 55 60Gly Leu Val Thr Pro
Thr Glu Glu Gly Met Lys Thr Leu Thr Gln Leu65 70
75 80Val Thr Thr Tyr Pro Gln Gly Phe Lys Leu
Trp Leu Gly Pro Thr Phe 85 90
95Pro Leu Leu Ile Leu Cys His Pro Asp Ile Ile Arg Pro Ile Thr Ser
100 105 110Ala Ser Ala Ala Val
Ala Pro Lys Asp Met Ile Phe Tyr Gly Phe Leu 115
120 125Lys Pro Trp Leu Gly Asp Gly Leu Leu Leu Ser Gly
Gly Asp Lys Trp 130 135 140Ser Arg His
Arg Arg Met Leu Thr Pro Ala Phe His Phe Asn Ile Leu145
150 155 160Lys Pro Tyr Met Lys Ile Phe
Asn Lys Ser Val Asn Ile Met His Asp 165
170 175Lys Trp Gln Arg Leu Ala Ser Glu Gly Ser Ala Arg
Leu Asp Met Phe 180 185 190Glu
His Ile Ser Leu Met Thr Leu Asp Ser Leu Gln Lys Cys Val Phe 195
200 205Ser Phe Glu Ser Asn Cys Gln Glu Lys
Pro Ser Glu Tyr Ile Ala Ala 210 215
220Ile Leu Glu Leu Ser Ala Phe Val Glu Lys Arg Asn Gln Gln Ile Leu225
230 235 240Leu His Thr Asp
Phe Leu Tyr Tyr Leu Thr Pro Asp Gly Gln Arg Phe 245
250 255Arg Arg Ala Cys His Leu Val His Asp Phe
Thr Asp Ala Val Ile Gln 260 265
270Glu Arg Arg Cys Thr Leu Pro Thr Gln Gly Ile Asp Asp Phe Leu Lys
275 280 285Asn Lys Ala Lys Ser Lys Thr
Leu Asp Phe Ile Asp Val Leu Leu Leu 290 295
300Ser Lys Asp Glu Asp Gly Lys Glu Leu Ser Asp Glu Asp Ile Arg
Ala305 310 315 320Glu Ala
Asp Thr Phe Met Phe Glu Gly His Asp Thr Thr Ala Ser Gly
325 330 335Leu Ser Trp Val Leu Tyr His
Leu Ala Lys His Pro Glu Tyr Gln Glu 340 345
350Gln Cys Arg Gln Glu Val Gln Glu Leu Leu Lys Asp Arg Glu
Pro Ile 355 360 365Glu Ile Glu Trp
Asp Asp Leu Ala Gln Leu Pro Phe Leu Thr Met Cys 370
375 380Ile Lys Glu Ser Leu Arg Leu His Pro Pro Val Pro
Val Ile Ser Arg385 390 395
400Cys Cys Thr Gln Asp Phe Val Leu Pro Asp Gly Arg Val Ile Pro Lys
405 410 415Gly Ile Val Cys Leu
Ile Asn Ile Ile Gly Ile His Tyr Asn Pro Thr 420
425 430Val Trp Pro Asp Pro Glu Val Tyr Asp Pro Phe Arg
Phe Asp Gln Glu 435 440 445Asn Ile
Lys Glu Arg Ser Pro Leu Ala Phe Ile Pro Phe Ser Ala Gly 450
455 460Pro Arg Asn Cys Ile Gly Gln Ala Phe Ala Met
Ala Glu Met Lys Val465 470 475
480Val Leu Ala Leu Thr Leu Leu His Phe Arg Ile Leu Pro Thr His Thr
485 490 495Glu Pro Arg Arg
Lys Pro Glu Leu Ile Leu Arg Ala Glu Gly Gly Leu 500
505 510Trp Leu Arg Val Glu Pro Leu Gly Ala Asn Ser
Gln 515 52041423PRTHomo sapiens 41Met Arg Ser Leu
Gly Ala Asn Met Ala Ala Ala Leu Arg Ala Ala Gly1 5
10 15Val Leu Leu Arg Asp Pro Leu Ala Ser Ser
Ser Trp Arg Val Cys Gln 20 25
30Pro Trp Arg Trp Lys Ser Gly Ala Ala Ala Ala Ala Val Thr Thr Glu
35 40 45Thr Ala Gln His Ala Gln Gly Ala
Lys Pro Gln Val Gln Pro Gln Lys 50 55
60Arg Lys Pro Lys Thr Gly Ile Leu Met Leu Asn Met Gly Gly Pro Glu65
70 75 80Thr Leu Gly Asp Val
His Asp Phe Leu Leu Arg Leu Phe Leu Asp Arg 85
90 95Asp Leu Met Thr Leu Pro Ile Gln Asn Lys Leu
Ala Pro Phe Ile Ala 100 105
110Lys Arg Arg Thr Pro Lys Ile Gln Glu Gln Tyr Arg Arg Ile Gly Gly
115 120 125Gly Ser Pro Ile Lys Ile Trp
Thr Ser Lys Gln Gly Glu Gly Met Val 130 135
140Lys Leu Leu Asp Glu Leu Ser Pro Asn Thr Ala Pro His Lys Tyr
Tyr145 150 155 160Ile Gly
Phe Arg Tyr Val His Pro Leu Thr Glu Glu Ala Ile Glu Glu
165 170 175Met Glu Arg Asp Gly Leu Glu
Arg Ala Ile Ala Phe Thr Gln Tyr Pro 180 185
190Gln Tyr Ser Cys Ser Thr Thr Gly Ser Ser Leu Asn Ala Ile
Tyr Arg 195 200 205Tyr Tyr Asn Gln
Val Gly Arg Lys Pro Thr Met Lys Trp Ser Thr Ile 210
215 220Asp Arg Trp Pro Thr His His Leu Leu Ile Gln Cys
Phe Ala Asp His225 230 235
240Ile Leu Lys Glu Leu Asp His Phe Pro Leu Glu Lys Arg Ser Glu Val
245 250 255Val Ile Leu Phe Ser
Ala His Ser Leu Pro Met Ser Val Val Asn Arg 260
265 270Gly Asp Pro Tyr Pro Gln Glu Val Ser Ala Thr Val
Gln Lys Val Met 275 280 285Glu Arg
Leu Glu Tyr Cys Asn Pro Tyr Arg Leu Val Trp Gln Ser Lys 290
295 300Val Gly Pro Met Pro Trp Leu Gly Pro Gln Thr
Asp Glu Ser Ile Lys305 310 315
320Gly Leu Cys Glu Arg Gly Arg Lys Asn Ile Leu Leu Val Pro Ile Ala
325 330 335Phe Thr Ser Asp
His Ile Glu Thr Leu Tyr Glu Leu Asp Ile Glu Tyr 340
345 350Ser Gln Val Leu Ala Lys Glu Cys Gly Val Glu
Asn Ile Arg Arg Ala 355 360 365Glu
Ser Leu Asn Gly Asn Pro Leu Phe Ser Lys Ala Leu Ala Asp Leu 370
375 380Val His Ser His Ile Gln Ser Asn Glu Leu
Cys Ser Lys Gln Leu Thr385 390 395
400Leu Ser Cys Pro Leu Cys Val Asn Pro Val Cys Arg Glu Thr Lys
Ser 405 410 415Phe Phe Thr
Ser Gln Gln Leu 42042175PRTHomo sapiens 42Met Ser Ser Gln Ile
Arg Gln Asn Tyr Ser Thr Asp Val Glu Ala Ala1 5
10 15Val Asn Ser Leu Val Asn Leu Tyr Leu Gln Ala
Ser Tyr Thr Tyr Leu 20 25
30Ser Leu Gly Phe Tyr Phe Asp Arg Asp Asp Val Ala Leu Glu Gly Val
35 40 45Ser His Phe Phe Arg Glu Leu Ala
Glu Glu Lys Arg Glu Gly Tyr Glu 50 55
60Arg Leu Leu Lys Met Gln Asn Gln Arg Gly Gly Arg Ala Leu Phe Gln65
70 75 80Asp Ile Lys Lys Pro
Ala Glu Asp Glu Trp Gly Lys Thr Pro Asp Ala 85
90 95Met Lys Ala Ala Met Ala Leu Glu Lys Lys Leu
Asn Gln Ala Leu Leu 100 105
110Asp Leu His Ala Leu Gly Ser Ala Arg Thr Asp Pro His Leu Cys Asp
115 120 125Phe Leu Glu Thr His Phe Leu
Asp Glu Glu Val Lys Leu Ile Lys Lys 130 135
140Met Gly Asp His Leu Thr Asn Leu His Arg Leu Gly Gly Pro Glu
Ala145 150 155 160Gly Leu
Gly Glu Tyr Leu Phe Glu Arg Leu Thr Leu Lys His Asp 165
170 17543274PRTHomo sapiens 43Met Gly Thr
Asp Ser Arg Ala Ala Lys Ala Leu Leu Ala Arg Ala Arg1 5
10 15Thr Leu His Leu Gln Thr Gly Asn Leu
Leu Asn Trp Gly Arg Leu Arg 20 25
30Lys Lys Cys Pro Ser Thr His Ser Glu Glu Leu His Asp Cys Ile Gln
35 40 45Lys Thr Leu Asn Glu Trp Ser
Ser Gln Ile Asn Pro Asp Leu Val Arg 50 55
60Glu Phe Pro Asp Val Leu Glu Cys Thr Val Ser His Ala Val Glu Lys65
70 75 80Ile Asn Pro Asp
Glu Arg Glu Glu Met Lys Val Ser Ala Lys Leu Phe 85
90 95Ile Val Glu Ser Asn Ser Ser Ser Ser Thr
Arg Ser Ala Val Asp Met 100 105
110Ala Cys Ser Val Leu Gly Val Ala Gln Leu Asp Ser Val Ile Ile Ala
115 120 125Ser Pro Pro Ile Glu Asp Gly
Val Asn Leu Ser Leu Glu His Leu Gln 130 135
140Pro Tyr Trp Glu Glu Leu Glu Asn Leu Val Gln Ser Lys Lys Ile
Val145 150 155 160Ala Ile
Gly Thr Ser Asp Leu Asp Lys Thr Gln Leu Glu Gln Leu Tyr
165 170 175Gln Trp Ala Gln Val Lys Pro
Asn Ser Asn Gln Val Asn Leu Ala Ser 180 185
190Cys Cys Val Met Pro Pro Asp Leu Thr Ala Phe Ala Lys Gln
Phe Asp 195 200 205Ile Gln Leu Leu
Thr His Asn Asp Pro Lys Glu Leu Leu Ser Glu Ala 210
215 220Ser Phe Gln Glu Ala Leu Gln Glu Ser Ile Pro Asp
Ile Gln Ala His225 230 235
240Glu Trp Val Pro Leu Trp Leu Leu Arg Tyr Ser Val Ile Val Lys Ser
245 250 255Arg Gly Ile Ile Lys
Ser Lys Gly Tyr Ile Leu Gln Ala Lys Arg Arg 260
265 270Gly Ser44522PRTHomo sapiens 44Met Ala Leu Leu Pro
Arg Ala Leu Ser Ala Gly Ala Gly Pro Ser Trp1 5
10 15Arg Arg Ala Ala Arg Ala Phe Arg Gly Phe Leu
Leu Leu Leu Pro Glu 20 25
30Pro Ala Ala Leu Thr Arg Ala Leu Ser Arg Ala Met Ala Cys Arg Gln
35 40 45Glu Pro Gln Pro Gln Gly Pro Pro
Pro Ala Ala Gly Ala Val Ala Ser 50 55
60Tyr Asp Tyr Leu Val Ile Gly Gly Gly Ser Gly Gly Leu Ala Ser Ala65
70 75 80Arg Arg Ala Ala Glu
Leu Gly Ala Arg Ala Ala Val Val Glu Ser His 85
90 95Lys Leu Gly Gly Thr Cys Val Asn Val Gly Cys
Val Pro Lys Lys Val 100 105
110Met Trp Asn Thr Ala Val His Ser Glu Phe Met His Asp His Ala Asp
115 120 125Tyr Gly Phe Pro Ser Cys Glu
Gly Lys Phe Asn Trp Arg Val Ile Lys 130 135
140Glu Lys Arg Asp Ala Tyr Val Ser Arg Leu Asn Ala Ile Tyr Gln
Asn145 150 155 160Asn Leu
Thr Lys Ser His Ile Glu Ile Ile Arg Gly His Ala Ala Phe
165 170 175Thr Ser Asp Pro Lys Pro Thr
Ile Glu Val Ser Gly Lys Lys Tyr Thr 180 185
190Ala Pro His Ile Leu Ile Ala Thr Gly Gly Met Pro Ser Thr
Pro His 195 200 205Glu Ser Gln Ile
Pro Gly Ala Ser Leu Gly Ile Thr Ser Asp Gly Phe 210
215 220Phe Gln Leu Glu Glu Leu Pro Gly Arg Ser Val Ile
Val Gly Ala Gly225 230 235
240Tyr Ile Ala Val Glu Met Ala Gly Ile Leu Ser Ala Leu Gly Ser Lys
245 250 255Thr Ser Leu Met Ile
Arg His Asp Lys Val Leu Arg Ser Phe Asp Ser 260
265 270Met Ile Ser Thr Asn Cys Thr Glu Glu Leu Glu Asn
Ala Gly Val Glu 275 280 285Val Leu
Lys Phe Ser Gln Val Lys Glu Val Lys Lys Thr Leu Ser Gly 290
295 300Leu Glu Val Ser Met Val Thr Ala Val Pro Gly
Arg Leu Pro Val Met305 310 315
320Thr Met Ile Pro Asp Val Asp Cys Leu Leu Trp Ala Ile Gly Arg Val
325 330 335Pro Asn Thr Lys
Asp Leu Ser Leu Asn Lys Leu Gly Ile Gln Thr Asp 340
345 350Asp Lys Gly His Ile Ile Val Asp Glu Phe Gln
Asn Thr Asn Val Lys 355 360 365Gly
Ile Tyr Ala Val Gly Asp Val Cys Gly Lys Ala Leu Leu Thr Pro 370
375 380Val Ala Ile Ala Ala Gly Arg Lys Leu Ala
His Arg Leu Phe Glu Tyr385 390 395
400Lys Glu Asp Ser Lys Leu Asp Tyr Asn Asn Ile Pro Thr Val Val
Phe 405 410 415Ser His Pro
Pro Ile Gly Thr Val Gly Leu Thr Glu Asp Glu Ala Ile 420
425 430His Lys Tyr Gly Ile Glu Asn Val Lys Thr
Tyr Ser Thr Ser Phe Thr 435 440
445Pro Met Tyr His Ala Val Thr Lys Arg Lys Thr Lys Cys Val Met Lys 450
455 460Met Val Cys Ala Asn Lys Glu Glu
Lys Val Val Gly Ile His Met Gln465 470
475 480Gly Leu Gly Cys Asp Glu Met Leu Gln Gly Phe Ala
Val Ala Val Lys 485 490
495Met Gly Ala Thr Lys Ala Asp Phe Asp Asn Thr Val Ala Ile His Pro
500 505 510Thr Ser Ser Glu Glu Leu
Val Thr Leu Arg 515 52045465PRTHomo sapiens 45Met
Glu Pro Ser Ser Leu Glu Leu Pro Ala Asp Thr Val Gln Arg Ile1
5 10 15Ala Ala Glu Leu Lys Cys His
Pro Thr Asp Glu Arg Val Ala Leu His 20 25
30Leu Asp Glu Glu Asp Lys Leu Arg His Phe Arg Glu Cys Phe
Tyr Ile 35 40 45Pro Lys Ile Gln
Asp Leu Pro Pro Val Asp Leu Ser Leu Val Asn Lys 50 55
60Asp Glu Asn Ala Ile Tyr Phe Leu Gly Asn Ser Leu Gly
Leu Gln Pro65 70 75
80Lys Met Val Lys Thr Tyr Leu Glu Glu Glu Leu Asp Lys Trp Ala Lys
85 90 95Ile Ala Ala Tyr Gly His
Glu Val Gly Lys Arg Pro Trp Ile Thr Gly 100
105 110Asp Glu Ser Ile Val Gly Leu Met Lys Asp Ile Val
Gly Ala Asn Glu 115 120 125Lys Glu
Ile Ala Leu Met Asn Ala Leu Thr Val Asn Leu His Leu Leu 130
135 140Met Leu Ser Phe Phe Lys Pro Thr Pro Lys Arg
Tyr Lys Ile Leu Leu145 150 155
160Glu Ala Lys Ala Phe Pro Ser Asp His Tyr Ala Ile Glu Ser Gln Leu
165 170 175Gln Leu His Gly
Leu Asn Ile Glu Glu Ser Met Arg Met Ile Lys Pro 180
185 190Arg Glu Gly Glu Glu Thr Leu Arg Ile Glu Asp
Ile Leu Glu Val Ile 195 200 205Glu
Lys Glu Gly Asp Ser Ile Ala Val Ile Leu Phe Ser Gly Val His 210
215 220Phe Tyr Thr Gly Gln His Phe Asn Ile Pro
Ala Ile Thr Lys Ala Gly225 230 235
240Gln Ala Lys Gly Cys Tyr Val Gly Phe Asp Leu Ala His Ala Val
Gly 245 250 255Asn Val Glu
Leu Tyr Leu His Asp Trp Gly Val Asp Phe Ala Cys Trp 260
265 270Cys Ser Tyr Lys Tyr Leu Asn Ala Gly Ala
Gly Gly Ile Ala Gly Ala 275 280
285Phe Ile His Glu Lys His Ala His Thr Ile Lys Pro Ala Leu Val Gly 290
295 300Trp Phe Gly His Glu Leu Ser Thr
Arg Phe Lys Met Asp Asn Lys Leu305 310
315 320Gln Leu Ile Pro Gly Val Cys Gly Phe Arg Ile Ser
Asn Pro Pro Ile 325 330
335Leu Leu Val Cys Ser Leu His Ala Ser Leu Glu Ile Phe Lys Gln Ala
340 345 350Thr Met Lys Ala Leu Arg
Lys Lys Ser Val Leu Leu Thr Gly Tyr Leu 355 360
365Glu Tyr Leu Ile Lys His Asn Tyr Gly Lys Asp Lys Ala Ala
Thr Lys 370 375 380Lys Pro Val Val Asn
Ile Ile Thr Pro Ser His Val Glu Glu Arg Gly385 390
395 400Cys Gln Leu Thr Ile Thr Phe Ser Val Pro
Asn Lys Asp Val Phe Gln 405 410
415Glu Leu Glu Lys Arg Gly Val Val Cys Asp Lys Arg Asn Pro Asn Gly
420 425 430Ile Arg Val Ala Pro
Val Pro Leu Tyr Asn Ser Phe His Asp Val Tyr 435
440 445Lys Phe Thr Asn Leu Leu Thr Ser Ile Leu Asp Ser
Ala Glu Thr Lys 450 455
460Asn46546572PRTHomo sapiens 46Met Glu Pro Glu Ala Pro Arg Arg Arg His
Thr His Gln Arg Gly Tyr1 5 10
15Leu Leu Thr Arg Asn Pro His Leu Asn Lys Asp Leu Ala Phe Thr Leu
20 25 30Glu Glu Arg Gln Gln Leu
Asn Ile His Gly Leu Leu Pro Pro Ser Phe 35 40
45Asn Ser Gln Glu Ile Gln Val Leu Arg Val Val Lys Asn Phe
Glu His 50 55 60Leu Asn Ser Asp Phe
Asp Arg Tyr Leu Leu Leu Met Asp Leu Gln Asp65 70
75 80Arg Asn Glu Lys Leu Phe Tyr Arg Val Leu
Thr Ser Asp Ile Glu Lys 85 90
95Phe Met Pro Ile Val Tyr Thr Pro Thr Val Gly Leu Ala Cys Gln Gln
100 105 110Tyr Ser Leu Val Phe
Arg Lys Pro Arg Gly Leu Phe Ile Thr Ile His 115
120 125Asp Arg Gly His Ile Ala Ser Val Leu Asn Ala Trp
Pro Glu Asp Val 130 135 140Ile Lys Ala
Ile Val Val Thr Asp Gly Glu Arg Ile Leu Gly Leu Gly145
150 155 160Asp Leu Gly Cys Asn Gly Met
Gly Ile Pro Val Gly Lys Leu Ala Leu 165
170 175Tyr Thr Ala Cys Gly Gly Met Asn Pro Gln Glu Cys
Leu Pro Val Ile 180 185 190Leu
Asp Val Gly Thr Glu Asn Glu Glu Leu Leu Lys Asp Pro Leu Tyr 195
200 205Ile Gly Leu Arg Gln Arg Arg Val Arg
Gly Ser Glu Tyr Asp Asp Phe 210 215
220Leu Asp Glu Phe Met Glu Ala Val Ser Ser Lys Tyr Gly Met Asn Cys225
230 235 240Leu Ile Gln Phe
Glu Asp Phe Ala Asn Val Asn Ala Phe Arg Leu Leu 245
250 255Asn Lys Tyr Arg Asn Gln Tyr Cys Thr Phe
Asn Asp Asp Ile Gln Gly 260 265
270Thr Ala Ser Val Ala Val Ala Gly Leu Leu Ala Ala Leu Arg Ile Thr
275 280 285Lys Asn Lys Leu Ser Asp Gln
Thr Ile Leu Phe Gln Gly Ala Gly Glu 290 295
300Ala Ala Leu Gly Ile Ala His Leu Ile Val Met Ala Leu Glu Lys
Glu305 310 315 320Gly Leu
Pro Lys Glu Lys Ala Ile Lys Lys Ile Trp Leu Val Asp Ser
325 330 335Lys Gly Leu Ile Val Lys Gly
Arg Ala Ser Leu Thr Gln Glu Lys Glu 340 345
350Lys Phe Ala His Glu His Glu Glu Met Lys Asn Leu Glu Ala
Ile Val 355 360 365Gln Glu Ile Lys
Pro Thr Ala Leu Ile Gly Val Ala Ala Ile Gly Gly 370
375 380Ala Phe Ser Glu Gln Ile Leu Lys Asp Met Ala Ala
Phe Asn Glu Arg385 390 395
400Pro Ile Ile Phe Ala Leu Ser Asn Pro Thr Ser Lys Ala Glu Cys Ser
405 410 415Ala Glu Gln Cys Tyr
Lys Ile Thr Lys Gly Arg Ala Ile Phe Ala Ser 420
425 430Gly Ser Pro Phe Asp Pro Val Thr Leu Pro Asn Gly
Gln Thr Leu Tyr 435 440 445Pro Gly
Gln Gly Asn Asn Ser Tyr Val Phe Pro Gly Val Ala Leu Gly 450
455 460Val Val Ala Cys Gly Leu Arg Gln Ile Thr Asp
Asn Ile Phe Leu Thr465 470 475
480Thr Ala Glu Val Ile Ala Gln Gln Val Ser Asp Lys His Leu Glu Glu
485 490 495Gly Arg Leu Tyr
Pro Pro Leu Asn Thr Ile Arg Asp Val Ser Leu Lys 500
505 510Ile Ala Glu Lys Ile Val Lys Asp Ala Tyr Gln
Glu Lys Thr Ala Thr 515 520 525Val
Tyr Pro Glu Pro Gln Asn Lys Glu Ala Phe Val Arg Ser Gln Met 530
535 540Tyr Ser Thr Asp Tyr Asp Gln Ile Leu Pro
Asp Cys Tyr Ser Trp Pro545 550 555
560Glu Glu Val Gln Lys Ile Gln Thr Lys Val Asp Gln
565 57047605PRTHomo sapiens 47Met Met Asp Leu Glu Leu
Pro Pro Pro Gly Leu Pro Ser Gln Gln Asp1 5
10 15Met Asp Leu Ile Asp Ile Leu Trp Arg Gln Asp Ile
Asp Leu Gly Val 20 25 30Ser
Arg Glu Val Phe Asp Phe Ser Gln Arg Arg Lys Glu Tyr Glu Leu 35
40 45Glu Lys Gln Lys Lys Leu Glu Lys Glu
Arg Gln Glu Gln Leu Gln Lys 50 55
60Glu Gln Glu Lys Ala Phe Phe Ala Gln Leu Gln Leu Asp Glu Glu Thr65
70 75 80Gly Glu Phe Leu Pro
Ile Gln Pro Ala Gln His Ile Gln Ser Glu Thr 85
90 95Ser Gly Ser Ala Asn Tyr Ser Gln Val Ala His
Ile Pro Lys Ser Asp 100 105
110Ala Leu Tyr Phe Asp Asp Cys Met Gln Leu Leu Ala Gln Thr Phe Pro
115 120 125Phe Val Asp Asp Asn Glu Val
Ser Ser Ala Thr Phe Gln Ser Leu Val 130 135
140Pro Asp Ile Pro Gly His Ile Glu Ser Pro Val Phe Ile Ala Thr
Asn145 150 155 160Gln Ala
Gln Ser Pro Glu Thr Ser Val Ala Gln Val Ala Pro Val Asp
165 170 175Leu Asp Gly Met Gln Gln Asp
Ile Glu Gln Val Trp Glu Glu Leu Leu 180 185
190Ser Ile Pro Glu Leu Gln Cys Leu Asn Ile Glu Asn Asp Lys
Leu Val 195 200 205Glu Thr Thr Met
Val Pro Ser Pro Glu Ala Lys Leu Thr Glu Val Asp 210
215 220Asn Tyr His Phe Tyr Ser Ser Ile Pro Ser Met Glu
Lys Glu Val Gly225 230 235
240Asn Cys Ser Pro His Phe Leu Asn Ala Phe Glu Asp Ser Phe Ser Ser
245 250 255Ile Leu Ser Thr Glu
Asp Pro Asn Gln Leu Thr Val Asn Ser Leu Asn 260
265 270Ser Asp Ala Thr Val Asn Thr Asp Phe Gly Asp Glu
Phe Tyr Ser Ala 275 280 285Phe Ile
Ala Glu Pro Ser Ile Ser Asn Ser Met Pro Ser Pro Ala Thr 290
295 300Leu Ser His Ser Leu Ser Glu Leu Leu Asn Gly
Pro Ile Asp Val Ser305 310 315
320Asp Leu Ser Leu Cys Lys Ala Phe Asn Gln Asn His Pro Glu Ser Thr
325 330 335Ala Glu Phe Asn
Asp Ser Asp Ser Gly Ile Ser Leu Asn Thr Ser Pro 340
345 350Ser Val Ala Ser Pro Glu His Ser Val Glu Ser
Ser Ser Tyr Gly Asp 355 360 365Thr
Leu Leu Gly Leu Ser Asp Ser Glu Val Glu Glu Leu Asp Ser Ala 370
375 380Pro Gly Ser Val Lys Gln Asn Gly Pro Lys
Thr Pro Val His Ser Ser385 390 395
400Gly Asp Met Val Gln Pro Leu Ser Pro Ser Gln Gly Gln Ser Thr
His 405 410 415Val His Asp
Ala Gln Cys Glu Asn Thr Pro Glu Lys Glu Leu Pro Val 420
425 430Ser Pro Gly His Arg Lys Thr Pro Phe Thr
Lys Asp Lys His Ser Ser 435 440
445Arg Leu Glu Ala His Leu Thr Arg Asp Glu Leu Arg Ala Lys Ala Leu 450
455 460His Ile Pro Phe Pro Val Glu Lys
Ile Ile Asn Leu Pro Val Val Asp465 470
475 480Phe Asn Glu Met Met Ser Lys Glu Gln Phe Asn Glu
Ala Gln Leu Ala 485 490
495Leu Ile Arg Asp Ile Arg Arg Arg Gly Lys Asn Lys Val Ala Ala Gln
500 505 510Asn Cys Arg Lys Arg Lys
Leu Glu Asn Ile Val Glu Leu Glu Gln Asp 515 520
525Leu Asp His Leu Lys Asp Glu Lys Glu Lys Leu Leu Lys Glu
Lys Gly 530 535 540Glu Asn Asp Lys Ser
Leu His Leu Leu Lys Lys Gln Leu Ser Thr Leu545 550
555 560Tyr Leu Glu Val Phe Ser Met Leu Arg Asp
Glu Asp Gly Lys Pro Tyr 565 570
575Ser Pro Ser Glu Tyr Ser Leu Gln Gln Thr Arg Asp Gly Asn Val Phe
580 585 590Leu Val Pro Lys Ser
Lys Lys Pro Asp Val Lys Lys Asn 595 600
60548274PRTHomo sapiens 48Met Val Gly Arg Arg Ala Leu Ile Val Leu
Ala His Ser Glu Arg Thr1 5 10
15Ser Phe Asn Tyr Ala Met Lys Glu Ala Ala Ala Ala Ala Leu Lys Lys
20 25 30Lys Gly Trp Glu Val Val
Glu Ser Asp Leu Tyr Ala Met Asn Phe Asn 35 40
45Pro Ile Ile Ser Arg Lys Asp Ile Thr Gly Lys Leu Lys Asp
Pro Ala 50 55 60Asn Phe Gln Tyr Pro
Ala Glu Ser Val Leu Ala Tyr Lys Glu Gly His65 70
75 80Leu Ser Pro Asp Ile Val Ala Glu Gln Lys
Lys Leu Glu Ala Ala Asp 85 90
95Leu Val Ile Phe Gln Phe Pro Leu Gln Trp Phe Gly Val Pro Ala Ile
100 105 110Leu Lys Gly Trp Phe
Glu Arg Val Phe Ile Gly Glu Phe Ala Tyr Thr 115
120 125Tyr Ala Ala Met Tyr Asp Lys Gly Pro Phe Arg Ser
Lys Lys Ala Val 130 135 140Leu Ser Ile
Thr Thr Gly Gly Ser Gly Ser Met Tyr Ser Leu Gln Gly145
150 155 160Ile His Gly Asp Met Asn Val
Ile Leu Trp Pro Ile Gln Ser Gly Ile 165
170 175Leu His Phe Cys Gly Phe Gln Val Leu Glu Pro Gln
Leu Thr Tyr Ser 180 185 190Ile
Gly His Thr Pro Ala Asp Ala Arg Ile Gln Ile Leu Glu Gly Trp 195
200 205Lys Lys Arg Leu Glu Asn Ile Trp Asp
Glu Thr Pro Leu Tyr Phe Ala 210 215
220Pro Ser Ser Leu Phe Asp Leu Asn Phe Gln Ala Gly Phe Leu Met Lys225
230 235 240Lys Glu Val Gln
Asp Glu Glu Lys Asn Lys Lys Phe Gly Leu Ser Val 245
250 255Gly His His Leu Gly Lys Ser Ile Pro Thr
Asp Asn Gln Ile Lys Ala 260 265
270Arg Lys49470PRTHomo sapiens 49Met Ala Gly Glu Asn His Gln Trp Gln Gly
Ser Ile Leu Tyr Asn Met1 5 10
15Leu Met Ser Ala Lys Gln Thr Arg Ala Ala Pro Glu Ala Pro Glu Thr
20 25 30Arg Leu Val Asp Gln Cys
Trp Gly Cys Ser Cys Gly Asp Glu Pro Gly 35 40
45Val Gly Arg Glu Gly Leu Leu Gly Gly Arg Asn Val Ala Leu
Leu Tyr 50 55 60Arg Cys Cys Phe Cys
Gly Lys Asp His Pro Arg Gln Gly Ser Ile Leu65 70
75 80Tyr Ser Met Leu Thr Ser Ala Lys Gln Thr
Tyr Ala Ala Pro Lys Ala 85 90
95Pro Glu Ala Thr Leu Gly Pro Cys Trp Gly Cys Ser Cys Gly Ser Asp
100 105 110Pro Gly Val Gly Arg
Ala Gly Leu Pro Gly Gly Arg Pro Val Ala Leu 115
120 125Leu Tyr Arg Cys Cys Phe Cys Gly Glu Asp His Pro
Arg Gln Gly Ser 130 135 140Ile Leu Tyr
Ser Leu Leu Thr Ser Ser Lys Gln Thr His Val Ala Pro145
150 155 160Ala Ala Pro Glu Ala Arg Pro
Gly Gly Ala Trp Trp Asp Arg Ser Tyr 165
170 175Phe Ala Gln Arg Pro Gly Gly Lys Glu Ala Leu Pro
Gly Gly Arg Ala 180 185 190Thr
Ala Leu Leu Tyr Arg Cys Cys Phe Cys Gly Glu Asp His Pro Gln 195
200 205Gln Gly Ser Thr Leu Tyr Cys Val Pro
Thr Ser Thr Asn Gln Ala Gln 210 215
220Ala Ala Pro Glu Glu Arg Pro Arg Ala Pro Trp Trp Asp Thr Ser Ser225
230 235 240Gly Ala Leu Arg
Pro Val Ala Leu Lys Ser Pro Gln Val Val Cys Glu 245
250 255Ala Ala Ser Ala Gly Leu Leu Lys Thr Leu
Arg Phe Val Lys Tyr Leu 260 265
270Pro Cys Phe Gln Val Leu Pro Leu Asp Gln Gln Leu Val Leu Val Arg
275 280 285Asn Cys Trp Ala Ser Leu Leu
Met Leu Glu Leu Ala Gln Asp Arg Leu 290 295
300Gln Phe Glu Thr Val Glu Val Ser Glu Pro Ser Met Leu Gln Lys
Ile305 310 315 320Leu Thr
Thr Arg Arg Arg Glu Thr Gly Gly Asn Glu Pro Leu Pro Val
325 330 335Pro Thr Leu Gln His His Leu
Ala Pro Pro Ala Glu Ala Arg Lys Val 340 345
350Pro Ser Ala Ser Gln Val Gln Ala Ile Lys Cys Phe Leu Ser
Lys Cys 355 360 365Trp Ser Leu Asn
Ile Ser Thr Lys Glu Tyr Ala Tyr Leu Lys Gly Thr 370
375 380Val Leu Phe Asn Pro Asp Val Pro Gly Leu Gln Cys
Val Lys Tyr Ile385 390 395
400Gln Gly Leu Gln Trp Gly Thr Gln Gln Ile Leu Ser Glu His Thr Arg
405 410 415Met Thr His Gln Gly
Pro His Asp Arg Phe Ile Glu Leu Asn Ser Thr 420
425 430Leu Phe Leu Leu Arg Phe Ile Asn Ala Asn Val Ile
Ala Glu Leu Phe 435 440 445Phe Arg
Pro Ile Ile Gly Thr Val Ser Met Asp Asp Met Met Leu Glu 450
455 460Met Leu Cys Thr Lys Ile465
47050560PRTHomo sapiens 50Met Gly Lys Trp Arg Pro Arg Gly Cys Cys Arg Gly
Asn Met Gln Cys1 5 10
15Arg Gln Glu Val Pro Ala Thr Leu Thr Ser Ser Glu Leu Phe Ser Thr
20 25 30Arg Asn Gln Pro Gln Pro Gln
Pro Gln Pro Leu Leu Ala Asp Ala Pro 35 40
45Val Pro Trp Ala Val Ala Ser Arg Met Cys Leu Thr Pro Gly Gln
Gly 50 55 60Cys Gly His Gln Gly Gln
Asp Glu Gly Pro Leu Pro Ala Pro Ser Pro65 70
75 80Pro Pro Ala Met Ser Ser Ser Arg Lys Asp His
Leu Gly Ala Ser Ser 85 90
95Ser Glu Pro Leu Pro Val Ile Ile Val Gly Asn Gly Pro Ser Gly Ile
100 105 110Cys Leu Ser Tyr Leu Leu
Ser Gly Tyr Thr Pro Tyr Thr Lys Pro Asp 115 120
125Ala Ile His Pro His Pro Leu Leu Gln Arg Lys Leu Thr Glu
Ala Pro 130 135 140Gly Val Ser Ile Leu
Asp Gln Asp Leu Asp Tyr Leu Ser Glu Gly Leu145 150
155 160Glu Gly Arg Ser Gln Ser Pro Val Ala Leu
Leu Phe Asp Ala Leu Leu 165 170
175Arg Pro Asp Thr Asp Phe Gly Gly Asn Met Lys Ser Val Leu Thr Trp
180 185 190Lys His Arg Lys Glu
His Ala Ile Pro His Val Val Leu Gly Arg Asn 195
200 205Leu Pro Gly Gly Ala Trp His Ser Ile Glu Gly Ser
Met Val Ile Leu 210 215 220Ser Gln Gly
Gln Trp Met Gly Leu Pro Asp Leu Glu Val Lys Asp Trp225
230 235 240Met Gln Lys Lys Arg Arg Gly
Leu Arg Asn Ser Arg Ala Thr Ala Gly 245
250 255Asp Ile Ala His Tyr Tyr Arg Asp Tyr Val Val Lys
Lys Gly Leu Gly 260 265 270His
Asn Phe Val Ser Gly Ala Val Val Thr Ala Val Glu Trp Gly Thr 275
280 285Pro Asp Pro Ser Ser Cys Gly Ala Gln
Asp Ser Ser Pro Leu Phe Gln 290 295
300Val Ser Gly Phe Leu Thr Arg Asn Gln Ala Gln Gln Pro Phe Ser Leu305
310 315 320Trp Ala Arg Asn
Val Val Leu Ala Thr Gly Thr Phe Asp Ser Pro Ala 325
330 335Arg Leu Gly Ile Pro Gly Glu Ala Leu Pro
Phe Ile His His Glu Leu 340 345
350Ser Ala Leu Glu Ala Ala Thr Arg Val Gly Ala Val Thr Pro Ala Ser
355 360 365Asp Pro Val Leu Ile Ile Gly
Ala Gly Leu Ser Ala Ala Asp Ala Val 370 375
380Leu Tyr Ala Arg His Tyr Asn Ile Pro Val Ile His Ala Phe Arg
Arg385 390 395 400Ala Val
Asp Asp Pro Gly Leu Val Phe Asn Gln Leu Pro Lys Met Leu
405 410 415Tyr Pro Glu Tyr His Lys Val
His Gln Met Met Arg Glu Gln Ser Ile 420 425
430Leu Ser Pro Ser Pro Tyr Glu Gly Tyr Arg Ser Leu Pro Arg
His Gln 435 440 445Leu Leu Cys Phe
Lys Glu Asp Cys Gln Ala Val Phe Gln Asp Leu Glu 450
455 460Gly Val Glu Lys Val Phe Gly Val Ser Leu Val Leu
Val Leu Ile Gly465 470 475
480Ser His Pro Asp Leu Ser Phe Leu Pro Gly Ala Gly Ala Asp Phe Ala
485 490 495Val Asp Pro Asp Gln
Pro Leu Ser Ala Lys Arg Asn Pro Ile Asp Val 500
505 510Asp Pro Phe Thr Tyr Gln Ser Thr Arg Gln Glu Gly
Leu Tyr Ala Met 515 520 525Gly Pro
Leu Ala Gly Asp Asn Phe Val Arg Phe Val Gln Gly Gly Ala 530
535 540Leu Ala Val Ala Ser Ser Leu Leu Arg Lys Glu
Thr Arg Lys Pro Pro545 550 555
56051483PRTHomo sapiens 51Met Ala Gln Ala Asp Ile Ala Leu Ile Gly
Leu Ala Val Met Gly Gln1 5 10
15Asn Leu Ile Leu Asn Met Asn Asp His Gly Phe Val Val Cys Ala Phe
20 25 30Asn Arg Thr Val Ser Lys
Val Asp Asp Phe Leu Ala Asn Glu Ala Lys 35 40
45Gly Thr Lys Val Val Gly Ala Gln Ser Leu Lys Glu Met Val
Ser Lys 50 55 60Leu Lys Lys Pro Arg
Arg Ile Ile Leu Leu Val Lys Ala Gly Gln Ala65 70
75 80Val Asp Asp Phe Ile Glu Lys Leu Val Pro
Leu Leu Asp Thr Gly Asp 85 90
95Ile Ile Ile Asp Gly Gly Asn Ser Glu Tyr Arg Asp Thr Thr Arg Arg
100 105 110Cys Arg Asp Leu Lys
Ala Lys Gly Ile Leu Phe Val Gly Ser Gly Val 115
120 125Ser Gly Gly Glu Glu Gly Ala Arg Tyr Gly Pro Ser
Leu Met Pro Gly 130 135 140Gly Asn Lys
Glu Ala Trp Pro His Ile Lys Thr Ile Phe Gln Gly Ile145
150 155 160Ala Ala Lys Val Gly Thr Gly
Glu Pro Cys Cys Asp Trp Val Gly Asp 165
170 175Glu Gly Ala Gly His Phe Val Lys Met Val His Asn
Gly Ile Glu Tyr 180 185 190Gly
Asp Met Gln Leu Ile Cys Glu Ala Tyr His Leu Met Lys Asp Val 195
200 205Leu Gly Met Ala Gln Asp Glu Met Ala
Gln Ala Phe Glu Asp Trp Asn 210 215
220Lys Thr Glu Leu Asp Ser Phe Leu Ile Glu Ile Thr Ala Asn Ile Leu225
230 235 240Lys Phe Gln Asp
Thr Asp Gly Lys His Leu Leu Pro Lys Ile Arg Asp 245
250 255Ser Ala Gly Gln Lys Gly Thr Gly Lys Trp
Thr Ala Ile Ser Ala Leu 260 265
270Glu Tyr Gly Val Pro Val Thr Leu Ile Gly Glu Ala Val Phe Ala Arg
275 280 285Cys Leu Ser Ser Leu Lys Asp
Glu Arg Ile Gln Ala Ser Lys Lys Leu 290 295
300Lys Gly Pro Gln Lys Phe Gln Phe Asp Gly Asp Lys Lys Ser Phe
Leu305 310 315 320Glu Asp
Ile Arg Lys Ala Leu Tyr Ala Ser Lys Ile Ile Ser Tyr Ala
325 330 335Gln Gly Phe Met Leu Leu Arg
Gln Ala Ala Thr Glu Phe Gly Trp Thr 340 345
350Leu Asn Tyr Gly Gly Ile Ala Leu Met Trp Arg Gly Gly Cys
Ile Ile 355 360 365Arg Ser Val Phe
Leu Gly Lys Ile Lys Asp Ala Phe Asp Arg Asn Pro 370
375 380Glu Leu Gln Asn Leu Leu Leu Asp Asp Phe Phe Lys
Ser Ala Val Glu385 390 395
400Asn Cys Gln Asp Ser Trp Arg Arg Ala Val Ser Thr Gly Val Gln Ala
405 410 415Gly Ile Pro Met Pro
Cys Phe Thr Thr Ala Leu Ser Phe Tyr Asp Gly 420
425 430Tyr Arg His Glu Met Leu Pro Ala Ser Leu Ile Gln
Ala Gln Arg Asp 435 440 445Tyr Phe
Gly Ala His Thr Tyr Glu Leu Leu Ala Lys Pro Gly Gln Phe 450
455 460Ile His Thr Asn Trp Thr Gly His Gly Gly Thr
Val Ser Ser Ser Ser465 470 475
480Tyr Asn Ala52272PRTHomo sapiens 52Met His Leu Arg Leu Ile Ser Trp
Leu Phe Ile Ile Leu Asn Phe Met1 5 10
15Glu Tyr Ile Gly Ser Gln Asn Ala Ser Arg Gly Arg Arg Gln
Arg Arg 20 25 30Met His Pro
Asn Val Ser Gln Gly Cys Gln Gly Gly Cys Ala Thr Cys 35
40 45Ser Asp Tyr Asn Gly Cys Leu Ser Cys Lys Pro
Arg Leu Phe Phe Ala 50 55 60Leu Glu
Arg Ile Gly Met Lys Gln Ile Gly Val Cys Leu Ser Ser Cys65
70 75 80Pro Ser Gly Tyr Tyr Gly Thr
Arg Tyr Pro Asp Ile Asn Lys Cys Thr 85 90
95Lys Cys Lys Ala Asp Cys Asp Thr Cys Phe Asn Lys Asn
Phe Cys Thr 100 105 110Lys Cys
Lys Ser Gly Phe Tyr Leu His Leu Gly Lys Cys Leu Asp Asn 115
120 125Cys Pro Glu Gly Leu Glu Ala Asn Asn His
Thr Met Glu Cys Val Ser 130 135 140Ile
Val His Cys Glu Val Ser Glu Trp Asn Pro Trp Ser Pro Cys Thr145
150 155 160Lys Lys Gly Lys Thr Cys
Gly Phe Lys Arg Gly Thr Glu Thr Arg Val 165
170 175Arg Glu Ile Ile Gln His Pro Ser Ala Lys Gly Asn
Leu Cys Pro Pro 180 185 190Thr
Asn Glu Thr Arg Lys Cys Thr Val Gln Arg Lys Lys Cys Gln Lys 195
200 205Gly Glu Arg Gly Lys Lys Gly Arg Glu
Arg Lys Arg Lys Lys Pro Asn 210 215
220Lys Gly Glu Ser Lys Glu Ala Ile Pro Asp Ser Lys Ser Leu Glu Ser225
230 235 240Ser Lys Glu Ile
Pro Glu Gln Arg Glu Asn Lys Gln Gln Gln Lys Lys 245
250 255Arg Lys Val Gln Asp Lys Gln Lys Ser Val
Ser Val Ser Thr Val His 260 265
27053501PRTHomo sapiens 53Met Val Arg Lys Pro Val Val Ser Thr Ile Ser
Lys Gly Gly Tyr Leu1 5 10
15Gln Gly Asn Val Asn Gly Arg Leu Pro Ser Leu Gly Asn Lys Glu Pro
20 25 30Pro Gly Gln Glu Lys Val Gln
Leu Lys Arg Lys Val Thr Leu Leu Arg 35 40
45Gly Val Ser Ile Ile Ile Gly Thr Ile Ile Gly Ala Gly Ile Phe
Ile 50 55 60Ser Pro Lys Gly Val Leu
Gln Asn Thr Gly Ser Val Gly Met Ser Leu65 70
75 80Thr Ile Trp Thr Val Cys Gly Val Leu Ser Leu
Phe Gly Ala Leu Ser 85 90
95Tyr Ala Glu Leu Gly Thr Thr Ile Lys Lys Ser Gly Gly His Tyr Thr
100 105 110Tyr Ile Leu Glu Val Phe
Gly Pro Leu Pro Ala Phe Val Arg Val Trp 115 120
125Val Glu Leu Leu Ile Ile Arg Pro Ala Ala Thr Ala Val Ile
Ser Leu 130 135 140Ala Phe Gly Arg Tyr
Ile Leu Glu Pro Phe Phe Ile Gln Cys Glu Ile145 150
155 160Pro Glu Leu Ala Ile Lys Leu Ile Thr Ala
Val Gly Ile Thr Val Val 165 170
175Met Val Leu Asn Ser Met Ser Val Ser Trp Ser Ala Arg Ile Gln Ile
180 185 190Phe Leu Thr Phe Cys
Lys Leu Thr Ala Ile Leu Ile Ile Ile Val Pro 195
200 205Gly Val Met Gln Leu Ile Lys Gly Gln Thr Gln Asn
Phe Lys Asp Ala 210 215 220Phe Ser Gly
Arg Asp Ser Ser Ile Thr Arg Leu Pro Leu Ala Phe Tyr225
230 235 240Tyr Gly Met Tyr Ala Tyr Ala
Gly Trp Phe Tyr Leu Asn Phe Val Thr 245
250 255Glu Glu Val Glu Asn Pro Glu Lys Thr Ile Pro Leu
Ala Ile Cys Ile 260 265 270Ser
Met Ala Ile Val Thr Ile Gly Tyr Val Leu Thr Asn Val Ala Tyr 275
280 285Phe Thr Thr Ile Asn Ala Glu Glu Leu
Leu Leu Ser Asn Ala Val Ala 290 295
300Val Thr Phe Ser Glu Arg Leu Leu Gly Asn Phe Ser Leu Ala Val Pro305
310 315 320Ile Phe Val Ala
Leu Ser Cys Phe Gly Ser Met Asn Gly Gly Val Phe 325
330 335Ala Val Ser Arg Leu Phe Tyr Val Ala Ser
Arg Glu Gly His Leu Pro 340 345
350Glu Ile Leu Ser Met Ile His Val Arg Lys His Thr Pro Leu Pro Ala
355 360 365Val Ile Val Leu His Pro Leu
Thr Met Ile Met Leu Phe Ser Gly Asp 370 375
380Leu Asp Ser Leu Leu Asn Phe Leu Ser Phe Ala Arg Trp Leu Phe
Ile385 390 395 400Gly Leu
Ala Val Ala Gly Leu Ile Tyr Leu Arg Tyr Lys Cys Pro Asp
405 410 415Met His Arg Pro Phe Lys Val
Pro Leu Phe Ile Pro Ala Leu Phe Ser 420 425
430Phe Thr Cys Leu Phe Met Val Ala Leu Ser Leu Tyr Ser Asp
Pro Phe 435 440 445Ser Thr Gly Ile
Gly Phe Val Ile Thr Leu Thr Gly Val Pro Ala Tyr 450
455 460Tyr Leu Phe Ile Ile Trp Asp Lys Lys Pro Arg Trp
Phe Arg Ile Met465 470 475
480Ser Glu Lys Ile Thr Arg Thr Leu Gln Ile Ile Leu Glu Val Val Pro
485 490 495Glu Glu Asp Lys Leu
50054137PRTHomo sapiens 54Met Gly Leu Arg Ala Gly Gly Thr Leu
Gly Arg Ala Gly Ala Gly Arg1 5 10
15Gly Ala Pro Glu Gly Pro Gly Pro Ser Gly Gly Ala Gln Gly Gly
Ser 20 25 30Ile His Ser Gly
Arg Ile Ala Ala Val His Asn Val Pro Leu Ser Val 35
40 45Leu Ile Arg Pro Leu Pro Ser Val Leu Asp Pro Ala
Lys Val Gln Ser 50 55 60Leu Val Asp
Thr Ile Arg Glu Asp Pro Asp Ser Val Pro Pro Ile Asp65 70
75 80Val Leu Trp Ile Lys Gly Ala Gln
Gly Gly Asp Tyr Phe Tyr Ser Phe 85 90
95Gly Gly Cys His Arg Tyr Ala Ala Tyr Gln Gln Leu Gln Arg
Glu Thr 100 105 110Ile Pro Ala
Lys Leu Val Gln Ser Thr Leu Ser Asp Leu Arg Val Tyr 115
120 125Leu Gly Ala Ser Thr Pro Asp Leu Gln 130
13555337PRTHomo sapiens 55Met Ser Ser Ser Pro Val Lys Arg
Gln Arg Met Glu Ser Ala Leu Asp1 5 10
15Gln Leu Lys Gln Phe Thr Thr Val Val Ala Asp Thr Gly Asp
Phe His 20 25 30Ala Ile Asp
Glu Tyr Lys Pro Gln Asp Ala Thr Thr Asn Pro Ser Leu 35
40 45Ile Leu Ala Ala Ala Gln Met Pro Ala Tyr Gln
Glu Leu Val Glu Glu 50 55 60Ala Ile
Ala Tyr Gly Arg Lys Leu Gly Gly Ser Gln Glu Asp Gln Ile65
70 75 80Lys Asn Ala Ile Asp Lys Leu
Phe Val Leu Phe Gly Ala Glu Ile Leu 85 90
95Lys Lys Ile Pro Gly Arg Val Ser Thr Glu Val Asp Ala
Arg Leu Ser 100 105 110Phe Asp
Lys Asp Ala Met Val Ala Arg Ala Arg Arg Leu Ile Glu Leu 115
120 125Tyr Lys Glu Ala Gly Ile Ser Lys Asp Arg
Ile Leu Ile Lys Leu Ser 130 135 140Ser
Thr Trp Glu Gly Ile Gln Ala Gly Lys Glu Leu Glu Glu Gln His145
150 155 160Gly Ile His Cys Asn Met
Thr Leu Leu Phe Ser Phe Ala Gln Ala Val 165
170 175Ala Cys Ala Glu Ala Gly Val Thr Leu Ile Ser Pro
Phe Val Gly Arg 180 185 190Ile
Leu Asp Trp His Val Ala Asn Thr Asp Lys Lys Ser Tyr Glu Pro 195
200 205Leu Glu Asp Pro Gly Val Lys Ser Val
Thr Lys Ile Tyr Asn Tyr Tyr 210 215
220Lys Lys Phe Ser Tyr Lys Thr Ile Val Met Gly Ala Ser Phe Arg Asn225
230 235 240Thr Gly Glu Ile
Lys Ala Leu Ala Gly Cys Asp Phe Leu Thr Ile Ser 245
250 255Pro Lys Leu Leu Gly Glu Leu Leu Gln Asp
Asn Ala Lys Leu Val Pro 260 265
270Val Leu Ser Ala Lys Ala Ala Gln Ala Ser Asp Leu Glu Lys Ile His
275 280 285Leu Asp Glu Lys Ser Phe Arg
Trp Leu His Asn Glu Asp Gln Met Ala 290 295
300Val Glu Lys Leu Ser Asp Gly Ile Arg Lys Phe Ala Ala Asp Ala
Val305 310 315 320Lys Leu
Glu Arg Met Leu Thr Glu Arg Met Phe Asn Ala Glu Asn Gly
325 330 335Lys56564PRTHomo sapiens 56Met
Ala Glu Leu Asp Leu Met Ala Pro Gly Pro Leu Pro Arg Ala Thr1
5 10 15Ala Gln Pro Pro Ala Pro Leu
Ser Pro Asp Ser Gly Ser Pro Ser Pro 20 25
30Asp Ser Gly Ser Ala Ser Pro Val Glu Glu Glu Asp Val Gly
Ser Ser 35 40 45Glu Lys Leu Gly
Arg Glu Thr Glu Glu Gln Asp Ser Asp Ser Ala Glu 50 55
60Gln Gly Asp Pro Ala Gly Glu Gly Lys Glu Val Leu Cys
Asp Phe Cys65 70 75
80Leu Asp Asp Thr Arg Arg Val Lys Ala Val Lys Ser Cys Leu Thr Cys
85 90 95Met Val Asn Tyr Cys Glu
Glu His Leu Gln Pro His Gln Val Asn Ile 100
105 110Lys Leu Gln Ser His Leu Leu Thr Glu Pro Val Lys
Asp His Asn Trp 115 120 125Arg Tyr
Cys Pro Ala His His Ser Pro Leu Ser Ala Phe Cys Cys Pro 130
135 140Asp Gln Gln Cys Ile Cys Gln Asp Cys Cys Gln
Glu His Ser Gly His145 150 155
160Thr Ile Val Ser Leu Asp Ala Ala Arg Arg Asp Lys Glu Ala Glu Leu
165 170 175Gln Cys Thr Gln
Leu Asp Leu Glu Arg Lys Leu Lys Leu Asn Glu Asn 180
185 190Ala Ile Ser Arg Leu Gln Ala Asn Gln Lys Ser
Val Leu Val Ser Val 195 200 205Ser
Glu Val Lys Ala Val Ala Glu Met Gln Phe Gly Glu Leu Leu Ala 210
215 220Ala Val Arg Lys Ala Gln Ala Asn Val Met
Leu Phe Leu Glu Glu Lys225 230 235
240Glu Gln Ala Ala Leu Ser Gln Ala Asn Gly Ile Lys Ala His Leu
Glu 245 250 255Tyr Arg Ser
Ala Glu Met Glu Lys Ser Lys Gln Glu Leu Glu Arg Met 260
265 270Ala Ala Ile Ser Asn Thr Val Gln Phe Leu
Glu Glu Tyr Cys Lys Phe 275 280
285Lys Asn Thr Glu Asp Ile Thr Phe Pro Ser Val Tyr Val Gly Leu Lys 290
295 300Asp Lys Leu Ser Gly Ile Arg Lys
Val Ile Thr Glu Ser Thr Val His305 310
315 320Leu Ile Gln Leu Leu Glu Asn Tyr Lys Lys Lys Leu
Gln Glu Phe Ser 325 330
335Lys Glu Glu Glu Tyr Asp Ile Arg Thr Gln Val Ser Ala Val Val Gln
340 345 350Arg Lys Tyr Trp Thr Ser
Lys Pro Glu Pro Ser Thr Arg Glu Gln Phe 355 360
365Leu Gln Tyr Ala Tyr Asp Ile Thr Phe Asp Pro Asp Thr Ala
His Lys 370 375 380Tyr Leu Arg Leu Gln
Glu Glu Asn Arg Lys Val Thr Asn Thr Thr Pro385 390
395 400Trp Glu His Pro Tyr Pro Asp Leu Pro Ser
Arg Phe Leu His Trp Arg 405 410
415Gln Val Leu Ser Gln Gln Ser Leu Tyr Leu His Arg Tyr Tyr Phe Glu
420 425 430Val Glu Ile Phe Gly
Ala Gly Thr Tyr Val Gly Leu Thr Cys Lys Gly 435
440 445Ile Asp Arg Lys Gly Glu Glu Arg Asn Ser Cys Ile
Ser Gly Asn Asn 450 455 460Phe Ser Trp
Ser Leu Gln Trp Asn Gly Lys Glu Phe Thr Ala Trp Tyr465
470 475 480Ser Asp Met Glu Thr Pro Leu
Lys Ala Gly Pro Phe Arg Arg Leu Gly 485
490 495Val Tyr Ile Asp Phe Pro Gly Gly Ile Leu Ser Phe
Tyr Gly Val Glu 500 505 510Tyr
Asp Thr Met Thr Leu Val His Lys Phe Ala Cys Lys Phe Ser Glu 515
520 525Pro Val Tyr Ala Ala Phe Trp Leu Ser
Lys Lys Glu Asn Ala Ile Arg 530 535
540Ile Val Asp Leu Gly Glu Glu Pro Glu Lys Pro Ala Pro Ser Leu Val545
550 555 560Gly Thr Ala
Pro57348PRTHomo sapiens 57Met Gln Phe Gly Glu Leu Leu Ala Ala Val Arg Lys
Ala Gln Ala Asn1 5 10
15Val Met Leu Phe Leu Glu Glu Lys Glu Gln Ala Ala Leu Ser Gln Ala
20 25 30Asn Gly Ile Lys Ala His Leu
Glu Tyr Arg Ser Ala Glu Met Glu Lys 35 40
45Ser Lys Gln Glu Leu Glu Thr Met Ala Ala Ile Ser Asn Thr Val
Gln 50 55 60Phe Leu Glu Glu Tyr Cys
Lys Phe Lys Asn Thr Glu Asp Ile Thr Phe65 70
75 80Pro Ser Val Tyr Ile Gly Leu Lys Asp Lys Leu
Ser Gly Ile Arg Lys 85 90
95Val Ile Thr Glu Ser Thr Val His Leu Ile Gln Leu Leu Glu Asn Tyr
100 105 110Lys Lys Lys Leu Gln Glu
Phe Ser Lys Glu Glu Glu Tyr Asp Ile Arg 115 120
125Thr Gln Val Ser Ala Ile Val Gln Arg Lys Tyr Trp Thr Ser
Lys Pro 130 135 140Glu Pro Ser Thr Arg
Glu Gln Phe Leu Gln Tyr Val His Asp Ile Thr145 150
155 160Phe Asp Pro Asp Thr Ala His Lys Tyr Leu
Arg Leu Gln Glu Glu Asn 165 170
175Arg Lys Val Thr Asn Thr Thr Pro Trp Glu His Pro Tyr Pro Asp Leu
180 185 190Pro Ser Arg Phe Leu
His Trp Arg Gln Val Leu Ser Gln Gln Ser Leu 195
200 205Tyr Leu His Arg Tyr Tyr Phe Glu Val Glu Ile Phe
Gly Ala Gly Thr 210 215 220Tyr Val Gly
Leu Thr Cys Lys Gly Ile Asp Gln Lys Gly Glu Glu Arg225
230 235 240Ser Ser Cys Ile Ser Gly Asn
Asn Phe Ser Trp Ser Leu Gln Trp Asn 245
250 255Gly Lys Glu Phe Thr Ala Trp Tyr Ser Asp Met Glu
Thr Pro Leu Lys 260 265 270Ala
Gly Pro Phe Trp Arg Leu Gly Val Tyr Ile Asp Phe Pro Gly Gly 275
280 285Ile Leu Ser Phe Tyr Gly Val Glu Tyr
Asp Ser Met Thr Leu Val His 290 295
300Lys Phe Ala Cys Lys Phe Ser Glu Pro Val Tyr Ala Ala Phe Trp Leu305
310 315 320Ser Lys Lys Glu
Asn Ala Ile Arg Ile Val Asp Leu Gly Glu Glu Pro 325
330 335Glu Lys Pro Ala Pro Ser Leu Val Gly Thr
Ala Pro 340 34558105PRTHomo sapiens 58Met Val
Lys Gln Ile Glu Ser Lys Thr Ala Phe Gln Glu Ala Leu Asp1 5
10 15Ala Ala Gly Asp Lys Leu Val Val
Val Asp Phe Ser Ala Thr Trp Cys 20 25
30Gly Pro Cys Lys Met Ile Lys Pro Phe Phe His Ser Leu Ser Glu
Lys 35 40 45Tyr Ser Asn Val Ile
Phe Leu Glu Val Asp Val Asp Asp Cys Gln Asp 50 55
60Val Ala Ser Glu Cys Glu Val Lys Cys Met Pro Thr Phe Gln
Phe Phe65 70 75 80Lys
Lys Gly Gln Lys Val Gly Glu Phe Ser Gly Ala Asn Lys Glu Lys
85 90 95Leu Glu Ala Thr Ile Asn Glu
Leu Val 100 10559649PRTHomo
sapiensMISC_FEATURE(648)..(648)Xaa is pyrrolidone carboxylic acid 59Met
Gly Cys Ala Glu Gly Lys Ala Val Ala Ala Ala Ala Pro Thr Glu1
5 10 15Leu Gln Thr Lys Gly Lys Asn
Gly Asp Gly Arg Arg Arg Ser Ala Lys 20 25
30Asp His His Pro Gly Lys Thr Leu Pro Glu Asn Pro Ala Gly
Phe Thr 35 40 45Ser Thr Ala Thr
Ala Asp Ser Arg Ala Leu Leu Gln Ala Tyr Ile Asp 50 55
60Gly His Ser Val Val Ile Phe Ser Arg Ser Thr Cys Thr
Arg Cys Thr65 70 75
80Glu Val Lys Lys Leu Phe Lys Ser Leu Cys Val Pro Tyr Phe Val Leu
85 90 95Glu Leu Asp Gln Thr Glu
Asp Gly Arg Ala Leu Glu Gly Thr Leu Ser 100
105 110Glu Leu Ala Ala Glu Thr Asp Leu Pro Val Val Phe
Val Lys Gln Arg 115 120 125Lys Ile
Gly Gly His Gly Pro Thr Leu Lys Ala Tyr Gln Glu Gly Arg 130
135 140Leu Gln Lys Leu Leu Lys Met Asn Gly Pro Glu
Asp Leu Pro Lys Ser145 150 155
160Tyr Asp Tyr Asp Leu Ile Ile Ile Gly Gly Gly Ser Gly Gly Leu Ala
165 170 175Ala Ala Lys Glu
Ala Ala Gln Tyr Gly Lys Lys Val Met Val Leu Asp 180
185 190Phe Val Thr Pro Thr Pro Leu Gly Thr Arg Trp
Gly Leu Gly Gly Thr 195 200 205Cys
Val Asn Val Gly Cys Ile Pro Lys Lys Leu Met His Gln Ala Ala 210
215 220Leu Leu Gly Gln Ala Leu Gln Asp Ser Arg
Asn Tyr Gly Trp Lys Val225 230 235
240Glu Glu Thr Val Lys His Asp Trp Asp Arg Met Ile Glu Ala Val
Gln 245 250 255Asn His Ile
Gly Ser Leu Asn Trp Gly Tyr Arg Val Ala Leu Arg Glu 260
265 270Lys Lys Val Val Tyr Glu Asn Ala Tyr Gly
Gln Phe Ile Gly Pro His 275 280
285Arg Ile Lys Ala Thr Asn Asn Lys Gly Lys Glu Lys Ile Tyr Ser Ala 290
295 300Glu Arg Phe Leu Ile Ala Thr Gly
Glu Arg Pro Arg Tyr Leu Gly Ile305 310
315 320Pro Gly Asp Lys Glu Tyr Cys Ile Ser Ser Asp Asp
Leu Phe Ser Leu 325 330
335Pro Tyr Cys Pro Gly Lys Thr Leu Val Val Gly Ala Ser Tyr Val Ala
340 345 350Leu Glu Cys Ala Gly Phe
Leu Ala Gly Ile Gly Leu Asp Val Thr Val 355 360
365Met Val Arg Ser Ile Leu Leu Arg Gly Phe Asp Gln Asp Met
Ala Asn 370 375 380Lys Ile Gly Glu His
Met Glu Glu His Gly Ile Lys Phe Ile Arg Gln385 390
395 400Phe Val Pro Ile Lys Val Glu Gln Ile Glu
Ala Gly Thr Pro Gly Arg 405 410
415Leu Arg Val Val Ala Gln Ser Thr Asn Ser Glu Glu Ile Ile Glu Gly
420 425 430Glu Tyr Asn Thr Val
Met Leu Ala Ile Gly Arg Asp Ala Cys Thr Arg 435
440 445Lys Ile Gly Leu Glu Thr Val Gly Val Lys Ile Asn
Glu Lys Thr Gly 450 455 460Lys Ile Pro
Val Thr Asp Glu Glu Gln Thr Asn Val Pro Tyr Ile Tyr465
470 475 480Ala Ile Gly Asp Ile Leu Glu
Asp Lys Val Glu Leu Thr Pro Val Ala 485
490 495Ile Gln Ala Gly Arg Leu Leu Ala Gln Arg Leu Tyr
Ala Gly Ser Thr 500 505 510Val
Lys Cys Asp Tyr Glu Asn Val Pro Thr Thr Val Phe Thr Pro Leu 515
520 525Glu Tyr Gly Ala Cys Gly Leu Ser Glu
Glu Lys Ala Val Glu Lys Phe 530 535
540Gly Glu Glu Asn Ile Glu Val Tyr His Ser Tyr Phe Trp Pro Leu Glu545
550 555 560Trp Thr Ile Pro
Ser Arg Asp Asn Asn Lys Cys Tyr Ala Lys Ile Ile 565
570 575Cys Asn Thr Lys Asp Asn Glu Arg Val Val
Gly Phe His Val Leu Gly 580 585
590Pro Asn Ala Gly Glu Val Thr Gln Gly Phe Ala Ala Ala Leu Lys Cys
595 600 605Gly Leu Thr Lys Lys Gln Leu
Asp Ser Thr Ile Gly Ile His Pro Val 610 615
620Cys Ala Glu Val Phe Thr Thr Leu Ser Val Thr Lys Arg Ser Gly
Ala625 630 635 640Ser Ile
Leu Gln Ala Gly Cys Xaa Gly 64560494PRTHomo sapiens 60Met
Phe Glu Ile Lys Lys Ile Cys Cys Ile Gly Ala Gly Tyr Val Gly1
5 10 15Gly Pro Thr Cys Ser Val Ile
Ala His Met Cys Pro Glu Ile Arg Val 20 25
30Thr Val Val Asp Val Asn Glu Ser Arg Ile Asn Ala Trp Asn
Ser Pro 35 40 45Thr Leu Pro Ile
Tyr Glu Pro Gly Leu Lys Glu Val Val Glu Ser Cys 50 55
60Arg Gly Lys Asn Leu Phe Phe Ser Thr Asn Ile Asp Asp
Ala Ile Lys65 70 75
80Glu Ala Asp Leu Val Phe Ile Ser Val Asn Thr Pro Thr Lys Thr Tyr
85 90 95Gly Met Gly Lys Gly Arg
Ala Ala Asp Leu Lys Tyr Ile Glu Ala Cys 100
105 110Ala Arg Arg Ile Val Gln Asn Ser Asn Gly Tyr Lys
Ile Val Thr Glu 115 120 125Lys Ser
Thr Val Pro Val Arg Ala Ala Glu Ser Ile Arg Arg Ile Phe 130
135 140Asp Ala Asn Thr Lys Pro Asn Leu Asn Leu Gln
Val Leu Ser Asn Pro145 150 155
160Glu Phe Leu Ala Glu Gly Thr Ala Ile Lys Asp Leu Lys Asn Pro Asp
165 170 175Arg Val Leu Ile
Gly Gly Asp Glu Thr Pro Glu Gly Gln Arg Ala Val 180
185 190Gln Ala Leu Cys Ala Val Tyr Glu His Trp Val
Pro Arg Glu Lys Ile 195 200 205Leu
Thr Thr Asn Thr Trp Ser Ser Glu Leu Ser Lys Leu Ala Ala Asn 210
215 220Ala Phe Leu Ala Gln Arg Ile Ser Ser Ile
Asn Ser Ile Ser Ala Leu225 230 235
240Cys Glu Ala Thr Gly Ala Asp Val Glu Glu Val Ala Thr Ala Ile
Gly 245 250 255Met Asp Gln
Arg Ile Gly Asn Lys Phe Leu Lys Ala Ser Val Gly Phe 260
265 270Gly Gly Ser Cys Phe Gln Lys Asp Val Leu
Asn Leu Val Tyr Leu Cys 275 280
285Glu Ala Leu Asn Leu Pro Glu Val Ala Arg Tyr Trp Gln Gln Val Ile 290
295 300Asp Met Asn Asp Tyr Gln Arg Arg
Arg Phe Ala Ser Arg Ile Ile Asp305 310
315 320Ser Leu Phe Asn Thr Val Thr Asp Lys Lys Ile Ala
Ile Leu Gly Phe 325 330
335Ala Phe Lys Lys Asp Thr Gly Asp Thr Arg Glu Ser Ser Ser Ile Tyr
340 345 350Ile Ser Lys Tyr Leu Met
Asp Glu Gly Ala His Leu His Ile Tyr Asp 355 360
365Pro Lys Val Pro Arg Glu Gln Ile Val Val Asp Leu Ser His
Pro Gly 370 375 380Val Ser Glu Asp Asp
Gln Val Ser Arg Leu Val Thr Ile Ser Lys Asp385 390
395 400Pro Tyr Glu Ala Cys Asp Gly Ala His Ala
Val Val Ile Cys Thr Glu 405 410
415Trp Asp Met Phe Lys Glu Leu Asp Tyr Glu Arg Ile His Lys Lys Met
420 425 430Leu Lys Pro Ala Phe
Ile Phe Asp Gly Arg Arg Val Leu Asp Gly Leu 435
440 445His Asn Glu Leu Gln Thr Ile Gly Phe Gln Ile Glu
Thr Ile Gly Lys 450 455 460Lys Val Ser
Ser Lys Arg Ile Pro Tyr Ala Pro Ser Gly Glu Ile Pro465
470 475 480Lys Phe Ser Leu Gln Asp Pro
Pro Asn Lys Lys Pro Lys Val 485
4906178DNAHomo sapiens 61aaacctgcca taactttccc aagaactgag tactctgtac
ctgggagtag ttggcagatc 60cactggtttc tgactgga
786278DNAHomo sapiens 62tgtggaccta actaggggga
gcctaaaata atgttgggac tacctagatg gtcagaaaga 60atgagccaat taacttct
786378DNAHomo sapiens
63aaacctgcca taactttccc aagaactgag tactctgtac tacctagatg gtcagaaaga
60atgagccaat taacttct
786479DNAHomo sapiens 64aggttaggta ctgaactcat caggaggctg aggttggaaa
gtagatttga caaggttaag 60taaaagaaag gcaaagctg
796579DNAHomo sapiens 65attttttcgg gttttttttc
cacttttttc cttttgaaat tttattattt atttactcat 60tttgagatag ggtctcact
796679DNAHomo sapiens
66aggttgggta ctgaactcat caggaggctg agtttgaaat tttattattt atttactcat
60tttgagatag ggtctcact
796779DNAHomo sapiens 67cttggttctc ctgctactac ttctgttgct gctacttgat
ccttacagga tgtttctata 60ctttacaaaa ctctttggt
796879DNAHomo sapiens 68gtgatggcag tgggcacgcc
catatacatt tgcatacact ctaatataaa tgtttacaaa 60catacacaca cacacattc
796979DNAHomo sapiens
69cttggttctc ctgctactac ttctgttgct gctacttgat ctaatataaa tgtttacaaa
60catacacaca cacacattc
79
User Contributions:
Comment about this patent or add new information about this topic: