Patent application title: Construct and Sequence for Enhanced Gene Expression
Inventors:
Maurice Wilhelmus Van Der Heijden (Gouda, NL)
Bart Marinus Engels (Woerden, NL)
IPC8 Class: AC12N1567FI
USPC Class:
1 1
Class name:
Publication date: 2021-07-01
Patent application number: 20210198677
Abstract:
The invention relates to a method for transcription and expression using
a nucleic acid construct which is characterized by the presence of a
promoter followed by an intronic promoter. The invention further relates
to said nucleic acid construct, an expression vector and a cell
comprising said construct, and its use. The invention also relates to
methods for transcription and optionally expression using a nucleotide
sequence. The invention further relates to said nucleotide sequence and a
construct, expression vector and cell comprising said nucleotide
sequence, and its use.Claims:
1.-15. (canceled)
16. A nucleic acid construct comprising a first promoter, a second promoter, and a single nucleotide sequence of interest, wherein said first promoter and second promoter are constitutive promoters and are both operably linked to said single nucleotide sequence of interest, and wherein said second promoter is an intronic promoter flanked by a first intronic sequence located upstream of said second promoter and a second intronic sequence located downstream of said second promoter, and wherein said single nucleotide sequence of interest is under the control of said first promoter and said second promoter.
17. The nucleic acid construct according to claim 1, wherein said nucleic acid construct further comprises an additional expression regulating sequence, and wherein said additional expression regulating sequence, said first promoter and said second promoter are all operably linked to said nucleic acid sequence of interest.
18. The nucleic acid construct according to claim 1, wherein said nucleotide sequence of interest encodes a protein or polypeptide of interest.
19. An expression vector comprising the nucleic acid construct according to claim 1.
20. An in vitro cell comprising the nucleic acid construct according to claim 1.
21. A non-human cell comprising the nucleic acid construct according to claim 1.
22. The nucleic acid construct according to claim 2, wherein said additional expression regulating sequence comprises or consists of an intron.
23. The nucleic acid construct according to claim 3, wherein said protein or polypeptide of interest is a heterologous protein or polypeptide.
24. The nucleic acid construct according to claim 1, wherein said first promoter comprises a sequence having at least 50% identity to SEQ ID NO: 1 over its whole length or having at least 50% identity to SEQ ID NO: 2 over its whole length.
25. The nucleic acid construct according to claim 1, wherein said second promoter comprises a sequence having at least 95% identity to SEQ ID NO: 57, 58, or 79 over its whole length.
26. The nucleic acid construct according to claim 1, wherein said second promoter comprises a sequence having at least 95% identity to SEQ ID NO: 57 or 58 over its whole length.
27. The nucleic acid construct according to claim 1, wherein said first promoter comprises a sequence having at least 50% identity to SEQ ID NO: 1 or 2 over its whole length and said second promoter comprises a sequence having at least 95% identity to SEQ ID NO: 57, 58 or 79 over its whole length.
28. The nucleic acid construct according to claim 1, wherein said first promoter comprises a sequence having at least 50% identity to SEQ ID NO: 1 over its whole length and said second promoter comprises a sequence having at least 95% identity to SEQ ID NO: 57 or 58 over its whole length.
29. The nucleic acid construct according to claim 1, wherein said first promoter comprises a sequence having at least 50% identity to SEQ ID NO: 2 over its whole length and said second promoter comprises a sequence having at least 95% identity to SEQ ID NO: 57 or 58 over its whole length.
30. The nucleic acid construct according to claim 1, wherein said second promoter is a human or murine cytomegalovirus (CMV) promoter.
31. The nucleic acid construct according to claim 1, wherein said first promoter and said first intronic sequence comprise the sequence of SEQ ID NO: 1.
32. The nucleic acid construct according to claim 1, wherein said second intronic sequence comprises the sequence of SEQ ID NO:19.
33. A method for transcription and optionally purifying the produced transcript comprising the steps of: a) providing a nucleic acid construct comprising a first promoter, a second promoter, and a single nucleotide sequence of interest, wherein said first promoter and second promoter are constitutive promoters and are both operably linked to said single nucleotide sequence of interest, and wherein said second promoter is an intronic promoter flanked by a first intronic sequence located upstream of said second promoter and a second intronic sequence located downstream of said second promoter, and wherein said single nucleotide sequence of interest is under the control of said first promoter and said second promoter, b) contacting a cell with said nucleic acid construct to obtain a transformed cell; and, c) allowing said transformed cell to produce a transcript of the nucleotide sequence of interest; and optionally, d) purifying said produced transcript.
34. A method for transcription and optionally purifying the produced transcript comprising the steps of: a) providing a nucleic acid construct comprising in the 5' to 3' direction an expression enhancing element, a heterologous promoter and a nucleotide sequence of interest, wherein said expression enhancing element has at least 50% identity to SEQ ID NO: 1 or 2 over its whole length, and wherein said expression enhancing element and said heterologous promoter are operably linked to a same, single nucleotide sequence of interest; and, b) contacting a cell with said nucleic acid construct to obtain a transformed cell; and, c) allowing said transformed cell to produce a transcript of the nucleotide sequence of interest; and optionally, d) purifying said produced transcript.
Description:
FIELD OF THE INVENTION
[0001] The invention relates to a method for transcription and expression using a nucleic acid construct which is characterized by the presence of a promoter followed by an intronic promoter. The invention further relates to said nucleic acid construct, an expression vector and a cell comprising said construct, and its use.
[0002] The invention also relates to methods for transcription and optionally expression using a nucleotide sequence. The invention further relates to said nucleotide sequence and a construct, expression vector and cell comprising said nucleotide sequence, and its use.
BACKGROUND OF THE INVENTION
[0003] There is still a need in the art for alternative and preferably improved methods for regulating the transcription of a transcript and optionally regulating the expression of a protein or polypeptide of interest in host cells.
SUMMARY OF THE INVENTION
[0004] The present invention relates to a method for transcription and optionally purifying the produced transcript comprising the steps of:
[0005] a) providing a nucleic acid construct comprising a first promoter, a second promoter, and a nucleotide sequence of interest, wherein said first and said second promoters are operably linked to said nucleotide sequence of interest, and wherein said second promoter is flanked by a first intronic sequence located upstream of said promoter and a second intronic sequence located downstream of said promoter; and,
[0006] b) contacting a cell with said nucleic acid construct to obtain a transformed cell; and,
[0007] c) allowing said transformed cell to produce a transcript of the nucleotide sequence of interest; and optionally,
[0008] d) purifying said produced transcript.
[0009] The present invention further relates to a method for expressing and optionally purifying a protein or polypeptide of interest comprising the step of:
[0010] a) providing a nucleic acid construct comprising a first promoter, a second promoter and a nucleotide sequence encoding a protein or polypeptide of interest, wherein said first and said second promoters are operably linked to said nucleotide sequence encoding a protein or polypeptide of interest, and wherein said second promoter is flanked by a first intronic sequence located upstream of said promoter and a second intronic sequence located downstream of said promoter; and,
[0011] b) contacting a cell with said nucleic acid construct to obtain a transformed cell; and,
[0012] c) allowing said transformed cell to express the protein or polypeptide of interest; and optionally, purifying said protein or polypeptide of interest.
[0013] Preferably, said first intronic sequence comprises at least a donor splice site and said second intronic sequence comprises at least an acceptor splice site. Moreover, the nucleic acid construct of step a) of the method of the invention comprises the following nucleotide sequences indicated here in their relative positions in the 5' to 3' direction: (i) a first promoter (ii) a first intronic sequence comprising at least a donor splice site, (iii) a second promoter, (iv) a second intronic sequence comprising at least an acceptor splice site; and (v) a nucleotide sequence encoding a protein or polypeptide of interest, wherein preferably said first promoter, said first intronic sequence comprising at least a donor splice site, said second promoter, and said second intronic sequence comprising at least an acceptor splice site are all operably linked to said nucleotide sequence encoding a protein or polypeptide of interest.
[0014] Preferably, said first promoter has at least 50% identity to nucleotides 1-969 of SEQ ID NO: 1 or nucleotides 1-614 of SEQ ID NO: 2 over its whole length. An overview of all SEQ ID NOs is given in Table 1. Preferably, a nucleotide sequence comprising both said first promoter and said first intronic sequence comprising at least a donor splice site has at least 50% identity to SEQ ID NO: 1 or 2 over its whole length. Preferably, said second promoter has at least 50% sequence identity with SEQ ID NO: 57 or SEQ ID NO: 58 over its whole length.
[0015] The present invention further relates to a nucleic acid construct comprising a first promoter and a second promoter, wherein said first and said second promoters are configured to be both operably linked to an optional nucleotide sequence of interest, and wherein said second promoter is flanked by a first intronic sequence located upstream of said promoter and a second intronic sequence located downstream of said promoter. Preferably, said first intronic sequence comprises at least a donor splice site and preferably said second intronic sequence comprises at least an acceptor splice site. Moreover, preferably a nucleic acid construct of the invention comprises the following nucleotide sequences indicated here in their relative positions in the 5' to 3' direction: (i) a first promoter (ii) a first intronic sequence comprising at least a donor splice site, (iii) a second promoter, (iv) a second intronic sequence comprising at least an acceptor splice site; and optionally (v) a nucleotide sequence of interest, wherein preferably said first promoter, said first intronic sequence comprising at least a donor splice site, said second promoter, and said second intronic sequence comprising at least an acceptor splice site are all configured to be operably linked to said optional nucleotide sequence of interest.
[0016] Preferably, said first promoter has at least 50% identity to nucleotides 1-969 of SEQ ID NO: 1 or nucleotides 1-614 of SEQ ID NO: 2 over its whole length. Preferably, a nucleotide sequence comprising both said first promoter and said first intronic sequence comprising at least a donor splice site has at least 50% identity to SEQ ID NO: 1 or 2 over its whole length. Preferably, said second promoter has at least 50% sequence identity with SEQ ID NO: 57 or SEQ ID NO: 58 over its whole length.
[0017] Preferably, said nucleic acid construct is an isolated construct. Preferably, said nucleic acid construct is a recombinant nucleic acid construct. Preferably, said optional nucleotide sequence of interest is a nucleotide sequence encoding a protein or polypeptide of interest. Preferably, said protein or polypeptide of interest is a heterologous protein or polypeptide.
[0018] The present invention further relates to an expression vector comprising a nucleic acid construct or recombinant nucleic acid construct as defined herein.
[0019] The present invention further relates to a cell comprising a nucleic acid construct or recombinant nucleic acid construct as defined herein, and/or an expression vector as defined herein.
[0020] The present invention also relates to a use of a nucleic acid construct or recombinant nucleic acid construct as defined herein, and/or an expression vector as defined herein and/or a cell as defined herein for the transcription of a nucleotide sequence of interest.
[0021] The present invention further relates to a use of a nucleic acid construct or recombinant nucleic acid construct as defined herein, and/or an expression vector as defined herein and/or a cell as defined herein for the expression of a protein or polypeptide of interest.
[0022] The present invention further relates to a method for transcription and optionally purifying the produced transcript comprising the step of:
[0023] a) providing a nucleic acid construct comprising an expression enhancing element, a heterologous promoter and a nucleotide sequence of interest of the invention, wherein said expression enhancing element and said heterologous promoter are operably linked to said nucleotide sequence of interest; and,
[0024] b) contacting a cell with said nucleic acid construct to obtain a transformed cell; and,
[0025] c) allowing said transformed cell to produce a transcript of the nucleotide sequence of interest; and optionally,
[0026] d) purifying said produced transcript.
[0027] The present invention further relates to a method for expressing and optionally purifying a protein or polypeptide of interest comprising the step of:
[0028] a) providing a nucleic acid construct comprising an expression enhancing element, a heterologous promoter and a nucleotide sequence encoding a protein or polypeptide of interest, wherein said expression enhancing element and said heterologous promoter are operably linked to said nucleotide sequence encoding a protein or polypeptide of interest; and,
[0029] b) contacting a cell with said nucleic acid construct to obtain a transformed cell; and,
[0030] c) allowing said transformed cell to express the protein or polypeptide of interest; and optionally,
[0031] d) purifying said protein or polypeptide of interest.
[0032] Preferably, said nucleic acid construct of said method for transcription and/or expression and optionally purifying a transcript and/or protein or polypeptide of interest further comprises an additional expression regulating element operably linked to said nucleotide sequence of interest and/or said nucleotide sequence encoding a protein or polypeptide of interest. Preferably, said additional expression regulating element comprises an intronic sequence. A preferred additional expression regulating element comprises or is an additional expression enhancing element. More preferably, said additional expression regulating element further comprises a translation enhancing element.
[0033] The present invention further relates to a nucleic acid molecule that is represented by a nucleotide sequence comprising an expression enhancing element of the invention, i.e. a nucleotide sequence that has at least 50% identity to SEQ ID NO: 1 or 2 over its whole length. An overview of all SEQ ID NOs is given in Table 1. Preferably, said nucleic acid molecule is an isolated nucleic acid molecule. Preferably, said nucleic acid molecule or isolated nucleic acid molecule is represented by a nucleotide sequence that has at least 50% sequence identity to SEQ ID NO: 1 or 2 over its whole length. Preferably, said nucleic acid molecule or isolated nucleic acid molecule is represented by a nucleotide sequence comprising a sequence derived from the Cricetulus griseus gene for polyubiquitin of at most 8000 nucleotides. The present invention further relates to a nucleic acid construct comprising a nucleic acid molecule of the invention. Preferably, said nucleic acid construct is represented by a nucleotide sequence that further comprises a heterologous promoter, wherein preferably said expression enhancing element and said heterologous promoter are configured to be both operably linked to an optional nucleotide sequence of interest. Preferably, said nucleic acid construct further comprises an additional expression regulating element, wherein preferably said expression enhancing element, said heterologous promoter and said additional expression regulating element are configured to be all operably linked to said optional nucleotide sequence of interest. Preferably, said additional expression regulating element further comprises a translation enhancing element. Preferably, said additional expression regulating element comprises an intronic sequence. Preferably, said optional nucleotide sequence of interest is a nucleotide sequence encoding a protein or polypeptide of interest. Preferably, said protein or polypeptide of interest is a heterologous protein or polypeptide.
[0034] Preferably, said nucleic acid construct is a recombinant and/or isolated nucleic acid construct.
[0035] The present invention further relates to an expression vector comprising a nucleic acid molecule or an isolated nucleic acid molecule as defined herein, and/or a nucleic acid construct or a recombinant and/or isolated nucleic acid construct as defined herein.
[0036] The present invention further relates to a cell comprising a nucleic acid molecule or an isolated nucleic acid molecule as defined herein, and/or a nucleic acid construct or recombinant nucleic acid construct as defined herein, and/or an expression vector as defined herein.
[0037] The present invention also relates to a use of a nucleic acid molecule or an isolated nucleic acid molecule as defined herein, and/or a nucleic acid construct or a recombinant and/or isolated nucleic acid construct as defined herein, and/or an expression vector as defined herein and/or a cell as defined herein for the transcription of a nucleotide sequence of interest.
[0038] The present invention further relates to a use of a nucleic acid molecule or an isolated nucleic acid molecule as defined herein, and/or a nucleic acid construct or a recombinant and/or isolated nucleic acid construct as defined herein, and/or an expression vector as defined herein and/or a cell as defined herein for the expression of a protein or polypeptide of interest.
[0039] The present invention further relates to a method for transcription and optionally purifying a produced transcript comprising the step of:
[0040] a) providing a nucleic acid construct comprising a nucleotide sequence that has at least 50% identity to SEQ ID NO:88 over its whole length and which is operably linked to a nucleotide sequence of interest and,
[0041] b) contacting a cell with said nucleic acid construct to obtain a transformed cell; and,
[0042] c) allowing said transformed cell to produce a transcript of the nucleotide sequence of interest; and optionally,
[0043] d) purifying said produced transcript.
[0044] The present invention further relates to a method for expressing and optionally purifying a protein or polypeptide of interest comprising the step of:
[0045] a) providing a nucleic acid construct comprising a nucleotide sequence that has at least 50% identity to SEQ ID NO:88 over its whole length and which is operably linked to a nucleotide sequence of interest and contacting a cell with said nucleic acid construct to obtain a transformed cell, and,
[0046] b) allowing said transformed cell to express the protein or polypeptide of interest; and optionally,
[0047] c) purifying said protein or polypeptide of interest.
[0048] Preferably, said nucleic acid construct of said method for transcription and/or expression and optionally purifying a transcript and/or protein or polypeptide of interest further comprises an additional expression regulating element operably linked to said nucleotide sequence of interest and/or said nucleotide sequence encoding a protein or polypeptide of interest. Preferably, said additional expression regulating element comprises an intronic sequence. Preferably, said additional expression regulating element further comprises a translation enhancing element.
[0049] The present invention further relates to a nucleic acid molecule that is represented by a nucleotide sequence that has at least 50% identity to SEQ ID NO: 88 over its whole length. Preferably, said nucleic acid molecule is an isolated nucleic acid molecule. The present invention further relates to a nucleic acid construct comprising a nucleic acid molecule of the invention. Preferably, said nucleic acid construct is represented by a nucleotide sequence that further comprises an optional nucleotide sequence of interest. Preferably, said nucleic acid construct further comprises an additional expression regulating element, wherein preferably said expression enhancing element is configured to be operably linked to said optional nucleotide sequence of interest. Preferably, said additional expression regulating element further comprises a translation enhancing element. Preferably, said additional expression regulating element comprises an intronic sequence. Preferably, said optional nucleotide sequence of interest is a nucleotide sequence encoding a protein or polypeptide of interest. Preferably, said protein or polypeptide of interest is a heterologous protein or polypeptide.
[0050] Preferably, said nucleic acid construct is a recombinant and/or isolated nucleic acid construct.
[0051] The present invention further relates to an expression vector comprising a nucleic acid molecule or an isolated nucleic acid molecule as defined herein, and/or a nucleic acid construct or a recombinant and/or isolated nucleic acid construct as defined herein.
[0052] The present invention further relates to a cell comprising a nucleic acid molecule or an isolated nucleic acid molecule as defined herein, and/or a nucleic acid construct or recombinant nucleic acid construct as defined herein, and/or an expression vector as defined herein.
[0053] The present invention also relates to a use of a nucleic acid molecule or an isolated nucleic acid molecule as defined herein, and/or a nucleic acid construct or a recombinant and/or isolated nucleic acid construct as defined herein, and/or an expression vector as defined herein and/or a cell as defined herein for the transcription of a nucleotide sequence of interest.
[0054] The present invention further relates to a use of a nucleic acid molecule or an isolated nucleic acid molecule as defined herein, and/or a nucleic acid construct or a recombinant and/or isolated nucleic acid construct as defined herein, and/or an expression vector as defined herein and/or a cell as defined herein for the expression of a protein or polypeptide of interest.
DESCRIPTION OF THE INVENTION
[0055] The inventors identified an expression construct for increasing the expression of a protein or polypeptide of interest. The expression construct of the invention is characterized by two promoters operably linked to a coding sequence of a protein or polypeptide of interest. An expression construct of the invention typically comprises a first promoter followed by a second promoter, a coding sequence of the protein or polypeptide of interest and a polyadenylation sequence, wherein said second promoter is flanked by intronic sequences. Said promoter being flanked by intronic sequences is denominated herein as an intronic promoter. Additional expression regulating sequences may be inserted upstream and downstream of said first and/or second promoter and/or downstream of the polyadenylation sequence. The inventors surprisingly found that an expression construct of the invention comprising a promoter followed by an intronic promoter operably linked to a coding sequence of a protein or polypeptide of interest, results in a significant increase in expression of said protein or polypeptide of interest as compared to an expression construct comprising only one promoter operably linked to said coding sequence. The inventors have found that the expression of initially poorly expressed proteins is increased to appreciable levels when using the combination of a promoter and intronic promoter of the invention instead of a single promoter in an expression construct encoding these proteins, as exemplified in the Examples, more specifically in Example 1. The combination of a promoter and an intronic promoter of the invention in an expression construct for an initially poorly expressed protein facilitates the generation of clonal lines and allows for the generation of clonal lines with increased and relevant expression levels, as exemplified in the Examples, more specifically in Example 3. Furthermore, expression of initially highly expressed proteins is even further increased when using the combination of a promoter and an intronic promoter of the invention instead of a single promoter in an expression construct encoding these proteins as exemplified in the Examples, more specifically in Example 5. Furthermore, an increase in total amount of mRNA and an increase in expression as measured on protein level was found as detailed herein below and exemplified in the Examples enclosed. Furthermore, the percentage of high-producer cell lines in a stably transfected pool is significantly higher as compared to pools with a single promoter operably linked to the coding sequence. As the nucleotide sequence of the invention comprising both a promoter and an intronic promoter operably linked to a nucleotide sequence of interest results in an increase in transcription, the present invention is not limited to the use of this sequence in protein and/or polypeptide expression and/or protein and/or polypeptide production but extends to the use of this combination of a promoter and intronic promoter in methods where higher levels of transcript are desired, for instance in methods for producing noncoding RNA transcripts as further specified herein. Furthermore, a further benefit of the invention is that, apart from an increase in transcription level and/or increase in expression level of the protein or polypeptide of interest, the invention allows for different transcripts to be formed as further detailed herein.
[0056] The inventors identified an expression enhancing element for increasing the expression of a protein or polypeptide of interest. The present invention relates to said expression enhancing element. Application of the expression enhancing element of the invention in an expression construct further comprising a heterologous promoter operably linked to a sequence encoding a protein or polypeptide of interest, results in a marked increase in expression of said protein or polypeptide of interest as compared to such expression using a similar expression construct which only differs to the former expression construct in that the expression enhancing element of the invention is absent. The inventors have found that expression of initially poorly expressed proteins is increased to appreciable levels after insertion of the element in an expression construct encoding these proteins as exemplified in the Examples, more specifically in Example 1. Insertion of the expression enhancing element in an expression construct for an initially poorly expressed protein facilitates the generation of clonal lines and allows for the generation of clonal lines with relevant expression levels, as exemplified in the Examples, more specifically in Example 3. Furthermore, expression of initially highly expressed proteins is even further increased after insertion of the element in an expression construct encoding these proteins as exemplified in the Examples, more specifically in Example 5. Furthermore, an increase in total amount of mRNA level and/or an increase in expression as measured on protein level was found as detailed herein below and exemplified in the Examples enclosed. As the expression enhancing element of the invention may result in an increase in transcription, the present invention is not limited to the use of this element in protein and/or polypeptide expression and/or protein and/or polypeptide production but extends to the use of this element in methods where higher levels of transcript are desired, for instance in methods for producing noncoding RNA transcripts as further specified herein.
The inventors further identified a nucleic acid molecule represented by a nucleotide sequence that has at least 50% identity with SEQ ID NO: 88 for increasing the expression of a protein or polypeptide of interest. The use of said nucleotide sequence is attractive as demonstrated in example 11.
First Aspect
[0057] In a first aspect, the present invention provides a nucleic acid construct for increasing transcription and/or expression, comprising a first promoter and a second promoter, which are configured to be both operably linked to an optional nucleotide sequence of interest within an expression construct. "Optional" being understood herein as not necessarily being present in an expression construct. For instance, such nucleotide sequence of interest need not be present in a commercialized expression vector, but may be readily introduced by a person skilled in the art before use in a method of the invention.
[0058] Preferably, within this first aspect, said nucleotide construct of the invention comprising a first promoter and a second promoter is capable of increasing the transcription of a nucleotide sequence of interest that is under the control of said first promoter and second promoter. Alternatively or in combination with the increased transcription, said nucleotide construct is also preferably capable of increasing expression of a protein or polypeptide of interest encoded by said nucleotide sequence of interest. Preferably, transcription levels are assessed in an expression system using an expression construct comprising said first promoter and second promoter operably linked to a nucleotide sequence of interest using a suitable assay such as RT-qPCR. Preferably, within this first aspect, the nucleotide construct of the invention comprising the first promoter and second promoter of the invention allows for an increase in transcription of at least 5%, 10%, 15%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 100%, 200%, 300%, 500%, 1000%, 1500% or 2000% of said nucleotide sequence of interest as compared to transcription using a construct which only differs in that the nucleotide sequence of interest is under the control of a single promoter, preferably when tested in a system as exemplified in the Examples which are enclosed herein. More specifically, preferably the transcription of a nucleotide sequence of interest encoding for secreted alkaline phosphatase (SeAP) is measured in a mammalian cell system, most preferably in CHO cells, using a pcDNA3.1 expression vector comprising a first promoter and second promoter sequence to be tested operably linked to said nucleotide sequence of interest. Transcription is preferably measured using RT-qPCR and transcription levels are compared to transcription levels of said nucleotide sequence of interest which are measured under the same conditions except that said first promoter and second promoter are replaced by a single promoter, preferably a CMV promoter represented by SEQ ID NO: 57, in the expression vector used.
[0059] Preferably, within this first aspect, expression levels are established in an expression system using an expression construct comprising said first promoter and second promoter operably linked to a nucleotide sequence encoding a protein or polypeptide of interest. Preferably, said protein or polypeptide of interest is a secreted protein or polypeptide and expression of said protein or polypeptide of interest is detected by a suitable assay such as an enzyme-linked immunosorbent assay (ELISA) assay, Western blotting or, dependent on the identity of the protein or polypeptide of interest, any suitable protein identification and/or quantification assay known to the person skilled in the art. Preferably, the first promoter and second promoter of the invention allow for an increase in expression of protein or polypeptide expression of at least 5%, 10%, 15%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 100%, 200%, 300%, 500%, 1000%, 1500% or 2000% as compared to expression of said protein or polypeptide using a construct which only differs in that the encoding sequence of the protein or polypeptide of interest is under the control of a single promoter, preferably when tested in a system as exemplified in the Examples which are enclosed herein. More specifically, the expression of nucleotide sequence of interest encoding for secreted alkaline phosphatase (SeAP) is measured in a mammalian cell system, most preferably in CHO cells, using a pcDNA3.1 expression vector comprising a first promoter and second promoter sequence to be tested operably linked to said nucleotide sequence of interest. Expression is preferably measured by measuring the conversion of any suitable alkaline phosphatase substrate and expression levels are compared to expression levels of said nucleotide sequence of interest which are measured under the same conditions except that the expression vector only differs in that said first promoter and second promoter are replaced by a single promoter, preferably a CMV promoter represented by SEQ ID NO: 57, in the expression vector used.
[0060] Preferably, within said first aspect, said increase of protein or polypeptide expression is copy number independent as established by an assay suitable to determine copy number dependency by a skilled person such as, but not limit to, a triplex Taqman assay as further detailed in Example 4 of the present invention.
[0061] Preferably, within said first aspect, said first promoter is located upstream or at the 5' site of said second promoter. Preferably, said second promoter as defined herein should be devoid of sequence elements that will act as transcription terminators. Transcription terminators well known by the persons skilled in the art are sequences that can result in premature termination of transcription such as, but not limited to, stable hairpin structures, repeat sequences such as long terminal repeats (LTRs) or Alu repeats, polyadenylation motifs and transposable elements.
[0062] Within the context of the first aspect of the invention a promoter is a promoter capable of initiating transcription in the host cell of choice. Promoters as used herein include tissue-specific, tissue-preferred, cell-type specific, inducible and constitutive promoters as defined herein in the Definitions section. Promoters that may be comprised within said first or second promoter as defined herein are promoters that may be employed in transcription of nucleotide sequences of interest and/or expression of proteins or polypeptides of interest, preferably in mammalian cells, and include, but are not limited to, the human or murine cytomegalovirus (CMV) promoter, a simian virus (SV40) promoter, a human or mouse ubiquitin C (UBC) promoter, a human or mouse or rat elongation factor alpha (EF1-a) promoter, mouse or hamster beta-actin promoter, or a hamster rpS21 promoter. The Tet-Off and Tet-On responsive elements upstream of a minimal promoter such as a CMV promoter is an example of an inducible mammalian promoter. Examples of suitable yeast and fungal promoters are Leu2 promoter, the galactose (Gal1 or Gal7) promoter, alcohol dehydrogenase I (ADH1) promoter, glucoamylase (Gla) promoter, triose phosphate isomerase (TPI) promoter, translational elongation factor EF-I alpha (TEF2) promoter, glyceraldehyde-3-phosphate dehydrogenase (gpdA) promoter, alcohol oxidase (AOX1) promoter, or glutamate dehydrogenase (gdhA) promoter. An example of a strong ubiquitous promoter for expression in plants is cauliflower mosaic virus (CaMV) 35S promoter.
[0063] In an embodiment within said first aspect, said first and said second promoters are similar promoters. Preferably, said first promoter has at least 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% identity to said second promoter.
[0064] In another embodiment within said first aspect, said first promoter and second promoter are distinct or different promoters. Preferably, said first promoter has less than 80%, 75%, 70%, 65%, 60%, 55%, 50%, 45%, 40%, 35%, 30%, 25%, 20%, 15%, 10% or 5% sequence identity to said second promoter.
[0065] In a preferred embodiment within said first aspect, said first promoter sequence comprises or consists of a UBC promoter or a CCT8 promoter and said second promoter comprises or consists of a CMV promoter, or the other way around. Preferably, said first promoter comprises or consists of a sequence that has at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% identity to nucleotides 1-969 of SEQ ID NO: 1 or with nucleotides 1-614 of SEQ ID NO: 2. Preferably, said second promoter comprises or consists of a sequence that has at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% identity to SEQ ID NO: 58, or preferably to SEQ ID NO: 57 Preferred within said first aspect is a nucleotide sequence of the invention wherein said first promoter comprises or consists of a sequence that has at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% identity to nucleotides 1-969 of SEQ ID NO: 1 and said second promoter comprises or consists of a sequence that has at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% identity to SEQ ID NO: 57.
[0066] Also preferred within said first aspect is a nucleotide sequence of the invention wherein said first promoter comprises or consists of a sequence that has at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% identity with nucleotides 1-614 of SEQ ID NO: 2 and said second promoter comprises or consists of a sequence that has at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% identity to SEQ ID NO: 57.
[0067] Also preferred within said first aspect is a nucleotide sequence of the invention wherein said first promoter comprises or consists of a sequence that has at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% identity to SEQ ID NO: 58, or preferably to SEQ ID NO: 57 and wherein said second promoter comprises or consists of a sequence that has at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% identity to nucleotides 1-969 of SEQ ID NO 1 or with nucleotides 1-614 of SEQ ID NO 2.
[0068] Preferably within said first aspect a nucleotide sequence of the invention wherein said first promoter comprises or consists of a sequence that has at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% identity with to SEQ ID NO: 57 and said second promoter comprises or consists of a sequence that has at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% identity nucleotides 1-969 of SEQ ID NO: 1. Also preferred is a nucleotide sequence of the invention wherein said first promoter comprises or consists of a sequence that has at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% identity to SEQ ID NO: 57 and said second promoter comprises or consists of a sequence that has at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% identity with nucleotides 1-614 of SEQ ID NO: 2.
[0069] It is to be understood that within said first aspect, said first and/or second promoter does not consist only of a promoter enhancer sequence, such as a sequence selected from the group consisting of SEQ ID NO: 52-54. Preferably, said first promoter and second promoter do not consist only of a promoter enhancer sequence, such as a sequence selected from the group consisting of SEQ ID NO: 52-54. Preferably, a nucleotide sequence of the invention does not comprise or consist of SEQ ID NO: 55 or 56.
[0070] In a preferred embodiment within said first aspect, said second promoter is flanked by a first intronic sequence at the 5' site or upstream of said second promoter and a second intronic sequence at the 3' site or downstream of said second promoter. Being "flanked" is understood herein as being positioned in between said indicated sequences optionally separated by 1-50, 1-60, 1-70, 1-80, 1-90, 1-100, 1-200, 1-300, 1-400, 1-500, 1-600, 1-700, 1-800, 1-900, 1-1000, 1-5,000 or 1-100,000 nucleotides, these nucleotides being understood to encompass the 5'-UTR. An intronic sequence is understood to be at least part of the nucleotide sequence of an intron. Preferably, said first intronic sequence at the 5' site or upstream of said second promoter comprises at least a donor splice site or splice site GT. A donor splice site is understood herein as a splice site that, when combined with an acceptor splice site as defined herein, results in the formation of an intron as defined in the Definition section. Preferably, a nucleotide sequence is an intron if at least 2%, 5%, 10%, 15%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90% or 100% of the primary RNA loses this sequence by RNA splicing using an assay suitable to detect intron splicing, such as but not limited to reverse-transcriptase polymerase chain reaction (RT-PCR) followed by size or sequence analysis of the RT-PCR. Preferred donor splice sites of the invention are M-W-G-[cut]-G-T-R-A-G-K or M-A-R-[cut]-G-T-R-A-G-K in case the host cell is a mammalian cell, A-G-[cut]-G-T-A-W-K in case the host cell is a plant cell, [cut]-G-T-A-W-G-T-T in case the host cell is a yeast cell and R-G-[cut]-G-T-R-A-G, in case the host cell is an insect cell. "[cut]" is to be understood herein as the specific cut site where splicing will take place. Intron splicing can be assessed functionally using an assay as detailed in the Definition section under "intron". Most preferably, the donor splice site comprised within the first intronic sequence of the invention is C-T-G-[cut]-G-T-G-A-G-G or A-A-A-[cut]-G-T-G-A-G-G. Preferably, said first intronic sequence consists of said donor splice site or splice site GT. Preferably, said first intronic sequence comprises a single donor splice site as defined herein. Preferably, said first intronic sequence is free of an acceptor splice site as defined herein below.
[0071] Preferably, within said first aspect said second intronic sequence at the 3' site or downstream of said promoter comprises at least an acceptor splice site which is understood herein as the splice site AG preferably preceded by a pyrimidine rich sequence or polypyrimidine tract nucleotide sequence, optionally separated from splice site AG by 1-50 nucleotides, and optionally further comprising a branch site comprising the sequence Y-T-N-A-Y, at the 5' site of the polypyrimidine tract nucleotide sequence, wherein the branch site may have the nucleotide sequence C-Y-G-A-C. An acceptor splice site is understood herein as a splice site that, when combined with a donor splice site encompassed within a construct, results in the formation of an intron as defined in the Definition section. Preferably, the acceptor splice site or splice site AG has the sequence [Y-rich]-N-Y-A-G-[cut]. Preferably, the acceptor splice site or splice site AG has the sequence [Y-rich]-N-Y-A-G-[cut]-R in case the host cell is a mammalian cell, [Y-rich]-D-Y-A-G-[cut]-R or [Y-rich]-D-Y-A-G-[cut]-R-W in case the host cell is a plant cell, [Y-rich]-A-Y-A-G-[cut] in case the host cell is a yeast cell and [Y-rich]-N-Y-A-G-[cut] in case the host cell is an insect cell. "[Y-rich]" is to be understood herein as the polypyrimidine tract which is preferably defined as a consecutive sequence of at least 10 nucleotides comprising at least 6, 7, 8, 9 or preferably 10 pyrimidine nucleotides. Preferably, said acceptor splice site or splice site GT has the sequence Y-A-G-[cut]-R. Preferably, said second intronic sequence comprises a single acceptor splice site. In an embodiment, said second intronic sequence is free of a donor splice site as defined herein. In an alternative embodiment, said second intronic sequence comprises both a donor splice site and an acceptor splice site as defined herein. Most preferably, said second intronic sequence is an intron as defined in the Definition section. Preferably, the second promoter and the intronic sequences flanking the second promoter are configured to form an intronic promoter (referred is to FIG. 1). An intronic promoter is known to a person skilled in the art as a promoter located within an intronic sequence. Preferably, said intronic promoter is an intron as defined in the Definition section. Preferably, the boundaries of the intronic promoter of the present invention are being formed by the donor splice site of the intronic sequence at the 5' site or upstream of the second promoter of the invention and the acceptor splice site of the intronic sequence at the 3' site or downstream of the second promoter of the invention. The intronic promoter of the invention can have a length that is comparable or similar to naturally occurring introns, preferably comparable or similar to naturally occurring introns in the host cell or organism as defined herein. Preferably, said intronic promoter as defined herein is at most 12,000 nucleotides in length. Preferably, said first intronic sequence at the 5' site or upstream of said second promoter is located at the 3' site or downstream of said first promoter. Preferably, the first promoter and second promoter, the intronic sequences flanking the second promoter, and a nucleotide sequence encoding a protein or polypeptide of interest are configured in such a way that the first promoter is upstream of the second promoter, wherein the second promoter is flanked by said intronic sequences to form an intronic promoter, and wherein said first promoter and second promoter are configured to be both upstream and operably linked to the nucleotide sequence encoding a protein or polypeptide of interest (FIG. 1). The intronic promoter may comprise further expression enhancing elements, but preferably the intronic promoter is free of further splice sites apart from the donor and acceptor splice sites as defined herein within the first and second intronic sequences as defined herein. Preferably, one or more expression enhancing sequences are comprised within said first and/or said second promoter. Without being wished to be bound by any theory, transcription starting from either of the two promoters may result in different transcripts (pre-mRNAs) which, upon splicing result in different mRNAs as illustrated in FIG. 1. In support of this theory, the inventors found that different transcripts are formed using a construct of the invention (referred is in this respect to FIG. 1, Example 8 and FIG. 10). Furthermore, the increased activity is found to be severely diminished by 4 nucleotides mutation in the intronic promoter which prevents correct intron splicing (referred in this respect is to FIG. 9 and Example 7), also supporting the theory that both promoters are active in the construct. Therefore, a further benefit of the invention is that, apart from an increase in transcription of the nucleotide sequence of interest and/or an increase in expression level of the protein or polypeptide of interest, the invention allows for different transcripts to be formed. "Different transcripts" are understood herein as transcripts that are structurally different or distinct, i.e. having a different or distinct nucleotide sequence. Therefore a further benefit of the invention is to direct or redirect the splicing of a nucleotide sequence of interest. Depending on the location of the intronic splice sites, the transcripts may have a different or distinct UTR sequence and/or a different or distinct coding sequence. It is also possible that only one type of transcript is formed, e.g. in case the 5'-UTR sequences of said first and second intronic sequences are the same. Assessment whether different transcripts are formed can be done using any suitable method known to the person skilled in the art, such as but not limited to Rapid amplification of cDNA ends Polymerase Chain Reaction (RACE-PCR).
[0072] Preferably, within said first aspect, said first intronic sequence at the 5' site or upstream of said second promoter, has at least 80%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity with nucleotides 970-1449 of SEQ ID NO: 1 or at least 80%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity with nucleotides 667-1228 of SEQ ID NO: 2, preferably comprising at least a donor splice site or splice site GT.
[0073] Preferably, within said first aspect, said intronic sequence downstream or at the 3' site of said second promoter comprises a nucleotide sequence having at least 80%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity with nucleotides 171-277 of SEQ ID NO: 14, nucleotides 171-274 of SEQ ID NO: 19, nucleotides 133-210 of SEQ ID NO: 20, nucleotides 134-211 of SEQ ID NO: 21, nucleotides 134-226 of SEQ ID NO: 22, nucleotides 134-226 of SEQ ID NO: 23, nucleotides 133-225 of SEQ ID NO: 24, nucleotides 134-226 of SEQ ID NO: 25, nucleotides 146-257 of SEQ ID NO: 26, or nucleotides 147-223 of SEQ ID NO: 27, preferably comprising at least an acceptor splice site AG preceded by a TC-rich nucleotide sequence, optionally separated from splice site AG by 1-50 nucleotides and a branch site comprising the sequence Y-T-N-A-Y or C-Y-G-A-C, at the 5' site of the TC-rich nucleotide sequence.
[0074] In a preferred embodiment within said first aspect, said first promoter is flanked at its 3' site by said first intronic sequence. In an embodiment, said first promoter and said first intronic sequence are not aligned in nature but aligned in a construct of the invention by recombination. In another embodiment, said sequence comprising both said first promoter flanked at its 3' site by said first intronic sequence is derived from a naturally occurring sequence. In a preferred embodiment, said nucleotide sequence comprising both a first promoter and a first intronic sequence according to the present invention is a sequence derived from the UBC ubiquitin gene. Preferably, said sequence is derived from a mammalian UBC ubiquitin gene. More preferably, said nucleotide sequence comprising both a first promoter and a first intronic sequence according to the present invention is derived from the Cricetulus griseus homologous gene of the human UBC ubiquitin gene, said gene being indicated as the Cricetulus sp. gene for polyubiquitin, or CRUPUQ (GenBank D63782). In a preferred embodiment, said nucleotide sequence derived from CRUPUQ is comprising both a first promoter and a first intronic sequence of the invention and is a contiguous sequence of at least 500, 600, 700, 800, 900, 1000 or 1117 in length, preferably at least 1449 nucleotides in length of SEQ ID NO: 1. Preferably, said nucleotide sequence having at least 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% identity to SEQ ID NO: 1 is at most 8000 7000, 6000, 5000, 4000, 3000, 2000, 1900, 1800, 1700, 1600, 1500 or 1449 in length. Most preferably, said sequence being 1449 nucleotides in length. Preferably, said nucleotide sequence is at most 8000, 7000, 6000, 5000, 4000, 3000, 2000, 1900, 1800, 1700, 1600, 1500 or 1449 nucleotides in length and has at least 65% identity with SEQ ID NO: 1 over its whole length. Preferably, said nucleotide sequence is at most 8000, 7000, 6000, 5000, 4000, 3000, 2000, 1900, 1800, 1700, 1600, 1500 or 1449 nucleotides in length and has at least 70% identity with SEQ ID NO: 1 over its whole length. Preferably, said nucleotide sequence is at most 8000, 7000, 6000, 5000, 4000, 3000, 2000, 1900, 1800, 1700, 1600, 1500 or 1449 nucleotides in length and has at least 75% identity with SEQ ID NO: 1 over its whole length. Preferably, said nucleotide sequence is at most 8000, 7000, 6000, 5000, 4000, 3000, 2000, 1900, 1800, 1700, 1600, 1500 or 1449 nucleotides in length and has at least 80% identity with SEQ ID NO: 1 over its whole length. Preferably, said nucleotide sequence is at most 8000, 7000, 6000, 5000, 4000, 3000, 2000, 1900, 1800, 1700, 1600, 1500 or 1449 nucleotides in length and has at least 85% identity with SEQ ID NO: 1 over its whole length. Preferably, said nucleotide sequence is at most 8000, 7000, 6000, 5000, 4000, 3000, 2000, 1900, 1800, 1700, 1600, 1500 or 1449 nucleotides in length and has at least 90% identity with SEQ ID NO: 1 over its whole length. Preferably, said nucleotide sequence is at most 8000, 7000, 6000, 5000, 4000, 3000, 2000, 1900, 1800, 1700, 1600, 1500 or 1449 nucleotides in length and has at least 95% identity with SEQ ID NO: 1 over its whole length. Also preferred is a sequence of at most 8000 nucleotides having at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% identity to SEQ ID NO: 1.
[0075] In a further preferred embodiment within said first aspect, said nucleotide sequence comprising both a first promoter and a first intronic sequence according to the present invention is a sequence derived from a CCT8 gene. Preferably, said sequence is derived from a mammalian CCT8 gene. More preferably, said nucleotide sequence comprising both a first promoter and a first intronic sequence according to the present invention is derived from the human or Homo sapiens CCT8 gene. In a preferred embodiment, said nucleotide sequence derived from said CCT8 gene comprising both a first promoter and a first intronic sequence of the invention is a contiguous sequence of at least 500, 600, 700, 791 or 1223 in length, preferably at least 1228 nucleotides in length of SEQ ID NO: 2. Preferably, said nucleotide sequence having at least 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% identity to SEQ ID NO: 2 is at most 8000 nucleotides in length. Preferably, said nucleotide sequence is at most 8000, 7000, 6000, 5000, 4000, 3000, 2000, 1900, 1800, 1700, 1600, 1500 or 1228 in length. Most preferably, said sequence being 1228 nucleotides in length. Preferably, said nucleotide sequence is at most 8000, 7000, 6000, 5000, 4000, 3000, 2000, 1900, 1800, 1700, 1600, 1500 or 1228 nucleotides in length and has at least 65% identity with SEQ ID NO: 2 over its whole length. Preferably, said nucleotide sequence is at most 8000, 7000, 6000, 5000, 4000, 3000, 2000, 1900, 1800, 1700, 1600, 1500 or 1228 nucleotides in length and has at least 70% identity with SEQ ID NO: 2 over its whole length. Preferably, said nucleotide sequence is at most 8000, 7000, 6000, 5000, 4000, 3000, 2000, 1900, 1800, 1700, 1600, 1500 or 1228 nucleotides in length and has at least 75% identity with SEQ ID NO: 2 over its whole length. Preferably, said nucleotide sequence is at most 8000, 7000, 6000, 5000, 4000, 3000, 2000, 1900, 1800, 1700, 1600, 1500 or 1228 nucleotides in length and has at least 80% identity with SEQ ID NO: 2 over its whole length. Preferably, said nucleotide sequence is at most 8000, 7000, 6000, 5000, 4000, 3000, 2000, 1900, 1800, 1700, 1600, 1500 or 1228 nucleotides in length and has at least 85% identity with SEQ ID NO: 2 over its whole length. Preferably, said nucleotide sequence is at most 8000, 7000, 6000, 5000, 4000, 3000, 2000, 1900, 1800, 1700, 1600, 1500 or 1228 nucleotides in length and has at least 90% identity with SEQ ID NO: 2 over its whole length. Preferably, said nucleotide sequence is at most 8000, 7000, 6000, 5000, 4000, 3000, 2000, 1900, 1800, 1700, 1600, 1500 or 1228 nucleotides in length and has at least 95% identity with SEQ ID NO: 2 over its whole length. Also preferred is a sequence of at most 8000 nucleotides having at least 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% identity with SEQ ID NO: 2 over its whole length.
[0076] Preferably within said first aspect, said nucleotide sequence comprising a first promoter and a first intronic sequence as defined herein has at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% identity to any of SEQ ID NO: 1, 2 and 59-61 over its whole length. Preferably, said nucleotide sequence comprising a first promoter and a first intronic sequence as defined herein comprises or consists of any of the sequences selected from the group consisting of SEQ ID NO: 1, 2 and 59-61. Most preferably, said nucleotide sequence comprising a first promoter and a first intronic sequence as defined herein comprises or consists of any of the sequences selected from the group consisting of SEQ ID NO: 1, 2 and 59.
[0077] Preferably, the nucleotide construct of the first aspect further comprises one or more additional expression regulating sequences, wherein preferably said first promoter, said intronic sequences as defined herein, and optionally said additional expression regulating sequence are all configured to be operably linked to an optional nucleotide sequence of interest. An "additional expression regulating sequence" is to be understood herein as a sequence or element in addition to the first and/or second promoter and/or the first and/or second intronic sequence as defined herein above, and may be an additional expression enhancing sequence and/or a distinct expression enhancing sequence. An additional expression regulating sequence as encompassed by the present invention can be, but is not limited to, a transcriptional and/or translational regulation of a gene, including but not limited to, 5'-UTR, 3'-UTR, enhancer, promoter, intron, polyadenylation signal and chromatin control elements such as S/MAR (scaffold/matrix attachment region), ubiquitous chromatin opening element, cytosine phosphodiester guanine island and STAR (stabilizing and anti-repressor element), and any derivatives thereof. Other optional regulating sequences that may be present in the nucleic acid construct of the invention include, but are not limited to, coding nucleotide sequences of homologous and/or heterologous nucleotide sequences, including the Iron Responsive Element (IRE), Translational cis-Regulatory Element (TLRE) or uORFs in 5' UTRs and poly(U) stretches in 3' UTRs. Such one or more additional expression regulating, preferably enhancing elements may be located on any position in the construct, preferably directly aligning or comprised within said first and/or second promoter.
[0078] A further preferred regulating sequence within said first aspect comprises or consists of a translation enhancing element. Preferably, a translation enhancing element allows for an increase in protein or polypeptide expression of at least 5%, 10%, 15%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 100%, 200%, 300%, 500%, 1000%, 1500% or 2000% as compared to expression of said protein or polypeptide using a construct which only differs in that it is free of said translation enhancing element, preferably when tested in a system as exemplified in the Examples which are enclosed herein. More specifically, preferably the expression of nucleotide sequence of interest encoding for secreted alkaline phosphatase (SeAP) is measured in a mammalian cell system, most preferably in CHO cells, using a pcDNA3.1 expression vector comprising a translation enhancing element to be tested and a CMV promoter represented by SEQ ID NO: 57, operably linked to said nucleotide sequence of interest. Expression is preferably measured by measuring the conversion of any suitable alkaline phosphatase substrate and expression levels are compared to expression levels of said nucleotide sequence of interest which are measured under the same conditions except that the expression vector is free of said translation enhancing element to be tested.
[0079] Preferably, within said first aspect said translation enhancing element comprises or consists of a sequence that has at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity with any of SEQ ID NO: 3-51 over its whole length, or a translation enhancing element that comprises or consists of a nucleotide sequence that comprises:
[0080] i) a GAA repeat nucleotide sequence, a TC-rich nucleotide sequence comprising at least 8 consecutive C or T nucleotides, at least 3 A-rich nucleotide sequences comprising at least 5 consecutive A nucleotides, a GT-rich nucleotide sequence comprising at least 10 nucleotides, at least 80% of which are G or T nucleotides;
[0081] ii) a TC-rich nucleotide sequence comprising at least 8 consecutive C or T nucleotides, at least 3 A-rich nucleotide sequences comprising at least 5 consecutive A nucleotides, and a GT-rich nucleotide sequence comprising at least 10 nucleotides, at least 80% of which are G or T nucleotides, said first nucleotide sequence not comprising a GAA repeat nucleotide sequence; or,
[0082] iii) a GAA repeat nucleotide sequence, a TC-rich nucleotide sequence comprising at least 8 consecutive C or T nucleotides, at least 3 A-rich nucleotide sequences comprising at least 5 consecutive A nucleotides, a GT-rich nucleotide sequence comprising at least 10 nucleotides, at least 80% of which are G or T nucleotides, wherein said GAA repeat nucleotide sequence is located 3' of any one or more of said TC-rich nucleotide sequence, A-rich nucleotide sequences, and/or GT-rich nucleotide sequence.
[0083] The GAA repeat nucleotide sequence is defined herein as comprising at least 3 GAA repeats. The GAA repeat nucleotide sequence may comprise an imperfect GAA repeat. The GAA repeat nucleotide sequence may have at least 50% sequence identity, or at least 60% sequence identity, or at least 70% sequence identity, or at least 80% sequence identity, or at least 90% sequence identity or 100% sequence identity with nucleotides 14-50 of SEQ ID NO: 3. The imperfect GAA repeat may comprise the nucleotide sequence (GAA)3ATAA(GAA)8.
[0084] The TC-rich nucleotide sequence is defined herein as having at least 70%, 80%, 90% or 100% sequence identity with nucleotides 54-68 of SEQ ID NO: 3.
[0085] The A-rich nucleotide sequence is defined herein as having at least 70%, 80%, 90% or 100% sequence identity with any one of nucleotides 77-87, nucleotides 93-105, nucleotides 111-121, nucleotides 126-132, or nucleotides 152-169 of SEQ ID NO: 3, respectively.
[0086] The GT-rich nucleotide sequence is defined herein as having at least 70%, 80%, 90% or 100% sequence identity with nucleotides 133-148 of SEQ ID NO: 3.
[0087] Preferably within said first aspect, said translation enhancing sequence comprises or consists of a sequence that has at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO: 19. Preferably, said translation enhancing sequence is located downstream or at the 3' site of the second promoter sequence of the invention and upstream or at the 5' site of an optional nucleotide sequence encoding a protein or polypeptide of interest. Preferably, said translation enhancing sequence is located downstream or at the 3' site of the second promoter sequence of the invention and upstream or at the 5' site of the second intronic sequence as defined herein.
[0088] Most preferably within said first aspect, said nucleic acid construct of the first aspect of the invention comprising a first promoter, a first intronic sequence, a second promoter and a second intronic sequence, has at least 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO: 73, 74, 75, or 76.
[0089] Preferably within said first aspect, the nucleic acid construct of the invention further comprises a nucleotide sequence of interest operably linked to and/or under the control of said first and second promoters and optionally said additional expression regulating sequence as defined herein. It is to be understood that said first promoter and second promoter, and optionally said additional expression regulating sequence are all configured to be operably linked to the same, single nucleotide sequence of interest. In a preferred embodiment, said nucleotide sequence of interest is a nucleotide sequence encoding a protein or polypeptide of interest. The protein or polypeptide of interest can be a homologous protein or polypeptide, but in a preferred embodiment of the invention the protein or polypeptide of interest is a heterologous protein or polypeptide. A nucleotide sequence encoding a heterologous protein or polypeptide may be derived in whole or in part from any source known to the art, including a bacterial or viral genome or episome, eukaryotic nuclear or plasmid DNA, cDNA or chemically synthesized DNA. The nucleotide sequence encoding a protein or polypeptide of interest may constitute an uninterrupted coding region or it may include one or more introns bounded by appropriate splice junctions, it can further be composed of segments derived from different sources, naturally occurring or synthetic. The nucleotide sequence encoding the protein or polypeptide of interest according to the method of the invention is preferably a full-length nucleotide sequence, but can also be a functionally active part or other part of said full-length nucleotide sequence. The nucleotide sequence encoding the protein or polypeptide of interest may also comprise signal sequences directing the protein or polypeptide of interest when expressed to a specific location in the cell or tissue. Furthermore, the nucleotide sequence encoding the protein or polypeptide of interest can also comprise sequences which facilitate protein purification and protein detection by for instance Western blotting and ELISA (e.g. c-myc or polyhistidine sequences).
[0090] Within the context of the invention, the protein or polypeptide of interest may have industrial or medicinal (pharmaceutical) applications. Examples of proteins or polypeptides with industrial applications include enzymes such as e.g. lipases (e.g. used in the detergent industry), proteases (used inter alia in the detergent industry, in brewing and the like), cell wall degrading enzymes (such as, cellulases, pectinases, beta.-1,3/4- and beta.-1,6-glucanases, rhamnogalacturonases, mannanases, xylanases, pullulanases, galactanases, esterases and the like, used in fruit processing wine making and the like or in feed), phytases, phospholipases, glycosidases (such as amylases, beta-glucosidases, arabinofuranosidases, rhamnosidases, apiosidases and the like), dairy enzymes (e.g. chymosin). Mammalian, and preferably human, proteins or polypeptides and/or enzymes with therapeutic, cosmetic or diagnostic applications include, but are not limited to, insulin, serum albumin (HSA), lactoferrin, hemoglobin a and B, tissue plasminogen activator (tPA), erythropoietin (EPO), tumor necrosis factors (TNF), BMP (Bone Morphogenic Protein), growth factors (G-CSF, GM-CSF, M-CSF, PDGF, EGF, and the like), peptide hormones (e.g. calcitonin, somatomedin, somatotropin, growth hormones, follicle stimulating hormone (FSH), interleukins (IL-x), interferons (IFN-y), phosphatases, antibodies, and antibody-like proteins such as, but not limited to, multispecific antibodies like DART (Dual-Affinity Re-Targeting) and Tribody protein, and antibody fragments like Fc, Fab, Fab2, Fv and scFv. Also included are bacterial and viral antigens, e.g. for use as vaccines, including e.g. heat-labile toxin B-subunit, cholera toxin B-subunit, envelope surface protein Hepatitis B virus, capsid protein Norwalk virus, glycoprotein B Human cytomegalovirus, glycoprotein S, interferon, and transmissible gastroenteritis corona virus receptors and the like. Further included are genes coding for mutants or analogues of the said proteins.
[0091] Within the context of the invention, in an alternative embodiment, said nucleotide sequence of interest is not a coding sequence for a protein or a polypeptide but may be a functional nucleotide sequence such as, but is not limited to, a sequence encoding a non-coding RNA, wherein a non-coding RNA is understood to be an RNA not coding for a protein or polypeptide. Preferably, said non-coding RNA is a reference sequence or regulatory molecule that may regulate the expression of genes or regulating the activity or localization of proteins or polypeptides. For instance, a non-coding RNA may be an antisense RNA or miRNA molecule. As the first promoter and second promoter of the invention is believed to work at the level of transcription, i.e. the increase in expression by the sequence of the invention comprising said first promoter and second promoter as shown herein is believed to result from an increase in transcription, the construct of the invention can also be used for producing increased levels of transcripts, as well as producing transcripts with different sequences. Transcription levels can be quantified by using regular transcription quantification methods known by the person skilled in the art such as, but not limited to, Northern blotting and RT-qPCR.
Second Aspect
[0092] In a second aspect, the present invention provides an expression vector comprising a nucleic acid construct according to the first aspect of the invention. The nucleic acid construct according to the invention is preferably a vector, in particular a plasmid, cosmid or phage or nucleotide sequence, linear or circular, of a single or double stranded DNA or RNA, derived from any source, in which a number of nucleotide sequences have been joined or recombined into a unique construction which is capable of introducing any one of the nucleotide sequences of the invention in sense or antisense orientation into a cell. The choice of vector is dependent on the recombinant procedures followed and the host cell used. The vector may be an autonomously replicating vector or may replicate together with the chromosome into which it has been integrated. Preferably, the vector contains a selection marker. Useful markers are dependent on the host cell of choice and are well known to persons skilled in the art and are selected from, but not limited to, the selection markers as defined in third aspect of the invention. A preferred expression vector is the pcDNA3.1 expression vector. Preferred selection markers are the neomycin resistance gene, zeocin resistance gene and blasicidin resistance gene.
Third Aspect
[0093] In a third aspect, the present invention provides a cell comprising a nucleic acid construct according to the first aspect of the invention, and/or an expression vector according to the second aspect of the invention as defined herein.
[0094] Within the context of the invention, a cell may be a mammalian, including human cell, a plant, animal, insect, fungal, yeast or bacterial cell. A recombinant host cell, such as a mammalian, including human, plant, animal, insect, fungal or bacterial cell, containing one or more copies of a nucleic acid construct according to the invention is an additional subject of the invention. By host cell is meant a cell which contains a nucleic acid construct such as a vector and supports the replication and/or expression of the nucleic acid construct. Examples of suitable bacteria are Gram positive bacteria such as several species of the genera Bacillus, Streptomyces and Staphylococcus or Gram negative bacteria such as several species of the genera Escherichia and Pseudomonas. Fungal cells include yeast cells. Expression in yeast can be achieved by using yeast strains such as Pichia pastoris, Saccharomyces cerevisiae and Hansenula polymorpha. Other fungal cells of interest include filamentous fungi cells as Aspergillus niger, Trichoderma reesei, and the like. Furthermore, insect cells such as cells or cell lines from Drosophila melanogaster, Spodoptera frugiperda, and Trichoplusia ni, such as, but not limited to, S2, Sf9, Sf21, and High Five cells, can be used as host cells. Alternatively, a suitable expression system can be a baculovirus system or expression systems using mammalian cells such as CHO, COS, CPK (porcine kidney), MDCK, BHK, Sp2/0, NSO, and Vero cells. A suitable human cell or human cell line is an astrocyte, adipocyte, chondrocyte, endothelial, epithelial, fibroblast, hair, keratinocyte, melanocyte, osteoblast, skeletal muscle, smooth muscle, stem, synoviocyte cell or cell line. Examples of suitable human cell lines also include HEK 293 (human embryonic kidney), HeLa, Per. C6, CAP (cell lines derived from primary human amniocytes), and Bowes melanoma cells. In an embodiment a human cell is not an embryonic stem cell.
[0095] Therefore, another aspect of the invention relates to a host cell that is genetically modified, preferably by a method of the invention, in that a host cell comprises a nucleic acid construct as herein defined above. Host cell is a cell that has been genetically modified. The wording host cell may be replaced by modified cell or transformed cell or recombinant cell or modified host cell or transformed host cell or recombinant host cell. For transformation procedures in plants, suitable bacteria include Agrobacterium tumefaciens and Agrobacterium rhizogenes.
[0096] A nucleic acid construct preferably is stably maintained, either as an autonomously replicating element, or, more preferably, the nucleic acid construct is integrated into the host cell's genome, in which case the construct is usually integrated at random positions in the host cell's genome, for instance by non-homologous recombination. Stably transformed host cells are produced by known methods. The term stable transformation refers to exposing cells to methods to transfer and incorporate foreign DNA into their genome. These methods include, but are not limited to transfer of purified DNA via cationic lipid reagents and polyethyleneimide (PEI), calcium-phosphate co-precipitation, microparticle bombardment, electroporation of protoplasts and microinjection or use of silicon fibers to facilitate penetration and transfer of DNA into the host cell.
[0097] Alternatively, a protein or polypeptide of interest may be expressed in a host cell, e.g., a mammalian cell, relying on transient expression from vectors.
[0098] A nucleic acid construct according to the invention preferably also comprises a marker gene which can provide selection or screening capability in a treated host cell. Selectable markers are generally preferred for host transformation events, but are not available for all host cells. A nucleic acid construct disclosed herein can also include a nucleotide sequence encoding a marker product. A marker product can be used to determine if the construct or portion thereof has been delivered to the cell and once delivered is being expressed. Examples of marker genes include, but are not limited to the E. coli lacZ gene, which encodes B-galactosidase, and a gene encoding the green fluorescent protein.
[0099] Within the context of the invention, examples of suitable selectable markers for mammalian cells include, but are not limited to dihydrofolate reductase (DHFR), glutathione synthetase (GS), thymidine kinase, neomycin, neomycin analog G418, hygromycin, blasticidin, zeocin and puromycin.
[0100] Other suitable selectable markers include, but are not limited to antibiotic, metabolic, auxotrophic or herbicide resistant genes which, when inserted in a host cell in culture, would confer on those cells the ability to withstand exposure to an antibiotic. Metabolic or auxotrophic marker genes enable transformed cells to synthesize an essential component, usually an amino acid, which allows the cells to grow on media that lack this component. Another type of marker gene is one that can be screened by histochemical or biochemical assay, even though the gene cannot be selected for. A suitable marker gene found useful in such host cell transformation experience is a luciferase gene. Luciferase catalyzes the oxidation of luciferin, resulting in the production of oxyluciferin and light. Thus, the use of a luciferase gene provides a convenient assay for the detection of the expression of introduced DNA in host cells by histochemical analysis of the cells. In an example of a transformation process, a nucleotide sequence sought to be expressed in a host cell could be coupled in tandem with the luciferase gene. The tandem construct could be transformed into host cells, and the resulting host cells could be analyzed for expression of the luciferase enzyme. An advantage of this marker is the non-destructive procedure of application of the substrate and the subsequent detection.
[0101] When such selectable markers are successfully transferred into a host cell, the transformed host cell can survive if placed under selective pressure. There are two widely used distinct categories of selective regimes. The first category is based on a cell's metabolism and the use of a mutant cell line which lacks the ability to grow independent of a supplemented media. Two non-limiting examples are CHO DHFR-cells and mouse LTK-cells. These cells lack the ability to grow without the addition of such nutrients as thymidine or hypoxanthine. Because these cells lack certain genes necessary for a complete nucleotide synthesis pathway, they cannot survive unless the missing nucleotides are provided in a supplemented media. An alternative to supplementing the media is to introduce an intact DHFR or TK gene into cells lacking the respective genes, thus altering their growth requirements. Individual cells which were not transformed with the DHFR or TK gene will not be capable of survival in non-supplemented media.
[0102] The second category is dominant selection which refers to a selection scheme used in any cell type and does not require the use of a mutant cell line. These schemes typically use a drug to arrest growth of a host cell. Those cells which have a novel gene would express a protein conveying drug resistance and would survive the selection. Examples of such dominant selection use the drugs neomycin, (Southern P. and Berg, P., J. Molec. Appl. Genet. 1: 327 (1982)), mycophenolic acid, (Mulligan, R.C. and Berg, P. Science 209: 1422 (1980)) or hygromycin, (Sugden, B. et al., Mol. Cell. Biol. 5: 410-413 (1985)). The three examples employ bacterial genes under eukaryotic control to convey resistance to the appropriate drug G418 or neomycin (geneticin), xgpt (mycophenolic acid) or hygromycin, respectively. Others include the neomycin analog G418 and puromycin. Other useful markers are dependent on the host cell of choice and are well known to persons skilled in the art.
[0103] When a transformed host cell is obtained with a method according to the invention (see below), a host tissue may be regenerated from said transformed cell in a suitable medium, which optionally may contain antibiotics or biocides known in the art for the selection of transformed cells.
[0104] Resulting transformed host tissues are preferably identified by means of selection using a selection marker gene as present on a nucleic acid construct as defined herein.
Fourth Aspect
[0105] In a fourth aspect, the present invention provides a method for expressing and optionally purifying a protein or polypeptide of interest comprising the step of:
[0106] a) providing a nucleic acid construct according to the first aspect of the invention comprising a nucleotide sequence encoding a protein or polypeptide of interest; and,
[0107] b) contacting a cell with said nucleic acid construct to obtain a transformed cell; and,
[0108] c) allowing said transformed cell to express the protein or polypeptide of interest; and optionally,
[0109] d) purifying said protein or polypeptide of interest.
[0110] The method of the invention may be an in vitro or ex vivo method. The method of the invention may be applied on a cell culture, organism culture, or tissue culture. Alternatively, next to the expression in host cells, the protein or polypeptide of interest can be produced in cell-free translation systems using RNAs derived from the nucleic acid constructs of the present invention. The method of the invention may be performed on cultured cells.
[0111] The skilled person is capable of transforming cells in accordance with step b). Transformation methods as used in step b) include, but are not limited to transfer of purified DNA via cationic lipid reagents and plyethyleneimide (PEI), calcium-phosphate co-precipitation, microparticle bombardment, electroporation of protoplasts and microinjection or use of silicon fibers to facilitate penetration and transfer of DNA into the host cell.
[0112] In step c) the transformed cell is allowed to express the protein or polypeptide of interest, and optionally said protein or polypeptide is subsequently recovered. For example, the transformed cell may be subjected to conditions leading to expression of the protein or polypeptide of interest. The person skilled in the art is well aware of techniques to be used for expressing or overexpressing the protein or polypeptide of interest. Methods in which the transformed cell does not need to be subjected to specific conditions leading to expression of the protein or polypeptide of interest, but in which the protein or polypeptide of interest is automatically (e.g., constitutively) expressed, are also included in the method of the present invention.
[0113] Within the context of the invention, purification steps depend on the expressed protein or polypeptide and the host cell used but can comprise isolation of the protein or polypeptide. When applied to a protein/polypeptide, the term "isolation" indicates that the protein or polypeptide is found in a condition other than its native environment. In a preferred form, the isolated protein or polypeptide is substantially free of other proteins, particularly other homologous proteins. It is preferred to provide the protein or polypeptide in a greater than 40% pure form, more preferably greater than 60% pure form. Even more preferably it is preferred to provide the protein or polypeptide in a highly purified form, i.e., greater than 80% pure, more preferably greater than 95% pure, and even more preferably greater than 99% pure, as determined by SDS-PAGE. If desired, the nucleotide sequence encoding a protein or polypeptide of interest may be ligated to a heterologous nucleotide sequence to encode a fusion protein or polypeptide to facilitate protein purification and protein detection on for instance Western blot and in an ELISA. Suitable heterologous sequences include, but are not limited to, the nucleotide sequences coding for proteins such as for instance glutathione-S-transferase, maltose binding protein, metal-binding polyhistidine, green fluorescent protein, luciferase and beta-galactosidase. The protein or polypeptide may also be coupled to non-peptide carriers, tags or labels that facilitate tracing of the protein or polypeptide, both in vivo and in vitro, and allow for the identification and quantification of binding of the protein or polypeptide to substrates. Such labels, tags or carriers are well-known in the art and include, but are not limited to, biotin, radioactive labels and fluorescent labels.
[0114] Preferably, the method of this fourth aspect of the invention allows for an increase in expression of a protein or polypeptide of interest. Preferably, expression levels are established in an expression system using an expression construct according to the first aspect of the invention comprising a first and a second promoter according to the first aspect of the invention, operably linked to a nucleotide sequence encoding a protein or polypeptide of interest. Preferably, said protein or polypeptide of interest is a secreted protein or polypeptide and expression of said protein or polypeptide of interest is detected by a suitable assay such as an ELISA assay, Western blotting or, dependent on the identity of the protein or polypeptide of interest, any suitable protein identification and/or quantification assay known to the person skilled in the art. Preferably, the method of the invention allows for an increase in protein or polypeptide expression of at least 5%, 10%, 15%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 100%, 200%, 300%, 500%, 1000%, 1500% or 2000% as compared to a method which only differs in that a construct is used in step a) that it is free of said first promoter and said second promoter and is operably linked to a single promoter, preferably when tested in a system as exemplified in the Examples which are enclosed herein. More specifically, the expression of nucleotide sequence of interest encoding for secreted alkaline phosphatase (SeAP) is measured in a mammalian cell system, most preferably in CHO cells, using a pcDNA3.1 expression vector comprising a first promoter and second promoter sequence to be tested operably linked to said nucleotide sequence of interest. Expression is preferably measured by measuring the conversion of any suitable alkaline phosphatase substrate and expression levels are compared to expression levels of said nucleotide sequence of interest which are measured under the same conditions except that said first promoter and second promoter are replaced by a single promoter, preferably a CMV promoter represented by SEQ ID NO: 57, in the expression vector used.
Fifth Aspect
[0115] In a fifth aspect, the present invention provides a method for expressing a protein or polypeptide of interest in an organism, comprising the steps of:
[0116] a) providing a nucleic acid construct according to the first aspect of the invention comprising a nucleotide sequence encoding a protein or polypeptide of interest; and,
[0117] b) contacting a target cell and/or target tissue of an organism, with said nucleic acid construct to obtain a transformed target cell and/or transformed target tissue, allowing said transformed cell to express the protein or polypeptide of interest; and optionally,
[0118] c) allowing said transformed target cell and/or target tissue to develop into a transformed organism; and, optionally,
[0119] d) allowing said transformed organism to express the protein or polypeptide of interest, for example, subjecting said transformed organism to conditions leading to expression of the protein or polypeptide of interest, and optionally recovering said protein or polypeptide.
[0120] Within the context of the invention, the target cell may be an embryonal target cell, e.g., embryonic stem cell, for example, derived from a non-human mammalian, such as bovine, porcine, et cetera species. Preferably, said target cell is not a human embryonic stem cell. In the case of a multicellular fungus, such target cell may be a fungal cell that can be proliferated into said multicellular fungus. When a transformed plant tissue or plant cell (e.g., pieces of leaf, stem segments, roots, but also protoplasts or plant cells cultivated by suspension) is obtained with the method according to the invention, whole plants can be regenerated from said transformed tissue or cell in a suitable medium, which optionally may contain antibiotics or biocides known in the art for the selection of transformed cells. The method of the invention may be applied in nucleic acid based vaccination and/or gene therapy preferably in a mammal, most preferably in a human. Encompassed within the present invention is a method of treatment comprising the method of the present aspect, wherein the protein or polypeptide of interest is a therapeutic and/or immunogenic protein or polypeptide. The invention also relates to a construct of the first aspect of the invention for treatment, wherein the protein or polypeptide of interest is a therapeutic and/or immunogenic protein or polypeptide. Furthermore, the invention relates to the use of a construct of the first aspect of the invention for the manufacture of a medicament, wherein the protein or polypeptide of interest is a therapeutic and/or immunogenic protein or polypeptide.
[0121] Furthermore, a part of the invention is a non-human transformed organism. Said organism is transformed with a nucleotide sequence, recombinant nucleic acid construct, or vector according to the present invention, and is capable of producing the polypeptide of interest. This includes a non-human transgenic organism, such as a transgenic non-human mammalian, transgenic plant (including propagation, harvest and tissue material of said transgenic plant, including, but not limited to, leafs, roots, shoots and flowers), multicellular fungus, and the like.
[0122] Preferably, the method of this fifth aspect of the invention allows for an increase in expression of a protein or polypeptide of interest in said organism or at least in one tissue or organelle or organ of said organism. Preferably, expression levels are established in an expression system using an expression construct according to the first aspect of the invention comprising a first and a second promoter according to the first aspect of the invention, operably linked to a nucleotide sequence encoding a protein or polypeptide of interest. Preferably, said protein or polypeptide of interest is a secreted protein or polypeptide and expression of said protein or polypeptide of interest is detected by a suitable assay such as an ELISA assay, Western blotting or, dependent on the identity of the protein or polypeptide of interest, any suitable protein identification and/or quantification assay known to the person skilled in the art. Preferably, the method of the invention allows for an increase in protein or polypeptide expression of at least 5%, 10%, 15%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 100%, 200%, 300%, 500%, 1000%, 1500% or 2000% in said organism or at least in one tissue or organelle or organ of said organism as compared to a method which only differs in that a construct is used in step a) that it is free of said first promoter and said second promoter and is operably linked to a single promoter, preferably when tested in a system as exemplified in the Examples which are enclosed herein. More specifically, the expression of nucleotide sequence of interest encoding for secreted alkaline phosphatase (SeAP) is measured in a mammalian cell system, most preferably in CHO cells, using a pcDNA3.1 expression vector comprising a first promoter and second promoter sequence to be tested operably linked to said nucleotide sequence of interest. Expression is preferably measured by measuring the conversion of any suitable alkaline phosphatase substrate and expression levels are compared to expression levels of said nucleotide sequence of interest which are measured under the same conditions except that said first promoter and second promoter are replaced by a single promoter, preferably a CMV promoter represented by SEQ ID NO: 57, in the expression vector used. Preferably, said increase of protein or polypeptide expression is copy number independent as established by an assay suitable to determine copy number dependency by a skilled person such as, but not limit to, a triplex Taqman assay as further detailed in Example 4 of the present invention.
Sixth Aspect
[0123] In a sixth aspect, the present invention provides a method for transcription and optionally purifying the produced transcript comprising the step of:
[0124] a) providing a nucleic acid construct according to the first aspect of the invention comprising a nucleotide sequence of interest; and,
[0125] b) contacting a cell with said nucleic acid construct to obtain a transformed cell; and,
[0126] c) allowing said transformed cell to produce a transcript of the nucleotide sequence of interest; and optionally,
[0127] d) purifying said produced transcript.
[0128] In a preferred embodiment of the method according to the invention a nucleic acid construct as defined above is used. The method of the invention may be an in vitro or ex vivo method. The method of the invention may be applied on a cell culture, organism culture, or tissue culture. The method of the invention may be applied in nucleic acid based vaccination and/or gene therapy preferably in a mammal, preferably in a human. Encompassed within the present invention is a method for treatment comprising or consisting of the method of the present aspect, wherein the nucleotide sequence of interest encodes for a therapeutic transcript. The invention also relates to a construct of the first aspect of the invention for use in treatment, wherein the nucleotide sequence of interest encodes for a therapeutic transcript. Furthermore, the invention relates to the use of a construct of the first aspect of the invention for the manufacture of a medicament, wherein the nucleotide sequence of interest encodes for a therapeutic transcript.
[0129] The skilled person is capable of transforming cells in accordance with step b). Transformation methods as used in step b) include, but are not limited to transfer of purified DNA via cationic lipid reagents and polyethyleneimide (PEI), calcium-phosphate co-precipitation, microparticle bombardment, electroporation of protoplasts and microinjection or use of silicon fibers to facilitate penetration and transfer of DNA into the host cell.
[0130] In step c) the transformed cell is allowed to produce a transcript of the nucleotide sequence of interest, and optionally the produced transcript is subsequently recovered. For example, the transformed cell may be subjected to conditions leading to transcription the nucleotide sequence of interest. The person skilled in the art is well aware of techniques to be used for transcription the nucleotide sequence of interest. Methods in which the transformed cell does not need to be subjected to specific conditions leading to transcription of the nucleotide sequence of interest, but in which the nucleotide sequence of interest is automatically (e.g., constitutively) transcribed, are also included in the method of the present invention.
[0131] Purification steps depend on the transcript produced. The term "isolation" indicates that the transcript is found in a condition other than its native environment. In a preferred form, the isolated transcript is substantially free of other cellular components, particularly other homologous cellular components such as homologous proteins. It is preferred to provide the transcript in a greater than 40% pure form, more preferably greater than 60% pure form. Even more preferably it is preferred to provide the transcript in a highly purified form, i.e., greater than 80% pure, more preferably greater than 95% pure, and even more preferably greater than 99% pure, as determined by Northern blotting.
[0132] Preferably, the method of this aspect of the invention allows for an increase in transcription of a nucleotide sequence of interest. Preferably, transcription levels are established in an expression system using an expression construct according to the first aspect of the invention comprising a first and a second promoter according to the first aspect of the invention operably linked to a nucleotide sequence of interest. Preferably, transcription of said nucleotide sequence of interest is detected by a suitable assay such as RT-qPCR. Preferably, the method of the invention allows for an increase in transcription of at least 5%, 10%, 15%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 100%, 200%, 300%, 500%, 1000%, 1500% or 2000% as compared to a method which only differs in that a construct is used in step a) that it is free of said first promoter and second promoter and is operably linked to a single promoter, preferably when tested in a system as exemplified in the Examples which are enclosed herein. More specifically, preferably the transcription of a nucleotide sequence of interest encoding for secreted alkaline phosphatase (SeAP) is measured in a mammalian cell system, most preferably in CHO cells, using a pcDNA3.1 expression vector comprising a first promoter and second promoter sequence to be tested operably linked to said nucleotide sequence of interest. Transcription is preferably measured using RT-qPCR and transcription levels are compared to transcription levels of said nucleotide sequence of interest which are measured under the same conditions except that said first promoter and second promoter are replaced by a single promoter, preferably a CMV promoter represented by SEQ ID NO: 57, in the expression vector used.
Seventh Aspect
[0133] In a seventh aspect, the present invention provides a method for splicing or redirecting the splicing of a nucleotide sequence of interest, and optionally purifying the produced transcripts comprising the step of:
[0134] a) providing a nucleic acid construct according to the first aspect of the invention comprising a nucleotide sequence of interest; and,
[0135] b) contacting a cell with said nucleic acid construct to obtain a transformed cell; and,
[0136] c) allowing said transformed cell to produce transcripts of the nucleotide sequence of interest resulting in the production of a transcript; and optionally,
[0137] d) purifying said produced transcripts.
[0138] Preferably within this aspect, said nucleic acid construct used in step a) comprises a nucleotide sequence upstream or at the 5' site of the second intronic sequence of the invention that is different or distinct from the nucleotide sequence upstream or at the 5' site of the first intronic sequence of the invention. Preferably, the nucleotide sequence between first promoter and 5' of said first intronic sequence and the nucleotide sequence between second promoter and 5' of said second intronic sequence differs at least 5%, 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90% or 100% in nucleotide sequence . Preferably, a method of this aspect of the invention wherein such a nucleic acid construct is used results in the production of two different or distinct transcripts, which differ in nucleotide sequence at the 5' site of the transcripts. In case the nucleotide sequence of interest is located downstream or at the 3' site of the second intronic sequence, the resulting transcripts will differ in sequence upstream of said nucleotide sequence of interest as can be detected by any suitable assay known by the person skilled in the art, such as, but not limited to 5'RACE-PCR.
[0139] In a preferred embodiment within this aspect, the nucleotide sequence of interest is a sequence encoding a protein or polypeptide of interest. The method of this aspect can be used to produce two proteins or polypeptides with different or distinct N-termini using the construct of the invention. Preferably, a first protein or polypeptide comprising a first N-terminus and a second protein or polypeptide comprising a second N-terminus are produced using the method of this aspect, wherein preferably, a first nucleotide sequence encoding said first N-terminus is located directly upstream or at the 5' site of said first intronic sequence and a second nucleotide sequence encoding said second N-terminus is located directly upstream or at the 5' site of said second intronic sequence. Preferably said first nucleotide sequence encoding said first N-terminus is located downstream or at the 3' site of said first promoter and upstream or at the 5' site of said first intronic sequence. Preferably said second nucleotide sequence encoding said second N-terminus is located downstream or at the 3' site of said second promoter and upstream or at the 5' site of said second intronic sequence. Preferably, said nucleic acid construct further comprises a nucleotide sequence encoding a C-terminus located downstream or at the 3' site of said second intronic sequence. In this embodiment, it is required that said second intronic sequence is an intron as defined in the Definition section. The difference between the N termini may be limited to a signal sequence and result in identical expressed proteins or polypeptides, wherein the localization of the proteins or polypeptides may differ. If performed in an expression system as earlier defined herein, the method of this embodiment preferably results in the production of two proteins or polypeptides of interest, wherein said first protein or polypeptide will comprise said first N-terminus linked to said C-terminus and said second protein or polypeptide will comprise said second N-terminus linked to said C-terminus, as can be detected by any suitable assay known by the person skilled in the art, such as, but not limited to, ELISA assay, Western blotting or, dependent on the identity of the protein or polypeptide of interest, any suitable protein identification and/or quantification assay known to the person skilled in the art. Preferably, said assay used to detect the two different or distinct proteins or polypeptides produced is specifically adapted to distinguish between the different proteins or polypeptides produced, for instance using a detecting antibody specifically binding to either the first or the second N-terminus of proteins or polypeptides produced.
Eight Aspect
[0140] In an eighth aspect, the present invention provides a use of a nucleic acid construct according to the first aspect of the invention, and/or a use of an expression vector according to the second aspect of the invention, and/or a use of a cell according to the third aspect of the invention, for the expression of a protein or polypeptide of interest.
Ninth Aspect
[0141] In a ninth aspect, the present invention provides for a nucleic acid construct according to the first aspect of the invention, and/or an expression vector according to the second aspect of the invention, and/or a cell according to the third aspect of the invention for use as a medicament. The invention also relates to a method of treatment comprising the administration of a nucleic acid construct according to the first aspect of the invention, and/or an expression vector according to the second aspect of the invention, and/or a cell according to the third aspect of the invention, wherein preferably said administration is to a mammal, more preferably to a human. Preferably, said treatment is nucleic acid based vaccination and/or gene therapy preferably in a mammal, most preferably in a human. Furthermore, the invention relates to the use of a nucleic acid construct according to the third aspect of the invention, and/or the use of an expression vector according to the second aspect of the invention, and/or the use of a cell according to the third aspect of the invention, for the preparation of a medicament. Preferably said medicament is for nucleic acid based vaccination and/or gene therapy preferably in a mammal, most preferably in a human.
Tenth Aspect
[0142] In a tenth aspect, the present invention provides a nucleic acid molecule that is represented by a nucleotide sequence that comprises or consists of an expression enhancing element for increasing transcription of a nucleotide sequence of interest and/or expression of a protein or polypeptide of interest. Preferably, the expression enhancing element of the invention is capable of increasing the transcription of a nucleotide sequence of interest and/or expression of a protein or polypeptide of interest. Preferably in this aspect, the expression enhancing element of the invention capable of increasing the transcription of a nucleotide sequence of interest and/or expression of a protein or polypeptide of interest is located upstream or at the 5' site of a promoter that is operably linked to said nucleotide sequence of interest.
[0143] Preferably within this aspect, transcription levels are established in an expression system using an expression construct comprising said expression enhancing element operably linked to a nucleotide sequence of interest using a suitable assay such a RT-qPCR. Preferably, the expression enhancing element of the invention allows for an increase in transcription of at least 5%, 10%, 15%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 100%, 200%, 300%, 500%, 1000%, 1500% or 2000% as compared to transcription levels using a construct which only differs in that it is free of said expression enhancing element, preferably as exemplified in the Examples which are enclosed herein. More specifically, preferably the transcription of a nucleotide sequence of interest encoding for secreted alkaline phosphatase (SeAP) is measured in a mammalian cell system, most preferably in CHO cells, using a pcDNA3.1 expression vector comprising an expression enhancing element to be tested and a CMV promoter represented by SEQ ID NO: 57, operably linked to said nucleotide sequence of interest. Transcription is preferably measured using RT-qPCR and transcription levels are compared to transcription levels of said nucleotide sequence of interest measured under the same conditions except that the expression vector used is free of said expression enhancing element to be tested.
[0144] Preferably within this aspect, expression levels are established in an expression system using an expression construct comprising said expression enhancing element operably linked to a nucleotide sequence encoding a protein or polypeptide of interest. Preferably, said protein or polypeptide of interest is a secreted protein or polypeptide and expression of said protein or polypeptide of interest is detected by a suitable assay such as an enzyme-linked immunosorbent assay (ELISA) assay, Western blotting or, dependent on the identity of the protein or polypeptide of interest, any suitable protein identification and/or quantification assay known to the person skilled in the art. Preferably, the expression enhancing element of the invention allows for an increase in protein or polypeptide expression of at least 5%, 10%, 15%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 100%, 200%, 300%, 500%, 1000%, 1500% or 2000% as compared to expression of said protein or polypeptide using a construct which only differs in that it is free of said expression enhancing element, preferably when tested in a system as exemplified in the Examples which are enclosed herein. More specifically, preferably the expression of a nucleotide sequence of interest encoding for secreted alkaline phosphatase (SeAP) is measured in a mammalian cell system, most preferably in CHO cells, using a pcDNA3.1 expression vector comprising an expression enhancing element to be tested and a CMV promoter represented by SEQ ID NO: 57, operably linked to said nucleotide sequence of interest. Expression is preferably measured by measuring the conversion of any suitable alkaline phosphatase substrate and expression levels are compared to expression levels of said nucleotide sequence of interest which are measured under the same conditions except that the expression vector is free of said expression enhancing element to be tested.
[0145] Preferably within this aspect, said nucleic acid molecule is an isolated nucleic acid molecule as defined herein. In a preferred embodiment, said expression enhancing element is a sequence that is derived from the UBC ubiquitin gene. Preferably, said expression enhancing element is derived from a mammalian UBC ubiquitin gene. More preferably, said expression enhancing element is derived from the Cricetulus griseus homologous gene of the human UBC ubiquitin gene, said gene being indicated as the Cricetulus sp. gene for polyubiquitin, or CRUPUQ (GenBank D63782).
[0146] In a preferred embodiment, said expression enhancing element derived from CRUPUQ comprises or consists of a sequence that has at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% identity to nucleotides 1-969 of SEQ ID NO: 1. Preferably, said sequence having at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% identity to nucleotides 1-969 of SEQ ID NO: 1 is a promoter as defined in the Definition section. Preferably, said promoter is capable of initiating transcription of a nucleotide sequence of interest and/or expression of a polypeptide or protein or polypeptide of interest encoded by a nucleotide sequence in a host cell as defined herein below. In a further preferred embodiment, said expression enhancing element derived from CRUPUQ comprises or consists of an intronic sequence. An intronic sequence is understood to be at least part of the nucleotide sequence of an intron. Preferably, said intronic sequence comprises at least a donor splice site or splice site GT. A donor splice site is understood herein as a splice site that, when combined with an acceptor splice site as defined herein, results in the formation of an intron as defined in the Definition section. Preferably, a nucleotide sequence is an intron if at least 2%, 5%, 10%, 15%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90% or 100% of the primary RNA loses this sequence by RNA splicing using an assay suitable to detect intron splicing, such as but not limited to reverse-transcriptase polymerase chain reaction (RT-PCR) followed by size or sequence analysis of the RT-PCR. Preferred donor splice sites of the invention are M-W-G-[cut]-G-T-R-A-G-K in case the host cell is a mammalian cell, A-G-[cut]-G-T-A-W-K in case the host cell is a plant cell, [cut]-G-T-A-W-G-T-T in case the host cell is a yeast cell and R-G-[cut]-G-T-R-A-G, in case the host cell is an insect cell. "[cut]" is to be understood herein as the specific cut site where splicing will take place. Intron splicing can be assessed functionally using an assay as detailed in the Definition section under "intron". Most preferably, the donor splice site comprised within the expression enhancing element of the invention is C-T-G-[cut]-G-T-G-A-G-G. Preferably, said intronic sequence encompassed within said expression enhancing element consists of said donor splice site or splice site GT. Preferably, said intronic sequence encompassed within said expression enhancing element is free of an acceptor splice site as defined herein below. Preferably, said expression enhancing element comprising an intronic sequence has at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% identity to nucleotides 970-1449 of SEQ ID NO: 1. In a preferred embodiment, said expression enhancing element comprises or consists of a promoter and an intronic sequence as defined herein. Preferably, the expression enhancing element of the invention has at least 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% identity to SEQ ID NO: 1 over its whole length. Preferably, said expression enhancing element of the invention that has at least 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% identity to SEQ ID NO: 1 over its whole length comprises both a promoter and an intronic sequence as defined herein. Preferably, said expression enhancing element for increasing expression is a contiguous sequence of at least 500, 600, 700, 800, 900, 1000, 1100 or 1117 in length, preferably at least 1449 nucleotides in length of SEQ ID NO: 1. Preferably, said expression enhancing element having at least 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% identity to SEQ ID NO: 1 is at most 8000 nucleotides in length. Preferably, said expression enhancing element is at most 8000, 7000, 6000, 5000, 4000, 3000, 2000, 1900, 1800, 1700, 1600, 1500 or 1449 in length. Most preferably, said sequence being 1449 nucleotides in length. Preferably, said expression enhancing element is at most 8000, 7000, 6000, 5000, 4000, 3000, 2000, 1900, 1800, 1700, 1600, 1500 or 1449 nucleotides in length and has at least 65% identity to SEQ ID NO: 1 over its whole length. Preferably, said expression enhancing element is at most 8000, 7000, 6000, 5000, 4000, 3000, 2000, 1900, 1800, 1700, 1600, 1500 or 1449 nucleotides in length and has at least 70% identity to SEQ ID NO: 1 over its whole length. Preferably, said expression enhancing element is at most 8000, 7000, 6000, 5000, 4000, 3000, 2000, 1900, 1800, 1700, 1600, 1500 or 1449 nucleotides in length and has at least 75% identity to SEQ ID NO: 1 over its whole length. Preferably, said expression enhancing element is at most 8000, 7000, 6000, 5000, 4000, 3000, 2000, 1900, 1800, 1700, 1600, 1500 or 1449 nucleotides in length and has at least 80% identity to SEQ ID NO: 1 over its whole length. Preferably, said expression enhancing element is at most 8000, 7000, 6000, 5000, 4000, 3000, 2000, 1900, 1800, 1700, 1600, 1500 or 1449 nucleotides in length and has at least 85% identity to SEQ ID NO: 1 over its whole length. Preferably, said expression enhancing element is at most 8000, 7000, 6000, 5000, 4000, 3000, 2000, 1900, 1800, 1700, 1600, 1500 or 1449 nucleotides in length and has at least 90% identity to SEQ ID NO: 1 over its whole length. Preferably, said expression enhancing element is at most 8000, 7000, 6000, 5000, 4000, 3000, 2000, 1900, 1800, 1700, 1600, 1500 or 1449 nucleotides in length and has at least 95% identity to SEQ ID NO: 1 over its whole length. Preferably, said expression enhancing element comprises or consists of a sequence that is represented by SEQ ID NO: 1.
[0147] In a further preferred embodiment within this aspect, said expression enhancing element of the invention is a sequence derived from the CCT8 gene. Preferably, said element is derived from a mammalian CCT8 gene. More preferably, said expression enhancing element is derived from the human or Homo sapiens CCT8 gene.
[0148] In a preferred embodiment within said aspect, said expression enhancing element derived from the Homo sapiens CCT8 gene comprises or consists of a sequence that has at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% identity to nucleotides 1-614 of SEQ ID NO: 2. Preferably, said sequence that has at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% identity to nucleotides 1-614 of SEQ ID NO: 2 is a promoter as defined in the Definition section. In a further preferred embodiment, said expression enhancing element derived from the Homo sapiens CCT8 gene comprises or consists of an intronic sequence. Preferably, said intronic sequence is an intronic sequence as earlier defined herein comprising at least a donor splice site or splice site GT as earlier defined herein. Preferably said donor splice site has the sequence M-A-R-[cut]-G-T-R-A-G-K, most preferably A-A-A-[cut]-G-T-G-A-G-G. Preferably, said intronic sequence encompassed within said expression enhancing element consists of said donor splice site or splice site GT. Preferably, said expression enhancing element comprising an intronic sequence has at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% identity to nucleotides 667-1228 of SEQ ID NO: 2.
[0149] In a preferred embodiment within said aspect, said expression enhancing element comprises or consists of a promoter and an intronic sequence as defined herein. Preferably, the expression enhancing element of the invention has at least 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% identity to SEQ ID NO: 2 over its whole length. Preferably, said expression enhancing element of the invention that has at least 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% identity to SEQ ID NO: 2 over its whole length comprises both a promoter and an intronic sequence as defined herein. Preferably, said expression enhancing element for increasing expression is a contiguous sequence of at least 500, 600, 700, 791 or 1223 in length, preferably at least 1228 nucleotides in length of SEQ ID NO: 2. Preferably, said expression enhancing element having at least 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% identity to SEQ ID NO: 2 is at most 8000 nucleotides in length. Preferably, said expression enhancing element is at most 8000, 7000, 6000, 5000, 4000, 3000, 2000, 1900, 1800, 1700, 1600, 1500 or 1228 in length. Most preferably, said sequence being 1228 nucleotides in length. Preferably, said expression enhancing element is at most 8000, 7000, 6000, 5000, 4000, 3000, 2000, 1900, 1800, 1700, 1600, 1500 or 1228 nucleotides in length and has at least 65% identity to SEQ ID NO: 2 over its whole length. Preferably, said expression enhancing element is at most 8000, 7000, 6000, 5000, 4000, 3000, 2000, 1900, 1800, 1700, 1600, 1500 or 1228 nucleotides in length and has at least 70% identity to SEQ ID NO: 2 over its whole length. Preferably, said expression enhancing element is at most 8000, 7000, 6000, 5000, 4000, 3000, 2000, 1900, 1800, 1700, 1600, 1500 or 1228 nucleotides in length and has at least 75% identity to SEQ ID NO: 2 over its whole length. Preferably, said expression enhancing element is at most 8000, 7000, 6000, 5000, 4000, 3000, 2000, 1900, 1800, 1700, 1600, 1500 or 1228 nucleotides in length and has at least 80% identity to SEQ ID NO: 2 over its whole length. Preferably, said expression enhancing element is at most 8000, 7000, 6000, 5000, 4000, 3000, 2000, 1900, 1800, 1700, 1600, 1500 or 1228 nucleotides in length and has at least 85% identity to SEQ ID NO: 2 over its whole length. Preferably, said expression enhancing element is at most 8000, 7000, 6000, 5000, 4000, 3000, 2000, 1900, 1800, 1700, 1600, 1500 or 1228 nucleotides in length and has at least 90% identity to SEQ ID NO: 2 over its whole length. Preferably, said expression enhancing element is at most 8000, 7000, 6000, 5000, 4000, 3000, 2000, 1900, 1800, 1700, 1600, 1500 or 1228 nucleotides in length and has at least 95% identity to SEQ ID NO: 2 over its whole length. Preferably, said expression enhancing element comprises or consists of a sequence that is represented by SEQ ID NO: 2.
[0150] Further preferred is a nucleotide sequence comprising an expression enhancing element as defined herein that has at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% identity to any of SEQ ID NO: 59-61 over its whole length. Preferably, said expression enhancing element comprises or consists of any of the sequences selected from the group consisting of SEQ ID NO: 59-61. Most preferably, said expression enhancing element comprises or consists of a sequence that is represented by SEQ ID NO: 59.
Eleventh Aspect
[0151] In an eleventh aspect, the present invention provides a nucleic acid construct comprising a nucleic acid molecule according to the tenth aspect of the invention. A nucleic acid construct of the invention comprises an expression enhancing element according to the tenth aspect of the present invention. Preferably, said nucleic acid construct is a recombinant and/or isolated construct as defined herein. Preferably, said nucleic acid construct further comprises a heterologous promoter, wherein preferably said expression enhancing element and said heterologous promoter are configured to be both operably linked to an optional nucleotide sequence of interest as defined herein below. "Heterologous promoter" is to be understood herein as a promoter that is not naturally operably linked to the expression enhancing element of the invention, i.e. a contiguous sequence comprising said expression enhancing element and said heterologous promoter does not occur in nature as neighboring sequences but can be synthesized as a recombinant sequence.
[0152] Preferably within this aspect, said heterologous promoter is located within a nucleic acid construct of the invention downstream or at the 3' site of the expression enhancing element of the invention. Preferably, said heterologous promoter is located within a construct of the invention upstream or at the 5' site of the nucleotide sequence of the invention encoding a protein or polypeptide of interest. In an embodiment of the invention the heterologous promoter is a promoter capable of initiating transcription in the host cell of choice. Heterologous promoters as used herein include tissue-specific, tissue-preferred, cell-type specific, inducible and constitutive promoters as defined herein. Heterologous promoters and/or regulating sequences that may be employed in expression of polypeptides according to the present invention, preferably in mammalian cells, include, but are not limited to, the human or murine cytomegalovirus (CMV) promoter, a simian virus (SV40) promoter, a human or mouse ubiquitin C promoter, a human or mouse or rat elongation factor alpha (EF1-a) promoter, mouse or hamster beta-actin promoter, or a hamster rpS21 promoter. The Tet-Off and Tet-On elements upstream of a minimal promoter such as a CMV promoter forms an example of an inducible mammalian promoter. Examples of suitable yeast and fungal promoters are Leu2 promoter, the galactose (Gal1 or Gal 7) promoter, alcohol dehydrogenase I (ADH1) promoter, glucoamylase (Gla) promoter, triose phosphate isomerase (TPI) promoter, translational elongation factor EF-I alpha (TEF2) promoter, glyceraldehyde-3-phosphate dehydrogenase (gpdA) promoter, alcohol oxidase (AOX1) promoter, or glutamate dehydrogenase (gdhA) promoter. An example of a strong ubiquitous promoter for expression in plants is cauliflower mosaic virus (CaMV) 35S promoter. Preferably, the nucleic acid construct of the invention comprises a heterologous promoter that is represented by a sequence that has at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% to SEQ ID NO: 58. More preferably, the nucleic acid construct of this aspect of the invention comprises a heterologous promoter that is represented by a sequence that has at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% to SEQ ID NO: 57.
[0153] In a further preferred embodiment within this aspect, the nucleic acid construct of the invention further comprises one or more additional expression regulating elements, wherein preferably said expression enhancing element, said heterologous promoter and said one or more additional expression regulating elements are configured to be all operably linked to an optional nucleotide sequence of interest as defined herein below. An "additional expression regulating element" is to be understood herein as an element in addition to the expression enhancing element and/or promoter as defined herein above which may be an additional expression enhancing element or a distinct expression enhancing element or an expression regulating element in its broadest sense. An additional expression regulating element as encompassed by the present invention can be involved in the transcriptional and/or translational regulation of a gene, including but not limited to, 5'-UTR, 3'-UTR, enhancer, promoter, intronic sequence, polyadenylation signal and chromatin control elements such as scaffold/matrix attachment regions, ubiquitous chromatin opening element, cytosine phosphodiester guanine pairs and stabilizing and anti-repressor elements, and any derivatives thereof. Other optional regulating elements that may be present in the nucleic acid construct of the invention include, but are not limited to, coding nucleotide sequences of homologous and/or heterologous nucleotide sequences, including the Iron Responsive Element (IRE), Translational cis-Regulatory Element (TLRE) or uORFs in 5' UTRs and poly(U) stretches in 3' UTRs.
[0154] Preferably within this aspect, said additional expression regulating element comprises or consists of an intronic sequence as defined herein. Preferably, the intronic sequence encompassed within the additional expression regulating element comprises at least of an acceptor splice site which is understood herein as to comprise the splice site AG preferably preceded by a polypyrimidine tract nucleotide sequence, optionally separated from splice site AG by 1-50 nucleotides, and optionally further comprising a branch site comprising the sequence Y-T-N-A-Y, at the 5' site of the polypyrimidine tract nucleotide sequence, wherein the branch site may have the nucleotide sequence C-Y-G-A-C. An acceptor splice site is understood herein as a splice site that, when combined with a donor splice site encompassed within a construct, results in the formation of an intron as defined in the Definition section. Preferably, the acceptor splice site or splice site AG has the sequence [Y-rich]-N-Y-A-G-[cut]. Preferably, the acceptor splice site or splice site AG has the sequence [Y-rich]-N-Y-A-G-[cut]-R in case the host cell is a mammalian cell, [Y-rich]-D-Y-A-G-[cut]-R or [Y-rich]-D-Y-A-G-[cut]-R-W in case the host cell is a plant cell, [Y-rich]-A-Y-A-G-[cut] in case the host cell is a yeast cell and [Y-rich]-N-Y-A-G-[cut] in case the host cell is an insect cell. "[Y-rich]" is to be understood herein as a polypyrimidine tract which is preferably defined as a consecutive sequence of at least 10 nucleotides comprising at least 6, 7, 8, 9 or preferably 10 pyrimidine nucleotides. Preferably, said acceptor splice site or splice site GT has the sequence Y-A-G-[cut]-R. Preferably, said intronic sequence encompassed within said additional expression regulating element consists of said acceptor splice site or splice site AG. The intronic sequence preferably comprises or consists of a nucleotide sequence having at least 80%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to nucleotides 171-277 of SEQ ID NO: 14, nucleotides 171-274 of SEQ ID NO: 19, nucleotides 133-210 of SEQ ID NO: 20, nucleotides 134-211 of SEQ ID NO: 21, nucleotides 134-226 of SEQ ID NO: 22, nucleotides 134-226 of SEQ ID NO: 23, nucleotides 133-225 of SEQ ID NO: 24, nucleotides 134-226 of SEQ ID NO: 25, nucleotides 146-257 of SEQ ID NO: 26, or nucleotides 147-223 of SEQ ID NO: 27, or nucleotides 970-1449 of SEQ ID NO: 1 or nucleotides 667-1228 of SEQ ID NO: 2. Preferably, said intronic sequence comprised within said additional expression regulating element further comprises a donor splice site as defined herein. Even more preferred, said intronic sequence encompassed within the additional expression regulating element is a intron as defined in the Definition section. Most preferably, the intronic sequence encompassed within the additional expression regulating element comprises or consists of a nucleotide sequence having at least 80%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to nucleotides 171-277 of SEQ ID NO: 14, nucleotides 171-274 of SEQ ID NO: 19, nucleotides 133-210 of SEQ ID NO: 20, nucleotides 134-211 of SEQ ID NO: 21, nucleotides 134-226 of SEQ ID NO: 22, nucleotides 134-226 of SEQ ID NO: 23, nucleotides 133-225 of SEQ ID NO: 24, nucleotides 134-226 of SEQ ID NO: 25, nucleotides 146-257 of SEQ ID NO: 26, or nucleotides 147-223 of SEQ ID NO: 27.
[0155] Also preferred within this aspect is an expression regulating element that is a translation enhancing element. Preferably, a translation enhancing element allows for an increase in protein or polypeptide expression of at least 5%, 10%, 15%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 100%, 200%, 300%, 500%, 1000%, 1500% or 2000% as compared to expression of said protein or polypeptide using a construct which only differs in that it is free of said translation enhancing element, preferably when tested in a system as exemplified in the Examples which are enclosed herein. More specifically, preferably the expression of nucleotide sequence of interest encoding for secreted alkaline phosphatase (SeAP) is measured in a mammalian cell system, most preferably in CHO cells, using a pcDNA3.1 expression vector comprising a translation enhancing element to be tested and a CMV promoter represented by SEQ ID NO: 57, operably linked to said nucleotide sequence of interest. Expression is preferably measured by measuring the conversion of any suitable alkaline phosphatase substrate and expression levels are compared to expression levels of said nucleotide sequence of interest which are measured under the same conditions except that the expression vector is free of said translation enhancing element to be tested.
[0156] Preferably within this aspect, said translation enhancing element comprises or consists of a nucleotide sequence that has at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% identity to SEQ ID NO: 3-51 over its whole length. Preferably, said translation enhancing element comprises or consists of a nucleotide sequence that has at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100 identity to SEQ ID NO: 19 over its whole length. More preferably, said translation enhancing element comprises or consists of a nucleotide sequence that has at least 90% identity to SEQ ID NO: 3-51 over its whole length. Also preferred within this aspect is a translation enhancing element that comprises or consists of a nucleotide sequence that comprises:
[0157] i) a GAA repeat nucleotide sequence, a TC-rich nucleotide sequence comprising at least 8 consecutive C or T nucleotides, at least 3 A-rich nucleotide sequences comprising at least 5 consecutive A nucleotides, a GT-rich nucleotide sequence comprising at least 10 nucleotides, at least 80% of which are G or T nucleotides;
[0158] ii) a TC-rich nucleotide sequence comprising at least 8 consecutive C or T nucleotides, at least 3 A-rich nucleotide sequences comprising at least 5 consecutive A nucleotides, and a GT-rich nucleotide sequence comprising at least 10 nucleotides, at least 80% of which are G or T nucleotides, said expression enhancing element not comprising a GAA repeat nucleotide sequence; or,
[0159] iii) a GAA repeat nucleotide sequence, a TC-rich nucleotide sequence comprising at least 8 consecutive C or T nucleotides, at least 3 A-rich nucleotide sequences comprising at least 5 consecutive A nucleotides, a GT-rich nucleotide sequence comprising at least 10 nucleotides, at least 80% of which are G or T nucleotides, wherein said GAA repeat nucleotide sequence is located 3' of any one or more of said TC-rich nucleotide sequence, A-rich nucleotide sequences, and/or GT-rich nucleotide sequence.
[0160] The GAA repeat nucleotide sequence, the TC-rich nucleotide sequence, the A-rich nucleotide sequence, the GT-rich nucleotide sequence have already been defined herein in the first aspect of the invention. These definitions also applied here.
[0161] Preferably within said aspect, said additional expression regulating element comprises a translation enhancing element as defined herein and an intronic sequence.
[0162] Preferably within said aspect, said additional expression regulating element is located within a nucleic acid construct of the invention downstream or the 3' site of a heterologous promoter. Preferably, said additional expression regulating element is located within a nucleic acid construct of the invention upstream or at the 5' site of a nucleic acid sequence encoding a protein or polypeptide of interest. Moreover, preferably a nucleic acid construct of the invention comprises the following nucleotide sequences indicated here in their relative positions in the 5' to 3' direction: (i) an expression enhancing element, (ii) a heterologous promoter, optionally (iii) an additional expression regulating element, and optionally (iv) a nucleotide sequence of interest, wherein preferably said expression enhancing element, said heterologous promoter and said additional expression regulating element are configured to be all operably linked to said optional nucleotide sequence of interest as defined herein below. It is to be understood that said expression enhancing element, said heterologous promoter, and optionally said additional expression regulating element of the nucleic acid construct of the invention are all configured to be operably linked to the same, single nucleotide sequence of interest.
[0163] The inventors found an unexpected synergistic effect when the expression enhancing element of the invention is combined with an additional expression regulating element as defined herein in an expression construct for expressing a protein or polypeptide of interest. In a stably transfected pool with both an expression enhancing element and an additional expression regulating element, the protein yield was significantly higher than the yield expected based on addition of the separate effects of either element. Preferably, said increase of protein or polypeptide expression is copy number independent as established by an assay suitable to determine copy number dependency by a skilled person, such as, but not limit to, a triplex Taqman assay as further detailed in Example 4 of the present invention. Preferably, said nucleic acid construct is a recombinant and/or isolated construct as defined herein. Preferably, said nucleic acid construct further comprises a nucleotide sequence of interest operably linked to and/or under the control of said expression enhancing element, said heterologous promoter and optionally said additional expression regulating element as defined herein. The presence of a nucleotide sequence of interest is optional. "Optional" is to be understood herein as not necessarily being present in an expression construct. For instance, such nucleotide sequence of interest need not be present in a commercialized expression vector, but may be readily introduced by a person skilled in the art before use in a method of the invention. It is to be understood that said expression enhancing element, said heterologous promoter, and optionally said additional expression regulating element are all configured to be operably linked to the same, single nucleotide sequence of interest. Preferably, said nucleic acid construct of the tenth aspect of the invention has at least 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% sequence identity to SEQ ID NO 73, 74, 75 or 76.
[0164] In a preferred embodiment within this aspect, said nucleotide sequence of interest is a nucleotide sequence encoding a protein or polypeptide of interest. The protein or polypeptide of interest can be a homologous protein or polypeptide, but in a preferred embodiment of the invention the protein or polypeptide of interest is a heterologous protein or polypeptide. A nucleotide sequence encoding a heterologous protein or polypeptide may be derived in whole or in part from any source known to the art, including a bacterial or viral genome or episome, eukaryotic nuclear or plasmid DNA, cDNA or chemically synthesised DNA. The nucleotide sequence encoding a protein or polypeptide of interest may constitute an uninterrupted coding region or it may include one or more introns bounded by appropriate splice junctions. It can further be composed of segments derived from different sources, naturally occurring or synthetic. The nucleotide sequence encoding the protein or polypeptide of interest according to the method of the invention is preferably a full-length nucleotide sequence, but can also be a functionally active part or other part of said full-length nucleotide sequence. The nucleotide sequence encoding the protein or polypeptide of interest may also comprise signal sequences directing the protein or polypeptide of interest when expressed to a specific location in the cell or tissue. Furthermore, the nucleotide sequence encoding the protein or polypeptide of interest can also comprise sequences which facilitate protein purification and protein detection by for instance Western blotting and ELISA (e.g. c-myc or polyhistidine sequences).
[0165] The protein or polypeptide of interest in this aspect has already been defined earlier herein in the first aspect of the invention.
[0166] In an alternative embodiment, said nucleotide sequence of interest is not a coding sequence for a protein or a polypeptide but may be a functional nucleotide sequence. This alternative embodiment of this aspect has already been defined earlier herein in the first aspect of the invention.
Twelfth Aspect
[0167] In a twelfth aspect, the present invention provides an expression vector comprising a nucleic acid construct according to the eleventh aspect of the invention. The expression vector of the invention preferably is a plasmid, cosmid or phage or nucleotide sequence, linear or circular, of a single or double stranded DNA or RNA, derived from any source, in which a number of nucleotide sequences have been joined or recombined into a unique construction which is capable of introducing any one of the nucleotide sequences of the invention in sense or antisense orientation into a cell. The choice of vector is dependent on the recombinant procedures followed and the host cell used. The vector may be an autonomously replicating vector or may replicate together with the chromosome into which it has been integrated. Preferably, the vector contains a selection marker. Useful markers are dependent on the host cell of choice and are well known to persons skilled in the art and are selected from, but not limited to, the selection markers as defined in third aspect of the invention. A preferred expression vector is the pcDNA3.1 expression vector. Preferred selection markers are the neomycin resistance gene, zeocin resistance gene and blasicidin resistance gene.
Thirteenth Aspect
[0168] In a thirteenth aspect, the present invention provides a cell comprising a nucleic acid molecule according to the tenth aspect of the invention, and/or a nucleic acid construct according to the eleventh aspect of the invention, and/or an expression vector according to the twelfth aspect of the invention as defined herein. The type of cell within the context of this aspect is the same as the one defined in the context of the third aspect. Therefore, another aspect of the invention relates to a host cell that is genetically modified, preferably by a method of the invention, in that a host cell comprises a nucleic acid construct as defined above in the thirteenth aspect. For transformation procedures in plants, suitable bacteria include Agrobacterium tumefaciens and Agrobacterium rhizogenes.
[0169] A nucleic acid construct within the context of this thirteenth aspect is as the one of the third aspect: it is preferably stably maintained, either as an autonomously replicating element, or, more preferably, the nucleic acid construct is integrated into the host cell's genome, in which case the construct is usually integrated at random positions in the host cell's genome, for instance by non-homologous recombination. Stably transformed host cells are produced by known methods. The definition of the term stable transformation and methods encompassed for stable transformation have already been provided under the third aspect.
[0170] Alternatively, a protein or polypeptide of interest may be expressed in a host cell, e.g., a mammalian cell, relying on transient expression from vectors.
[0171] A nucleic acid construct according to this aspect preferably also comprises a marker gene which can provide selection or screening capability in a treated host cell.
[0172] All definitions relating to selectable markers and types of selectable markers including the example of the use of the luciferase gene as selectable marker, the example of a first category of marker based on a cell's metabolism and the use of a mutant cell line which lacks the ability to grow independent of a supplemented media, the example of dominant selection have already been provided in the third aspect. They also apply here in the thirteenth aspect of the invention.
[0173] When a transformed host cell is obtained with a method according to the invention (see below), a host tissue may be regenerated from said transformed cell in a suitable medium, which optionally may contain antibiotics or biocides known in the art for the selection of transformed cells.
[0174] Resulting transformed host tissues are preferably identified by means of selection using a selection marker gene as present on a nucleic acid construct as defined herein.
Fourteenth Aspect
[0175] In a fourteenth aspect, the present invention provides a method for expressing and optionally purifying a protein or polypeptide of interest comprising the step of:
[0176] a. providing a nucleic acid construct according to the eleventh aspect of the invention comprising a nucleotide sequence encoding a protein or polypeptide of interest; and,
[0177] b. contacting a cell with said nucleic acid construct to obtain a transformed cell; and,
[0178] c. allowing said transformed cell to express the protein or polypeptide of interest; and optionally,
[0179] d. purifying said protein or polypeptide of interest.
[0180] In a preferred embodiment of the method according to the invention, a nucleic acid construct as defined above in the eleventh aspect of the invention is used. The method of the invention may be an in vitro or ex vivo method. The method of the invention may be applied on a cell culture, organism culture, or tissue culture. Alternatively, next to the expression in host cells the protein or polypeptide of interest can be produced in cell-free translation systems using RNAs derived from the nucleic acid constructs of the present invention. The method of the invention may be performed on cultured cells.
[0181] The skilled person is capable of transforming cells in accordance with step b). Transformation methods as used in step b) include, but are not limited to transfer of purified DNA via cationic lipid reagents and polyethyleneimide (PEI), calcium-phosphate co-precipitation, microparticle bombardment, electroporation of protoplasts and microinjection or use of silicon fibers to facilitate penetration and transfer of DNA into the host cell.
[0182] In step c) the transformed cell is allowed to express the protein or polypeptide of interest, and optionally said protein or polypeptide is subsequently recovered. For example, the transformed cell may be subjected to conditions leading to expression of the protein or polypeptide of interest. The person skilled in the art is well aware of techniques to be used for expressing or overexpressing the protein or polypeptide of interest. Methods in which the transformed cell does not need to be subjected to specific conditions leading to expression of the protein or polypeptide of interest, but in which the protein or polypeptide of interest is automatically (e.g., constitutively) expressed, are also included in the method of the present invention.
[0183] Purification steps and definitions related to these steps as the definition of an isolated protein or polypeptide are the same as in the method of the fourth aspect and have been earlier defined herein. If desired as defined in the method of the fourth aspect, the nucleotide sequence encoding a protein or polypeptide of interest may be ligated to a heterologous nucleotide sequence to encode a fusion protein or polypeptide to facilitate protein purification and protein detection on for instance Western blot and in an ELISA. Suitable heterologous sequences include, but are not limited to, the nucleotide sequences coding for proteins such as for instance glutathione-S-transferase, maltose binding protein, metal-binding polyhistidine, green fluorescent protein, luciferase and beta-galactosidase. The protein or polypeptide may also be coupled to non-peptide carriers, tags or labels that facilitate tracing of the protein or polypeptide, both in vivo and in vitro, and allow for the identification and quantification of binding of the protein or polypeptide to substrates. Such labels, tags or carriers are well-known in the art and include, but are not limited to, biotin, radioactive labels and fluorescent labels.
[0184] Preferably, the method of this fourteenth aspect of the invention allows for an increase in expression of a protein or polypeptide of interest. Preferably, expression levels are established in an expression system using an expression construct according to the eleventh aspect of the invention comprising an expression enhancing element and a heterologous promoter operably linked to a nucleotide sequence encoding a protein or polypeptide of interest according to the eleventh aspect of the invention. Preferably, said protein or polypeptide of interest is a secreted protein or polypeptide and expression of said protein or polypeptide of interest is detected by a suitable assay such as an ELISA assay, Western blotting or, dependent on the identity of the protein or polypeptide of interest, any suitable protein identification and/or quantification assay known to the person skilled in the art. Preferably, the method of the invention allows for an increase in protein or polypeptide expression of at least 5%, 10%, 15%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 100%, 200%, 300%, 500%, 1000%, 1500% or 2000% as compared to a method which only differs in that a construct is used in step a) that it is free of said expression enhancing element, preferably when tested in a system as exemplified in the Examples which are enclosed herein. More specifically, preferably the expression of nucleotide sequence of interest encoding for secreted alkaline phosphatase (SeAP) is measured in a mammalian cell system, most preferably in CHO cells, using a pcDNA3.1 expression vector comprising an expression enhancing element to be tested and a CMV promoter represented by SEQ ID NO: 57, operably linked to said nucleotide sequence of interest. Expression is preferably measured by measuring the conversion of any suitable alkaline phosphatase substrate and expression levels are compared to expression levels of said nucleotide sequence of interest which are measured under the same conditions except that the expression vector is free of said expression enhancing element to be tested. Preferably, said increase of protein or polypeptide expression is copy number independent as established by an assay suitable to determine copy number dependency by a skilled person such as, but not limit to, a triplex Taqman assay as further detailed in Example 4 of the present invention.
Fifteenth Aspect
[0185] In a fifteenth aspect, the present invention provides a method for expressing a protein or polypeptide of interest in an organism, comprising the steps of:
[0186] a) providing a nucleic acid construct according to the eleventh aspect comprising a nucleotide sequence encoding a protein or polypeptide of interest of the invention; and,
[0187] b) contacting a target cell and/or target tissue of an organism, with said nucleic acid construct to obtain a transformed target cell and/or transformed target tissue, allowing said transformed cell to express the protein or polypeptide of interest; and optionally,
[0188] c) allowing said transformed target cell to develop into a transformed organism; and, optionally,
[0189] d) allowing said transformed organism to express the protein or polypeptide of interest, for example, subjecting said transformed organism to conditions leading to expression of the protein or polypeptide of interest, and optionally recovering said protein or polypeptide.
[0190] The target cell may be an embryonal target cell, e.g., embryonic stem cell, for example, derived from a non-human mammalian, such as bovine, porcine, et cetera species. Preferably, said target cell is not a human embryonic stem cell. In the case of a multicellular fungus, such target cell may be a fungal cell that can be proliferated into said multicellular fungus. When a transformed plant tissue or plant cell (e.g., pieces of leaf, stem segments, roots, but also protoplasts or plant cells cultivated by suspension) is obtained with this method according to the invention, whole plants can be regenerated from said transformed tissue or cell in a suitable medium, which optionally may contain antibiotics or biocides known in the art for the selection of transformed cells. This method of the invention may be applied in nucleic acid based vaccination and/or gene therapy preferably in a mammal, most preferably in a human. Encompassed within the present invention is a method of treatment comprising the method of the present aspect, wherein the protein or polypeptide of interest is a therapeutic and/or immunogenic protein or polypeptide. The invention also relates to a construct of the eleventh aspect of the invention for treatment, wherein the protein or polypeptide of interest is a therapeutic and/or immunogenic protein or polypeptide. Furthermore, the invention relates to the use of a construct of the eleventh aspect of the invention for the manufacture of a medicament, wherein the protein or polypeptide of interest is a therapeutic and/or immunogenic protein or polypeptide.
[0191] Furthermore, an embodiment of the invention is a non-human transformed organism. Said organism is transformed with a nucleotide sequence, recombinant nucleic acid construct, or vector according to the present invention, and is capable of producing the polypeptide of interest. This includes a non-human transgenic organism, such as a transgenic non-human mammalian, transgenic plant (including propagation, harvest and tissue material of said transgenic plant, including, but not limited to, leafs, roots, shoots and flowers), multicellular fungus, and the like.
[0192] Preferably, the method of this aspect of the invention allows for an increase in expression of a protein or polypeptide of interest in said organism or at least in one tissue or organelle or organ of said organism. Preferably, expression levels are established in an expression system using an expression construct according to the eleventh aspect of the invention comprising an expression enhancing element and a heterologous promoter operably linked to a nucleotide sequence encoding a protein or polypeptide of interest according to the eleventh aspect of the invention. Preferably, said protein or polypeptide of interest is a secreted protein or polypeptide and expression of said protein or polypeptide of interest is detected by a suitable assay such as an ELISA assay, Western blotting or, dependent on the identity of the protein or polypeptide of interest, any suitable protein identification and/or quantification assay known to the person skilled in the art. Preferably, this method of the invention allows for an increase in protein or polypeptide expression of at least 5%, 10%, 15%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 100%, 200%, 300%, 500%, 1000%, 1500% or 2000% in said organism or at least in one tissue or organelle or organ of said organism. as compared to a method which only differs in that a construct is used in step a) that it is free of said expression enhancing element, preferably when tested in a system as exemplified in the Examples which are enclosed herein. More specifically, preferably the expression of nucleotide sequence of interest encoding for secreted alkaline phosphatase (SeAP) is measured in a mammalian cell system, most preferably in CHO cells, using a pcDNA3.1 expression vector comprising an expression enhancing element to be tested and a CMV promoter represented by SEQ ID NO: 57, operably linked to said nucleotide sequence of interest. Expression is preferably measured by measuring the conversion of any suitable alkaline phosphatase substrate and expression levels are compared to expression levels of said nucleotide sequence of interest which are measured under the same conditions except that the expression vector is free of said expression enhancing element to be tested.
[0193] Preferably, said increase of protein or polypeptide expression is copy number independent as established by an assay suitable to determine copy number dependency by a skilled person such as, but not limit to, a triplex Taqman assay as further detailed in Example 4 of the present invention.
Sixteenth Aspect
[0194] In a sixteenth aspect, the present invention provides a method for transcription and optionally purifying the produced transcript comprising the step of:
[0195] a) providing a nucleic acid construct according to the eleventh aspect comprising a nucleotide sequence of interest of the invention; and,
[0196] b) contacting a cell with said nucleic acid construct to obtain a transformed cell; and,
[0197] c) allowing said transformed cell to produce a transcript of the nucleotide sequence of interest; and optionally,
[0198] d) purifying said produced transcript.
[0199] In a preferred embodiment of this method according to the invention a nucleic acid construct as defined above in the eleventh aspect is used. The method of the invention may be an in vitro or ex vivo method. The method of the invention may be applied on a cell culture, organism culture, or tissue culture. The method of the invention may be applied in nucleic acid based vaccination and/or gene therapy preferably in a mammal, preferably in a human. Encompassed within the present invention is a method for treatment comprising or consisting of the method of the present aspect, wherein the nucleotide sequence of interest encodes for a therapeutic transcript. The invention also relates to a construct of the eleventh aspect of the invention for use in treatment, wherein the nucleotide sequence of interest encodes for a therapeutic transcript. Furthermore, the invention relates to the use of a construct of the eleventh aspect of the invention for the manufacture of a medicament, wherein the nucleotide sequence of interest encodes for a therapeutic transcript.
[0200] The skilled person is capable of transforming cells in accordance with step b). Transformation methods as used in step b) include, but are not limited to transfer of purified DNA via cationic lipid reagents and polyethyleneimide (PEI), calcium-phosphate co-precipitation, microparticle bombardment, electroporation of protoplasts and microinjection or use of silicon fibers to facilitate penetration and transfer of DNA into the host cell.
[0201] In step c) the transformed cell is allowed to produce a transcript of the nucleotide sequence of interest, and optionally the produced transcript is subsequently recovered. For example, the transformed cell may be subjected to conditions leading to transcription the nucleotide sequence of interest. The person skilled in the art is well aware of techniques to be used for transcription the nucleotide sequence of interest. Methods in which the transformed cell does not need to be subjected to specific conditions leading to transcription of the nucleotide sequence of interest, but in which the nucleotide sequence of interest is automatically (e.g., constitutively) transcribed, are also included in the method of the present invention.
[0202] Purification steps depend on the transcript produced. The term "isolation" indicates that the transcript is found in a condition other than its native environment. In a preferred form, the isolated transcript is substantially free of other cellular components, particularly other homologous cellular components such as homologous proteins. It is preferred to provide the transcript in a greater than 40% pure form, more preferably greater than 60% pure form. Even more preferably it is preferred to provide the transcript in a highly purified form, i.e., greater than 80% pure, more preferably greater than 95% pure, and even more preferably greater than 99% pure, as determined by Northern blot.
[0203] Preferably, the method of this aspect of the invention allows for an increase in transcription of a nucleotide sequence of interest. Preferably, transcription levels are established in an expression system using an expression construct according to the second aspect of the invention comprising an expression enhancing element and a heterologous promoter operably linked to a nucleotide sequence of interest according to the second aspect of the invention. Preferably, transcription of said nucleotide sequence of interest is detected by a suitable assay such as RT-qPCR. Preferably, the method of the invention allows for an increase in transcription of at least 5%, 10%, 15%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 100%, 200%, 300%, 500%, 1000%, 1500% or 2000% as compared to a method which only differs in that a construct is used in step a) that it is free of said expression enhancing element, preferably when tested in a system as exemplified in the Examples which are enclosed herein. More specifically, preferably the transcription of a nucleotide sequence of interest encoding for secreted alkaline phosphatase (SeAP) is measured in a mammalian cell system, most preferably in CHO cells, using a pcDNA3.1 expression vector comprising an expression enhancing element to be tested and a CMV promoter represented by SEQ ID NO: 57, operably linked to said nucleotide sequence of interest. Transcription is preferably measured using RT-qPCR and transcription levels are compared to transcription levels of said nucleotide sequence of interest measured under the same conditions except that the expression vector used is free of said expression enhancing element to be tested.
Seventeenth Aspect
[0204] In an seventeenth aspect, the present invention provides a use of a nucleic acid molecule according to the tenth aspect of the invention, and/or a use of a nucleic acid construct according to the eleventh aspect of the invention, and/or a use of an expression vector according to the twelfth aspect of the invention, and/or a use of a cell according to the thirteenth aspect of the invention, for the transcription of a nucleotide sequence of interest and/or the expression of a protein or polypeptide of interest.
Eighteenth Aspect
[0205] In a eighteenth aspect, the present invention provides for a nucleic acid molecule according to according to the tenth aspect of the invention, and/or a nucleic acid construct according to the eleventh aspect of the invention, and/or an expression vector according to the twelfth aspect of the invention, and/or a cell according to the thirteenth aspect of the invention for use as a medicament. The invention also relates to a method of treatment comprising the administration of a nucleic acid molecule according to the tenth aspect of the invention, and/or a nucleic acid construct according to the eleventh aspect of the invention, and/or an expression vector according to the twelfth aspect of the invention, and/or a cell according to the thirteenth aspect of the invention, wherein preferably said administration is to a mammal, more preferably to a human. Preferably, said treatment is nucleic acid based vaccination and/or gene therapy preferably in a mammal, most preferably in a human. Furthermore, the invention relates to the use of a nucleic acid molecule according to according to the tenth aspect of the invention, and/or the use of a nucleic acid construct according to the eleventh aspect of the invention, and/or the use of an expression vector according to the twelfth aspect of the invention, and/or the use of a cell according to the thirteenth aspect of the invention, for the preparation of a medicament. Preferably said medicament is for nucleic acid based vaccination and/or gene therapy preferably in a mammal, most preferably in a human.
Nineteenth Aspect
[0206] In a nineteenth aspect, the present invention provides a nucleic acid molecule that is represented by a nucleotide sequence that has at least 50% identity with SEQ ID NO: 88 for increasing transcription of a nucleotide sequence of interest and/or expression of a protein or polypeptide of interest. Within the context of the nineteenth to twenty seventh aspect, said identity percentage is preferably assessed over the whole length of SEQ ID NO:88. However, it is not excluded that said identity percentage is assessed over part of SEQ ID NO:88 as defined in the section entitled definitions. Preferably, said nucleotide sequence of the invention is capable of increasing the transcription of a nucleotide sequence of interest and/or expression of a protein or polypeptide of interest. Said nucleic acid molecule represented by a nucleotide sequence that has at least 50% identity with SEQ ID NO:88 may be called a transcription regulating sequence.
[0207] Preferably within this aspect, transcription levels are established in an expression system using an expression construct comprising said nucleotide sequence operably linked to a nucleotide sequence of interest using a suitable assay such a RT-qPCR. Preferably, the nucleotide sequence of the invention allows for an increase in transcription of at least 5%, 10%, 15%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 100%, 200%, 300%, 500%, 1000%, 1500% or 2000% as compared to transcription levels using a construct wherein the nucleotide sequence having at least 50% identity with SEQ ID NO:88 has been replaced by an alternative sequence, preferably as exemplified in example 11 which is enclosed herein. More specifically, preferably the transcription of a nucleotide sequence of interest encoding for secreted alkaline phosphatase (SeAP) is measured in a mammalian cell system, most preferably in CHO cells, using a pcDNA3.1 expression vector comprising the nucleotide sequence of the invention operably linked to said nucleotide sequence of interest. Transcription is preferably measured using RT-qPCR and transcription levels are compared to transcription levels of said nucleotide sequence of interest measured under the same conditions except that in the expression vector used the nucleotide sequence having at least 50% identity with SEQ ID NO:88 has been replaced by an alternative sequence, preferably as exemplified in example 11 which is enclosed herein.
[0208] Preferably within this aspect, expression levels are established in an expression system using an expression construct comprising said nucleotide sequence having at least 50% identity with SEQ ID NO:88 and which is operably linked to a nucleotide sequence encoding a protein or polypeptide of interest. Preferably, said protein or polypeptide of interest is a secreted protein or polypeptide and expression of said protein or polypeptide of interest is detected by a suitable assay such as an enzyme-linked immunosorbent assay (ELISA) assay, Western blotting or, dependent on the identity of the protein or polypeptide of interest, any suitable protein identification and/or quantification assay known to the person skilled in the art. Preferably, the nucleotide sequence having at least 50% identity with SEQ ID NO:88 allows for an increase in protein or polypeptide expression of at least 5%, 10%, 15%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 100%, 200%, 300%, 500%, 1000%, 1500% or 2000% as compared to expression of said protein or polypeptide using a construct which only differs in that said nucleotide sequence has been replaced by an alternative sequence, preferably when tested in a system as exemplified in example 11 which is enclosed herein. More specifically, preferably the expression of a nucleotide sequence of interest encoding for secreted alkaline phosphatase (SeAP) is measured in a mammalian cell system, most preferably in CHO cells, using a pcDNA3.1 expression vector comprising a nucleotide sequence having at least 50% identity with SEQ ID NO:88 operably linked to said nucleotide sequence of interest. Expression is preferably measured by measuring the conversion of any suitable alkaline phosphatase substrate and expression levels are compared to expression levels of said isolated nucleic acid molecule as defined herein. In a preferred embodiment, said nucleotide sequence having at least 50% identity with SEQ ID NO:88 is a sequence that is derived from the UBC ubiquitin gene. Preferably, said nucleotide sequence is derived from a mammalian UBC ubiquitin gene. More preferably, said nucleotide sequence is derived from the Cricetulus griseus homologous gene of the human UBC ubiquitin gene, said gene being indicated as the Cricetulus sp. gene for polyubiquitin, or CRUPUQ (GenBank D63782).
[0209] In a preferred embodiment, said nucleotide sequence comprises or consists of a sequence that has at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% identity with SEQ ID NO: 88 over its whole length. Preferably, said sequence having at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% identity with SEQ ID NO: 88 over its whole length comprises a promoter as defined in the Definition section. Preferably, said promoter is capable of initiating transcription of a nucleotide sequence of interest and/or expression of a polypeptide or protein or polypeptide of interest encoded by a nucleotide sequence in a host cell as defined herein below. Preferably, said nucleotide sequence for increasing expression is a contiguous sequence of at least 1450, 1500, 1600, 1700, 1800, 1900, 2000, 2100, 2200, 2300, 2400, 2500, 2600 or 2617 in length, preferably at least 2617 nucleotides in length of SEQ ID NO: 88. Preferably, said nucleotide sequence having at least 50%, 51%, 52%, 53%, 54%, 55%, 56%, 57%, 58%, 59%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% identity to SEQ ID NO: 88 is at most 8000 nucleotides in length. Preferably, said nucleotide sequence is at most 8000, 7000, 6000, 5000, 4000, 3000, 2617 in length. Most preferably, said sequence being 2617 nucleotides in length. Preferably, nucleotide sequence is at most 8000, 7000, 6000, 5000, 4000, 3000, 2617 nucleotides in length and has at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 99% identity to SEQ ID NO: 88 over its whole length.
Twentieth Aspect
[0210] In a twentieth aspect, the present invention provides a nucleic acid construct comprising a nucleic acid molecule according to the nineteenth aspect of the invention. A nucleic acid construct of the invention comprises a nucleotide sequence according to the nineteenth aspect of the present invention. Preferably, said nucleic acid construct is a recombinant and/or isolated construct as defined herein. Preferably, said nucleic acid construct further comprises an optional nucleotide sequence of interest as defined herein below wherein the nucleotide sequence of the invention is operably linked to said optional nucleic acid sequence of interest.
[0211] In a further preferred embodiment within this aspect, the nucleic acid construct of the invention further comprises one or more additional expression regulating elements, wherein preferably said nucleotide sequence and said one or more additional expression regulating elements are configured to be all operably linked to an optional nucleotide sequence of interest as defined herein below. An "additional expression regulating element" is to be understood herein as an element in addition to the nucleotide sequence as defined herein above which may be an additional expression regulating element or a distinct expression regulating element or an additional expression enhancing element or a distinct expression enhancing element. An additional expression regulating element as encompassed by the present invention can be involved in the transcriptional and/or translational regulation of a gene, including but not limited to, 5'-UTR, 3'-UTR, enhancer, promoter, intron, polyadenylation signal and chromatin control elements such as scaffold/matrix attachment regions, ubiquitous chromatin opening element, cytosine phosphodiester guanine pairs and stabilizing and anti-repressor elements, and any derivatives thereof. Other optional regulating elements that may be present in the nucleic acid construct of the invention include, but are not limited to, coding nucleotide sequences of homologous and/or heterologous nucleotide sequences, including the Iron Responsive Element (IRE), Translational cis-Regulatory Element (TLRE) or uORFs in 5' UTRs and poly(U) stretches in 3' UTRs.
[0212] Also preferred within this aspect is an expression regulating element that is a translation enhancing element. Preferably, a translation enhancing element allows for an increase in protein or polypeptide expression of at least 5%, 10%, 15%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 100%, 200%, 300%, 500%, 1000%, 1500% or 2000% as compared to expression of said protein or polypeptide using a construct which only differs in that it is free of said translation enhancing element, preferably when tested in a system as exemplified in the Examples which are enclosed herein. More specifically, preferably the expression of nucleotide sequence of interest encoding for secreted alkaline phosphatase (SeAP) is measured in a mammalian cell system, most preferably in CHO cells, using a pcDNA3.1 expression vector comprising a translation enhancing element to be tested and a nucleotide sequence having at least 50% identity with SEQ ID NO: 88, operably linked to said nucleotide sequence of interest. Expression is preferably measured by measuring the conversion of any suitable alkaline phosphatase substrate and expression levels are compared to expression levels of said nucleotide sequence of interest which are measured under the same conditions except that the expression vector is free of said translation enhancing element to be tested.
[0213] Preferably within this aspect, said translation enhancing element comprises or consists of a nucleotide sequence that has at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% identity to SEQ ID NO: 3-51 over its whole length. Preferably, said translation enhancing element comprises or consists of a nucleotide sequence that has at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100 identity to SEQ ID NO: 19 over its whole length. More preferably, said translation enhancing element comprises or consists of a nucleotide sequence that has at least 90% identity to SEQ ID NO: 3-51 over its whole length. Also preferred within this aspect is a translation enhancing element that comprises or consists of a nucleotide sequence that comprises:
[0214] i) a GAA repeat nucleotide sequence, a TC-rich nucleotide sequence comprising at least 8 consecutive C or T nucleotides, at least 3 A-rich nucleotide sequences comprising at least 5 consecutive A nucleotides, a GT-rich nucleotide sequence comprising at least 10 nucleotides, at least 80% of which are G or T nucleotides;
[0215] ii) a TC-rich nucleotide sequence comprising at least 8 consecutive C or T nucleotides, at least 3 A-rich nucleotide sequences comprising at least 5 consecutive A nucleotides, and a GT-rich nucleotide sequence comprising at least 10 nucleotides, at least 80% of which are G or T nucleotides, said expression enhancing element not comprising a GAA repeat nucleotide sequence; or,
[0216] iii) a GAA repeat nucleotide sequence, a TC-rich nucleotide sequence comprising at least 8 consecutive C or T nucleotides, at least 3 A-rich nucleotide sequences comprising at least 5 consecutive A nucleotides, a GT-rich nucleotide sequence comprising at least 10 nucleotides, at least 80% of which are G or T nucleotides, wherein said GAA repeat nucleotide sequence is located 3' of any one or more of said TC-rich nucleotide sequence, A-rich nucleotide sequences, and/or GT-rich nucleotide sequence.
[0217] The GAA repeat nucleotide sequence, the TC-rich nucleotide sequence, the A-rich nucleotide sequence, the GT-rich nucleotide sequence have already been defined herein in the first aspect of the invention. These definitions also applied here.
[0218] Preferably within said aspect, said additional expression regulating element is located within a nucleic acid construct of the invention having at least 50% identity with SEQ ID NO: 88. Preferably, said additional expression regulating element is located within a nucleic acid construct of the invention upstream or at the 5' site of a nucleic acid sequence encoding a protein or polypeptide of interest. Moreover, preferably a nucleic acid construct of the invention comprises the following nucleotide sequences indicated here in their relative positions in the 5' to 3' direction:
[0219] optionally (i) an expression regulating preferably enhancing element,
[0220] (ii) a nucleotide sequence having at least 50% identity with SEQ ID NO:88, optionally (iii) an additional expression regulating element, and optionally (iv) a nucleotide sequence of interest, wherein preferably said expression enhancing element, said nucleotide sequence having at least 50% identity with SEQ ID NO:88 and said additional expression regulating element are configured to be all operably linked to said optional nucleotide sequence of interest as defined herein below. It is to be understood that said expression enhancing element, said nucleotide sequence having at least 50% identity with SEQ ID NO:88, and optionally said additional expression regulating element of the nucleic acid construct of the invention are all configured to be operably linked to the same, single nucleotide sequence of interest.
[0221] The presence of a nucleotide sequence of interest is optional. "Optional" is to be understood herein as not necessarily being present in an expression construct. For instance, such nucleotide sequence of interest need not be present in a commercialized expression vector, but may be readily introduced by a person skilled in the art before use in a method of the invention. It is to be understood that said expression enhancing element, said nucleotide sequence having at least 50% identity with SEQ ID NO:88, and optionally said additional expression regulating element are all configured to be operably linked to the same, single nucleotide sequence of interest.
[0222] In a preferred embodiment within this aspect, said nucleotide sequence of interest is a nucleotide sequence encoding a protein or polypeptide of interest. The protein or polypeptide of interest can be a homologous protein or polypeptide, but in a preferred embodiment of the invention the protein or polypeptide of interest is a heterologous protein or polypeptide. A nucleotide sequence encoding a heterologous protein or polypeptide may be derived in whole or in part from any source known to the art, including a bacterial or viral genome or episome, eukaryotic nuclear or plasmid DNA, cDNA or chemically synthesised DNA. The nucleotide sequence encoding a protein or polypeptide of interest may constitute an uninterrupted coding region or it may include one or more introns bounded by appropriate splice junctions. It can further be composed of segments derived from different sources, naturally occurring or synthetic. The nucleotide sequence encoding the protein or polypeptide of interest according to the method of the invention is preferably a full-length nucleotide sequence, but can also be a functionally active part or other part of said full-length nucleotide sequence. The nucleotide sequence encoding the protein or polypeptide of interest may also comprise signal sequences directing the protein or polypeptide of interest when expressed to a specific location in the cell or tissue. Furthermore, the nucleotide sequence encoding the protein or polypeptide of interest can also comprise sequences which facilitate protein purification and protein detection by for instance Western blotting and ELISA (e.g. c-myc or polyhistidine sequences).
[0223] The protein or polypeptide of interest in this aspect has already been defined earlier herein in the first aspect of the invention.
[0224] In an alternative embodiment, said nucleotide sequence of interest is not a coding sequence for a protein or a polypeptide but may be a functional nucleotide sequence. This alternative embodiment of this aspect has already been defined earlier herein in the first aspect of the invention.
Twenty First Aspect
[0225] In a twenty first aspect, the present invention provides an expression vector comprising a nucleic acid construct according to the twentieth aspect of the invention. The expression vector of the invention preferably is a plasmid, cosmid or phage or nucleotide sequence, linear or circular, of a single or double stranded DNA or RNA, derived from any source, in which a number of nucleotide sequences have been joined or recombined into a unique construction which is capable of introducing any one of the nucleotide sequences of the invention in sense or antisense orientation into a cell. The choice of vector is dependent on the recombinant procedures followed and the host cell used. The vector may be an autonomously replicating vector or may replicate together with the chromosome into which it has been integrated. Preferably, the vector contains a selection marker. Useful markers are dependent on the host cell of choice and are well known to persons skilled in the art and are selected from, but not limited to, the selection markers as defined in third aspect of the invention. A preferred expression vector is the pcDNA3.1 expression vector. Preferred selection markers are the neomycin resistance gene, zeocin resistance gene and blasicidin resistance gene.
Twenty Second Aspect
[0226] In a twenty second aspect, the present invention provides a cell comprising a nucleic acid molecule according to the nineteenth aspect of the invention, and/or a nucleic acid construct according to the twentieth aspect of the invention, and/or an expression vector according to the twenty first aspect of the invention as defined herein. The type of cell within the context of this aspect is the same as the one defined in the context of the third aspect.
[0227] Therefore, another aspect of the invention relates to a host cell that is genetically modified, preferably by a method of the invention, in that a host cell comprises a nucleic acid construct as defined above in the twentieth aspect. For transformation procedures in plants, suitable bacteria include Agrobacterium tumefaciens and Agrobacterium rhizogenes.
[0228] A nucleic acid construct within the context of this twentieth aspect is as the one of the third aspect: it is preferably stably maintained, either as an autonomously replicating element, or, more preferably, the nucleic acid construct is integrated into the host cell's genome, in which case the construct is usually integrated at random positions in the host cell's genome, for instance by non-homologous recombination. Stably transformed host cells are produced by known methods. The definition of the term stable transformation and methods encompassed for stable transformation have already been provided under the third aspect.
[0229] Alternatively, a protein or polypeptide of interest may be expressed in a host cell, e.g., a mammalian cell, relying on transient expression from vectors.
[0230] A nucleic acid construct according to this aspect preferably also comprises a marker gene which can provide selection or screening capability in a treated host cell.
[0231] All definitions relating to selectable markers and types of selectable markers including the example of the use of the luciferase gene as selectable marker, the example of a first category of marker based on a cell's metabolism and the use of a mutant cell line which lacks the ability to grow independent of a supplemented media, the example of dominant selection have already been provided in the third aspect. They also apply here in the thirteenth aspect of the invention.
[0232] When a transformed host cell is obtained with a method according to the invention (see below), a host tissue may be regenerated from said transformed cell in a suitable medium, which optionally may contain antibiotics or biocides known in the art for the selection of transformed cells.
[0233] Resulting transformed host tissues are preferably identified by means of selection using a selection marker gene as present on a nucleic acid construct as defined herein.
Twenty Third Aspect
[0234] In a twenty third aspect, the present invention provides a method for expressing and optionally purifying a protein or polypeptide of interest comprising the step of:
[0235] a. providing a nucleic acid construct according to the twentieth aspect of the invention comprising a nucleotide sequence encoding a protein or polypeptide of interest; and,
[0236] b. contacting a cell with said nucleic acid construct to obtain a transformed cell; and,
[0237] c. allowing said transformed cell to express the protein or polypeptide of interest; and optionally,
[0238] d. purifying said protein or polypeptide of interest.
[0239] In a preferred embodiment of the method according to the invention, a nucleic acid construct as defined above in the twentieth aspect of the invention is used. The method of the invention may be an in vitro or ex vivo method. The method of the invention may be applied on a cell culture, organism culture, or tissue culture. Alternatively, next to the expression in host cells the protein or polypeptide of interest can be produced in cell-free translation systems using RNAs derived from the nucleic acid constructs of the present invention. The method of the invention may be performed on cultured cells.
[0240] The skilled person is capable of transforming cells in accordance with step b). Transformation methods as used in step b) include, but are not limited to transfer of purified DNA via cationic lipid reagents and polyethyleneimide (PEI), calcium-phosphate co-precipitation, microparticle bombardment, electroporation of protoplasts and microinjection or use of silicon fibers to facilitate penetration and transfer of DNA into the host cell.
[0241] In step c) the transformed cell is allowed to express the protein or polypeptide of interest, and optionally said protein or polypeptide is subsequently recovered. For example, the transformed cell may be subjected to conditions leading to expression of the protein or polypeptide of interest. The person skilled in the art is well aware of techniques to be used for expressing or overexpressing the protein or polypeptide of interest. Methods in which the transformed cell does not need to be subjected to specific conditions leading to expression of the protein or polypeptide of interest, but in which the protein or polypeptide of interest is automatically (e.g., constitutively) expressed, are also included in the method of the present invention.
[0242] Purification steps and definitions related to these steps as the definition of an isolated protein or polypeptide are the same as in the method of the fourth aspect and have been earlier defined herein. If desired as defined in the method of the fourth aspect, the nucleotide sequence encoding a protein or polypeptide of interest may be ligated to a heterologous nucleotide sequence to encode a fusion protein or polypeptide to facilitate protein purification and protein detection on for instance Western blot and in an ELISA. Suitable heterologous sequences include, but are not limited to, the nucleotide sequences coding for proteins such as for instance glutathione-S-transferase, maltose binding protein, metal-binding polyhistidine, green fluorescent protein, luciferase and beta-galactosidase. The protein or polypeptide may also be coupled to non-peptide carriers, tags or labels that facilitate tracing of the protein or polypeptide, both in vivo and in vitro, and allow for the identification and quantification of binding of the protein or polypeptide to substrates. Such labels, tags or carriers are well-known in the art and include, but are not limited to, biotin, radioactive labels and fluorescent labels.
[0243] Preferably, the method of this twenty third aspect of the invention allows for an increase in expression of a protein or polypeptide of interest. Preferably, expression levels are established in an expression system using an expression construct according to the twenty first aspect of the invention comprising a nucleotide sequence having at least 50% identity with SEQ ID NO: 88 operably linked to a nucleotide sequence encoding a protein or polypeptide of interest Preferably, said protein or polypeptide of interest is a secreted protein or polypeptide and expression of said protein or polypeptide of interest is detected by a suitable assay such as an ELISA assay, Western blotting or, dependent on the identity of the protein or polypeptide of interest, any suitable protein identification and/or quantification assay known to the person skilled in the art. Preferably, the method of the invention allows for an increase in protein or polypeptide expression of at least 5%, 10%, 15%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 100%, 200%, 300%, 500%, 1000%, 1500% or 2000% as compared to a method which only differs in that a construct is used in step a) that in said construct the nucleotide sequence having at least 50% identity with SEQ ID NO:88 has been replaced by an alternative sequence, preferably one of those as described in example 11, more preferably when tested in a system as exemplified in example 11 which is enclosed herein. More specifically, preferably the expression of nucleotide sequence of interest encoding for secreted alkaline phosphatase (SeAP) is measured in a mammalian cell system, most preferably in CHO cells, using a pcDNA3.1 expression vector comprising a nucleotide sequence having at least 50% identity with SEQ ID NO:88 operably linked to said nucleotide sequence of interest. Expression is preferably measured by measuring the conversion of any suitable alkaline phosphatase substrate and expression levels are compared to expression levels of said nucleotide sequence of interest which are measured under the same conditions except that in the expression vector the nucleotide sequence having at least 50% identity with SEQ ID NO:88 has been replaced by an alternative sequence, preferably one of those as described in example 11.
Twenty Fourth Aspect
[0244] In a twenty fourth aspect, the present invention provides a method for expressing a protein or polypeptide of interest in an organism, comprising the steps of:
[0245] a) providing a nucleic acid construct according to the twentieth aspect comprising a nucleotide sequence encoding a protein or polypeptide of interest of the invention; and,
[0246] b) contacting a target cell and/or target tissue of an organism, with said nucleic acid construct to obtain a transformed target cell and/or transformed target tissue, allowing said transformed cell to express the protein or polypeptide of interest; and optionally,
[0247] c) allowing said transformed target cell to develop into a transformed organism; and, optionally,
[0248] d) allowing said transformed organism to express the protein or polypeptide of interest, for example, subjecting said transformed organism to conditions leading to expression of the protein or polypeptide of interest, and optionally recovering said protein or polypeptide.
[0249] The target cell may be an embryonal target cell, e.g., embryonic stem cell, for example, derived from a non-human mammalian, such as bovine, porcine, et cetera species. Preferably, said target cell is not a human embryonic stem cell. In the case of a multicellular fungus, such target cell may be a fungal cell that can be proliferated into said multicellular fungus. When a transformed plant tissue or plant cell (e.g., pieces of leaf, stem segments, roots, but also protoplasts or plant cells cultivated by suspension) is obtained with this method according to the invention, whole plants can be regenerated from said transformed tissue or cell in a suitable medium, which optionally may contain antibiotics or biocides known in the art for the selection of transformed cells. This method of the invention may be applied in nucleic acid based vaccination and/or gene therapy preferably in a mammal, most preferably in a human. Encompassed within the present invention is a method of treatment comprising the method of the present aspect, wherein the protein or polypeptide of interest is a therapeutic and/or immunogenic protein or polypeptide. The invention also relates to a construct of the twentieth aspect of the invention for treatment, wherein the protein or polypeptide of interest is a therapeutic and/or immunogenic protein or polypeptide. Furthermore, the invention relates to the use of a construct of the twentieth aspect of the invention for the manufacture of a medicament, wherein the protein or polypeptide of interest is a therapeutic and/or immunogenic protein or polypeptide.
[0250] Furthermore, an embodiment of the invention is a non-human transformed organism. Said organism is transformed with a nucleotide sequence, recombinant nucleic acid construct, or vector according to the present invention, and is capable of producing the polypeptide of interest. This includes a non-human transgenic organism, such as a transgenic non-human mammalian, transgenic plant (including propagation, harvest and tissue material of said transgenic plant, including, but not limited to, leafs, roots, shoots and flowers), multicellular fungus, and the like.
[0251] Preferably, the method of this aspect of the invention allows for an increase in expression of a protein or polypeptide of interest in said organism or at least in one tissue or organelle or organ of said organism. Preferably, expression levels are established in an expression system using an expression construct according to the twenty first aspect of the invention comprising a nucleotide sequence having at least 50% identity with SEQ ID NO:88 operably linked to a nucleotide sequence encoding a protein or polypeptide of interest. Preferably, said protein or polypeptide of interest is a secreted protein or polypeptide and expression of said protein or polypeptide of interest is detected by a suitable assay such as an ELISA assay, Western blotting or, dependent on the identity of the protein or polypeptide of interest, any suitable protein identification and/or quantification assay known to the person skilled in the art. Preferably, this method of the invention allows for an increase in protein or polypeptide expression of at least 5%, 10%, 15%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 100%, 200%, 300%, 500%, 1000%, 1500% or 2000% in said organism or at least in one tissue or organelle or organ of said organism. as compared to a method which only differs in that a construct is used in step a) wherein the nucleotide sequence having at least 50% identity with SEQ ID NO:88 has been replaced by an alternative sequence, preferably one of those as described in example 11, preferably when tested in a system as exemplified in example 11 which is enclosed herein. More specifically, preferably the expression of a nucleotide sequence of interest encoding for secreted alkaline phosphatase (SeAP) is measured in a mammalian cell system, most preferably in CHO cells, using a pcDNA3.1 expression vector comprising a nucleotide sequence having at least 50% identity with SEQ ID NO:88 operably linked to said nucleotide sequence of interest. Expression is preferably measured by measuring the conversion of any suitable alkaline phosphatase substrate and expression levels are compared to expression levels of said nucleotide sequence of interest which are measured under the same conditions except that in the expression vector said nucleotide sequence having at least 50% identity with SEQ ID NO:88 has been replaced by an alternative sequence, preferably one of those as described in example 11.
Twenty Fifth Aspect
[0252] In a twenty fifth aspect, the present invention provides a method for transcription and optionally purifying the produced transcript comprising the step of:
[0253] a) providing a nucleic acid construct according to the twentieth aspect comprising a nucleotide sequence of interest of the invention; and,
[0254] b) contacting a cell with said nucleic acid construct to obtain a transformed cell; and,
[0255] c) allowing said transformed cell to produce a transcript of the nucleotide sequence of interest; and optionally,
[0256] d) purifying said produced transcript.
[0257] In a preferred embodiment of this method according to the invention a nucleic acid construct as defined above in the twentieth aspect is used. The method of the invention may be an in vitro or ex vivo method. The method of the invention may be applied on a cell culture, organism culture, or tissue culture. The method of the invention may be applied in nucleic acid based vaccination and/or gene therapy preferably in a mammal, preferably in a human. Encompassed within the present invention is a method for treatment comprising or consisting of the method of the present aspect, wherein the nucleotide sequence of interest encodes for a therapeutic transcript. The invention also relates to a construct of the twentieth aspect of the invention for use in treatment, wherein the nucleotide sequence of interest encodes for a therapeutic transcript. Furthermore, the invention relates to the use of a construct of the twentieth aspect of the invention for the manufacture of a medicament, wherein the nucleotide sequence of interest encodes for a therapeutic transcript.
[0258] The skilled person is capable of transforming cells in accordance with step b). Transformation methods as used in step b) include, but are not limited to transfer of purified DNA via cationic lipid reagents and polyethyleneimide (PEI), calcium-phosphate co-precipitation, microparticle bombardment, electroporation of protoplasts and microinjection or use of silicon fibers to facilitate penetration and transfer of DNA into the host cell.
[0259] In step c) the transformed cell is allowed to produce a transcript of the nucleotide sequence of interest, and optionally the produced transcript is subsequently recovered. For example, the transformed cell may be subjected to conditions leading to transcription the nucleotide sequence of interest. The person skilled in the art is well aware of techniques to be used for transcription the nucleotide sequence of interest. Methods in which the transformed cell does not need to be subjected to specific conditions leading to transcription of the nucleotide sequence of interest, but in which the nucleotide sequence of interest is automatically (e.g., constitutively) transcribed, are also included in the method of the present invention.
[0260] Purification steps depend on the transcript produced. The term "isolation" indicates that the transcript is found in a condition other than its native environment. In a preferred form, the isolated transcript is substantially free of other cellular components, particularly other homologous cellular components such as homologous proteins. It is preferred to provide the transcript in a greater than 40% pure form, more preferably greater than 60% pure form. Even more preferably it is preferred to provide the transcript in a highly purified form, i.e., greater than 80% pure, more preferably greater than 95% pure, and even more preferably greater than 99% pure, as determined by Northern blot.
[0261] Preferably, the method of this aspect of the invention allows for an increase in transcription of a nucleotide sequence of interest. Preferably, transcription levels are established in an expression system using an expression construct according to the second aspect of the invention comprising a nucleotide sequence having at least 50% identity with SEQ ID NO:88 operably linked to a nucleotide sequence of interest. Preferably, transcription of said nucleotide sequence of interest is detected by a suitable assay such as RT-qPCR. Preferably, the method of the invention allows for an increase in transcription of at least 5%, 10%, 15%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 100%, 200%, 300%, 500%, 1000%, 1500% or 2000% as compared to a method which only differs in that a construct is used in step a) wherein said nucleotide sequence having at least 50% identity with SEQ ID NO:88 has been replaced by an alternative sequence, preferably one of those as described in example 11, preferably when tested in a system as exemplified in example 11 which is enclosed herein. More specifically, preferably the transcription of a nucleotide sequence of interest encoding for secreted alkaline phosphatase (SeAP) is measured in a mammalian cell system, most preferably in CHO cells, using a pcDNA3.1 expression vector comprising a nucleotide sequence having at least 50% identity with SEQ ID NO:88 operably linked to said nucleotide sequence of interest. Transcription is preferably measured using RT-qPCR and transcription levels are compared to transcription levels of said nucleotide sequence of interest measured under the same conditions except that in the expression vector used the nucleotide sequence having at least 50% identity with SEQ ID NO:88 has been replaced by an alternative sequence, preferably one of those as described in example 11.
Twenty Sixth Aspect
[0262] In an twenty sixth aspect, the present invention provides a use of a nucleic acid molecule according to the nineteenth aspect of the invention, and/or a use of a nucleic acid construct according to the twentieth aspect of the invention, and/or a use of an expression vector according to the twenty first aspect of the invention, and/or a use of a cell according to the twenty second aspect of the invention, for the transcription of a nucleotide sequence of interest and/or the expression of a protein or polypeptide of interest.
Twenty Seven Aspect
[0263] In a twenty seven aspect, the present invention provides for a nucleic acid molecule according to according to the nineteenth aspect of the invention, and/or a nucleic acid construct according to the twentieth aspect of the invention, and/or an expression vector according to the twenty first aspect of the invention, and/or a cell according to the twenty second aspect of the invention for use as a medicament. The invention also relates to a method of treatment comprising the administration of a nucleic acid molecule according to the nineteenth aspect of the invention, and/or a nucleic acid construct according to the twentieth aspect of the invention, and/or an expression vector according to the twenty first aspect of the invention, and/or a cell according to the twenty second aspect of the invention, wherein preferably said administration is to a mammal, more preferably to a human. Preferably, said treatment is nucleic acid based vaccination and/or gene therapy preferably in a mammal, most preferably in a human. Furthermore, the invention relates to the use of a nucleic acid molecule according to according to the nineteenth aspect of the invention, and/or the use of a nucleic acid construct according to the twentieth aspect of the invention, and/or the use of an expression vector according to the twenty first aspect of the invention, and/or the use of a cell according to the twenty second aspect of the invention, for the preparation of a medicament. Preferably said medicament is for nucleic acid based vaccination and/or gene therapy preferably in a mammal, most preferably in a human.
[0264] Definitions
[0265] The phrase "nucleic acid" as used herein refers to a naturally occurring or synthetic oligonucleotide or polynucleotide, whether DNA or RNA or DNA-RNA hybrid, single-stranded or double-stranded, sense or antisense, which is capable of hybridization to a complementary nucleic acid by Watson-Crick base-pairing. A nucleic acid of the invention is preferably modified as compared to its naturally occurring counterpart by comprising at least 1, 2, 3, 4, 5, 10, 20, 30 or 50 nucleotide mutations as compared to its naturally occurring counterpart. Preferably, a nucleic acid of the invention does not occur in nature. Nucleic acids of the invention can also include nucleotide analogs (e.g., BrdU), and nonphosphodiester internucleoside linkages (e.g., peptide nucleic acid (PNA) or thiodiester linkages). In particular, nucleic acids can include, without limitation, DNA, RNA, cDNA, gDNA, ssDNA, dsDNA, ssRNA, dsRNA, non coding RNAs, hnRNA, premRNA, matured mRNA or any combination thereof. The terms "nucleic acid sequence" and "nucleotide sequence" as used herein are interchangeable, and have their usual meaning in the art. The term refers to a DNA or RNA molecule in single or double stranded form. An "isolated nucleic acid sequence" refers to a nucleic acid sequence which is no longer in the natural environment from which it was isolated. A nucleic acid molecule is represented by a nucleotide sequence. Furthermore, an element such as, but not limited to an expression enhancing element and a transcription regulating element, is represented by a nucleotide sequence.
[0266] A "recombinant construct" (or chimeric construct) refers to any nucleic acid sequence or molecule, which is not normally found in nature in a species, in particular a nucleic acid sequence, molecule or gene in which one or more parts of the nucleic acid sequence are present that are not associated with each other in nature. For example, a recombinant construct comprises a promoter that is not associated in nature with part or all of the transcribed region or with another regulating region comprised within said recombinant construct. The term "recombinant construct" is understood to include expression constructs in which a promoter or expression regulating sequence is operably linked to one or more sense sequences (e.g. coding sequences) or to an antisense (reverse complement of the sense strand) or inverted repeat sequence (sense and antisense, whereby the RNA transcript forms double stranded RNA upon transcription), or to any other sequence coding for a functional RNA molecule.
[0267] A "nucleic acid construct" is defined as a polynucleotide which is isolated from a naturally occurring gene or which has been modified to contain segments of polynucleotides which are combined or juxtaposed in a manner which would not otherwise exist in nature. Optionally, a polynucleotide present in a nucleic acid construct is operably linked to one or more control sequences, which direct the production or transcription of a nucleotide sequence of interest and/or the expression of a peptide or polypeptide of interest in a cell or in a subject
[0268] A "vector" or "plasmid" is herein understood to mean a man-made (usually circular) nucleic acid molecule resulting from the use of recombinant DNA technology and which is used to deliver exogenous DNA into a host cell. Vectors usually comprise further genetic elements to facilitate their use in molecular cloning, such as e.g. selectable markers, multiple cloning sites and the like (see below). A nucleic acid construct may also be part of a recombinant viral vector for expression of a protein in a plant or plant cell (e.g. a vector derived from cauliflower mosaic virus, CaMV, or tobacco mosaic virus, TMV) or in a mammalian organism or mammalian cell system (e.g. a vector derived from Moloney murine leukemia virus (MMLV; a Retrovirus) a Lentivirus, an Adeno-associated virus (AAV) or an adenovirus (AdV)).
[0269] A "transformed cell" are terms referring to a new individual cell (or organism), arising as a result of the introduction into said cell of at least one nucleic acid molecule, especially comprising a chimeric or recombinant construct encoding a desired protein or a nucleic acid sequence which upon transcription yields an antisense RNA for silencing of a target gene/gene family. The host cell may be a plant cell, a bacterial cell (e.g. an Agrobacterium strain), a fungal cell (including a yeast cell), an animal (including insect, mammalian) cell, etc. The transformed cell may contain the nucleic acid construct as an extra-chromosomally (episomal) replicating molecule, as a non-replicating molecule or comprises the recombinant construct integrated in the nuclear or organellar DNA of the host cell. The term "organism" as used herein, encompasses all organisms consisting of more than one cell, i.e. multicellular organisms, and includes multicellular fungi. "Transformation" and "transformed" refers to the transfer of a nucleic acid sequence, generally a nucleic acid sequence comprising a recombinant construct or gene of interest (GOI), into the nuclear genome of a cell to create a "transgenic" cell or organism comprising a transgene. The introduced nucleic acid sequence is generally, but not always, integrated in the host genome. When the introduced nucleic acid sequence is not integrated in the host genome, one may speak of "transfection", "transiently transfected", and "transfected". For the purposes of the present patent specification, the terms "transformation", "transiently transfected", and "transfection" are used interchangeably, and refer to stable or transient presence of a nucleic acid sequence into a cell or organism. When the cell is a bacterial cell, the term usually refers to an extrachromosomal, self-replicating vector which harbors a selectable antibiotic resistance.
[0270] "Sequence identity" or "identity" in the context of amino acid- or nucleic acid-sequence is herein defined as a relationship between two or more amino acid (peptide, polypeptide, or protein) sequences or two or more nucleic acid (nucleotide, polynucleotide) sequences, as determined by comparing the sequences. In the art, "identity" also means the degree of sequence relatedness between amino acid or nucleotide sequences, as the case may be, as determined by the match between strings of such sequences. Within the present invention, sequence identity with a particular sequence indicated with a particular SEQ ID NO preferably means sequence identity over the entire length of said particular polypeptide or polynucleotide sequence indicated with said particular SEQ ID NO. However, sequence identity with a particular sequence indicated with a particular SEQ ID NO may also mean that sequence identity is assessed over a part of said SEQ ID NO. A part may mean at least 50%, 60%, 70%, 80%, 90% or 95% of the length of said SEQ ID NO. The sequence information as provided herein should not be so narrowly construed as to require inclusion of erroneously identified bases. The skilled person is capable of identifying such erroneously identified bases and knows how to correct for such errors.
[0271] Any nucleotide sequences capable of hybridising to the nucleotide sequences of the invention are defined as being part of the cis-acting elements of the invention. Stringent hybridisation conditions are herein defined as conditions that allow a nucleic acid sequence of at least 25, preferably 50, 75 or 100, and most preferably 150 or more nucleotides, to hybridise at a temperature of about 65.degree. C. or of 65.degree. C. in a solution comprising about 1 M salt or 1 M salt, preferably 6.times.SSC or any other solution having a comparable ionic strength, and washing at 65.degree. C. in a solution comprising about 0.1 M salt, or 0.1 M salt or less, preferably 0.2.times.SSC or any other solution having a comparable ionic strength. Preferably, the hybridisation is performed overnight, i.e. at least for 10 hours and preferably washing is performed for at least one hour with at least two changes of the washing solution. These conditions will usually allow the specific hybridisation of sequences having about 90% or more sequence identity or at least 90% sequence identity.
[0272] Moderate hybridization conditions are herein defined as conditions that allow a nucleic acid sequence of at least 50, preferably 150 or more nucleotides, to hybridise at a temperature of about 45.degree. C. or of 45.degree. C. in a solution comprising about 1 M salt or 1 M salt, preferably 6.times.SSC or any other solution having a comparable ionic strength, and washing at room temperature in a solution comprising about 1 M salt, or 1 M salt preferably 6.times.SSC or any other solution having a comparable ionic strength. Preferably, the hybridisation is performed overnight, i.e. at least for 10 hours, and preferably washing is performed for at least one hour with at least two changes of the washing solution. These conditions will usually allow the specific hybridisation of sequences having up to 50% sequence identity. The person skilled in the art will be able to modify these hybridisation conditions in order to specifically identify sequences varying in identity between 50% and 90%.
[0273] "Identity" can be readily calculated by known methods, including but not limited to those described in Computational Molecular Biology, Lesk, A. M., ed., Oxford University Press, New York, 1988; Biocomputing: Informatics and Genome Projects, Smith, D. W., ed., Academic Press, New York, 1993; Computer Analysis of Sequence Data, Part I, Griffin, A. M., and Griffin, H. G., eds., Humana Press, New Jersey, 1994; Sequence Analysis in Molecular Biology, von Heine, G., Academic Press, 1987; and Sequence Analysis Primer, Gribskov, M. and Devereux, J., eds., M Stockton Press, New York, 1991; and Carillo, H., and Lipman, D., SIAM J. Applied Math., 48:1073 (1988).
[0274] Preferred methods to determine identity are designed to give the largest match between the sequences tested. Methods to determine identity and similarity are codified in publicly available computer programs. Preferred computer program methods to determine identity and similarity between two sequences include e.g. the GCG program package (Devereux, J., et al., Nucleic Acids Research 12 (1): 387 (1984)), BestFit, BLASTP, BLASTN, and FASTA (Altschul, S. F. et al., J. Mol. Biol. 215:403-410 (1990). The BLAST X program is publicly available from NCBI and other sources (BLAST Manual, Altschul, S., et al., NCBI NLM NIH Bethesda, MD 20894; Altschul, S., et al., J. Mol. Biol. 215:403-410 (1990). The well-known Smith Waterman algorithm may also be used to determine identity.
[0275] Preferred parameters for polypeptide sequence comparison include the following: Algorithm: Needleman and Wunsch, J. Mol. Biol. 48:443-453 (1970); Comparison matrix: BLOSSUM62 from Hentikoff and Hentikoff, Proc. Natl. Acad. Sci. USA. 89:10915-10919 (1992); Gap Penalty: 12; and Gap Length Penalty: 4. A program useful with these parameters is publicly available as the "Ogap" program from Genetics Computer Group, located in Madison, Wis. The aforementioned parameters are the default parameters for amino acid comparisons (along with no penalty for end gaps).
[0276] Preferred parameters for nucleic acid comparison include the following: Algorithm: Needleman and Wunsch, J. Mol. Biol. 48:443-453 (1970); Comparison matrix: matches=+10, mismatch=0; Gap Penalty: 50; Gap Length Penalty: 3. Available as the Gap program from Genetics Computer Group, located in Madison, Wis. Given above are the default parameters for nucleic acid comparisons.
[0277] Preferred program and parameter for assessing identity for nucleic acid comparison is calculated using EMBOSS Needle Nucleotide Alignment algorithm with the following parameters: DNAfull matrix with the following gap penalties: open=10; extend=0.5 as carried out in example 9.
[0278] The term "derived from" in the context of being derived from a particular naturally occurring gene or sequence is defined herein as being chemically synthesized according to a naturally occurring gene or sequence and/or isolated and/or purified from a naturally occurring gene or sequence. Techniques for chemical synthesis, isolation and/or purification of nucleic acid molecules are well known in the art. In general, a derived sequence is a partial sequence of the naturally occurring gene or sequence or a fraction of the naturally occurring gene or sequence. Optionally, the derived sequence comprises nucleic acid substitutions or mutations, preferably resulting in a sequence being at least 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identical over its whole length to the naturally occurring gene partial gene or sequence or partial sequence.
[0279] "Polypeptide" as used herein refers to any peptide, oligopeptide, polypeptide, gene product, expression product, or protein. A polypeptide is comprised of consecutive amino acids. The term "polypeptide" encompasses naturally occurring or synthetic molecules. A polypeptide is represented by an amino acid sequence. A polynucleotide is represented by a nucleotide sequence. A polypeptide is represented by an amino acid sequence.
[0280] The term "homologous" when used to indicate the relation between a given nucleic acid or polypeptide molecule and a given host organism or host cell, is understood to mean that in nature the nucleic acid or polypeptide molecule is produced by a host cell or organisms of the same species, preferably of the same variety or strain. If homologous to a host cell, a nucleic acid sequence of interest, preferably encoding a polypeptide will typically be operably linked to another promoter sequence or, if applicable, another secretory signal sequence and/or terminator sequence than in its natural environment.
[0281] When used to indicate the relatedness of two nucleic acid sequences the term "homologous" means that one single-stranded nucleic acid sequence may hybridise to a complementary single-stranded nucleic acid sequence. The degree of hybridisation may depend on a number of factors including the extent of identity between the sequences and the hybridisation conditions such as temperature and salt concentration as discussed later. Preferably, the region of identity is greater than 5 bp, more preferably the region of identity is greater than 10 bp.
[0282] The term "heterologous" when used to indicate the relation between a given (recombinant) nucleic acid or polypeptide molecule and a given host organism or host cell, is understood to mean a nucleic acid or polypeptide molecule from a foreign cell which does not occur naturally as part of the organism, cell, genome or DNA or RNA sequence in which it is present, or which is found in a cell or location or locations in the genome or DNA or RNA sequence that differ from that in which it is found in nature. Heterologous nucleic acids or proteins are not endogenous to the cell into which they are introduced, but have been obtained from another cell or synthetically or recombinantly produced.
[0283] When used to indicate the relatedness of two nucleic acid sequences, the term the term "heterologous sequence" or "heterologous nucleic acid" is one that is not naturally found operably linked as neighboring sequence of the other sequence. As used herein, the term "heterologous" may mean "recombinant". "Recombinant" refers to a genetic entity distinct from that generally found in nature. As applied to a nucleotide sequence or nucleic acid molecule, this means that said nucleotide sequence or nucleic acid molecule is the product of various combinations of cloning, restriction and/or ligation steps, and other procedures that result in the production of a construct that is distinct from a sequence or molecule found in nature.
[0284] "Operably linked" is defined herein as a configuration in which a control sequence or regulating sequence is appropriately placed at a position relative to the nucleotide sequence of interest, preferably coding for the polypeptide of interest such that the control or regulating sequence directs or affects the transcription and/or production or expression of the nucleotide sequence of interest, preferably encoding a peptide or polypeptide of the invention in a cell and/or in a subject. For instance, a promoter is operably linked to a coding sequence if the promoter is able to initiate or regulate the transcription or expression of a coding sequence, in which case the coding sequence should be understood as being "under the control of" the promoter. When one or more nucleotide sequences and/or elements comprised within a construct are defined herein to be "configured to be operably linked to an optional nucleotide sequence of interest", said nucleotide sequences and/or elements are understood to be configured within said construct in such a way that these nucleotide sequences and/or elements are all operably linked to said nucleotide sequence of interest once said nucleotide sequence of interest is present in said construct.
[0285] "Promoter" refers to a nucleic acid sequence located upstream or 5' to a translational start codon of an open reading frame (or protein-coding region) of a gene and that is involved in recognition and binding of RNA polymerase II and other proteins (trans-acting transcription factors) to initiate transcription. The term promoter refers to a nucleic acid fragment that functions to control the transcription of one or more genes, located upstream with respect to the direction of transcription of the transcription initiation site of the gene, and is structurally identified by the presence of a binding site for DNA-dependent RNA polymerase, transcription initiation sites and any other DNA sequences, including, but not limited to transcription factor binding sites, repressor and activator protein binding sites, and any other sequences of nucleotides known to one skilled in the art to act directly or indirectly to regulate the amount of transcription from the promoter. The promoter does not include the transcription start site (TSS) but rather ends at nucleotide -1 of the transcription site, and does not include nucleotide sequences that become untranslated regions in the transcribed mRNA such as the 5'-UTR. Promoters of the invention may be tissue-specific, tissue-preferred, cell-type specific, inducible and constitutive promoters. Tissue-specific promoters are promoters which initiate transcription only in certain tissues and refer to a sequence of DNA that provides recognition signals for RNA polymerase and/or other factors required for transcription to begin, and/or for controlling expression of the coding sequence precisely within certain tissues or within certain cells of that tissue. Expression in a tissue-specific manner may be only in individual tissues or in combinations of tissues. Tissue-preferred promoters are promoters that preferentially initiate transcription in certain tissues. Cell-type-specific promoters are promoters that primarily drive expression in certain cell types. Inducible promoters are promoters that are capable of activating transcription of one or more DNA sequences or genes in response to an inducer. The DNA sequences or genes will not be transcribed when the inducer is absent. Activation of an inducible promoter is established by application of the inducer. Constitutive promoters are promoters that are active under many environmental conditions and in many different tissue types. Preferably, capability to initiate transcription is established in an expression system using an expression construct comprising said promoter operably linked to a nucleotide sequence of interest using a suitable assay such a RT-qPCR or Northern blotting. A promoter is said to be capable to start transcription if a transcript can be detected or if an increase in a transcript level is found of at least 5%, 10%, 15%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 100%, 200%, 300%, 500%, 1000%, 1500% or 2000% as compared to transcription using a construct which only differs in that it is free of said promoter. In a further preferred embodiment, capability to initiate expression is established in an expression system using an expression construct comprising said promoter operably linked to a nucleotide sequence encoding a protein or polypeptide of interest. Preferably, said protein or polypeptide of interest is a secreted protein or polypeptide and expression of said protein or polypeptide of interest is detected by a suitable assay such as an ELISA assay, Western blotting or, dependent on the identity of the protein or polypeptide of interest, any suitable protein identification and/or quantification assay known to the person skilled in the art. A promoter is said to be capable to initiate expression if the protein or polypeptide of interest can be detected or if an increase in a expression level is found of at least 5%, 10%, 15%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 100%, 200%, 300%, 500%, 1000%, 1500% or 2000% as compared to expression using a construct which only differs in that it is free of said promoter. As a first and second promoter of the invention, an induced or constitutive promoter or a combination thereof may be used in the present invention.
[0286] An "intron" is a nucleotide sequence within a primary RNA transcript that is removed by RNA splicing or intron splicing while the final mature RNA product is being generated. Assessment whether intron splicing occurs can be done using any suitable method known to the person skilled in the art, such as but not limited to reverse-transcriptase polymerase chain reaction (RT-PCR) followed by size or sequence analysis of the RT-PCR product. Preferably, a nucleotide sequence is an intron if at least 2%, 5%, 10%, 15%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90% or 100% of the primary RNA loses this sequence by RNA splicing using an assay suitable to detect intron splicing as indicated above. Preferably, an intron comprises a splice site GT at the 5' end of the nucleotide sequence, and a splice site AG at the 3' end of the nucleotide sequence, which splice site AG is preceded by a pyrimidine rich nucleotide sequence or polypyrimidine tract, optionally separated from splice site AG by 1-50 nucleotides. An intron may further comprise a branch site comprising the sequence Y-T-N-A-Y, at the 5' side of the polypyrimidine tract. The branch site may have the nucleotide sequence C-Y-G-A-C. An "intronic sequence" is understood to be at least part of the nucleotide sequence of an intron.
[0287] "Expression" will be understood to include any step involved in the production of the peptide or polypeptide including, but not limited to, transcription, post-transcriptional modification, translation, post-translational modification and secretion.
[0288] Optionally, a promoter represented by a nucleotide sequence present in a nucleic acid construct is operably linked to another nucleotide sequence encoding a peptide or polypeptide as identified herein.
[0289] An expression vector may be any vector which can be conveniently subjected to recombinant DNA procedures and can bring about the expression of a nucleotide sequence encoding a polypeptide of the invention in a cell and/or in a subject.
[0290] As used herein, the "5'-UTR" is the sequence starting with nucleotide 1 of the mRNA and ending with nucleotide -1 of the start codon. It is possible that a regulating part of the promoter is comprised within the nucleotide sequence becoming a 5'-UTR; however, in such case, the 5'-UTR is still not part of the promoter as herein defined.
[0291] The term "control sequences" is defined herein to include all components, which are necessary or advantageous for the expression of a polynucleotide or a polypeptide. Each control sequence may be native or foreign to the nucleic acid sequence harboring or encoding the polynucleotide or the polypeptide. Such control sequences include, but are not limited to, a leader, optimal translation initiation sequences (as described in Kozak, 1991, J. Biol. Chem. 266:19867-19870), a polyadenylation sequence, a pro-peptide sequence, a pre-pro-peptide sequence, a promoter, a signal sequence, and a transcription terminator. At a minimum, the control sequences include a promoter, and transcriptional and translational stop signals.
[0292] The control sequences may be provided with linkers for the purpose of introducing specific restriction sites facilitating ligation of the control sequences with the coding region of the nucleic acid sequence encoding a polypeptide.
[0293] The control sequence may be an appropriate promoter sequence, a nucleic acid sequence, which is recognized by a host cell for expression of the nucleic acid sequence. The promoter sequence contains transcriptional control sequences, which mediate the expression of the polypeptide. The promoter may be any nucleic acid sequence, which shows transcriptional activity in the cell including mutant, truncated, and hybrid promoters, and may be obtained from genes encoding extracellular or intracellular polypeptides either homologous or heterologous to the cell.
[0294] The control sequence may also be a suitable transcription terminator sequence, a sequence recognized by a host cell to terminate transcription. The terminator sequence is operably linked to the 3' terminus of the nucleic acid sequence of interest, preferably encoding a polypeptide of interest. Any terminator, which is functional in the cell, may be used in the present invention.
[0295] The control sequence may also be a suitable leader sequence, a non-translated region of a mRNA which is important for translation by the host cell. The leader sequence is operably linked to the 5' terminus of the nucleic acid sequence of interest, preferably encoding a polypeptide of interest. Any leader sequence, which is functional in the cell, may be used in the present invention.
[0296] The control sequence may also be a polyadenylation sequence, a sequence which is operably linked to the 3' terminus of the nucleic acid sequence and which, when transcribed, is recognized by the host cell as a signal to add adenine residues to transcribed mRNA. Any polyadenylation sequence, which is functional in the cell, may be used in the present invention.
[0297] In this document and in its claims, the verb "to comprise" and its conjugations is used in its non-limiting sense to mean that items following the word are included, but items not specifically mentioned are not excluded. In addition the verb "to consist" may be replaced by "to consist essentially of" meaning that a product or a composition or a nucleic acid molecule or a peptide or polypeptide of a nucleic acid construct or vector or cell as defined herein may comprise additional component(s) than the ones specifically identified; said additional component(s) not altering the unique characteristic of the invention. In addition, reference to an element by the indefinite article "a" or "an" does not exclude the possibility that more than one of the elements is present, unless the context clearly requires that there be one and only one of the elements. The indefinite article "a" or "an" thus usually means "at least one".
[0298] All patent and literature references cited in the present specification are hereby incorporated by reference in their entirety.
TABLE-US-00001 TABLE 1 Sequence identification SEQ ID NO: Description 1 Expression enhancing element 1 2 Expression enhancing element 2 3 UN1 4-13 sequences derived from UN1 14 UN2 15 UN1dGAA 16 UN2dGAA 17 R3 18 fUN1 19 UN2-2 20 UN2-3 21 UN2-4 22 UN2-5 23 UN2-6 24 UN2-7 25 UN2-8 26 UN2-9 27 UN2-10 28 UN1dGAA-2 29 UN1dGAA-3 30 UN1dGAA-4 31 UN1dGAA-5 32 UN1dGAA-6 33 UN2dGAA-2 34 UN2dGAA-3 35 UN2dGAA-4 36 UN1shuffle 37 UN1shuffle-2 38 UN1shuffle-3 39 UN1shuffle-4 40 UN1shuffle-5 41 UN1shuffle-6 42 UN2shuffle-1 43 UN2shuffle-2 44 CAA1 45 CAA2 46 CAA3 47 CAA4 48 CAA5 49 CAA6 50 TATA1 51 TATA2 52 CMV promoter enhancer sequence 53 UBC enhancer region 54 CMV promoter enhancer sequence 55 construct 56 construct 57 CMV promoter sequence 58 Minimal CMV promoter sequence 59 EEE1-Xt 60 EEE1-80 61 EEE1-60 62 EEE1-50 63 EEE1-SL 64 HC RACE primer 65 Light chain vector sequence 66 Heavy chain vector sequence 67 HuMabl protein light chain 68 HuMabl protein heavy 69 HuMab2 protein light chain 70 HuMab2 protein and heavy chain 71 pcDNA3.1 (+) 72 SeAP protein 73 EEE1 + CMV + TEE 74 EEE1-Xt + CMV + TEE 75 EEE1-80 + CMV + TEE 76 EEE1-60 + CMV + TEE 77 pPNic384 78 pPNic602 insert 79 EF1a promoter 80 EEE1-A1 81 EEE1-A2 82 EEE1-A3 83 EEE1-B1 84 EEE1-B2 85 EEE1-B3 86 EEE1-B4 87 EEE1-B5 88 Transcription regulating sequence
FIGURES
[0299] FIG. 1. Schematic map of intronic promoter construct and different transcripts. The construct comprises 2 promoters. Transcription by Promoter 1 results in a primary transcript including the intron that contains the complete Promoter 2 sequence and is bordered by 5' and 3'-splice sites. After intron splicing, said primary transcript results in a mRNA without said intron (Transcript 1) encoding a "Gene". Transcription from Promoter 2 also results in a mRNA (Transcript 2) encoding the same "Gene".
[0300] FIGS. 2a-2b. Schematic map of EEE1 (FIG. 2a) and EEE2 (FIG. 2b) elements showing some features of the UBC and CCT8 genes relevant to their promoter activity in a genomic context. Features include the predicted transcription start site (TSS), 5'-UTRs, exon and intron information.
[0301] FIG. 3. Schematic map of an expression vector for an Ig light chain (IgLC) with the EEE1 sequence integrated upstream of the CMV promoter.
[0302] FIGS. 4a-4b. Comparison of HuMab1 production by CHO-S pools stably transfected with Reference or EEE1 constructs. Expression vector without (FIG. 4a) and with (FIG. 4b) additional expression regulating element. The bars represent the average exhaust titers of 4 pools derived from 2 independent transfections. Pools were grown in 125 ml shake-flasks in 30 ml CD FortiCHO selection medium.
[0303] FIG. 5. Comparison of HuMab1 production by CHO-S pools stably transfected with Reference or EEE2 constructs. The bars represent the average exhaust titers of 4 pools derived from 2 independent transfections. Pools were grown in 125 ml shake-flasks in 30 ml CD FortiCHO selection medium.
[0304] FIG. 6. Analysis of HuMab1 production by top-10 CHO-S clonal cell lines stably transfected with EEE1-TEE constructs harboring EEE1. Cells were grown in 125 ml shake-flasks in 30 ml CD FortiCHO selection medium.
[0305] FIGS. 7a-7b. Comparison of HuMab2 production by CHO-S pools stably transfected with EEE1 in reference vector (FIG. 7a, left panel) and in vector with additional regulating element (FIG. 7a, right panel). The bars represent the average exhaust titers of 4 pools derived from 2 independent transfections. Pools were grown in 125 ml shake-flasks in 30 ml CD FortiCHO selection medium. (FIG. 7b) Analysis of HuMab2 production by top-12 Reference and EEE1-TEE CHO-S clonal cell lines. Cells were grown in 125 ml shake-flasks in 30 ml CD FortiCHO selection medium.
[0306] FIG. 8. Comparison of SeAP activity in the exhaust media of CHO-S pools stably transfected with Reference or EEE1 constructs. Expression vector without (left panel) and with (right panel) additional expression regulating element. The bars represent the average activities of 4 pools derived from 2 independent transfections, measured using the SEAP Reporter Gene Assay Kit, Abcam. Pools were grown in 125 ml shake-flasks in 30 ml CD FortiCHO selection medium.
[0307] FIG. 9. Comparison of SeAP activity in the exhaust media of CHO-S pools stably transfected with constructs containing different versions of the EEE1 element. The bars represent the average activities of 4 pools derived from 2 independent transfections, measured using the SEAP Reporter Gene Assay Kit, Abcam. Pools were grown in 125 ml shake-flasks in 30 ml CD FortiCHO selection medium.
[0308] FIGS. 10a-10b. 5'-RACE amplification of 5'-ends of heavy chain transcripts from CHO-S clones stably transfected with EEE1-TEE constructs expressing HuMab1 (FIG. 10a). Two bands are detected on agarose gel, corresponding to the transcripts generated by the CMV promoter (Transcript 1) and the UBC promoter (Transcript 2). The size difference between Transcript 1 and Transcript 2 is explained in a schematic map of intronic promoter construct and different transcripts (FIG. 10b). The construct comprises 2 promoters, UBC and CMV. The CMV promoter is linked to a TEE sequence which also comprises a short intron. Transcription by the CMV promoter results, after intron splicing, in mRNAs with TEE as 5'-UTR (Transcript 1). The UBC promoter is linked to a partial UBC 5'-UTR region which comprises a 5' splice donor site and precedes the CMV promoter. Transcription by the UBC promoter results in mRNAs with the UBC 5'-UTR sequence (Transcript 2). The large intron which is spliced from the primary transcript runs from the 5'-splice donor sequence in the UBC sequence to the 3'-splice acceptor site in the TEE and contains the complete CMV sequence.
[0309] FIG. 11. Effect of EEE in Pichia pastoris expressing recombinant human interleukin 8. Each bar represents the average expression of 10 independent clones.
[0310] The invention will be explained in more detail in the following Examples section, with reference to the appended figures. The examples serve for illustration purposes only, and do not intend to limit the present invention in any way.
EXAMPLES
[0311] The expression enhancing element represented by SEQ ID NO: 1 is based on the Chinese hamster (Cricetulus griseus) ubiquitin-C (UBC) gene. It comprises the predicted promoter sequence and part of the 5'-untranslated region (FIG. 2A). The expression enhancing element represented by SEQ ID NO: 2 is based on the human CCT8 gene (chaperonin containing TCP1, subunit 8). It comprises the predicted promoter sequence and part of the 5'-untranslated region as well as a short sequence encoding 27 amino acids and part of the first intron (FIG. 2B).
Example 1
[0312] Expression plasmids were constructed based on the pcDNA3.1 expression vector (SEQ ID NO: 71). The vector was modified by removing the f1-ori. Coding sequences for an IgG1 (HuMab1) heavy (represented by a sequence that is at least 96% identical SEQ ID NO: 68) and light chain (represented by a sequence that is at least 99% identical SEQ ID NO: 67) genes were inserted in this vector (the light chain coding sequence was inserted in the vector represented by SEQ ID NO: 65 and the heavy chain coding sequence was inserted in the vector represented by SEQ ID NO: 66), resulting in the Reference constructs. To generate the EEE1 (Expression Enhancing Element 1) vectors, the EEE1-sequence (SEQ ID NO: 1) was inserted upstream of the CMV promoter (SEQ ID NO: 57) (FIG. 3). EEE2 (SEQ ID NO: 2) was introduced in a similar way, resulting in the EEE2 vectors. A vector with an additional expression regulating element was generated by replacing the pcDNA3.1 5'-UTR for SEQ ID NO: 19 (transcription enhancing element; TEE).
[0313] CHO-S cells (Life Technologies) were maintained per manufacturer's instructions. Duplicate transfections were performed using 3E7 cells, 50 .mu.g of linearized DNA and FreeStyle MAX Reagent (Life Technologies). Post-transfection pools were split in two and selected in CD FortiCHO medium supplemented with 8 mM glutamine and 800 .mu.g/ml G418. Selected pools were seeded in 30 ml of the same medium at a density of 3E5 cells/ml in 125 ml shake-flasks. The HuMab1 exhaust titers were determined by ELISA (FIG. 4). Exhaust titers of the Reference pools were too low for accurate determination. This indicates that the antibody is poorly expressed. The EEE1 pools produced approximately 1 .mu.g/ml (FIG. 4A). A similar effect was observed in the vector with an additional expression regulating element. In this vector the introduction of EEE1 increased the production from approximately 0.2 .mu.g/ml to 9-12 .mu.g/ml in stable pools from three independent transfection experiments (FIG. 4B). These data show that a poorly expressed antibody can be expressed at significantly higher levels by introduction of the EEE1 element.
Example 2
[0314] The effect of introducing the EEE2 element was studied in a vector harboring an additional expression regulating element (See Example 1, FIG. 4B). CHO-S cells were transfected either with the reference or with the EEE2 constructs as described previously. Antibody exhaust titers of the stably transfected EEE2 pools were over 20 times higher than the Reference pools (FIG. 5).
Example 3
[0315] The EEE1-TEE and Reference pools generated previously were seeded in six 96-well plates at a density of 0.5 cell/well in CD FortiCHO selection medium. The Reference cells showed impaired growth as compared to the EEE1-TEE clones and thus no HuMab1 was produced. Clones of EEE1-TEE showed normal growth and HuMab1 production (See below). 100 Clonal EEE1-TEE lines were assessed for HuMab1 production in microtiter plates. The 10 clones with highest specific productivity were expanded to 125 ml shake-flasks. The clones were seeded in 30 ml of CD FortiCHO selection medium at a density of 3E5 cells/ml and HuMab1 exhaust titers were determined by ELISA (FIG. 6). Clones produced up to 0,25 .mu.g/ml HuMab1. These data indicate that the EEE1 can facilitate the generation of clonal lines and allows the generation of clonal lines with relevant expression levels.
Example 4
[0316] The copy number of antibody expressing EEE comprising clones was determined. The PrimerExpress program (Life Technologies) was used to design Taqman primers and probes specific for the heavy- and light chains of HuMab1 and .beta.-2 microglobulin. The primers were combined in a triplex Taqman assay to measure gene copies in gDNA samples of EEE1-TEE HuMab1 clones and pools. The gene copy numbers were compared with HuMab1 titers (Table 2). Clonal cell lines producing similar HuMab1 titers had different numbers of light and heavy chain gene copies (Clone 1 and 2). Also, clones producing very different HuMab1 titers had similar gene copy numbers (Clone 3 and 4). In pools relatively high numbers of light and heavy chain genes were paired with relatively low expression levels. These data (Table 2) indicate that there is no correlation between EEE comprising gene copy number and HuMab1 expression levels.
TABLE-US-00002 TABLE 2 Titers of IgG1 and gene copy numbers IgG titer LC HC Clone1 123.7 36.8 25.1 Clone2 118.2 1.7 1.7 Clone3 143.6 3.1 1.2 Clone4 7.2 4.6 0.7 Pool 9.0 17.5 21.1
Example 5
[0317] The HuMab1 heavy and light chain genes of the previous examples were replaced by heavy and light chain genes (SEQ ID NO's: 69 and 70) encoding a biosimilar antibody (HuMab2 derived from DrugBank Accession Number DB00072). The constructs were used to generate CHO-S pools as described previously. Using ELISA, the exhaust titers were determined. The data (6.3 .mu.g/ml without enhancing element) indicate that this antibody is produced to a higher level than the antibody from the previous examples. Without any additional expression regulating element introduction of the EEE resulted in a 3.7 fold increase (FIG. 7A, left panel), in the modified vector the increase is 7 fold (FIG. 7A, right panel). Since stand-alone the additional expression regulating element results in a 40% increase, the data also indicate a synergistic effect between the EEE and the additional expression regulating element. Clonal lines were isolated from the Reference and EEE1-TEE pools as described previously. The best EEE1-TEE clones produced 3-fold higher HuMab2 titers as compared to the best Reference clones (FIG. 7B). These data indicate that the EEE1 element can be successfully applied in enhancing recombinant protein expression from stable cell lines.
Example 6
[0318] The HuMab1 light chain gene of the constructs from Example 1 was replaced by the gene encoding secreted alkaline phosphatase (SeAP; SEQ ID NO: 72). The constructs were used to generate CHO-S pools as described previously. The SeAP activity was measured in the exhaust medium using the SEAP Reporter Gene Assay Kit, Abcam. The EEE1 pools showed 2-fold higher activity as compared to the Reference pools (FIG. 8). In the EEE1-TEE pools the increase was almost 4-fold as compared to the Reference pool. These data show that EEE1 enhances the expression of a single subunit non-antibody protein in a transfected cell line.
Example 7
[0319] The SeAP constructs used in Example 6 all comprised the CMV promoter. Two TEE vector variants were made that contained the human EF-1.alpha. promoter instead of CMV (SEQ ID NO: 79). The constructs were used to generate CHO-S pools as described previously. The SeAP activity was measured in the exhaust medium using the SEAP Reporter Gene Assay Kit, Abcam. The EEE1-TEE pool with EF-1.alpha. as intronic promoter produced 2.8-fold higher SeAP activity as compared to the Reference EF-1.alpha. promoter pool without EEE1. These data show that EEE1 enhances the expression of a protein in an intronic promoter construct when the intronic promoter is not the CMV promoter, such as the EF-1.alpha. promoter.
Example 8
[0320] The EEE1 element of the EEE1 SeAP-expression vector was replaced by the following variants: 1. EEE1-80 represented by SEQ ID NO: 60 has a 290 bp truncation from the 5'-end; 2. EEE1-60 represented by SEQ ID NO: 61 with a 580 bp truncation from the 5'-end; 3. EEE1-50 represented by SEQ ID NO: 62 with a 725 bp truncation from the 5'-end; 4. EEE1-Xt represented by SEQ ID NO: 59 with a 800 bp extension from the genomic C. griseus UBC sequence at the 5'-end; 5. EEE1-SL (SEQ ID NO: 63) has all major predicted splice donor and acceptor sites mutated. SeAP activity in the supernatant of cells transfected with the EEE1 element was set at 100%, which decreased to 39% activity without the EEE1 element (FIG. 9). The 5' truncations of the EEE1-80 and EEE1-60 constructs gradually decreased activity but still showed enhanced activity as compared to the No-EEE construct. The EEE1-50 element decreased SeAP activity to 40%, which is similar to the No-EEE construct. The EEE1-Xt construct showed almost 40% increased activity as compared to the EEE1 construct. The data suggest that sequences with more than 50% identity to EEE1 can function as expression enhancing elements. The EEE1-Xt construct produced almost 40% increased activity as compared to the EEE1 construct, which shows that additional enhancer sequences reside in the region upstream of the genomic sequence from which EEE1 was taken. The activity of the EEE1 element is severely impaired by 4 nt mutations of the EEE1-SL construct which prevent correct intron splicing, resulting in a significant reduction in SeAP expression as compared to the EEE1 construct.
Example 9
[0321] The EEE1 element of the EEE1 SeAP-expression vector was replaced by 9 variants of the EEE1 element, which can be grouped based on 2 different types of mutations. The first type of EEE1 variants (EEE1-A) all had changes within the EEE1 or EEE1-Xt element with more than 30 percent of nucleotides mutated, each in another of the 3 regions which each consisted of at least 244 bp. The second type of EEE1 variants (EEE1-B) also had the same size as the EEE1 element (1,449 bp) with at least 96 percent sequence identity, with mutations that targeted different functional sequences of the EEE1 sequence. The different mutations are listed in Table 3.
TABLE-US-00003 TABLE 3 Modifications of EEE1 Type A: More than 30% mutated in 3 regions of EEE1 Identity to Modified Size modified Variant SEQ ID NO: EEE1.sup.1) EEE1 region region (bp) EEE1-A1 80 71.6%.sup.2) 5' promoter region 1,526 EEE1-A2 81 95.0% 3' promoter region 244 EEE1-A3 82 81.8% intron region 480 Type B: Mutations that target specific domains of EEE1 Identity to Modification of Variant SEQ ID NO: EEE1.sup.1) EEE1 sequence EEE1-B1 83 97.9% 1: 7 nt changed in nt 144-152 2: 4 nt changed in nt 612-615 3: 4 nt changed in nt 667-670 4: 6 nt changed in nt 816-823 5: 5 nt changed in nt 1,106-1,112 6: 5 nt changed in nt 1,432-1,438 EEE1-B2 84 96.5% 50 single bp mutations = 50% of CG's mutated in nt 227-1,409; predicted transcription factor binding sites maintained EEE1-B3 85 99.7% 5 single bp mutations = 50% of CG's mutated in nt 549-603 EEE1-B4 86 96.5% 50 single bp mutations = 50% of CG's mutated in nt 227-1,409; 8 predicted sites for transcription factors SP1, HSF, and NF.kappa.B affected. EEE1-B5 87 96.0% 51 bp mutations in 12 regions with predicted transcription factor binding activity spanning nt 105-1,449 were mutated. EEE1-B6.sup.3) 63 99.7% 4 single bp mutations eliminating predicted and known splice-donor or acceptor sites, including known donor site (nt 970), nt 545 and 552 in promoter region, nt 1,267 in intron region. .sup.1)Identity calculated using EMBOSS Needle Nucleotide Alignment algorithm with the following parameters: DNAfull matrix with the following gap penalties: open = 10; extend = 0.5 .sup.2)% identity EEE1-A1 calculated relative to EEE1-Xt .sup.3)This is referred to as EEE1-SL in Example 8
[0322] SeAP activity in the supernatant of cells transfected with the different variants was measured. Activity of cells with EEE1 element was used as reference (100%). Without EEE1 element the activity was 24% in this experiment. SeAP activity of cells transfected with the EEE1-A2 and EEE1-A3 constructs was decreased to 75% and 48% relative to EEE1, respectively (Table 4). This is higher than the 24% activity observed with the No-EEE construct in this experiment. The EEE1-A1 construct decreased SeAP activity to 30% relative to the EEE1-Xt construct on which it is based, which is still higher than the No-EEE construct which produces only 18% of SeAP activity relative to the EEE1-Xt construct. The data show that EEE1 variants with as little as 72% overall identity and locally 50% identity to the genomic UBC sequence can function as expression enhancing elements.
[0323] SeAP activity of cells transfected with the EEE1-B1 to B6 constructs was decreased by up to 42% relative to the EEE1 construct (Table 4). The data show that mutations in regions with a predicted functionality in the intronic promoter activity of the EEE1 element can significantly limit the expression enhancement capability of the EEE1 element. For instance, mutating 4 nt involved in intron-splicing resulted in 38% decreased SeAP titers (EEE1-B6). Mutation of different sets of CpG's also resulted in decreased SeAP titers (B1, B2, B4).
TABLE-US-00004 TABLE 4 SeAP activity of EEE1 variants Construct Activity relative to EEE1 (%).sup.1) No-EEE1 24 EEE1 100 EEE1-A2 75 EEE1-A3 48 EEE1-B1 58 EEE1-B2 60 EEE1-B3 73 EEE1-B4 58 EEE1-B5 84 EEE1-B6 62 Construct Activity relative to EEE1-Xt (%).sup.1) No-EEE1 18 EEE1-Xt 100 EEE1-Al 30 .sup.1)Values represent the average activities of 4 pools derived from 2 independent transfections, measured using the SEAP Reporter Gene Assay Kit, Abcam. Pools were grown in 125 ml shake-flasks in 30 ml CD FortiCHO selection medium
Example 10
[0324] A EEE1-TEE CHO-S clone from Example 3 was grown and cells were harvested in log-phase. Total RNA was isolated from the cells using AllPrep DNA/RNA Mini Kit (Qiagen). cDNA was synthesized using the Epicentre Exact Start Eukaryotic mRNA 5' and 3' RACE Kit. First strand cDNA was amplified using the 5' RACE primer from the kit combined with a heavy chain specific primer (SEQ ID NO: 64) and ZymoTaq DNA polymerase. The PCR product was analyzed on 1.2% agarose gels, showing two discrete bands (FIG. 10A) which were separately isolated and inserted in a PCR4-TOPO vector (Life Technologies). Sequencing analysis revealed that the upper band seen on the agarose gel corresponds to the transcript initiated from the CMV promoter. The lower band corresponds to the transcript initiated from the UBC promoter. Both products have the predicted intronic sequence spliced out correctly. The differences in size correspond to the different lengths of the 5'-UTRs, as depicted in FIG. 10B. The data show that both promoters contribute to transcription.
Example 11
[0325] CHO-S pools stable transfected with constructs with three different single promoters were compared by the SeAP activity in the supernatant. Pools were grown in 125 ml shake-flasks in 30 ml CD FortiCHO selection medium and SeAP activity was measured using the SEAP Reporter Gene Assay Kit (Abcam) in 4 pools per construct derived from 2 independent transfections. The constructs either contained the CMV promoter (Example 6), the EF-1.alpha. promoter (Example 7), or the UBC promoter (Example 11). The UBC promoter produced 2.7-fold higher SeAP activity as compared to the CMV promoter construct. The UBC promoter produced 6.0-fold higher SeAP activity as compared to the EF-1.alpha. promoter. The data shows that the expression with the UBC promoter alone is higher as compared to the CMV promoter or the EF-1.alpha. promoter alone.
Example 12 Methanol Induced Secretion of IL8 in Pichia pastoris GS115 Integrative Transformants
[0326] Plasmids for stable transformation of Pichia pastoris with human interleukin 8 (hIL-8) expression constructs were generated in plasmid pPIC9K (Life Technologies). Insertion of the hIL-8 gene in pPIC9K resulted in plasmid pPNic384 (SEQ ID NO: 77), which contains the hIL-8 gene under control of the AOX1 promoter. The EEE1 sequence was inserted upstream of the AOX1 promoter as a AatII-AleI fragment (SEQ ID NO: 78) in pPNic384, resulting in plasmid pPNic602.
[0327] The expression vectors were linearized by digestion with SalI and transformed into P. pastoris strain GS115 using electroporation as recommended (Invitrogen, 2008). Transformants were plated on RDB agar plates (Regeneration Dextrose Medium, a medium lacking histidine). After incubation at 30.degree. C. for 48 h, large colonies were observed. A control transformation without DNA was performed resulting in no colonies. Randomly 10 clones per construct were picked from the transformation plate and grown to saturation in 800 .mu.l BMG (Buffered minimal medium with 1% glycerol), in 2 ml deep well plates. The plate was kept in a shaking incubator (Infors-HT Microton) set at 30.degree. C., 1000 rpm for 18 hours. The optical density of the culture was between 5-10 absorbance units at 600 nm. The cells were harvested and the medium replaced by 800 .mu.l of BMM (Buffered minimal medium with 0.5% methanol) in 2 ml deep well plates. The cells were grown in the shaking incubator and every 24 hours 0.5% methanol (final concentration) was added to the culture to maintain induction. After 72 hours of methanol induction the culture supernatant were collected and assayed for secreted hIL8 yields using the AlphaLISA hIL8 kit (Perkin Elmer). The data show (FIG. 11) that there is a significant difference between the IL8 yields of the reference and the EEE1 transformants, suggesting that the EEE1 sequence upstream of the promoter improves the hIL8 yields compared to expression plasmid without the EEE1 sequence.
Sequence CWU
1
1
8811449DNAArtificial sequenceEEE1 1tttcaggcaa ccagagctac atagtgagat
cctgtctcaa caaaaataaa ataatctaag 60gcttcaaagg gttcaatctc ttaggtagct
aaatatgaac aaaatttggg aaatgtgacc 120ttttccttag tgacagtcag atagaacctt
ctcgagtgca aggacaccaa gtgcaaacag 180gctcaagaac agcctggaaa ggtctagtgc
tatggggctt caggtcgaat gccaactgtt 240ttcaagaact gtgtggattt ttctgcctgt
aacgaattca gattcatttt tcaaaactcg 300gggagagttt tcccccttta taattttttt
tttaaattta ttaaactttg tttcgttccc 360cttgttttga gaattgcaga gtcatccacc
ctgtcacagt gccagggagc tcagggatgg 420gcccaggggc ctggcggggc tgaaggggct
ggggaagcga gggctccaaa gggaccccag 480tgtggcagga gccaaagccc taggtcccta
gaacgcagag gccaccggga ccccccagac 540ggggtaagcg ggtgggtgtc tggggcgcga
agccgcactg cgcatgcgcc gaggtccgct 600ccggccgcgc tgatccaagc cgggttctcg
cgccgacctg gtcgtgattg acaagtcaca 660cacgctgatc cctccgcggg gccgcacagg
gtcacagcct ttcccctccc cacaaagccc 720cctactctct gggcaccaca cacgaacatt
ccttgagcgt gaccttgttg gctctagtca 780ggcgcctccg gtgcagagac tggaacggcc
ttgggaagta gtccctaacc gcatttccgc 840ggagggatcg tcgggagggc gtggcttctg
aggattatat aaggcgactc cgggcgggtc 900ttagctagtt ccgtcggaga cccgagttca
gtcgccgctt ctctgtgagg actgctgccg 960ccgccgctgg tgaggagaag ccgccgcgct
tggcgtagct gagagacggg gagggggcgc 1020ggacacgagg ggcagcccgc ggcctggacg
ttctgtttcc gtggcccgcg aggaaggcga 1080ctgtcctgag gcggaggacc cagcggcaag
atggcggcca agtggaagcc tgaggggata 1140ggcgagcggc cctgaggcgc tcgacggggt
tgggggggaa gcaggcccgc gaggcagctg 1200cagccgggaa cgtgcggcca accccttatt
ttttttgacg ggttgcgggc cgtaggtgcc 1260tccgaagtga gagccgtggg cgtttgactg
tcgggagagg tcggtcggat tttcatccgt 1320tgctaaagac ggaagtgcga ctgagacggg
aagggggggg agtcggttgg tggcggttga 1380acctggacta aggcgcacat gacgtcgcgg
tttctatggg ctcataatgg gtggtgagga 1440catttccct
144921228DNAArtificial sequenceEEE2
2gtaaagcaga tcacacagaa tatggcacac ttgagcactt gatgtgtact acattactct
60tagtgacgac tttaattatc gtgcgcattc ccagcgcttc ctatggtgcc caacacagag
120cggacgccta gagacaattt tgggggatgg ggcagatgct ctgcctcggg aaaaaaaaag
180cacacctgcc ctgacgttgg tggctgggtc tggaagatac gtggaaatta agctaaggat
240gtgtggcttc cagatcaaaa accgcaaaaa tctaacgccg tgactactga ctacggtcag
300agagcacaga ctggagcaac ctctcacggc ctgggctgtc tgcgcgtgcg tgagccagaa
360acccgagggg ctccctgggc ccgccctatc gatcgacccg atcggggatc gtcagcttgg
420ttctggccac agaggttgct cttctcgcga tgcttcagac ctggcggcag ggaaagggtg
480ggctaattgg agagccagga agagcgtgag gcggccccac gctgctttcc cagaaggctg
540tgcgtgctcc tcgcttcctc cgcggtcttc cgagcggtcg cgtgaactgc ttccagcagg
600ctggccatgg cgcttcacgt tcccaaggct ccgggctttg cccagatgct caaggaggga
660gcgaaagtaa gggctgaagg aaaggaatga ggtgggagcg tcagcatagg gctgcggcgg
720cggcggcgaa gtaggagggc ctactaacgg gctgagcgtg ctgccctggc tcagcggccg
780ggggaagaga agattccaga aagggaggtg attttggaag ggctcggcca ccggagcctg
840cgggcacttc tcttcttccg cgaccgggag aaggccgagg gatcggcggc acgatcgaca
900ttgtacacct tgaaggtgga cggatgtgaa gccgcgcgtg cgttttgcct ccatccgtaa
960atggggctaa ggcccgtcac ccttaaagga ggttgtgagg gtgaaattga ataacgtaga
1020tgaaattgtc ttgagaactg cgacgtcgat tatcacatag ctcgcgagtt gtaggatggg
1080gaagaacgag aactagccga tccagagaag agagtgggaa aaagggccgg gtcttggttg
1140cttgcttccc agtgagaaac atacggcttt cagcttagtt gacagaagcc atgcgttgta
1200gccaaatgag ttccggtccc aacttatg
12283173DNAArtificial sequenceTEE 3caagctctag caggaagaag aaataagaag
aagaagaaga agaagaagaa gcgtctcctc 60ttcttcttgt gagagtaaaa aataaaactc
ccaaaaaaaa gaaaatcatc aaaaaaacaa 120atttcaaaaa gagtttttgt gtttggggat
taaagaataa aaaaaacaac gtc 1734175DNAArtificial sequenceTEE
4caagctctag caggaagaag aaataagaag aagaagaaga agaagaagaa gcgtctcctc
60ttcttcttgt gagagtaaaa aataaaactc ccaaaaaaaa gaaaatcatc aaaaaaacaa
120atttcaaaaa gagtttttgt gtttggggat taaagaataa aaaaaacaac gtccc
1755189DNAArtificial sequenceTEE 5agatcactag aagcttcaag ctctagcagg
aagaagaaat aagaagaaga agaagaagaa 60gaagaagcgt ctcctcttct tcttgtgaga
gtaaaaaata aaactcccaa aaaaaagaaa 120atcatcaaaa aaacaaattt caaaaagagt
ttttgtgttt ggggattaaa gaataaaaaa 180aacaacgcc
1896189DNAArtificial sequenceTEE
6agatcactag aagcttcaag ctctagcagg aagaagaaag aagaagaaga agaagaagaa
60gaagaagcgt ctcctcttct tcttgtgaga gtaaaaaaga aaactcccaa aaaaaagaaa
120atcatcaaaa aaacaaattt caaaaagagt ttttgtgttt ggggattaaa gaagaaaaaa
180aacaacgcc
1897191DNAArtificial sequenceTEE 7agatcactag aagcttcaag ctctagcagg
aagaagaaat aagaagaaga agaagaagaa 60gaagaagcgt ctcctcttct tcttgtgaga
gtaaaaaata aaactcccaa aaaaaagaaa 120atcatcaaaa aaacaaattt caaaaagagt
ttttgtgttt ggggattaaa gaataaaaaa 180aacaacaggc c
1918284DNAArtificial sequenceTEE
8ctttttcgca acgggtttgc cgccagaaca caggtgtcgt gaggaattag cttggtacta
60atacgactca ctatagggag acccaagctg gctaggtaag cttggtaccc aagctctagc
120aggaagaaga aataagaaga agaagaagaa gaagaagaag cgtctcctct tcttcttgtg
180agagtaaaaa ataaaactcc caaaaaaaag aaaatcatca aaaaaacaaa tttcaaaaag
240agtttttgtg tttggggatt aaagaataaa aaaaacaacg tccc
2849230DNAArtificial sequenceTEE 9aacccactgc ttactggctt atcgaaatta
atacgactca ctatagggag acccaagctc 60tagcaggaag aagaaataag aagaagaaga
agaagaagaa gaagcgtctc ctcttcttct 120tgtgagagta aaaaataaaa ctcccaaaaa
aaagaaaatc atcaaaaaaa caaatttcaa 180aaagagtttt tgtgtttggg gattaaagaa
taaaaaaaac aacctccacc 23010177DNAArtificial sequenceTEE
10caagctctag cagcaacaac aaataacaac aacaacaaca acaacaacaa gcgtctcctc
60ttcttcttgt gagagtaaaa aataaaactc ccaaaaaaaa gaaaatcatc aaaaaaacaa
120atttcaaaaa gagtttttgt gtttggggat taaagaataa aaaaaacaac ctccacc
17711191DNAArtificial sequenceTEE 11agatcactag aagcttcaag ctctagcagg
aagaagaaat aagaagaaga agaagaataa 60gaagaagcgt ctcgtcttct tcttgtgaga
gtaaaaaata aaactcccaa aaaaaataaa 120atcatcaaaa aaagaaattt caaaaagagt
ttttgtgttt ggggattaaa gaataaaaaa 180aacaacaggc c
19112189DNAArtificial sequenceTEE
12agatcactag aagcttcaag ctctagcagg aagaagaaat aataagaaga agaagaataa
60gaagaagcgt ctcctcttct tcttgtgaga gtaaaaaata aaactcccaa aaaaaataaa
120atcatcaaaa aaataaattt caaaaagagt ttttgtgttt ggggattaaa gaataaaaaa
180aacaacgcc
18913173DNAArtificial sequenceTEE 13caagctctag caggaagaag aaakaagaag
aagaagaaga agaagaagaa gcgtctcctc 60ttcttcttgt gagagtaaaa aakaaaactc
ccaaaaaaaa kaaaatcatc aaaaaaacaa 120atttcaaaaa gagtttttgt gtttggggat
taaagaakaa aaaaaacaac gtc 17314274DNAArtificial sequenceUN2
14caagctctag caggaagaag aaakaagaag aagaagaaga agaagaagaa gcgtctcctc
60ttcttcttgt gagagtaaaa aakaaaactc ccaaaaaaaa kaaaatcatc aaaaaaacaa
120atttcaaaaa gagtttttgt gtttggggat taaagaakaa aaaaaacaac aggtgagtaa
180gcgcagttgt cgtctcttgc ggtgccgttg ctggttctca caccttttag gtctgttctc
240gtcttccgtt ctgactctct ctttttcgtt gcag
27415123DNAArtificial sequenceUN1dGAA 15ggcgtctcct cttcttcttg tgagagtaaa
aaataaaact cccaaaaaaa akaaaatcat 60caaaaaaaca aatttcaaaa agagtttttg
tgtttgggga ttaaagaaka aaaaaacaac 120gtc
12316257DNAArtificial sequenceUN2dGAA
16aaagtatcaa caaaaaagct tcgtctcctc ttcttcttgt gagagtaaaa aakaaaactc
60ccaaaaaaaa kaaaatcatc aaaaaaacaa atttcaaaaa gagtttttgt gtttgtaagt
120caggactcta gctttctact gtagtatcct ctaaaggact gctgttctgt gcaccccctt
180cctttgttta tcatagcgca cgacaagagt actaactaat taacttaggg ggattaaaga
240akaaaaaaaa caacaaa
25717173DNAArtificial sequenceR3 17caagctctag cacgtctcct cttcttcttg
tgagagtaaa aaakaaaact cccaaaaaaa 60akaaaatcat caaaaaaaca aatttcaaaa
agagtttttg tgtttgggga ttaaagaaka 120aaaaaaacaa ggaagaagaa akaagaagaa
gaagaagaag aagaagaagc ctc 17318384DNAArtificial sequencefUN1
18caagctctag caggaagaag aaataagaag aagaagaaga agaagaagaa gcgtctcctc
60ttcttcttgt gagagtaaaa aataaaactc ccaaaaaaaa gaaaatcatc aaaaaaacaa
120atttcaaaaa gagtttttgt gtttggggat taaagaataa aaaaaacaac gtctggacaa
180accacaacta gaatgcagtg aaaaaaatgc tttatttgtg aaatttgtga tgctattgct
240ttatttgtaa ccattataag ctgcaataaa caagttaaca acaacaattg cattcatttt
300atgtttcagg ttcaggggga ggtgtgggag gttttttaaa gcaagtaaaa cctctacaaa
360tgtggtaaaa tcgataagga tccg
38419277DNAArtificial sequenceUN2-2 19caagctctag caggaagaag aaakaagaag
aagaagaaga agaagaagaa gcgtctcctc 60ttcttcttgt gagagtaaaa aakaaaactc
ccaaaaaaaa gaaaatcatc aaaaaaacaa 120atttcaaaaa gagtttttgt gtttggggat
taaagaagaa aaaaaacaac aggtgagtaa 180gcgcagttgt cgtctcttgc ggtgccgttg
ctggttctca caccttttag gtctgttctc 240gtcttccgtt ctgactctct ctttttcgtt
gcaggcc 27720252DNAArtificial sequenceUN2-3
20aagctctagc aggaagaaga aakaagaaga agaagaagaa gaagaagaag cgtctcctct
60tcttcttgtg agagtaaaaa akaaaactcc caaaaaaaak aaaatcatca aaaaaacaaa
120tttcaaaaag agtaggtaag attatctctt cccaaaattg attacttttt tattgaacaa
180ttattaacca atcatggctt aacgaaaaac aggttttgtg tttggggatt aaagaakaaa
240aaaaacaaaa ca
25221254DNAArtificial sequenceUN2-4 21caagctctag caggaagaag aaakaagaag
aagaagaaga agaagaagaa gcgtctcctc 60ttcttcttgt gagagtaaaa aakaaaactc
ccaaaaaaaa kaaaatcatc aaaaaaacaa 120atttcaaaaa gagtaggtaa gattatctct
tcccaaaatt gattactttt attattgaac 180aattactaac atttcatggc ttaacgaaaa
acaggttttg tgtttgggga ttaaagaaka 240aaaaaaacaa aaca
25422266DNAArtificial sequenceUN2-5
22caagctctag caggaagaag aaakaagaag aagaagaaga agaagaagaa gcgtctcctc
60ttcttcttgt gagagtaaaa aakaaaactc ccaaaaaaaa kaaaatcatc aaaaaaacaa
120atttcaaaaa gagtgaggta agattatcga tatttaaatt atttatttct tcttttccat
180ttttttggct aacattttcc atggttttat gatatcatgc aggtacgttt tgtgtttggg
240gattaaagaa kaaaaaaaac aaaaca
26623266DNAArtificial sequenceUN2-6 23caagctctag caggaagaag aaakaagaag
aagaagaaga agaagaagaa gcgtctcctc 60ttcttcttgt gagagtaaaa aakaaaactc
ccaaaaaaaa kaaaatcatc aaaaaaacaa 120atttcaaaaa gagtgaggta agattatcga
tatttaaatt atttatttct tcttttccat 180ttttttggct aacattttcc taggttttat
tatatctagc aggtacgttt tgtgtttggg 240gattaaagaa kaaaaaaaac aaaaca
26624265DNAArtificial sequenceUN2-7
24aagctctagc aggaagaaga aakaagaaga agaagaagaa gaagaagaag cgtctcctct
60tcttcttgtg agagtaaaaa akaaaactcc caaaaaaaak aaaatcatca aaaaaacaaa
120tttcaaaaag agtgaggtaa gattatcgat atttaaatta tttatttctt cttttccatt
180tttttggcta acattttcct aggttttatt atatctagca ggtacgtttt gtgtttgggg
240attaaagaak aaaaaaaaca aaaca
26525266DNAArtificial sequenceUN2-8 25caagctctag caggaagaag aaakaagaag
aagaagaaga agaagaagaa gcgtctcctc 60ttcttcttgt gagagtaaaa aakaaaactc
ccaaaaaaaa kaaaatcatc aaaaaaacaa 120atttcaaaaa gagtgaggta agattatcga
tatttaaatt atttatttct tcttttccat 180ttttttggct aacattttcc taggttttat
tatatctagc aggtacgttt tgtgtttggg 240gattaaagaa kaaaaaaaac aaaacc
26626287DNAArtificial sequenceUN2-9
26caagctctag caggaagaag aaakaagaag aagaagaaga agaagaagaa gcgtctcctc
60ttcttcttgt gagagtaaaa aakaaaactc ccaaaaaaaa kaaaatcatc aaaaaaacaa
120atttcaaaaa gagtttttgt gtttgtaagt caggactcta gctttctact gtagtatcct
180ctaaaggact gctgttctgt gcaccccctt cctttgttta tcatagcgca cgacaagagt
240actaactaat taacttaggg ggattaaaga akaaaaaaaa caacaaa
28727251DNAArtificial sequenceUN2-10 27caagctctag caggaagaag aaakaagaag
aagaagaaga agaagaagaa gcgtctcctc 60ttcttcttgt gagagtaaaa aakaaaactc
ccaaaaaaaa kaaaatcatc aaaaaaacaa 120atttcaaaaa gagtttttgt gtttgggtaa
gtaattgcct tactcggaaa ataatcaatc 180atcatactaa cgcaagaggc gctgatattg
cggttataca gggattaaag aakaaaaaaa 240acaacgtcac c
25128143DNAArtificial sequenceUN1dGAA-2
28aaagtatcaa caaaaaagct tcgtctcctc ttcttcttgt gagagtaaaa aakaaaactc
60ccaaaaaaaa kaaaatcatc aaaaaaacaa atttcaaaaa gagtttttgt gtttggggat
120taaagaakaa aaaaaacaac aaa
14329123DNAArtificial sequenceUN1dGAA-3 29tcgtctcctc ttcttcttgt
gagagtaaaa aakaaaactc ccaaaaaaaa kaaaatcatc 60aaaaaaacaa atttcaaaaa
gagtttttgt gtttggggat taaagaakaa aaaaaacaac 120gtc
12330134DNAArtificial
sequenceUN1dGAA-4 30caagctctag cacgtctcct cttcttcttg tgagagtaaa
aaakaaaact cccaaaaaaa 60akaaaatcat caaaaaaaca aatttcaaaa agagtttttg
tgtttgggga ttaaagaaka 120aaaaaaacaa cgcc
13431140DNAArtificial sequenceUN1dGAA-5
31ggcaagctct agcacgtctc ctcttcttct tgtgagagta aaaaakaaaa ctcccaaaaa
60aaakaaaatc atcaaaaaaa caaatttcaa aaagagtttt tgtgtttggg gattaaagaa
120kaaaaaaaac aacctccacc
14032130DNAArtificial sequenceUN1dGAA-6 32caagcgtctc ctcttcttct
tgtgagagta aaaaakaaaa ctcccaaaaa aaakaaaatc 60atcaaaaaaa caaatttcaa
aaagagtttt tgtgtttggg gattaaagaa kaaaaaaaac 120aacctccacc
13033178DNAArtificial
sequenceUN2dGAA-2 33cgtctcctct tcttcttgtg agagtaaaaa akaaaactcc
caaaaaaaak aaaatcatca 60aaaaaacaaa tttcaaaaag agtttttgtg tttggggatt
aaagaakaaa aaaaacaacc 120tcgtgcgtgt tgccgattcg cgtacgaata cgccttgtgc
tgacacttct gtagcacc 17834238DNAArtificial sequenceUN2dGAA-3
34caagctctag cacgtctcct cttcttcttg tgagagtaaa aaakaaaact cccaaaaaaa
60akaaaatcat caaaaaaaca aatttcaaaa agagtttttg tgtttgggga ttaaagaaka
120aaaaaaacaa caggtgagta agcgcagttg tcgtctcttg cggtgccgtt gctggttctc
180acacctttta ggtctgttct cgtcttccgt tctgactctc tctttttcgt tgcaggcc
23835229DNAArtificial sequenceUN2dGAA-4 35atcaagctct agcacgtctc
ctcttcttct tgtgagagta aaaaakaaaa ctcccaaaaa 60aaakaaaatc atcaaaaaaa
caaatttcaa aaagagtgag gtaagattat cgatatttaa 120attatttatt tcttcttttc
catttttttg gctaacattt tcctaggttt tattatatct 180agcaggtacg ttttgtgttt
ggggattaaa gaakaaaaaa aacaaaaca 22936188DNAArtificial
sequenceUN1shuffle 36caagctctag cacgtctcct cttcttcttg tgagagtaaa
aaakaaaact cccaaaaaaa 60akaaaatcat caaaaaaaca aatttcaaaa agagtttttg
tgtttgggga ttaaagaaka 120aaaaaaacaa ggaagaagaa akaagaagaa gaagaagaag
aagaagaagg gcggccgccc 180ccttcacc
18837173DNAArtificial sequenceUN1shuffle-2
37caagctctag cacgtctcct cttcttcttg tgagagtaaa aaakaaaact cccaaaaaaa
60akaaaatcat caaaaaaaca aatttcaaaa agagtttttg tgtttgggga ttaaagaaka
120aaaaaaacaa ggaagaagaa akaagaagaa gaagaagaag aagaagaagc gcc
17338154DNAArtificial sequenceUN1shuffle-3 38gaagctctag cacgtctcct
cttcttcttg tgagagtaaa aaakaaaact cccaaaaaaa 60akaaaatcat caaaaaaaca
aatttcaaaa agagtttttg tgtttgggga ttaaagaaka 120aaaaaaacaa ggaagaagaa
gaagaagaag cgcc 15439179DNAArtificial
sequenceUN1shuffle-4 39ggcaagctct agcacgtctc ctcttcttct tgtgagagta
aaaaakaaaa ctcccaaaaa 60aaakaaaatc atcaaaaaaa caaatttcaa aaagagtttt
tgtgtttggg gattaaagaa 120kaaaaaaaac aaggaagaag aaakaagaag aagaagaaga
agaagaagaa gcctccacc 17940160DNAArtificial sequenceUN1shuffle-5
40ggcaagctct agcacgtctc ctcttcttct tgtgagagta aaaaataaaa ctcccaaaaa
60aaagaaaatc atcaaaaaaa caaatttcaa aaagagtttt tgtgtttggg gattaaagaa
120taaaaaaaac aaggaagaag aagaagaaga agcctccacc
16041177DNAArtificial sequenceUN1shuffle-6 41caagctctag cacgtctcct
cttcttcttg tgagagtaaa aaakaaaact cccaaaaaaa 60akaaaatcat caaaaaaaca
aatttcaaaa agagtttttg tgtttgggga ttaaagaaka 120aaaaaaacaa ggaagaagaa
akaagaagaa gaagaagaag aagaagaagc ctccacc 17742277DNAArtificial
sequenceUN2shuffle-1 42caagctctag cacgtctcct cttcttcttg tgagagtaaa
aaakaaaact cccaaaaaaa 60akaaaatcat caaaaaaaca aatttcaaaa agagtttttg
tgtttgggga ttaaagaaka 120aaaaaaacaa ggaagaagaa akaagaagaa gaagaagaag
aagaagaagc aggtgagtaa 180gcgcagttgt cgtctcttgc ggtgccgttg ctggttctca
caccttttag gtctgttctc 240gtcttccgtt ctgactctct ctttttcgtt gcaggcc
27743267DNAArtificial sequenceUN2shuffle-2
43atcaagctct agcacgtctc ctcttcttct tgtgagagta aaaaakaaaa ctcccaaaaa
60aaakaaaatc atcaaaaaaa caaatttcaa aaagagtgag gtaagattat cgatatttaa
120attatttatt tcttcttttc catttttttg gctaacattt tcctaggttt tattatatct
180agcaggtacg ttttgtgttt ggggattaaa gaakaaaaaa aacaaggaag aagaaakaag
240aagaagaaga agaagaagaa gaaaaca
26744264DNAArtificial sequenceCAA1 44caagctctac caccaagaac aaacaacaac
aacatatata aaacaacaac caccatctcc 60tcttcttctt gtcaactcca aaatcaaact
cccaaaaaaa agcaaatcat caaaagtgag 120gtaagattat cgatatttaa attatttatt
tcttcttttc catttttttg gctaacattt 180tcctaggttt tattatatct agcaggtacg
aaatttcaaa caacaacaac aaacaacaaa 240caacattaac atcatatcaa aacc
26445188DNAArtificial sequenceCAA2
45caacctctac caccaacaac aaacaacaac aacaacaaca acaacaacaa ccctctccac
60atctccctct cagagtaaaa aacaaaactc ccaaaaaaaa gaaaatcatc aaaaaaacaa
120atttcaaaaa gacttcttct cattccttat taaagaacaa aaaaaacaag gcggccgccc
180ccttcacc
18846207DNAArtificial sequenceCAA3 46gtatttttac aacaattacc aacaacaaca
aacaacaaac aacattacaa ttactattta 60caattacaag cgtctcctct tcttcttgtg
agagtaaaaa ataaaactcc caaaaaaaag 120aaaatcatca aaaaaacaaa tttcaaaaag
agtttttgtg tttggggatt aaagaataaa 180aaaaacaagg cggccgcccc cttcacc
20747188DNAArtificial sequenceCAA4
47caagctctac caccaagaac aaacaacaac aacatatata aaacaacaac caccatctcc
60tcttcttctt gtcaactcca aaatcaaact cccaaaaaaa agcaaatcat caaaaccaca
120aatttcaaac aacaacaaca aacaacaaac aacattaaca tcatatcaag gcggccgccc
180ccttcacc
18848175DNAArtificial sequenceCAA5 48caacctctac caccaacaac aaacaacaac
aacaacaaca acaacaacaa ccctctccac 60atctccctct cagagtaaaa aacaaaattg
acaaaaaaaa gattttataa taaaaacaaa 120tttcaaaaag aattcaactc attcaatatt
acaacaagaa caaaggaggt cacat 17549177DNAArtificial sequenceCAA6
49caagctctac caccaagaac aaacaacaac aacatatata aaacaacaac caccatctcc
60tcttcttctt gtcaactcca aaatcaaact cccaaaaaaa agcaaatcat caaaaccaca
120aatttcaaac aacaacaaca aacaacaaac aacattaaca tcatatcaac ctccacc
17750175DNAArtificial sequenceTATA1 50caagctctag caggaagaag aaataagaag
aagaagaaga agaagaagaa gcgtctcctc 60ttcttcttga cagagtaaaa aataactttt
ataataaaga aaatcatcaa aaaaacaaat 120ttcaaaaaga gtttttgtgt ttggggatta
aagaataaaa aaaaggaggt cacat 17551177DNAArtificial sequenceTATA2
51caagctctag caggaagaag aaataagaag aagtatataa aagaagaaga agcgtctcct
60cttcttcttg tgaagtaaaa aataaaactc ccaaaaaaaa gaaaatcatc aaaaaaacaa
120atttcaaaaa gagtttttgt gtttggggat taaagaataa aaaaaacaac ctccacc
17752315DNAArtificial sequencefragment 52ggagttccgc gttacataac ttacggtaaa
tggcccgcct ggctgaccgc ccaacgaccc 60ccgcccattg acgtcaataa tgacgtatgt
tcccatagta acgccaatag ggactttcca 120ttgacgtcaa tgggtggagt atttacggta
aactgcccac ttggcagtac atcaagtgta 180tcatatgcca agtacgcccc ctattgacgt
caatgacggt aaatggcccg cctggcatta 240tgcccagtac atgaccttat gggactttcc
tacttggcag tacatctacg tattagtcat 300cgctattacc atggt
31553303DNAArtificial sequencefragment
53ggcctccgcg ccgggttttg gcgcctcccg cgggcgcccc cctcctcacg gcgagcgctg
60ccacgtcaga cgaagggcgc agcgagcgtc ctgatccttc cgcccggacg ctcaggacag
120cggcccgctg ctcataagac tcggccttag aaccccagta tcagcagaag gacattttag
180gacgggactt gggtgactct agggcactgg ttttctttcc agagagcgga acaggcgagg
240aaaagtagtc ccttctcggc gattctgcgg agggatctcc gtggggcggt gaacgccgat
300gat
30354305DNAArtificial sequencefragment 54cgttacataa cttacggtaa atggcccgcc
tggctgaccg cccaacgacc cccgcccatt 60gacgtcaata atgacgtatg ttcccatagt
aacgccaata gggactttcc attgacgtca 120atgggtggag tatttacggt aaactgccca
cttggcagta catcaagtgt atcatatgcc 180aagtacgccc cctattgacg tcaatgacgg
taaatggccc gcctggcatt atgcccagta 240catgacctta tgggactttc ctacttggca
gtacatctac gtattagtca tcgctattac 300catga
305551428DNAArtificial sequencefragment
55cgttacataa cttacggtaa atggcccgcc tggctgaccg cccaacgacc cccgcccatt
60gacgtcaata atgacgtatg ttcccatagt aacgccaata gggactttcc attgacgtca
120atgggtggag tatttacggt aaactgccca cttggcagta catcaagtgt atcatatgcc
180aagtacgccc cctattgacg tcaatgacgg taaatggccc gcctggcatt atgcccagta
240catgacctta tgggactttc ctacttggca gtacatctac gtattagtca tcgctattac
300catgaattgg tttgatctga ttataaccta ggtcgaggaa ggtttcttca actcaaattc
360atccgcctga taattttctt atattttcct aaagaaggaa gagaagcgca tagaggagaa
420gggaaataat tttttaggag cctttcttac ggctatgagg aatttggggc tcagttgaaa
480agcctaaact gcctctcggg aggttgggcg cggcgaacta ctttcagcgg cgcacggaga
540cggcgtctac gtgaggggtg ataagtgacg caacactcgt tgcataaatt tgcctccgcc
600agcccggagc atttaggggc ggttggcttt gttgggtgag cttgtttgtg tccctgtggg
660tggacgtggt tggtgattgg caggatcctg gtatccgcta acaggtactg gcccgcagcc
720gtaacgacct tgggggggtg tgagaggggg gaatgggtga ggtcaaggtg gaggcttctt
780ggggttgggt gggccgctga ggggagggcg tgggggaggg gagggcgagg tgacgcggcg
840ctgggccttt ccgggacagt gggccttgtt gacctgaggg gggcgagggc ggttggcgcg
900cgcgggttga cggaaactaa cggacgccta accgatcggc gattctgtcg agtttacttc
960gcggggaagg cggaaaagag gtagtttgtg tggtttctgg aagcctttac tttggaatct
1020cagtgtgaga aaggtgcccc ttcttgtgtt tcaatgggat ttttatttcg cgagtcttgt
1080gggtttggtt ttgttttcag tttgcctaac accgtgctta ggtttgaggc agattggagt
1140tcggtcgggg gagtttgaat atccggaaca gttagtgggg aaagctgtgg acgattggta
1200agagagcgct ctggattttc cgctgttgac gttgaaacct tgaatgacga atttcgtatt
1260aagtgactta gccttgtaaa attgagggga ggcttgcgga atattaacgt atttaaggca
1320ttttgaagga atagttgcta attttgaaga atattaggtg taaaagcaag aaatacaatg
1380atcctgaggt gacacgctta tgttttactt ttaaactagg tcagcatg
1428561055DNAArtificial sequencefragment 56ggagttccgc gttacataac
ttacggtaaa tggcccgcct ggctgaccgc ccaacgaccc 60ccgcccattg acgtcaataa
tgacgtatgt tcccatagta acgccaatag ggactttcca 120ttgacgtcaa tgggtggagt
atttacggta aactgcccac ttggcagtac atcaagtgta 180tcatatgcca agtacgcccc
ctattgacgt caatgacggt aaatggcccg cctggcatta 240tgcccagtac atgaccttat
gggactttcc tacttggcag tacatctacg tattagtcat 300cgctattacc atggtcgagg
tgagccccac gttctgcttc actctcccca tctccccccc 360ctccccaccc ccaattttgt
atttatttat tttttaatta ttttgtgcag cgatgggggc 420gggggggggg gggggcgcgc
gccaggcggg gcggggcggg gcgaggggcg gggcggggcg 480aggcggagag gtgcggcggc
agccaatcag agcggcgcgc tccgaaagtt tccttttatg 540gcgaggcggc ggcggcggcg
gccctataaa aagcgaagcg cgcggcgggc gggagtcgct 600gcgcgctgcc ttcgccccgt
gccccgctcc gccgccgcct cgcgccgccc gccccggctc 660tgactgaccg cgttactaaa
acaggtaagt ccggcctccg cgccgggttt tggcgcctcc 720cgcgggcgcc cccctcctca
cggcgagcgc tgccacgtca gacgaagggc gcagcgagcg 780tcctgatcct tccgcccgga
cgctcaggac agcggcccgc tgctcataag actcggcctt 840agaaccccag tatcagcaga
aggacatttt aggacgggac ttgggtgact ctagggcact 900ggttttcttt ccagagagcg
gaacaggcga ggaaaagtag tcccttctcg gcgattctgc 960ggagggatct ccgtggggcg
gtgaacgccg atgatgcctc tactaaccat gttcatgttt 1020tctttttttt tctacaggtc
ctgggtgacg aacag 105557753DNAArtificial
sequenceCMV promoter 57tcaatattgg ccattagcca tattattcat tggttatata
gcataaatca atattggcta 60ttggccattg catacgttgt atctatatca taatatgtac
atttatattg gctcatgtcc 120aatatgaccg ccatgttggc attgattatt gactagttat
taatagtaat caattacggg 180gtcattagtt catagcccat atatggagtt ccgcgttaca
taacttacgg taaatggccc 240gcctggctga ccgcccaacg acccccgccc attgacgtca
ataatgacgt atgttcccat 300agtaacgcca atagggactt tccattgacg tcaatgggtg
gagtatttac ggtaaactgc 360ccacttggca gtacatcaag tgtatcatat gccaagtccg
ccccctattg acgtcaatga 420cggtaaatgg cccgcctggc attatgccca gtacatgacc
ttacgggact ttcctacttg 480gcagtacatc tacgtattag tcatcgctat taccatagtg
atgcggtttt ggcagtacac 540caatgggcgt ggatagcggt ttgactcacg gggatttcca
agtctccacc ccattgacgt 600caatgggagt ttgttttggc accaaaatca acgggacttt
ccaaaatgtc gtaataaccc 660cgccccgttg acgcaaatgg gcggtaggcg tgtacggtgg
gaggtctata taagcagagc 720tcgtttagtg aaccgtcaga tcactagaag ctt
7535869DNAArtificial sequenceminimal CMV promoter
58taggcgtgta cggtgggagg tctatataag cagagctcgt ttagtgaacc gtcagatcac
60tagaagctt
69592248DNAArtificial sequenceEEE1-Xt 59tggtgaccct gtctcaaaaa accctcaaaa
agtgttggga ttagtggcat gcaccaccat 60tcccaccaaa ggtttatttt taataatatg
tgtgtgagtg tgtatcacta tgagtatatg 120tcaatatgtg tcaatgtccc cagggacatt
taaagagccc ctgaagctgg agtcataggc 180cattatgaac tgcctgacat ggctaatggg
aattgaactc agattttctg gaagttatac 240ctgctcttac tgctgagcca tgtctctgaa
gaccccaggg attttttttt ttttttgaga 300caggtatttt ctgtatagcc ctggctgtcc
tgaaagcact ctctatatgt agaccaggct 360tgcctggagc ttggatatgc acctgcttct
gcctcaggaa tggtgggatt gaaggtgtgc 420accaccacat ccgctaacat gcacaattct
taatgggttt atatcttatt taatgaatga 480aaggtttggg ggatggatgt agcttaatgg
aaaatgactg aagatttcaa ttaaaaatct 540ggggcttagc tgcgcggtgg gtggtgcctg
cctttagtcc cagtactggg gaggcagagg 600aaggaggatc tctgtgagtt cgaggccagc
tggtctataa cgtgagttcc aggacagcca 660gagatacaca gacaaaccct gtctcaccaa
aacaaaacaa caacaacaac aacaaatctg 720ggacgtaggc ttggtgtggt ggcacacatt
ttgattccag cacttggaag gaagaggcct 780gcatggtcta catagcttgt ttcaggcaac
cagagctaca tagtgagatc ctgtctcaac 840aaaaataaaa taatctaagg cttcaaaggg
ttcaatctct taggtagcta aatatgaaca 900aaatttggga aatgtgacct tttccttagt
gacagtcaga tagaaccttc tcgagtgcaa 960ggacaccaag tgcaaacagg ctcaagaaca
gcctggaaag gtctagtgct atggggcttc 1020aggtcgaatg ccaactgttt tcaagaactg
tgtggatttt tctgcctgta acgaattcag 1080attcattttt caaaactcgg ggagagtttt
ccccctttat aatttttttt ttaaatttat 1140taaactttgt ttcgttcccc ttgttttgag
aattgcagag tcatccaccc tgtcacagtg 1200ccagggagct cagggatggg cccaggggcc
tggcggggct gaaggggctg gggaagcgag 1260ggctccaaag ggaccccagt gtggcaggag
ccaaagccct aggtccctag aacgcagagg 1320ccaccgggac cccccagacg gggtaagcgg
gtgggtgtct ggggcgcgaa gccgcactgc 1380gcatgcgccg aggtccgctc cggccgcgct
gatccaagcc gggttctcgc gccgacctgg 1440tcgtgattga caagtcacac acgctgatcc
ctccgcgggg ccgcacaggg tcacagcctt 1500tcccctcccc acaaagcccc ctactctctg
ggcaccacac acgaacattc cttgagcgtg 1560accttgttgg ctctagtcag gcgcctccgg
tgcagagact ggaacggcct tgggaagtag 1620tccctaaccg catttccgcg gagggatcgt
cgggagggcg tggcttctga ggattatata 1680aggcgactcc gggcgggtct tagctagttc
cgtcggagac ccgagttcag tcgccgcttc 1740tctgtgagga ctgctgccgc cgccgctggt
gaggagaagc cgccgcgctt ggcgtagctg 1800agagacgggg agggggcgcg gacacgaggg
gcagcccgcg gcctggacgt tctgtttccg 1860tggcccgcga ggaaggcgac tgtcctgagg
cggaggaccc agcggcaaga tggcggccaa 1920gtggaagcct gaggggatag gcgagcggcc
ctgaggcgct cgacggggtt gggggggaag 1980caggcccgcg aggcagctgc agccgggaac
gtgcggccaa ccccttattt tttttgacgg 2040gttgcgggcc gtaggtgcct ccgaagtgag
agccgtgggc gtttgactgt cgggagaggt 2100cggtcggatt ttcatccgtt gctaaagacg
gaagtgcgac tgagacggga agggggggga 2160gtcggttggt ggcggttgaa cctggactaa
ggcgcacatg acgtcgcggt ttctatgggc 2220tcataatggg tggtgaggac atttccct
2248601159DNAArtificial sequenceEEE1-80
60tcaaaactcg gggagagttt tcccccttta taattttttt tttaaattta ttaaactttg
60tttcgttccc cttgttttga gaattgcaga gtcatccacc ctgtcacagt gccagggagc
120tcagggatgg gcccaggggc ctggcggggc tgaaggggct ggggaagcga gggctccaaa
180gggaccccag tgtggcagga gccaaagccc taggtcccta gaacgcagag gccaccggga
240ccccccagac ggggtaagcg ggtgggtgtc tggggcgcga agccgcactg cgcatgcgcc
300gaggtccgct ccggccgcgc tgatccaagc cgggttctcg cgccgacctg gtcgtgattg
360acaagtcaca cacgctgatc cctccgcggg gccgcacagg gtcacagcct ttcccctccc
420cacaaagccc cctactctct gggcaccaca cacgaacatt ccttgagcgt gaccttgttg
480gctctagtca ggcgcctccg gtgcagagac tggaacggcc ttgggaagta gtccctaacc
540gcatttccgc ggagggatcg tcgggagggc gtggcttctg aggattatat aaggcgactc
600cgggcgggtc ttagctagtt ccgtcggaga cccgagttca gtcgccgctt ctctgtgagg
660actgctgccg ccgccgctgg tgaggagaag ccgccgcgct tggcgtagct gagagacggg
720gagggggcgc ggacacgagg ggcagcccgc ggcctggacg ttctgtttcc gtggcccgcg
780aggaaggcga ctgtcctgag gcggaggacc cagcggcaag atggcggcca agtggaagcc
840tgaggggata ggcgagcggc cctgaggcgc tcgacggggt tgggggggaa gcaggcccgc
900gaggcagctg cagccgggaa cgtgcggcca accccttatt ttttttgacg ggttgcgggc
960cgtaggtgcc tccgaagtga gagccgtggg cgtttgactg tcgggagagg tcggtcggat
1020tttcatccgt tgctaaagac ggaagtgcga ctgagacggg aagggggggg agtcggttgg
1080tggcggttga acctggacta aggcgcacat gacgtcgcgg tttctatggg ctcataatgg
1140gtggtgagga catttccct
115961869DNAArtificial sequenceEEE1-60 61cgcatgcgcc gaggtccgct ccggccgcgc
tgatccaagc cgggttctcg cgccgacctg 60gtcgtgattg acaagtcaca cacgctgatc
cctccgcggg gccgcacagg gtcacagcct 120ttcccctccc cacaaagccc cctactctct
gggcaccaca cacgaacatt ccttgagcgt 180gaccttgttg gctctagtca ggcgcctccg
gtgcagagac tggaacggcc ttgggaagta 240gtccctaacc gcatttccgc ggagggatcg
tcgggagggc gtggcttctg aggattatat 300aaggcgactc cgggcgggtc ttagctagtt
ccgtcggaga cccgagttca gtcgccgctt 360ctctgtgagg actgctgccg ccgccgctgg
tgaggagaag ccgccgcgct tggcgtagct 420gagagacggg gagggggcgc ggacacgagg
ggcagcccgc ggcctggacg ttctgtttcc 480gtggcccgcg aggaaggcga ctgtcctgag
gcggaggacc cagcggcaag atggcggcca 540agtggaagcc tgaggggata ggcgagcggc
cctgaggcgc tcgacggggt tgggggggaa 600gcaggcccgc gaggcagctg cagccgggaa
cgtgcggcca accccttatt ttttttgacg 660ggttgcgggc cgtaggtgcc tccgaagtga
gagccgtggg cgtttgactg tcgggagagg 720tcggtcggat tttcatccgt tgctaaagac
ggaagtgcga ctgagacggg aagggggggg 780agtcggttgg tggcggttga acctggacta
aggcgcacat gacgtcgcgg tttctatggg 840ctcataatgg gtggtgagga catttccct
86962724DNAArtificial sequenceEEE1-50
62tctctgggca ccacacacga acattccttg agcgtgacct tgttggctct agtcaggcgc
60ctccggtgca gagactggaa cggccttggg aagtagtccc taaccgcatt tccgcggagg
120gatcgtcggg agggcgtggc ttctgaggat tatataaggc gactccgggc gggtcttagc
180tagttccgtc ggagacccga gttcagtcgc cgcttctctg tgaggactgc tgccgccgcc
240gctggtgagg agaagccgcc gcgcttggcg tagctgagag acggggaggg ggcgcggaca
300cgaggggcag cccgcggcct ggacgttctg tttccgtggc ccgcgaggaa ggcgactgtc
360ctgaggcgga ggacccagcg gcaagatggc ggccaagtgg aagcctgagg ggataggcga
420gcggccctga ggcgctcgac ggggttgggg gggaagcagg cccgcgaggc agctgcagcc
480gggaacgtgc ggccaacccc ttattttttt tgacgggttg cgggccgtag gtgcctccga
540agtgagagcc gtgggcgttt gactgtcggg agaggtcggt cggattttca tccgttgcta
600aagacggaag tgcgactgag acgggaaggg gggggagtcg gttggtggcg gttgaacctg
660gactaaggcg cacatgacgt cgcggtttct atgggctcat aatgggtggt gaggacattt
720ccct
724631449DNAArtificial sequenceEEE1-SL 63tttcaggcaa ccagagctac atagtgagat
cctgtctcaa caaaaataaa ataatctaag 60gcttcaaagg gttcaatctc ttaggtagct
aaatatgaac aaaatttggg aaatgtgacc 120ttttccttag tgacagtcag atagaacctt
ctcgagtgca aggacaccaa gtgcaaacag 180gctcaagaac agcctggaaa ggtctagtgc
tatggggctt caggtcgaat gccaactgtt 240ttcaagaact gtgtggattt ttctgcctgt
aacgaattca gattcatttt tcaaaactcg 300gggagagttt tcccccttta taattttttt
tttaaattta ttaaactttg tttcgttccc 360cttgttttga gaattgcaga gtcatccacc
ctgtcacagt gccagggagc tcagggatgg 420gcccaggggc ctggcggggc tgaaggggct
ggggaagcga gggctccaaa gggaccccag 480tgtggcagga gccaaagccc taggtcccta
gaacgcagag gccaccggga ccccccagac 540ggggaaagcg gttgggtgtc tggggcgcga
agccgcactg cgcatgcgcc gaggtccgct 600ccggccgcgc tgatccaagc cgggttctcg
cgccgacctg gtcgtgattg acaagtcaca 660cacgctgatc cctccgcggg gccgcacagg
gtcacagcct ttcccctccc cacaaagccc 720cctactctct gggcaccaca cacgaacatt
ccttgagcgt gaccttgttg gctctagtca 780ggcgcctccg gtgcagagac tggaacggcc
ttgggaagta gtccctaacc gcatttccgc 840ggagggatcg tcgggagggc gtggcttctg
aggattatat aaggcgactc cgggcgggtc 900ttagctagtt ccgtcggaga cccgagttca
gtcgccgctt ctctgtgagg actgctgccg 960ccgccgctgc tgaggagaag ccgccgcgct
tggcgtagct gagagacggg gagggggcgc 1020ggacacgagg ggcagcccgc ggcctggacg
ttctgtttcc gtggcccgcg aggaaggcga 1080ctgtcctgag gcggaggacc cagcggcaag
atggcggcca agtggaagcc tgaggggata 1140ggcgagcggc cctgaggcgc tcgacggggt
tgggggggaa gcaggcccgc gaggcagctg 1200cagccgggaa cgtgcggcca accccttatt
ttttttgacg ggttgcgggc cgtaggtgcc 1260tccgaattga gagccgtggg cgtttgactg
tcgggagagg tcggtcggat tttcatccgt 1320tgctaaagac ggaagtgcga ctgagacggg
aagggggggg agtcggttgg tggcggttga 1380acctggacta aggcgcacat gacgtcgcgg
tttctatggg ctcataatgg gtggtgagga 1440catttccct
14496433DNAArtificial sequenceHC RACE
PRIMER 64gctggtgccc aggtccttag cgcaatagta cac
33654128DNAArtificial sequenceLight chain vector sequence without
the coding sequence 65tgcaggcggc cgctttcagg caaccagagc tacatagtga
gatcctgtct caacaaaaat 60aaaataatct aaggcttcaa agggttcaat ctcttaggta
gctaaatatg aacaaaattt 120gggaaatgtg accttttcct tagtgacagt cagatagaac
cttctcgagt gcaaggacac 180caagtgcaaa caggctcaag aacagcctgg aaaggtctag
tgctatgggg cttcaggtcg 240aatgccaact gttttcaaga actgtgtgga tttttctgcc
tgtaacgaat tcagattcat 300ttttcaaaac tcggggagag ttttccccct ttataatttt
ttttttaaat ttattaaact 360ttgtttcgtt ccccttgttt tgagaattgc agagtcatcc
accctgtcac agtgccaggg 420agctcaggga tgggcccagg ggcctggcgg ggctgaaggg
gctggggaag cgagggctcc 480aaagggaccc cagtgtggca ggagccaaag ccctaggtcc
ctagaacgca gaggccaccg 540ggacccccca gacggggtaa gcgggtgggt gtctggggcg
cgaagccgca ctgcgcatgc 600gccgaggtcc gctccggccg cgctgatcca agccgggttc
tcgcgccgac ctggtcgtga 660ttgacaagtc acacacgctg atccctccgc ggggccgcac
agggtcacag cctttcccct 720ccccacaaag ccccctactc tctgggcacc acacacgaac
attccttgag cgtgaccttg 780ttggctctag tcaggcgcct ccggtgcaga gactggaacg
gccttgggaa gtagtcccta 840accgcatttc cgcggaggga tcgtcgggag ggcgtggctt
ctgaggatta tataaggcga 900ctccgggcgg gtcttagcta gttccgtcgg agacccgagt
tcagtcgccg cttctctgtg 960aggactgctg ccgccgccgc tggtgaggag aagccgccgc
gcttggcgta gctgagagac 1020ggggaggggg cgcggacacg aggggcagcc cgcggcctgg
acgttctgtt tccgtggccc 1080gcgaggaagg cgactgtcct gaggcggagg acccagcggc
aagatggcgg ccaagtggaa 1140gcctgagggg ataggcgagc ggccctgagg cgctcgacgg
ggttgggggg gaagcaggcc 1200cgcgaggcag ctgcagccgg gaacgtgcgg ccaacccctt
attttttttg acgggttgcg 1260ggccgtaggt gcctccgaag tgagagccgt gggcgtttga
ctgtcgggag aggtcggtcg 1320gattttcatc cgttgctaaa gacggaagtg cgactgagac
gggaaggggg gggagtcggt 1380tggtggcggt tgaacctgga ctaaggcgca catgacgtcg
cggtttctat gggctcataa 1440tgggtggtga ggacatttcc ctgtttaaac ttaaacaagt
ttgtacaaaa aagcaggcta 1500gatcttcaat attggccatt agccatatta ttcattggtt
atatagcata aatcaatatt 1560ggctattggc cattgcatac gttgtatcta tatcataata
tgtacattta tattggctca 1620tgtccaatat gaccgccatg ttggcattga ttattgacta
gttattaata gtaatcaatt 1680acggggtcat tagttcatag cccatatatg gagttccgcg
ttacataact tacggtaaat 1740ggcccgcctg gctgaccgcc caacgacccc cgcccattga
cgtcaataat gacgtatgtt 1800cccatagtaa cgccaatagg gactttccat tgacgtcaat
gggtggagta tttacggtaa 1860actgcccact tggcagtaca tcaagtgtat catatgccaa
gtccgccccc tattgacgtc 1920aatgacggta aatggcccgc ctggcattat gcccagtaca
tgaccttacg ggactttcct 1980acttggcagt acatctacgt attagtcatc gctattacca
tagtgatgcg gttttggcag 2040tacaccaatg ggcgtggata gcggtttgac tcacggggat
ttccaagtct ccaccccatt 2100gacgtcaatg ggagtttgtt ttggcaccaa aatcaacggg
actttccaaa atgtcgtaat 2160aaccccgccc cgttgacgca aatgggcggt aggcgtgtac
ggtgggaggt ctatataagc 2220agagctcgtt tagtgaaccg tcagatcact agaagcttaa
tacgactcac tatagggaga 2280cccaagctgg ctagcgttta aacgggccct ctagtaacgg
ccgccagtgt gctggaattc 2340ggcttaactc tagaccatgg ggcgcgccgg ttcagcctcg
actgtgcctt ctagttgcca 2400gccatctgtt gtttgcccct cccccgtgcc ttccttgacc
ctggaaggtg ccactcccac 2460tgtcctttcc taataaaatg aggaaattgc atcgcattgt
ctgagtaggt gtcattctat 2520tctggggggt ggggtggggc aggacagcaa gggggaggat
tgggaagaca atagcaggca 2580tgctggggat gcggtgggct ctatggcttc tgaggcggaa
agaaccagct ggatccatcc 2640gttagatatc tgtggaatgt gtgtcagtta gggtgtggaa
agtccccagg ctccccagca 2700ggcagaagta tgcaaagcat gcatctcaat tagtcagcaa
ccaggtgtgg aaagtcccca 2760ggctccccag caggcagaag tatgcaaagc atgcatctca
attagtcagc aaccatagtc 2820ccgcccctaa ctccgcccat cccgccccta actccgccca
gttccgccca ttctccggcc 2880catgcctgac taattttttt tatttatgca gaggccgagg
ccgcctctgc ctctgagcta 2940ttccagaagt agtgaggagg cttttttgga ggcctaggct
tttgcaaaaa gctcccggga 3000gcttgtatat ccattttcgg atctgatcaa gagacaggat
gaggatcgtt tcacatgatt 3060gaacaagatg gattgcacgc aggttctccg gccgcttggg
tggagaggct attcggctat 3120gactgggcac aacagacaat cggctgctct gatgccgccg
tgttccggct gtcagcgcag 3180gggcgcccgg ttctttttgt caagaccgac ctgtccggtg
ccctgaatga actgcaggac 3240gaggcagcgc ggctatcgtg gctggccacg acgggcgttc
cttgcgcagc tgtgctcgac 3300gttgtcactg aagcgggaag ggactggctg ctattgggcg
aagtgccggg gcaggatctc 3360ctgtcatctc accttgctcc tgccgagaaa gtatccatca
tggctgatgc aatgcggcgg 3420ctgcatacgc ttgatccggc tacctgccca ttcgaccacc
aagcgaaaca tcgcatcgag 3480cgagcacgta ctcggatgga agccggtctt gtcgatcagg
atgatctgga cgaagagcat 3540caggggctcg cgccagccga actgttcgcc aggctcaagg
cgcgcatgcc cgacggcgag 3600gatctcgtcg tgacacatgg cgatgcctgc ttgccgaata
tcatggtgga aaatggccgc 3660ttttctggat tcatcgactg tggccggctg ggtgtggcgg
accgctatca ggacatagcg 3720ttggctaccc gtgatattgc tgaagagctt ggcggcgaat
gggctgaccg cttcctcgtg 3780ctttacggta tcgccgctcc cgattcgcag cgcatcgcct
tctatcgcct tcttgacgag 3840ttcttctagg taccacgaga tttcgattcc accgccgcct
tctatgaaag gttgggcttc 3900ggaatcgttt tccgggacgc cggctggatg atcctccagc
gcggggatct catgctggag 3960ttcttcgccc accccaactt gtttattgca gcttataatg
gttacaaata aagcaatagc 4020atcacaaatt tcacaaataa agcatttttt tcactgcatt
ctagttgtgg tttgtccaaa 4080ctcatcaatg tatcttatca tgtctcaggt tgatgagcat
attttacc 4128662647DNAArtificial sequenceHeavy chain
vector sequence without the coding sequence 66tgcaggcggc cgctttcagg
caaccagagc tacatagtga gatcctgtct caacaaaaat 60aaaataatct aaggcttcaa
agggttcaat ctcttaggta gctaaatatg aacaaaattt 120gggaaatgtg accttttcct
tagtgacagt cagatagaac cttctcgagt gcaaggacac 180caagtgcaaa caggctcaag
aacagcctgg aaaggtctag tgctatgggg cttcaggtcg 240aatgccaact gttttcaaga
actgtgtgga tttttctgcc tgtaacgaat tcagattcat 300ttttcaaaac tcggggagag
ttttccccct ttataatttt ttttttaaat ttattaaact 360ttgtttcgtt ccccttgttt
tgagaattgc agagtcatcc accctgtcac agtgccaggg 420agctcaggga tgggcccagg
ggcctggcgg ggctgaaggg gctggggaag cgagggctcc 480aaagggaccc cagtgtggca
ggagccaaag ccctaggtcc ctagaacgca gaggccaccg 540ggacccccca gacggggtaa
gcgggtgggt gtctggggcg cgaagccgca ctgcgcatgc 600gccgaggtcc gctccggccg
cgctgatcca agccgggttc tcgcgccgac ctggtcgtga 660ttgacaagtc acacacgctg
atccctccgc ggggccgcac agggtcacag cctttcccct 720ccccacaaag ccccctactc
tctgggcacc acacacgaac attccttgag cgtgaccttg 780ttggctctag tcaggcgcct
ccggtgcaga gactggaacg gccttgggaa gtagtcccta 840accgcatttc cgcggaggga
tcgtcgggag ggcgtggctt ctgaggatta tataaggcga 900ctccgggcgg gtcttagcta
gttccgtcgg agacccgagt tcagtcgccg cttctctgtg 960aggactgctg ccgccgccgc
tggtgaggag aagccgccgc gcttggcgta gctgagagac 1020ggggaggggg cgcggacacg
aggggcagcc cgcggcctgg acgttctgtt tccgtggccc 1080gcgaggaagg cgactgtcct
gaggcggagg acccagcggc aagatggcgg ccaagtggaa 1140gcctgagggg ataggcgagc
ggccctgagg cgctcgacgg ggttgggggg gaagcaggcc 1200cgcgaggcag ctgcagccgg
gaacgtgcgg ccaacccctt attttttttg acgggttgcg 1260ggccgtaggt gcctccgaag
tgagagccgt gggcgtttga ctgtcgggag aggtcggtcg 1320gattttcatc cgttgctaaa
gacggaagtg cgactgagac gggaaggggg gggagtcggt 1380tggtggcggt tgaacctgga
ctaaggcgca catgacgtcg cggtttctat gggctcataa 1440tgggtggtga ggacatttcc
ctgtttaaac ttaaacaagt ttgtacaaaa aagcaggcta 1500gatcttcaat attggccatt
agccatatta ttcattggtt atatagcata aatcaatatt 1560ggctattggc cattgcatac
gttgtatcta tatcataata tgtacattta tattggctca 1620tgtccaatat gaccgccatg
ttggcattga ttattgacta gttattaata gtaatcaatt 1680acggggtcat tagttcatag
cccatatatg gagttccgcg ttacataact tacggtaaat 1740ggcccgcctg gctgaccgcc
caacgacccc cgcccattga cgtcaataat gacgtatgtt 1800cccatagtaa cgccaatagg
gactttccat tgacgtcaat gggtggagta tttacggtaa 1860actgcccact tggcagtaca
tcaagtgtat catatgccaa gtccgccccc tattgacgtc 1920aatgacggta aatggcccgc
ctggcattat gcccagtaca tgaccttacg ggactttcct 1980acttggcagt acatctacgt
attagtcatc gctattacca tagtgatgcg gttttggcag 2040tacaccaatg ggcgtggata
gcggtttgac tcacggggat ttccaagtct ccaccccatt 2100gacgtcaatg ggagtttgtt
ttggcaccaa aatcaacggg actttccaaa atgtcgtaat 2160aaccccgccc cgttgacgca
aatgggcggt aggcgtgtac ggtgggaggt ctatataagc 2220agagctcgtt tagtgaaccg
tcagatcact agaagcttaa tacgactcac tatagggaga 2280cccaagctgg ctagcgttta
aacgggccct ctagtaacgg ccgccagtgt gctggaattc 2340ggcttaactc tagaccatgg
ggcgcgccgg ttcagcctcg actgtgcctt ctagttgcca 2400gccatctgtt gtttgcccct
cccccgtgcc ttccttgacc ctggaaggtg ccactcccac 2460tgtcctttcc taataaaatg
aggaaattgc atcgcattgt ctgagtaggt gtcattctat 2520tctggggggt ggggtggggc
aggacagcaa gggggaggat tgggaagaca atagcaggca 2580tgctggggat gcggtgggct
ctatggcttc tgaggcggaa agaaccagct ggatccatcc 2640gttagat
264767235PRTArtificial
sequenceHuMab1 protein light chain 67Met Gly Trp Ser Cys Ile Ile Leu Phe
Leu Val Ala Thr Ala Thr Gly1 5 10
15Val His Ser Ala Gln Asp Ile Gln Met Thr Gln Ser Pro Ser Ser
Val 20 25 30Ser Ala Ser Val
Gly Asp Arg Val Thr Ile Thr Cys Arg Ala Ser Gln 35
40 45Gly Ile Ser Ser Trp Leu Ala Trp Tyr Gln Gln Lys
Pro Gly Lys Ala 50 55 60Pro Lys Leu
Leu Ile Tyr Ala Ala Ser Ser Leu Gln Ser Gly Val Pro65 70
75 80Ser Arg Phe Ser Gly Ser Gly Ser
Gly Thr Asp Phe Thr Leu Thr Ile 85 90
95Ser Ser Leu Gln Pro Glu Asp Phe Ala Thr Tyr Tyr Cys Gln
Gln Ala 100 105 110Asn Asn Phe
Pro Leu Thr Phe Gly Gly Gly Thr Lys Val Glu Ile Lys 115
120 125Arg Thr Val Ala Ala Pro Ser Val Phe Ile Phe
Pro Pro Ser Asp Glu 130 135 140Gln Leu
Lys Ser Gly Thr Ala Ser Val Val Cys Leu Leu Asn Asn Phe145
150 155 160Tyr Pro Arg Glu Ala Lys Val
Gln Trp Lys Val Asp Asn Ala Leu Gln 165
170 175Ser Gly Asn Ser Gln Glu Ser Val Thr Glu Gln Asp
Ser Lys Asp Ser 180 185 190Thr
Tyr Ser Leu Ser Ser Thr Leu Thr Leu Ser Lys Ala Asp Tyr Glu 195
200 205Lys His Lys Val Tyr Ala Cys Glu Val
Thr His Gln Gly Leu Ser Ser 210 215
220Pro Val Thr Lys Ser Phe Asn Arg Gly Glu Cys225 230
23568474PRTArtificial sequenceHuMab1 protein heavy chain
68Met Gly Trp Ser Cys Ile Ile Leu Phe Leu Val Ala Thr Ala Thr Gly1
5 10 15Val His Ser Glu Val Gln
Leu Leu Glu Ser Gly Gly Gly Leu Val Gln 20 25
30Pro Gly Gly Ser Leu Arg Leu Ser Cys Ala Ala Ser Gly
Phe Thr Phe 35 40 45Ser Asn Tyr
Ala Met Ser Trp Val Arg Gln Ala Pro Gly Lys Gly Leu 50
55 60Glu Trp Val Ser Ala Ile Ser Ala Ser Gly His Ser
Thr Tyr Leu Ala65 70 75
80Asp Ser Val Lys Gly Arg Phe Thr Ile Ser Arg Asp Asn Ser Lys Asn
85 90 95Thr Leu Tyr Leu Gln Met
Asn Ser Leu Arg Ala Glu Asp Thr Ala Val 100
105 110Tyr Tyr Cys Ala Lys Asp Arg Glu Val Thr Met Ile
Val Val Leu Asn 115 120 125Gly Gly
Phe Asp Tyr Trp Gly Gln Gly Thr Arg Val Thr Val Ser Ser 130
135 140Ala Ser Thr Lys Gly Pro Ser Val Phe Pro Leu
Ala Pro Ser Ser Lys145 150 155
160Ser Thr Ser Gly Gly Thr Ala Ala Leu Gly Cys Leu Val Lys Asp Tyr
165 170 175Phe Pro Glu Pro
Val Thr Val Ser Trp Asn Ser Gly Ala Leu Thr Ser 180
185 190Gly Val His Thr Phe Pro Ala Val Leu Gln Ser
Ser Gly Leu Tyr Ser 195 200 205Leu
Ser Ser Val Val Thr Val Pro Ser Ser Ser Leu Gly Thr Gln Thr 210
215 220Tyr Ile Cys Asn Val Asn His Lys Pro Ser
Asn Thr Lys Val Asp Lys225 230 235
240Arg Val Glu Pro Lys Ser Cys Asp Lys Thr His Thr Cys Pro Pro
Cys 245 250 255Pro Ala Pro
Glu Leu Leu Gly Gly Pro Ser Val Phe Leu Phe Pro Pro 260
265 270Lys Pro Lys Asp Thr Leu Met Ile Ser Arg
Thr Pro Glu Val Thr Cys 275 280
285Val Val Val Asp Val Ser His Glu Asp Pro Glu Val Lys Phe Asn Trp 290
295 300Tyr Val Asp Gly Val Glu Val His
Asn Ala Lys Thr Lys Pro Arg Glu305 310
315 320Glu Gln Tyr Asn Ser Thr Tyr Arg Val Val Ser Val
Leu Thr Val Leu 325 330
335His Gln Asp Trp Leu Asn Gly Lys Glu Tyr Lys Cys Lys Val Ser Asn
340 345 350Lys Ala Leu Pro Ala Pro
Ile Glu Lys Thr Ile Ser Lys Ala Lys Gly 355 360
365Gln Pro Arg Glu Pro Gln Val Tyr Thr Leu Pro Pro Ser Arg
Glu Glu 370 375 380Met Thr Lys Asn Gln
Val Ser Leu Thr Cys Leu Val Lys Gly Phe Tyr385 390
395 400Pro Ser Asp Ile Ala Val Glu Trp Glu Ser
Asn Gly Gln Pro Glu Asn 405 410
415Asn Tyr Lys Thr Thr Pro Pro Val Leu Asp Ser Asp Gly Ser Phe Phe
420 425 430Leu Tyr Ser Lys Leu
Thr Val Asp Lys Ser Arg Trp Gln Gln Gly Asn 435
440 445Val Phe Ser Cys Ser Val Met His Glu Ala Leu His
Asn His Tyr Thr 450 455 460Gln Lys Ser
Leu Ser Leu Ser Pro Gly Lys465 47069233PRTArtificial
sequenceHuMab2 protein light chain 69Met Gly Trp Ser Cys Ile Ile Leu Phe
Leu Val Ala Thr Ala Thr Gly1 5 10
15Val His Ser Asp Ile Gln Met Thr Gln Ser Pro Ser Ser Leu Ser
Ala 20 25 30Ser Val Gly Asp
Arg Val Thr Ile Thr Cys Arg Ala Ser Gln Asp Val 35
40 45Asn Thr Ala Val Ala Trp Tyr Gln Gln Lys Pro Gly
Lys Ala Pro Lys 50 55 60Leu Leu Ile
Tyr Ser Ala Ser Phe Leu Tyr Ser Gly Val Pro Ser Arg65 70
75 80Phe Ser Gly Ser Arg Ser Gly Thr
Asp Phe Thr Leu Thr Ile Ser Ser 85 90
95Leu Gln Pro Glu Asp Phe Ala Thr Tyr Tyr Cys Gln Gln His
Tyr Thr 100 105 110Thr Pro Pro
Thr Phe Gly Gln Gly Thr Lys Val Glu Ile Lys Arg Thr 115
120 125Val Ala Ala Pro Ser Val Phe Ile Phe Pro Pro
Ser Asp Glu Gln Leu 130 135 140Lys Ser
Gly Thr Ala Ser Val Val Cys Leu Leu Asn Asn Phe Tyr Pro145
150 155 160Arg Glu Ala Lys Val Gln Trp
Lys Val Asp Asn Ala Leu Gln Ser Gly 165
170 175Asn Ser Gln Glu Ser Val Thr Glu Gln Asp Ser Lys
Asp Ser Thr Tyr 180 185 190Ser
Leu Ser Ser Thr Leu Thr Leu Ser Lys Ala Asp Tyr Glu Lys His 195
200 205Lys Val Tyr Ala Cys Glu Val Thr His
Gln Gly Leu Ser Ser Pro Val 210 215
220Thr Lys Ser Phe Asn Arg Gly Glu Cys225
23070470PRTArtificial sequenceHuMab2 protein heavy chain 70Met Gly Trp
Ser Cys Ile Ile Leu Phe Leu Val Ala Thr Ala Thr Gly1 5
10 15Val His Ser Glu Val Gln Leu Val Glu
Ser Gly Gly Gly Leu Val Gln 20 25
30Pro Gly Gly Ser Leu Arg Leu Ser Cys Ala Ala Ser Gly Phe Asn Ile
35 40 45Lys Asp Thr Tyr Ile His Trp
Val Arg Gln Ala Pro Gly Lys Gly Leu 50 55
60Glu Trp Val Ala Arg Ile Tyr Pro Thr Asn Gly Tyr Thr Arg Tyr Ala65
70 75 80Asp Ser Val Lys
Gly Arg Phe Thr Ile Ser Ala Asp Thr Ser Lys Asn 85
90 95Thr Ala Tyr Leu Gln Met Asn Ser Leu Arg
Ala Glu Asp Thr Ala Val 100 105
110Tyr Tyr Cys Ser Arg Trp Gly Gly Asp Gly Phe Tyr Ala Met Asp Tyr
115 120 125Trp Gly Gln Gly Thr Leu Val
Thr Val Ser Ser Ala Ser Thr Lys Gly 130 135
140Pro Ser Val Phe Pro Leu Ala Pro Ser Ser Lys Ser Thr Ser Gly
Gly145 150 155 160Thr Ala
Ala Leu Gly Cys Leu Val Lys Asp Tyr Phe Pro Glu Pro Val
165 170 175Thr Val Ser Trp Asn Ser Gly
Ala Leu Thr Ser Gly Val His Thr Phe 180 185
190Pro Ala Val Leu Gln Ser Ser Gly Leu Tyr Ser Leu Ser Ser
Val Val 195 200 205Thr Val Pro Ser
Ser Ser Leu Gly Thr Gln Thr Tyr Ile Cys Asn Val 210
215 220Asn His Lys Pro Ser Asn Thr Lys Val Asp Lys Lys
Val Glu Pro Pro225 230 235
240Lys Ser Cys Asp Lys Thr His Thr Cys Pro Pro Cys Pro Ala Pro Glu
245 250 255Leu Leu Gly Gly Pro
Ser Val Phe Leu Phe Pro Pro Lys Pro Lys Asp 260
265 270Thr Leu Met Ile Ser Arg Thr Pro Glu Val Thr Cys
Val Val Val Asp 275 280 285Val Ser
His Glu Asp Pro Glu Val Lys Phe Asn Trp Tyr Val Asp Gly 290
295 300Val Glu Val His Asn Ala Lys Thr Lys Pro Arg
Glu Glu Gln Tyr Asn305 310 315
320Ser Thr Tyr Arg Val Val Ser Val Leu Thr Val Leu His Gln Asp Trp
325 330 335Leu Asn Gly Lys
Glu Tyr Lys Cys Lys Val Ser Asn Lys Ala Leu Pro 340
345 350Ala Pro Ile Glu Lys Thr Ile Ser Lys Ala Lys
Gly Gln Pro Arg Glu 355 360 365Pro
Gln Val Tyr Thr Leu Pro Pro Ser Arg Asp Glu Leu Thr Lys Asn 370
375 380Gln Val Ser Leu Thr Cys Leu Val Lys Gly
Phe Tyr Pro Ser Asp Ile385 390 395
400Ala Val Glu Trp Glu Ser Asn Gly Gln Pro Glu Asn Asn Tyr Lys
Thr 405 410 415Thr Pro Pro
Val Leu Asp Ser Asp Gly Ser Phe Phe Leu Tyr Ser Lys 420
425 430Leu Thr Val Asp Lys Ser Arg Trp Gln Gln
Gly Asn Val Phe Ser Cys 435 440
445Ser Val Met His Glu Ala Leu His Asn His Tyr Thr Gln Lys Ser Leu 450
455 460Ser Leu Ser Pro Gly Lys465
470715428DNAArtificial sequencepcDNA3.1(+) cloning vector
71gacggatcgg gagatctccc gatcccctat ggtgcactct cagtacaatc tgctctgatg
60ccgcatagtt aagccagtat ctgctccctg cttgtgtgtt ggaggtcgct gagtagtgcg
120cgagcaaaat ttaagctaca acaaggcaag gcttgaccga caattgcatg aagaatctgc
180ttagggttag gcgttttgcg ctgcttcgcg atgtacgggc cagatatacg cgttgacatt
240gattattgac tagttattaa tagtaatcaa ttacggggtc attagttcat agcccatata
300tggagttccg cgttacataa cttacggtaa atggcccgcc tggctgaccg cccaacgacc
360cccgcccatt gacgtcaata atgacgtatg ttcccatagt aacgccaata gggactttcc
420attgacgtca atgggtggag tatttacggt aaactgccca cttggcagta catcaagtgt
480atcatatgcc aagtacgccc cctattgacg tcaatgacgg taaatggccc gcctggcatt
540atgcccagta catgacctta tgggactttc ctacttggca gtacatctac gtattagtca
600tcgctattac catggtgatg cggttttggc agtacatcaa tgggcgtgga tagcggtttg
660actcacgggg atttccaagt ctccacccca ttgacgtcaa tgggagtttg ttttggcacc
720aaaatcaacg ggactttcca aaatgtcgta acaactccgc cccattgacg caaatgggcg
780gtaggcgtgt acggtgggag gtctatataa gcagagctct ctggctaact agagaaccca
840ctgcttactg gcttatcgaa attaatacga ctcactatag ggagacccaa gctggctagc
900gtttaaactt aagcttggta ccgagctcgg atccactagt ccagtgtggt ggaattctgc
960agatatccag cacagtggcg gccgctcgag tctagagggc ccgtttaaac ccgctgatca
1020gcctcgactg tgccttctag ttgccagcca tctgttgttt gcccctcccc cgtgccttcc
1080ttgaccctgg aaggtgccac tcccactgtc ctttcctaat aaaatgagga aattgcatcg
1140cattgtctga gtaggtgtca ttctattctg gggggtgggg tggggcagga cagcaagggg
1200gaggattggg aagacaatag caggcatgct ggggatgcgg tgggctctat ggcttctgag
1260gcggaaagaa ccagctgggg ctctaggggg tatccccacg cgccctgtag cggcgcatta
1320agcgcggcgg gtgtggtggt tacgcgcagc gtgaccgcta cacttgccag cgccctagcg
1380cccgctcctt tcgctttctt cccttccttt ctcgccacgt tcgccggctt tccccgtcaa
1440gctctaaatc gggggctccc tttagggttc cgatttagtg ctttacggca cctcgacccc
1500aaaaaacttg attagggtga tggttcacgt agtgggccat cgccctgata gacggttttt
1560cgccctttga cgttggagtc cacgttcttt aatagtggac tcttgttcca aactggaaca
1620acactcaacc ctatctcggt ctattctttt gatttataag ggattttgcc gatttcggcc
1680tattggttaa aaaatgagct gatttaacaa aaatttaacg cgaattaatt ctgtggaatg
1740tgtgtcagtt agggtgtgga aagtccccag gctccccagc aggcagaagt atgcaaagca
1800tgcatctcaa ttagtcagca accaggtgtg gaaagtcccc aggctcccca gcaggcagaa
1860gtatgcaaag catgcatctc aattagtcag caaccatagt cccgccccta actccgccca
1920tcccgcccct aactccgccc agttccgccc attctccgcc ccatggctga ctaatttttt
1980ttatttatgc agaggccgag gccgcctctg cctctgagct attccagaag tagtgaggag
2040gcttttttgg aggcctaggc ttttgcaaaa agctcccggg agcttgtata tccattttcg
2100gatctgatca agagacagga tgaggatcgt ttcgcatgat tgaacaagat ggattgcacg
2160caggttctcc ggccgcttgg gtggagaggc tattcggcta tgactgggca caacagacaa
2220tcggctgctc tgatgccgcc gtgttccggc tgtcagcgca ggggcgcccg gttctttttg
2280tcaagaccga cctgtccggt gccctgaatg aactgcagga cgaggcagcg cggctatcgt
2340ggctggccac gacgggcgtt ccttgcgcag ctgtgctcga cgttgtcact gaagcgggaa
2400gggactggct gctattgggc gaagtgccgg ggcaggatct cctgtcatct caccttgctc
2460ctgccgagaa agtatccatc atggctgatg caatgcggcg gctgcatacg cttgatccgg
2520ctacctgccc attcgaccac caagcgaaac atcgcatcga gcgagcacgt actcggatgg
2580aagccggtct tgtcgatcag gatgatctgg acgaagagca tcaggggctc gcgccagccg
2640aactgttcgc caggctcaag gcgcgcatgc ccgacggcga ggatctcgtc gtgacccatg
2700gcgatgcctg cttgccgaat atcatggtgg aaaatggccg cttttctgga ttcatcgact
2760gtggccggct gggtgtggcg gaccgctatc aggacatagc gttggctacc cgtgatattg
2820ctgaagagct tggcggcgaa tgggctgacc gcttcctcgt gctttacggt atcgccgctc
2880ccgattcgca gcgcatcgcc ttctatcgcc ttcttgacga gttcttctga gcgggactct
2940ggggttcgaa atgaccgacc aagcgacgcc caacctgcca tcacgagatt tcgattccac
3000cgccgccttc tatgaaaggt tgggcttcgg aatcgttttc cgggacgccg gctggatgat
3060cctccagcgc ggggatctca tgctggagtt cttcgcccac cccaacttgt ttattgcagc
3120ttataatggt tacaaataaa gcaatagcat cacaaatttc acaaataaag catttttttc
3180actgcattct agttgtggtt tgtccaaact catcaatgta tcttatcatg tctgtatacc
3240gtcgacctct agctagagct tggcgtaatc atggtcatag ctgtttcctg tgtgaaattg
3300ttatccgctc acaattccac acaacatacg agccggaagc ataaagtgta aagcctgggg
3360tgcctaatga gtgagctaac tcacattaat tgcgttgcgc tcactgcccg ctttccagtc
3420gggaaacctg tcgtgccagc tgcattaatg aatcggccaa cgcgcgggga gaggcggttt
3480gcgtattggg cgctcttccg cttcctcgct cactgactcg ctgcgctcgg tcgttcggct
3540gcggcgagcg gtatcagctc actcaaaggc ggtaatacgg ttatccacag aatcagggga
3600taacgcagga aagaacatgt gagcaaaagg ccagcaaaag gccaggaacc gtaaaaaggc
3660cgcgttgctg gcgtttttcc ataggctccg cccccctgac gagcatcaca aaaatcgacg
3720ctcaagtcag aggtggcgaa acccgacagg actataaaga taccaggcgt ttccccctgg
3780aagctccctc gtgcgctctc ctgttccgac cctgccgctt accggatacc tgtccgcctt
3840tctcccttcg ggaagcgtgg cgctttctca tagctcacgc tgtaggtatc tcagttcggt
3900gtaggtcgtt cgctccaagc tgggctgtgt gcacgaaccc cccgttcagc ccgaccgctg
3960cgccttatcc ggtaactatc gtcttgagtc caacccggta agacacgact tatcgccact
4020ggcagcagcc actggtaaca ggattagcag agcgaggtat gtaggcggtg ctacagagtt
4080cttgaagtgg tggcctaact acggctacac tagaagaaca gtatttggta tctgcgctct
4140gctgaagcca gttaccttcg gaaaaagagt tggtagctct tgatccggca aacaaaccac
4200cgctggtagc ggtttttttg tttgcaagca gcagattacg cgcagaaaaa aaggatctca
4260agaagatcct ttgatctttt ctacggggtc tgacgctcag tggaacgaaa actcacgtta
4320agggattttg gtcatgagat tatcaaaaag gatcttcacc tagatccttt taaattaaaa
4380atgaagtttt aaatcaatct aaagtatata tgagtaaact tggtctgaca gttaccaatg
4440cttaatcagt gaggcaccta tctcagcgat ctgtctattt cgttcatcca tagttgcctg
4500actccccgtc gtgtagataa ctacgatacg ggagggctta ccatctggcc ccagtgctgc
4560aatgataccg cgagacccac gctcaccggc tccagattta tcagcaataa accagccagc
4620cggaagggcc gagcgcagaa gtggtcctgc aactttatcc gcctccatcc agtctattaa
4680ttgttgccgg gaagctagag taagtagttc gccagttaat agtttgcgca acgttgttgc
4740cattgctaca ggcatcgtgg tgtcacgctc gtcgtttggt atggcttcat tcagctccgg
4800ttcccaacga tcaaggcgag ttacatgatc ccccatgttg tgcaaaaaag cggttagctc
4860cttcggtcct ccgatcgttg tcagaagtaa gttggccgca gtgttatcac tcatggttat
4920ggcagcactg cataattctc ttactgtcat gccatccgta agatgctttt ctgtgactgg
4980tgagtactca accaagtcat tctgagaata gtgtatgcgg cgaccgagtt gctcttgccc
5040ggcgtcaata cgggataata ccgcgccaca tagcagaact ttaaaagtgc tcatcattgg
5100aaaacgttct tcggggcgaa aactctcaag gatcttaccg ctgttgagat ccagttcgat
5160gtaacccact cgtgcaccca actgatcttc agcatctttt actttcacca gcgtttctgg
5220gtgagcaaaa acaggaaggc aaaatgccgc aaaaaaggga ataagggcga cacggaaatg
5280ttgaatactc atactcttcc tttttcaata ttattgaagc atttatcagg gttattgtct
5340catgagcgga tacatatttg aatgtattta gaaaaataaa caaatagggg ttccgcgcac
5400atttccccga aaagtgccac ctgacgtc
542872519PRTArtificial sequenceSeAP protein 72Met Leu Leu Leu Leu Leu Leu
Leu Gly Leu Arg Leu Gln Leu Ser Leu1 5 10
15Gly Ile Ile Pro Val Glu Glu Glu Asn Pro Asp Phe Trp
Asn Arg Glu 20 25 30Ala Ala
Glu Ala Leu Gly Ala Ala Lys Lys Leu Gln Pro Ala Gln Thr 35
40 45Ala Ala Lys Asn Leu Ile Ile Phe Leu Gly
Asp Gly Met Gly Val Ser 50 55 60Thr
Val Thr Ala Ala Arg Ile Leu Lys Gly Gln Lys Lys Asp Lys Leu65
70 75 80Gly Pro Glu Ile Pro Leu
Ala Met Asp Arg Phe Pro Tyr Val Ala Leu 85
90 95Ser Lys Thr Tyr Asn Val Asp Lys His Val Pro Asp
Ser Gly Ala Thr 100 105 110Ala
Thr Ala Tyr Leu Cys Gly Val Lys Gly Asn Phe Gln Thr Ile Gly 115
120 125Leu Ser Ala Ala Ala Arg Phe Asn Gln
Cys Asn Thr Thr Arg Gly Asn 130 135
140Glu Val Ile Ser Val Met Asn Arg Ala Lys Lys Ala Gly Lys Ser Val145
150 155 160Gly Val Val Thr
Thr Thr Arg Val Gln His Ala Ser Pro Ala Gly Thr 165
170 175Tyr Ala His Thr Val Asn Arg Asn Trp Tyr
Ser Asp Ala Asp Val Pro 180 185
190Ala Ser Ala Arg Gln Glu Gly Cys Gln Asp Ile Ala Thr Gln Leu Ile
195 200 205Ser Asn Met Asp Ile Asp Val
Ile Leu Gly Gly Gly Arg Lys Tyr Met 210 215
220Phe Arg Met Gly Thr Pro Asp Pro Glu Tyr Pro Asp Asp Tyr Ser
Gln225 230 235 240Gly Gly
Thr Arg Leu Asp Gly Lys Asn Leu Val Gln Glu Trp Leu Ala
245 250 255Lys Arg Gln Gly Ala Arg Tyr
Val Trp Asn Arg Thr Glu Leu Met Gln 260 265
270Ala Ser Leu Asp Pro Ser Val Thr His Leu Met Gly Leu Phe
Glu Pro 275 280 285Gly Asp Met Lys
Tyr Glu Ile His Arg Asp Ser Thr Leu Asp Pro Ser 290
295 300Leu Met Glu Met Thr Glu Ala Ala Leu Arg Leu Leu
Ser Arg Asn Pro305 310 315
320Arg Gly Phe Phe Leu Phe Val Glu Gly Gly Arg Ile Asp His Gly His
325 330 335His Glu Ser Arg Ala
Tyr Arg Ala Leu Thr Glu Thr Ile Met Phe Asp 340
345 350Asp Ala Ile Glu Arg Ala Gly Gln Leu Thr Ser Glu
Glu Asp Thr Leu 355 360 365Ser Leu
Val Thr Ala Asp His Ser His Val Phe Ser Phe Gly Gly Tyr 370
375 380Pro Leu Arg Gly Ser Ser Ile Phe Gly Leu Ala
Pro Gly Lys Ala Arg385 390 395
400Asp Arg Lys Ala Tyr Thr Val Leu Leu Tyr Gly Asn Gly Pro Gly Tyr
405 410 415Val Leu Lys Asp
Gly Ala Arg Pro Asp Val Thr Glu Ser Glu Ser Gly 420
425 430Ser Pro Glu Tyr Arg Gln Gln Ser Ala Val Pro
Leu Asp Glu Glu Thr 435 440 445His
Ala Gly Glu Asp Val Ala Val Phe Ala Arg Gly Pro Gln Ala His 450
455 460Leu Val His Gly Val Gln Glu Gln Thr Phe
Ile Ala His Val Met Ala465 470 475
480Phe Ala Ala Cys Leu Glu Pro Tyr Thr Ala Cys Asp Leu Ala Pro
Pro 485 490 495Ala Gly Thr
Thr Asp Ala Ala His Pro Gly Tyr Ser Arg Val Gly Ala 500
505 510Ala Gly Arg Phe Glu Gln Thr
515732522DNAArtificial sequenceEEE1+CMV+TEE 73tttcaggcaa ccagagctac
atagtgagat cctgtctcaa caaaaataaa ataatctaag 60gcttcaaagg gttcaatctc
ttaggtagct aaatatgaac aaaatttggg aaatgtgacc 120ttttccttag tgacagtcag
atagaacctt ctcgagtgca aggacaccaa gtgcaaacag 180gctcaagaac agcctggaaa
ggtctagtgc tatggggctt caggtcgaat gccaactgtt 240ttcaagaact gtgtggattt
ttctgcctgt aacgaattca gattcatttt tcaaaactcg 300gggagagttt tcccccttta
taattttttt tttaaattta ttaaactttg tttcgttccc 360cttgttttga gaattgcaga
gtcatccacc ctgtcacagt gccagggagc tcagggatgg 420gcccaggggc ctggcggggc
tgaaggggct ggggaagcga gggctccaaa gggaccccag 480tgtggcagga gccaaagccc
taggtcccta gaacgcagag gccaccggga ccccccagac 540ggggtaagcg ggtgggtgtc
tggggcgcga agccgcactg cgcatgcgcc gaggtccgct 600ccggccgcgc tgatccaagc
cgggttctcg cgccgacctg gtcgtgattg acaagtcaca 660cacgctgatc cctccgcggg
gccgcacagg gtcacagcct ttcccctccc cacaaagccc 720cctactctct gggcaccaca
cacgaacatt ccttgagcgt gaccttgttg gctctagtca 780ggcgcctccg gtgcagagac
tggaacggcc ttgggaagta gtccctaacc gcatttccgc 840ggagggatcg tcgggagggc
gtggcttctg aggattatat aaggcgactc cgggcgggtc 900ttagctagtt ccgtcggaga
cccgagttca gtcgccgctt ctctgtgagg actgctgccg 960ccgccgctgg tgaggagaag
ccgccgcgct tggcgtagct gagagacggg gagggggcgc 1020ggacacgagg ggcagcccgc
ggcctggacg ttctgtttcc gtggcccgcg aggaaggcga 1080ctgtcctgag gcggaggacc
cagcggcaag atggcggcca agtggaagcc tgaggggata 1140ggcgagcggc cctgaggcgc
tcgacggggt tgggggggaa gcaggcccgc gaggcagctg 1200cagccgggaa cgtgcggcca
accccttatt ttttttgacg ggttgcgggc cgtaggtgcc 1260tccgaagtga gagccgtggg
cgtttgactg tcgggagagg tcggtcggat tttcatccgt 1320tgctaaagac ggaagtgcga
ctgagacggg aagggggggg agtcggttgg tggcggttga 1380acctggacta aggcgcacat
gacgtcgcgg tttctatggg ctcataatgg gtggtgagga 1440catttccctg tttaaactta
aacaagtttg tacaaaaaag caggctagat cttcaatatt 1500ggccattagc catattattc
attggttata tagcataaat caatattggc tattggccat 1560tgcatacgtt gtatctatat
cataatatgt acatttatat tggctcatgt ccaatatgac 1620cgccatgttg gcattgatta
ttgactagtt attaatagta atcaattacg gggtcattag 1680ttcatagccc atatatggag
ttccgcgtta cataacttac ggtaaatggc ccgcctggct 1740gaccgcccaa cgacccccgc
ccattgacgt caataatgac gtatgttccc atagtaacgc 1800caatagggac tttccattga
cgtcaatggg tggagtattt acggtaaact gcccacttgg 1860cagtacatca agtgtatcat
atgccaagtc cgccccctat tgacgtcaat gacggtaaat 1920ggcccgcctg gcattatgcc
cagtacatga ccttacggga ctttcctact tggcagtaca 1980tctacgtatt agtcatcgct
attaccatag tgatgcggtt ttggcagtac accaatgggc 2040gtggatagcg gtttgactca
cggggatttc caagtctcca ccccattgac gtcaatggga 2100gtttgttttg gcaccaaaat
caacgggact ttccaaaatg tcgtaataac cccgccccgt 2160tgacgcaaat gggcggtagg
cgtgtacggt gggaggtcta tataagcaga gctcgtttag 2220tgaaccgtca gatcactaga
agcttcaagc tctagcagga agaagaaaga agaagaagaa 2280gaagaagaag aagaagcgtc
tcctcttctt cttgtgagag taaaaaagaa aactcccaaa 2340aaaaagaaaa tcatcaaaaa
aacaaatttc aaaaagagtt tttgtgtttg gggattaaag 2400aagaaaaaaa acaacaggtg
agtaagcgca gttgtcgtct cttgcggtgc cgttgctggt 2460tctcacacct tttaggtctg
ttctcgtctt ccgttctgac tctctctttt tcgttgcagg 2520cc
2522743321DNAArtificial
sequenceEEE1-Xt+CMV+TEE 74tggtgaccct gtctcaaaaa accctcaaaa agtgttggga
ttagtggcat gcaccaccat 60tcccaccaaa ggtttatttt taataatatg tgtgtgagtg
tgtatcacta tgagtatatg 120tcaatatgtg tcaatgtccc cagggacatt taaagagccc
ctgaagctgg agtcataggc 180cattatgaac tgcctgacat ggctaatggg aattgaactc
agattttctg gaagttatac 240ctgctcttac tgctgagcca tgtctctgaa gaccccaggg
attttttttt ttttttgaga 300caggtatttt ctgtatagcc ctggctgtcc tgaaagcact
ctctatatgt agaccaggct 360tgcctggagc ttggatatgc acctgcttct gcctcaggaa
tggtgggatt gaaggtgtgc 420accaccacat ccgctaacat gcacaattct taatgggttt
atatcttatt taatgaatga 480aaggtttggg ggatggatgt agcttaatgg aaaatgactg
aagatttcaa ttaaaaatct 540ggggcttagc tgcgcggtgg gtggtgcctg cctttagtcc
cagtactggg gaggcagagg 600aaggaggatc tctgtgagtt cgaggccagc tggtctataa
cgtgagttcc aggacagcca 660gagatacaca gacaaaccct gtctcaccaa aacaaaacaa
caacaacaac aacaaatctg 720ggacgtaggc ttggtgtggt ggcacacatt ttgattccag
cacttggaag gaagaggcct 780gcatggtcta catagcttgt ttcaggcaac cagagctaca
tagtgagatc ctgtctcaac 840aaaaataaaa taatctaagg cttcaaaggg ttcaatctct
taggtagcta aatatgaaca 900aaatttggga aatgtgacct tttccttagt gacagtcaga
tagaaccttc tcgagtgcaa 960ggacaccaag tgcaaacagg ctcaagaaca gcctggaaag
gtctagtgct atggggcttc 1020aggtcgaatg ccaactgttt tcaagaactg tgtggatttt
tctgcctgta acgaattcag 1080attcattttt caaaactcgg ggagagtttt ccccctttat
aatttttttt ttaaatttat 1140taaactttgt ttcgttcccc ttgttttgag aattgcagag
tcatccaccc tgtcacagtg 1200ccagggagct cagggatggg cccaggggcc tggcggggct
gaaggggctg gggaagcgag 1260ggctccaaag ggaccccagt gtggcaggag ccaaagccct
aggtccctag aacgcagagg 1320ccaccgggac cccccagacg gggtaagcgg gtgggtgtct
ggggcgcgaa gccgcactgc 1380gcatgcgccg aggtccgctc cggccgcgct gatccaagcc
gggttctcgc gccgacctgg 1440tcgtgattga caagtcacac acgctgatcc ctccgcgggg
ccgcacaggg tcacagcctt 1500tcccctcccc acaaagcccc ctactctctg ggcaccacac
acgaacattc cttgagcgtg 1560accttgttgg ctctagtcag gcgcctccgg tgcagagact
ggaacggcct tgggaagtag 1620tccctaaccg catttccgcg gagggatcgt cgggagggcg
tggcttctga ggattatata 1680aggcgactcc gggcgggtct tagctagttc cgtcggagac
ccgagttcag tcgccgcttc 1740tctgtgagga ctgctgccgc cgccgctggt gaggagaagc
cgccgcgctt ggcgtagctg 1800agagacgggg agggggcgcg gacacgaggg gcagcccgcg
gcctggacgt tctgtttccg 1860tggcccgcga ggaaggcgac tgtcctgagg cggaggaccc
agcggcaaga tggcggccaa 1920gtggaagcct gaggggatag gcgagcggcc ctgaggcgct
cgacggggtt gggggggaag 1980caggcccgcg aggcagctgc agccgggaac gtgcggccaa
ccccttattt tttttgacgg 2040gttgcgggcc gtaggtgcct ccgaagtgag agccgtgggc
gtttgactgt cgggagaggt 2100cggtcggatt ttcatccgtt gctaaagacg gaagtgcgac
tgagacggga agggggggga 2160gtcggttggt ggcggttgaa cctggactaa ggcgcacatg
acgtcgcggt ttctatgggc 2220tcataatggg tggtgaggac atttccctgt ttaaacttaa
acaagtttgt acaaaaaagc 2280aggctagatc ttcaatattg gccattagcc atattattca
ttggttatat agcataaatc 2340aatattggct attggccatt gcatacgttg tatctatatc
ataatatgta catttatatt 2400ggctcatgtc caatatgacc gccatgttgg cattgattat
tgactagtta ttaatagtaa 2460tcaattacgg ggtcattagt tcatagccca tatatggagt
tccgcgttac ataacttacg 2520gtaaatggcc cgcctggctg accgcccaac gacccccgcc
cattgacgtc aataatgacg 2580tatgttccca tagtaacgcc aatagggact ttccattgac
gtcaatgggt ggagtattta 2640cggtaaactg cccacttggc agtacatcaa gtgtatcata
tgccaagtcc gccccctatt 2700gacgtcaatg acggtaaatg gcccgcctgg cattatgccc
agtacatgac cttacgggac 2760tttcctactt ggcagtacat ctacgtatta gtcatcgcta
ttaccatagt gatgcggttt 2820tggcagtaca ccaatgggcg tggatagcgg tttgactcac
ggggatttcc aagtctccac 2880cccattgacg tcaatgggag tttgttttgg caccaaaatc
aacgggactt tccaaaatgt 2940cgtaataacc ccgccccgtt gacgcaaatg ggcggtaggc
gtgtacggtg ggaggtctat 3000ataagcagag ctcgtttagt gaaccgtcag atcactagaa
gcttcaagct ctagcaggaa 3060gaagaaagaa gaagaagaag aagaagaaga agaagcgtct
cctcttcttc ttgtgagagt 3120aaaaaagaaa actcccaaaa aaaagaaaat catcaaaaaa
acaaatttca aaaagagttt 3180ttgtgtttgg ggattaaaga agaaaaaaaa caacaggtga
gtaagcgcag ttgtcgtctc 3240ttgcggtgcc gttgctggtt ctcacacctt ttaggtctgt
tctcgtcttc cgttctgact 3300ctctcttttt cgttgcaggc c
3321752232DNAArtificial sequenceEEE1-80+CMV+TEE
75tcaaaactcg gggagagttt tcccccttta taattttttt tttaaattta ttaaactttg
60tttcgttccc cttgttttga gaattgcaga gtcatccacc ctgtcacagt gccagggagc
120tcagggatgg gcccaggggc ctggcggggc tgaaggggct ggggaagcga gggctccaaa
180gggaccccag tgtggcagga gccaaagccc taggtcccta gaacgcagag gccaccggga
240ccccccagac ggggtaagcg ggtgggtgtc tggggcgcga agccgcactg cgcatgcgcc
300gaggtccgct ccggccgcgc tgatccaagc cgggttctcg cgccgacctg gtcgtgattg
360acaagtcaca cacgctgatc cctccgcggg gccgcacagg gtcacagcct ttcccctccc
420cacaaagccc cctactctct gggcaccaca cacgaacatt ccttgagcgt gaccttgttg
480gctctagtca ggcgcctccg gtgcagagac tggaacggcc ttgggaagta gtccctaacc
540gcatttccgc ggagggatcg tcgggagggc gtggcttctg aggattatat aaggcgactc
600cgggcgggtc ttagctagtt ccgtcggaga cccgagttca gtcgccgctt ctctgtgagg
660actgctgccg ccgccgctgg tgaggagaag ccgccgcgct tggcgtagct gagagacggg
720gagggggcgc ggacacgagg ggcagcccgc ggcctggacg ttctgtttcc gtggcccgcg
780aggaaggcga ctgtcctgag gcggaggacc cagcggcaag atggcggcca agtggaagcc
840tgaggggata ggcgagcggc cctgaggcgc tcgacggggt tgggggggaa gcaggcccgc
900gaggcagctg cagccgggaa cgtgcggcca accccttatt ttttttgacg ggttgcgggc
960cgtaggtgcc tccgaagtga gagccgtggg cgtttgactg tcgggagagg tcggtcggat
1020tttcatccgt tgctaaagac ggaagtgcga ctgagacggg aagggggggg agtcggttgg
1080tggcggttga acctggacta aggcgcacat gacgtcgcgg tttctatggg ctcataatgg
1140gtggtgagga catttccctg tttaaactta aacaagtttg tacaaaaaag caggctagat
1200cttcaatatt ggccattagc catattattc attggttata tagcataaat caatattggc
1260tattggccat tgcatacgtt gtatctatat cataatatgt acatttatat tggctcatgt
1320ccaatatgac cgccatgttg gcattgatta ttgactagtt attaatagta atcaattacg
1380gggtcattag ttcatagccc atatatggag ttccgcgtta cataacttac ggtaaatggc
1440ccgcctggct gaccgcccaa cgacccccgc ccattgacgt caataatgac gtatgttccc
1500atagtaacgc caatagggac tttccattga cgtcaatggg tggagtattt acggtaaact
1560gcccacttgg cagtacatca agtgtatcat atgccaagtc cgccccctat tgacgtcaat
1620gacggtaaat ggcccgcctg gcattatgcc cagtacatga ccttacggga ctttcctact
1680tggcagtaca tctacgtatt agtcatcgct attaccatag tgatgcggtt ttggcagtac
1740accaatgggc gtggatagcg gtttgactca cggggatttc caagtctcca ccccattgac
1800gtcaatggga gtttgttttg gcaccaaaat caacgggact ttccaaaatg tcgtaataac
1860cccgccccgt tgacgcaaat gggcggtagg cgtgtacggt gggaggtcta tataagcaga
1920gctcgtttag tgaaccgtca gatcactaga agcttcaagc tctagcagga agaagaaaga
1980agaagaagaa gaagaagaag aagaagcgtc tcctcttctt cttgtgagag taaaaaagaa
2040aactcccaaa aaaaagaaaa tcatcaaaaa aacaaatttc aaaaagagtt tttgtgtttg
2100gggattaaag aagaaaaaaa acaacaggtg agtaagcgca gttgtcgtct cttgcggtgc
2160cgttgctggt tctcacacct tttaggtctg ttctcgtctt ccgttctgac tctctctttt
2220tcgttgcagg cc
2232761942DNAArtificial sequenceEEE1-60+CMV+TEE 76cgcatgcgcc gaggtccgct
ccggccgcgc tgatccaagc cgggttctcg cgccgacctg 60gtcgtgattg acaagtcaca
cacgctgatc cctccgcggg gccgcacagg gtcacagcct 120ttcccctccc cacaaagccc
cctactctct gggcaccaca cacgaacatt ccttgagcgt 180gaccttgttg gctctagtca
ggcgcctccg gtgcagagac tggaacggcc ttgggaagta 240gtccctaacc gcatttccgc
ggagggatcg tcgggagggc gtggcttctg aggattatat 300aaggcgactc cgggcgggtc
ttagctagtt ccgtcggaga cccgagttca gtcgccgctt 360ctctgtgagg actgctgccg
ccgccgctgg tgaggagaag ccgccgcgct tggcgtagct 420gagagacggg gagggggcgc
ggacacgagg ggcagcccgc ggcctggacg ttctgtttcc 480gtggcccgcg aggaaggcga
ctgtcctgag gcggaggacc cagcggcaag atggcggcca 540agtggaagcc tgaggggata
ggcgagcggc cctgaggcgc tcgacggggt tgggggggaa 600gcaggcccgc gaggcagctg
cagccgggaa cgtgcggcca accccttatt ttttttgacg 660ggttgcgggc cgtaggtgcc
tccgaagtga gagccgtggg cgtttgactg tcgggagagg 720tcggtcggat tttcatccgt
tgctaaagac ggaagtgcga ctgagacggg aagggggggg 780agtcggttgg tggcggttga
acctggacta aggcgcacat gacgtcgcgg tttctatggg 840ctcataatgg gtggtgagga
catttccctg tttaaactta aacaagtttg tacaaaaaag 900caggctagat cttcaatatt
ggccattagc catattattc attggttata tagcataaat 960caatattggc tattggccat
tgcatacgtt gtatctatat cataatatgt acatttatat 1020tggctcatgt ccaatatgac
cgccatgttg gcattgatta ttgactagtt attaatagta 1080atcaattacg gggtcattag
ttcatagccc atatatggag ttccgcgtta cataacttac 1140ggtaaatggc ccgcctggct
gaccgcccaa cgacccccgc ccattgacgt caataatgac 1200gtatgttccc atagtaacgc
caatagggac tttccattga cgtcaatggg tggagtattt 1260acggtaaact gcccacttgg
cagtacatca agtgtatcat atgccaagtc cgccccctat 1320tgacgtcaat gacggtaaat
ggcccgcctg gcattatgcc cagtacatga ccttacggga 1380ctttcctact tggcagtaca
tctacgtatt agtcatcgct attaccatag tgatgcggtt 1440ttggcagtac accaatgggc
gtggatagcg gtttgactca cggggatttc caagtctcca 1500ccccattgac gtcaatggga
gtttgttttg gcaccaaaat caacgggact ttccaaaatg 1560tcgtaataac cccgccccgt
tgacgcaaat gggcggtagg cgtgtacggt gggaggtcta 1620tataagcaga gctcgtttag
tgaaccgtca gatcactaga agcttcaagc tctagcagga 1680agaagaaaga agaagaagaa
gaagaagaag aagaagcgtc tcctcttctt cttgtgagag 1740taaaaaagaa aactcccaaa
aaaaagaaaa tcatcaaaaa aacaaatttc aaaaagagtt 1800tttgtgtttg gggattaaag
aagaaaaaaa acaacaggtg agtaagcgca gttgtcgtct 1860cttgcggtgc cgttgctggt
tctcacacct tttaggtctg ttctcgtctt ccgttctgac 1920tctctctttt tcgttgcagg
cc 1942779513DNAArtificial
sequencepPNic384 77agatctaaca tccaaagacg aaaggttgaa tgaaaccttt ttgccatccg
acatccacag 60gtccattctc acacataagt gccaaacgca acaggagggg atacactagc
agcagaccgt 120tgcaaacgca ggacctccac tcctcttctc ctcaacaccc acttttgcca
tcgaaaaacc 180agcccagtta ttgggcttga ttggagctcg ctcattccaa ttccttctat
taggctacta 240acaccatgac tttattagcc tgtctatcct ggcccccctg gcgaggttca
tgtttgttta 300tttccgaatg caacaagctc cgcattacac ccgaacatca ctccagatga
gggctttctg 360agtgtggggt caaatagttt catgttcccc aaatggccca aaactgacag
tttaaacgct 420gtcttggaac ctaatatgac aaaagcgtga tctcatccaa gatgaactaa
gtttggttcg 480ttgaaatgct aacggccagt tggtcaaaaa gaaacttcca aaagtcgcca
taccgtttgt 540cttgtttggt attgattgac gaatgctcaa aaataatctc attaatgctt
agcgcagtct 600ctctatcgct tctgaacccc ggtgcacctg tgccgaaacg caaatgggga
aacacccgct 660ttttggatga ttatgcattg tctccacatt gtatgcttcc aagattctgg
tgggaatact 720gctgatagcc taacgttcat gatcaaaatt taactgttct aacccctact
tgacagcaat 780atataaacag aaggaagctg ccctgtctta aacctttttt tttatcatca
ttattagctt 840actttcataa ttgcgactgg ttccaattga caagcttttg attttaacga
cttttaacga 900caacttgaga agatcaaaaa acaactaatt attcgaagga tccaaacgat
gagatttcct 960tcaattttta ctgcagtttt attcgcggcc tcctcggcct tagctgctcc
agtcaacact 1020acaacagaag atgaaacggc acaaattccg gctgaagctg tcatcggtta
ctcagattta 1080gaaggggatt tcgatgttgc tgttttgcca ttttccaaca gcacaaataa
cgggttattg 1140tttataaata ctactattgc cagcattgct gctaaagaag aaggggtatc
tctcgagaaa 1200agagaggctg aagcttacgt agaattcgag ggtgctgtct tgcctagatc
cgctaaagaa 1260ttgagatgtc agtgtatcaa gacttactcc aagccattcc acccaaagtt
catcaaagag 1320ttgagagtta tcgagtccgg tccacactgt gctaacactg agatcatcgt
taagttgtcc 1380gacggtagag agttgtgttt ggacccaaaa gagaactggg ttcagagagt
tgttgagaag 1440ttcttgaaga gagctgagaa ctcctagtaa gcggccgcga attaattcgc
cttagacatg 1500actgttcctc agttcaagtt gggcacttac gagaagaccg gtcttgctag
attctaatca 1560agaggatgtc agaatgccat ttgcctgaga gatgcaggct tcatttttga
tactttttta 1620tttgtaacct atatagtata ggattttttt tgtcattttg tttcttctcg
tacgagcttg 1680ctcctgatca gcctatctcg cagctgatga atatcttgtg gtaggggttt
gggaaaatca 1740ttcgagtttg atgtttttct tggtatttcc cactcctctt cagagtacag
aagattaagt 1800gagaagttcg tttgtgcaag cttatcgata agctttaatg cggtagttta
tcacagttaa 1860attgctaacg cagtcaggca ccgtgtatga aatctaacaa tgcgctcatc
gtcatcctcg 1920gcaccgtcac cctggatgct gtaggcatag gcttggttat gccggtactg
ccgggcctct 1980tgcgggatat cgtccattcc gacagcatcg ccagtcacta tggcgtgctg
ctagcgctat 2040atgcgttgat gcaatttcta tgcgcacccg ttctcggagc actgtccgac
cgctttggcc 2100gccgcccagt cctgctcgct tcgctacttg gagccactat cgactacgcg
atcatggcga 2160ccacacccgt cctgtggatc tatcgaatct aaatgtaagt taaaatctct
aaataattaa 2220ataagtccca gtttctccat acgaacctta acagcattgc ggtgagcatc
tagaccttca 2280acagcagcca gatccatcac tgcttggcca atatgtttca gtccctcagg
agttacgtct 2340tgtgaagtga tgaacttctg gaaggttgca gtgttaactc cgctgtattg
acgggcatat 2400ccgtacgttg gcaaagtgtg gttggtaccg gaggagtaat ctccacaact
ctctggagag 2460taggcaccaa caaacacaga tccagcgtgt tgtacttgat caacataaga
agaagcattc 2520tcgatttgca ggatcaagtg ttcaggagcg tactgattgg acatttccaa
agcctgctcg 2580taggttgcaa ccgatagggt tgtagagtgt gcaatacact tgcgtacaat
ttcaaccctt 2640ggcaactgca cagcttggtt gtgaacagca tcttcaattc tggcaagctc
cttgtctgtc 2700atatcgacag ccaacagaat cacctgggaa tcaataccat gttcagcttg
agacagaagg 2760tctgaggcaa cgaaatctgg atcagcgtat ttatcagcaa taactagaac
ttcagaaggc 2820ccagcaggca tgtcaatact acacagggct gatgtgtcat tttgaaccat
catcttggca 2880gcagtaacga actggtttcc tggaccaaat attttgtcac acttaggaac
agtttctgtt 2940ccgtaagcca tagcagctac tgcctgggcg cctcctgcta gcacgataca
cttagcacca 3000accttgtggg caacgtagat gacttctggg gtaagggtac catccttctt
aggtggagat 3060gcaaaaacaa tttctttgca accagcaact ttggcaggaa cacccagcat
cagggaagtg 3120gaaggcagaa ttgcggttcc accaggaata tagaggccaa ctttctcaat
aggtcttgca 3180aaacgagagc agactacacc agggcaagtc tcaacttgca acgtctccgt
tagttgagct 3240tcatggaatt tcctgacgtt atctatagag agatcaatgg ctctcttaac
gttatctggc 3300aattgcataa gttcctctgg gaaaggagct tctaacacag gtgtcttcaa
agcgactcca 3360tcaaacttgg cagttagttc taaaagggct ttgtcaccat tttgacgaac
attgtcgaca 3420attggtttga ctaattccat aatctgttcc gttttctgga taggacgacg
aagggcatct 3480tcaatttctt gtgaggaggc cttagaaacg tcaattttgc acaattcaat
acgaccttca 3540gaagggactt ctttaggttt ggattcttct ttaggttgtt ccttggtgta
tcctggcttg 3600gcatctcctt tccttctagt gacctttagg gacttcatat ccaggtttct
ctccacctcg 3660tccaacgtca caccgtactt ggcacatcta actaatgcaa aataaaataa
gtcagcacat 3720tcccaggcta tatcttcctt ggatttagct tctgcaagtt catcagcttc
ctccctaatt 3780ttagcgttca acaaaacttc gtcgtcaaat aaccgtttgg tataagaacc
ttctggagca 3840ttgctcttac gatcccacaa ggtggcttcc atggctctaa gaccctttga
ttggccaaaa 3900caggaagtgc gttccaagtg acagaaacca acacctgttt gttcaaccac
aaatttcaag 3960cagtctccat cacaatccaa ttcgataccc agcaactttt gagttgctcc
agatgtagca 4020cctttatacc acaaaccgtg acgacgagat tggtagactc cagtttgtgt
ccttatagcc 4080tccggaatag actttttgga cgagtacacc aggcccaacg agtaattaga
agagtcagcc 4140accaaagtag tgaatagacc atcggggcgg tcagtagtca aagacgccaa
caaaatttca 4200ctgacaggga actttttgac atcttcagaa agttcgtatt cagtagtcaa
ttgccgagca 4260tcaataatgg ggattatacc agaagcaaca gtggaagtca catctaccaa
ctttgcggtc 4320tcagaaaaag cataaacagt tctactaccg ccattagtga aacttttcaa
atcgcccagt 4380ggagaagaaa aaggcacagc gatactagca ttagcgggca aggatgcaac
tttatcaacc 4440agggtcctat agataaccct agcgcctggg atcatccttt ggacaactct
ttctgccaaa 4500tctaggtcca aaatcacttc attgatacca ttattgtaca acttgagcaa
gttgtcgatc 4560agctcctcaa attggtcctc tgtaacggat gactcaactt gcacattaac
ttgaagctca 4620gtcgattgag tgaacttgat caggttgtgc agctggtcag cagcataggg
aaacacggct 4680tttcctacca aactcaagga attatcaaac tctgcaacac ttgcgtatgc
aggtagcaag 4740ggaaatgtca tacttgaagt cggacagtga gtgtagtctt gagaaattct
gaagccgtat 4800ttttattatc agtgagtcag tcatcaggag atcctctacg ccggacgcat
cgtggccgac 4860ctgcaggggg ggggggggcg ctgaggtctg cctcgtgaag aaggtgttgc
tgactcatac 4920caggcctgaa tcgccccatc atccagccag aaagtgaggg agccacggtt
gatgagagct 4980ttgttgtagg tggaccagtt ggtgattttg aacttttgct ttgccacgga
acggtctgcg 5040ttgtcgggaa gatgcgtgat ctgatccttc aactcagcaa aagttcgatt
tattcaacaa 5100agccgccgtc ccgtcaagtc agcgtaatgc tctgccagtg ttacaaccaa
ttaaccaatt 5160ctgattagaa aaactcatcg agcatcaaat gaaactgcaa tttattcata
tcaggattat 5220caataccata tttttgaaaa agccgtttct gtaatgaagg agaaaactca
ccgaggcagt 5280tccataggat ggcaagatcc tggtatcggt ctgcgattcc gactcgtcca
acatcaatac 5340aacctattaa tttcccctcg tcaaaaataa ggttatcaag tgagaaatca
ccatgagtga 5400cgactgaatc cggtgagaat ggcaaaagct tatgcatttc tttccagact
tgttcaacag 5460gccagccatt acgctcgtca tcaaaatcac tcgcatcaac caaaccgtta
ttcattcgtg 5520attgcgcctg agcgagacga aatacgcgat cgctgttaaa aggacaatta
caaacaggaa 5580tcgaatgcaa ccggcgcagg aacactgcca gcgcatcaac aatattttca
cctgaatcag 5640gatattcttc taatacctgg aatgctgttt tcccggggat cgcagtggtg
agtaaccatg 5700catcatcagg agtacggata aaatgcttga tggtcggaag aggcataaat
tccgtcagcc 5760agtttagtct gaccatctca tctgtaacat cattggcaac gctacctttg
ccatgtttca 5820gaaacaactc tggcgcatcg ggcttcccat acaatcgata gattgtcgca
cctgattgcc 5880cgacattatc gcgagcccat ttatacccat ataaatcagc atccatgttg
gaatttaatc 5940gcggcctcga gcaagacgtt tcccgttgaa tatggctcat aacacccctt
gtattactgt 6000ttatgtaagc agacagtttt attgttcatg atgatatatt tttatcttgt
gcaatgtaac 6060atcagagatt ttgagacaca acgtggcttt cccccccccc cctgcaggtc
ggcatcaccg 6120gcgccacagg tgcggttgct ggcgcctata tcgccgacat caccgatggg
gaagatcggg 6180ctcgccactt cgggctcatg agcgcttgtt tcggcgtggg tatggtggca
ggccccgtgg 6240ccgggggact gttgggcgcc atctccttgc atgcaccatt ccttgcggcg
gcggtgctca 6300acggcctcaa cctactactg ggctgcttcc taatgcagga gtcgcataag
ggagagcgtc 6360gagtatctat gattggaagt atgggaatgg tgatacccgc attcttcagt
gtcttgaggt 6420ctcctatcag attatgccca actaaagcaa ccggaggagg agatttcatg
gtaaatttct 6480ctgacttttg gtcatcagta gactcgaact gtgagactat ctcggttatg
acagcagaaa 6540tgtccttctt ggagacagta aatgaagtcc caccaataaa gaaatccttg
ttatcaggaa 6600caaacttctt gtttcgaact ttttcggtgc cttgaactat aaaatgtaga
gtggatatgt 6660cgggtaggaa tggagcgggc aaatgcttac cttctggacc ttcaagaggt
atgtagggtt 6720tgtagatact gatgccaact tcagtgacaa cgttgctatt tcgttcaaac
cattccgaat 6780ccagagaaat caaagttgtt tgtctactat tgatccaagc cagtgcggtc
ttgaaactga 6840caatagtgtg ctcgtgtttt gaggtcatct ttgtatgaat aaatctagtc
tttgatctaa 6900ataatcttga cgagccaagg cgataaatac ccaaatctaa aactctttta
aaacgttaaa 6960aggacaagta tgtctgcctg tattaaaccc caaatcagct cgtagtctga
tcctcatcaa 7020cttgaggggc actatcttgt tttagagaaa tttgcggaga tgcgatatcg
agaaaaaggt 7080acgctgattt taaacgtgaa atttatctca agatctctgc ctcgcgcgtt
tcggtgatga 7140cggtgaaaac ctctgacaca tgcagctccc ggagacggtc acagcttgtc
tgtaagcgga 7200tgccgggagc agacaagccc gtcagggcgc gtcagcgggt gttggcgggt
gtcggggcgc 7260agccatgacc cagtcacgta gcgatagcgg agtgtatact ggcttaacta
tgcggcatca 7320gagcagattg tactgagagt gcaccatatg cggtgtgaaa taccgcacag
atgcgtaagg 7380agaaaatacc gcatcaggcg ctcttccgct tcctcgctca ctgactcgct
gcgctcggtc 7440gttcggctgc ggcgagcggt atcagctcac tcaaaggcgg taatacggtt
atccacagaa 7500tcaggggata acgcaggaaa gaacatgtga gcaaaaggcc agcaaaaggc
caggaaccgt 7560aaaaaggccg cgttgctggc gtttttccat aggctccgcc cccctgacga
gcatcacaaa 7620aatcgacgct caagtcagag gtggcgaaac ccgacaggac tataaagata
ccaggcgttt 7680ccccctggaa gctccctcgt gcgctctcct gttccgaccc tgccgcttac
cggatacctg 7740tccgcctttc tcccttcggg aagcgtggcg ctttctcaat gctcacgctg
taggtatctc 7800agttcggtgt aggtcgttcg ctccaagctg ggctgtgtgc acgaaccccc
cgttcagccc 7860gaccgctgcg ccttatccgg taactatcgt cttgagtcca acccggtaag
acacgactta 7920tcgccactgg cagcagccac tggtaacagg attagcagag cgaggtatgt
aggcggtgct 7980acagagttct tgaagtggtg gcctaactac ggctacacta gaaggacagt
atttggtatc 8040tgcgctctgc tgaagccagt taccttcgga aaaagagttg gtagctcttg
atccggcaaa 8100caaaccaccg ctggtagcgg tggttttttt gtttgcaagc agcagattac
gcgcagaaaa 8160aaaggatctc aagaagatcc tttgatcttt tctacggggt ctgacgctca
gtggaacgaa 8220aactcacgtt aagggatttt ggtcatgaga ttatcaaaaa ggatcttcac
ctagatcctt 8280ttaaattaaa aatgaagttt taaatcaatc taaagtatat atgagtaaac
ttggtctgac 8340agttaccaat gcttaatcag tgaggcacct atctcagcga tctgtctatt
tcgttcatcc 8400atagttgcct gactccccgt cgtgtagata actacgatac gggagggctt
accatctggc 8460cccagtgctg caatgatacc gcgagaccca cgctcaccgg ctccagattt
atcagcaata 8520aaccagccag ccggaagggc cgagcgcaga agtggtcctg caactttatc
cgcctccatc 8580cagtctatta attgttgccg ggaagctaga gtaagtagtt cgccagttaa
tagtttgcgc 8640aacgttgttg ccattgctgc aggcatcgtg gtgtcacgct cgtcgtttgg
tatggcttca 8700ttcagctccg gttcccaacg atcaaggcga gttacatgat cccccatgtt
gtgcaaaaaa 8760gcggttagct ccttcggtcc tccgatcgtt gtcagaagta agttggccgc
agtgttatca 8820ctcatggtta tggcagcact gcataattct cttactgtca tgccatccgt
aagatgcttt 8880tctgtgactg gtgagtactc aaccaagtca ttctgagaat agtgtatgcg
gcgaccgagt 8940tgctcttgcc cggcgtcaac acgggataat accgcgccac atagcagaac
tttaaaagtg 9000ctcatcattg gaaaacgttc ttcggggcga aaactctcaa ggatcttacc
gctgttgaga 9060tccagttcga tgtaacccac tcgtgcaccc aactgatctt cagcatcttt
tactttcacc 9120agcgtttctg ggtgagcaaa aacaggaagg caaaatgccg caaaaaaggg
aataagggcg 9180acacggaaat gttgaatact catactcttc ctttttcaat attattgaag
catttatcag 9240ggttattgtc tcatgagcgg atacatattt gaatgtattt agaaaaataa
acaaataggg 9300gttccgcgca catttccccg aaaagtgcca cctgacgtct aagaaaccat
tattatcatg 9360acattaacct ataaaaatag gcgtatcacg aggccctttc gtcttcaaga
attaattctc 9420atgtttgaca gcttatcatc gataagctga ctcatgttgg tattgtgaaa
tagacgcaga 9480tcgggaacac tgaaaaataa cagttattat tcg
9513781720DNAArtificial sequencepPNic602 insert 78cctgacgtct
aagaaaccat tattatcatg acattaacct ataaaaatag gcgtatcacg 60aggccctttc
gtcttcaaga attaattctc atgtttgaca gcttatcatc gataagctga 120ctcatgttgg
tattgtgaaa tagacgcaga tcgggaacac tgaaaaataa cagttattat 180tcgtttcagg
caaccagagc tacatagtga gatcctgtct caacaaaaat aaaataatct 240aaggcttcaa
agggttcaat ctcttaggta gctaaatatg aacaaaattt gggaaatgtg 300accttttcct
tagtgacagt cagatagaac cttctcgagt gcaaggacac caagtgcaaa 360caggctcaag
aacagcctgg aaaggtctag tgctatgggg cttcaggtcg aatgccaact 420gttttcaaga
actgtgtgga tttttctgcc tgtaacgaat tcagattcat ttttcaaaac 480tcggggagag
ttttccccct ttataatttt ttttttaaat ttattaaact ttgtttcgtt 540ccccttgttt
tgagaattgc agagtcatcc accctgtcac agtgccaggg agctcaggga 600tgggcccagg
ggcctggcgg ggctgaaggg gctggggaag cgagggctcc aaagggaccc 660cagtgtggca
ggagccaaag ccctaggtcc ctagaacgca gaggccaccg ggacccccca 720gacggggtaa
gcgggtgggt gtctggggcg cgaagccgca ctgcgcatgc gccgaggtcc 780gctccggccg
cgctgatcca agccgggttc tcgcgccgac ctggtcgtga ttgacaagtc 840acacacgctg
atccctccgc ggggccgcac agggtcacag cctttcccct ccccacaaag 900ccccctactc
tctgggcacc acacacgaac attccttgag cgtgaccttg ttggctctag 960tcaggcgcct
ccggtgcaga gactggaacg gccttgggaa gtagtcccta accgcatttc 1020cgcggaggga
tcgtcgggag ggcgtggctt ctgaggatta tataaggcga ctccgggcgg 1080gtcttagcta
gttccgtcgg agacccgagt tcagtcgccg cttctctgtg aggactgctg 1140ccgccgccgc
tggtgaggag aagccgccgc gcttggcgta gctgagagac ggggaggggg 1200cgcggacacg
aggggcagcc cgcggcctgg acgttctgtt tccgtggccc gcgaggaagg 1260cgactgtcct
gaggcggagg acccagcggc aagatggcgg ccaagtggaa gcctgagggg 1320ataggcgagc
ggccctgagg cgctcgacgg ggttgggggg gaagcaggcc cgcgaggcag 1380ctgcagccgg
gaacgtgcgg ccaacccctt attttttttg acgggttgcg ggccgtaggt 1440gcctccgaag
tgagagccgt gggcgtttga ctgtcgggag aggtcggtcg gattttcatc 1500cgttgctaaa
gacggaagtg cgactgagac gggaaggggg gggagtcggt tggtggcggt 1560tgaacctgga
ctaaggcgca catgacgacg cggtttctat gggctcataa tgggtggtga 1620ggacatttcc
ctagatctaa catccaaaga cgaaaggttg aatgaaacct ttttgccatc 1680cgacatccac
aggtccattc tcacacataa gtgccaaacg
172079455DNAArtificial sequenceEF1a promoter 79ggatccttgg agctaagcca
gcaatggtag agggaagatt ctgcacgtcc cttccaggcg 60gcctccccgt caccaccccc
cccaacccgc cccgaccgga gctgagagta attcatacaa 120aaggactcgc ccctgccttg
gggaatccca gggaccgtcg ttaaactccc actaacgtag 180aacccagaga tcgctgcgtt
cccgccccct cacccgcccg ctctcgtcat cactgaggtg 240gagaagagca tgcgtgaggc
tccggtgccc gtcagtgggc agagcgcaca tcgcccacag 300tccccgagaa gttgggggga
ggggtcggca attgaaccgg tgcctagaga aggtggcgcg 360gggtaaactg ggaaagtgat
gtcgtgtact ggctccgcct ttttcccgag ggtgggggag 420aaccgtatat aagtgcagta
gtcgccgtga acgtt 455802250DNAArtificial
sequenceEEE1-A1 80tgcagtgcca gacactacat agcctgatat actctagtgt taactgagaa
ggagcaccat 60tcccaccata ccttaaatat aatttaaaag agagagtgtg agaaactcaa
agtgaacaag 120actaaaagag actaagacac aaggcagaat aatacactcc ctcatggtgc
tgactttgcc 180caatttcatc aggcagtctt gggttaagcc taatcatcac tgttatacag
catgattttg 240gtccacattc aggtcaccca agacacagta cagggctggg ttttatatat
tgatatcaca 300gaggaaatat gtcttttggc ctgggtctgc agtatggagt gtgtttttct
tgtgcagggt 360aggcagcacc atgcttttcc tcctccttct gcctcagcat tggagggtta
gtaggagagg 420agctctagaa ccggttagaa ggagattaca taatcctttt ttatcctatt
atatgaatga 480aagatttggc ctatgttctt ggcttttggg ataatgactt acaatactac
ttatggttgt 540ggcccttacc agcgcggtgg gtggtgcgct gcttttagtc ctgatattgg
ggaggcagag 600gaaggaggat cactctcact acgtgggcac ctggtgtttt acgagactac
cacctctggc 660tgtgttacag tgaatagcac agaccctacc aatgcatata acaacaacta
gatctattgt 720gccacgaagg gtagctctcc agcctctctt atagttagga cctcatggaa
ggaagagggc 780aggaagctgt tctttggtag atacacctaa ctagagttca ttagtgagtg
catgcaccta 840atataaatat ttgatctaag gagcataggc tactaatctc ttaggtggtt
attttcactt 900ctatatttgg gaaagagtcc tataccctag ctaaagttac aaagaacctt
ctcgactcct 960aggtctggaa gtgcaaaagg gtgaaagaac agctgctatg ctgttgaggt
ttggcccctt 1020cccctcgtaa ggctagtcta tactacatca gagagcaata tacaggcaga
atcgtaatga 1080catacttata cctatacacg ggcacactat accccataaa attatacttt
agagcaataa 1140attatcatag atacgatggg catctatagg taattctatg accatccacc
ctgcatctgt 1200ggctgcctgg tgaggcaagc ccccaggggg cagccgggcc agtagcccct
ggggtaccgt 1260ggggtgctat gcctccccac tctgggaggt ggctatgggg ttgctgcttt
gtacggacag 1320ggctccggga cccccctgtc ggccttaccg gcagggagac aggcccgcgt
agccggagtc 1380cggaagcgcc gtggtgcggt gcgcccgcgc tgatcctacc cgctatgtcg
cgccgtcctg 1440gacgagttag tctactgaga gacgctgatc cctccgcggg gcccggagac
actgagaccc 1500tatgcccacc ctagattacc cctcaatctc tgggcaccac acacgaacat
tccttgagcg 1560tgaccttgtt ggctctagtc aggcgcctcc ggtgcagaga ctggaacggc
cttgggaagt 1620agtccctaac cgcatttccg cggagggatc gtcgggaggg cgtggcttct
gaggattata 1680taaggcgact ccgggcgggt cttagctagt tccgtcggag acccgagttc
agtcgccgct 1740tctctgtgag gactgctgcc gccgccgctg gtgaggagaa gccgccgcgc
ttggcgtagc 1800tgagagacgg ggagggggcg cggacacgag gggcagcccg cggcctggac
gttctgtttc 1860cgtggcccgc gaggaaggcg actgtcctga ggcggaggac ccagcggcaa
gatggcggcc 1920aagtggaagc ctgaggggat aggcgagcgg ccctgaggcg ctcgacgggg
ttggggggga 1980agcaggcccg cgaggcagct gcagccggga acgtgcggcc aaccccttat
tttttttgac 2040gggttgcggg ccgtaggtgc ctccgaagtg agagccgtgg gcgtttgact
gtcgggagag 2100gtcggtcgga ttttcatccg ttgctaaaga cggaagtgcg actgagacgg
gaaggggggg 2160gagtcggttg gtggcggttg aacctggact aaggcgcaca tgacgtcgcg
gtttctatgg 2220gctcataatg ggtggtgagg acatttccct
2250811449DNAArtificial sequenceEEE1-A2 81tttcaggcaa
ccagagctac atagtgagat cctgtctcaa caaaaataaa ataatctaag 60gcttcaaagg
gttcaatctc ttaggtagct aaatatgaac aaaatttggg aaatgtgacc 120ttttccttag
tgacagtcag atagaacctt ctcgagtgca aggacaccaa gtgcaaacag 180gctcaagaac
agcctggaaa ggtctagtgc tatggggctt caggtcgaat gccaactgtt 240ttcaagaact
gtgtggattt ttctgcctgt aacgaattca gattcatttt tcaaaactcg 300gggagagttt
tcccccttta taattttttt tttaaattta ttaaactttg tttcgttccc 360cttgttttga
gaattgcaga gtcatccacc ctgtcacagt gccagggagc tcagggatgg 420gcccaggggc
ctggcggggc tgaaggggct ggggaagcga gggctccaaa gggaccccag 480tgtggcagga
gccaaagccc taggtcccta gaacgcagag gccaccggga ccccccagac 540ggggtaagcg
ggtgggtgtc tggggcgcga agccgcactg cgcatgcgcc gaggtccgct 600ccggccgcgc
tgatccaagc cgggttctcg cgccgacctg gtcgtgattg acaagtcaca 660cacgctgatc
cctccgcggg gccgcacagg gtcacagcct ttcccctccc cacaaagccc 720cctactgtct
ggccatctct ctcgtagaat gcatcaccga gtgctagatg ccacaactga 780cccgcctccg
ctcctgtgtc agcataggcc ttgggaagta gtcctttagc ggaataccgc 840gcaccgttcg
acgtgagggc gtggctattg aggattatat aaggcgtcac cgcccggctg 900taacctagtt
ccgtcgcaca gccgagagta cccgccgctt ctcagagtgc agtccaggcg 960ccgccgctgg
tgaggagaag ccgccgcgct tggcgtagct gagagacggg gagggggcgc 1020ggacacgagg
ggcagcccgc ggcctggacg ttctgtttcc gtggcccgcg aggaaggcga 1080ctgtcctgag
gcggaggacc cagcggcaag atggcggcca agtggaagcc tgaggggata 1140ggcgagcggc
cctgaggcgc tcgacggggt tgggggggaa gcaggcccgc gaggcagctg 1200cagccgggaa
cgtgcggcca accccttatt ttttttgacg ggttgcgggc cgtaggtgcc 1260tccgaagtga
gagccgtggg cgtttgactg tcgggagagg tcggtcggat tttcatccgt 1320tgctaaagac
ggaagtgcga ctgagacggg aagggggggg agtcggttgg tggcggttga 1380acctggacta
aggcgcacat gacgtcgcgg tttctatggg ctcataatgg gtggtgagga 1440catttccct
1449821449DNAArtificial sequenceEEE1-A3 82tttcaggcaa ccagagctac
atagtgagat cctgtctcaa caaaaataaa ataatctaag 60gcttcaaagg gttcaatctc
ttaggtagct aaatatgaac aaaatttggg aaatgtgacc 120ttttccttag tgacagtcag
atagaacctt ctcgagtgca aggacaccaa gtgcaaacag 180gctcaagaac agcctggaaa
ggtctagtgc tatggggctt caggtcgaat gccaactgtt 240ttcaagaact gtgtggattt
ttctgcctgt aacgaattca gattcatttt tcaaaactcg 300gggagagttt tcccccttta
taattttttt tttaaattta ttaaactttg tttcgttccc 360cttgttttga gaattgcaga
gtcatccacc ctgtcacagt gccagggagc tcagggatgg 420gcccaggggc ctggcggggc
tgaaggggct ggggaagcga gggctccaaa gggaccccag 480tgtggcagga gccaaagccc
taggtcccta gaacgcagag gccaccggga ccccccagac 540ggggtaagcg ggtgggtgtc
tggggcgcga agccgcactg cgcatgcgcc gaggtccgct 600ccggccgcgc tgatccaagc
cgggttctcg cgccgacctg gtcgtgattg acaagtcaca 660cacgctgatc cctccgcggg
gccgcacagg gtcacagcct ttcccctccc cacaaagccc 720cctactctct gggcaccaca
cacgaacatt ccttgagcgt gaccttgttg gctctagtca 780ggcgcctccg gtgcagagac
tggaacggcc ttgggaagta gtccctaacc gcatttccgc 840ggagggatcg tcgggagggc
gtggcttctg aggattatat aaggcgactc cgggcgggtc 900ttagctagtt ccgtcggaga
cccgagttca gtcgccgctt ctctgtgagg actgctgccg 960ccgccgctgg tgaggtcaat
ccgccgcggt agccgattgt ctctctcggg gagggcccgc 1020gctgtcgtcc cctggggcgc
ggggacctcg aagacaaagc gacctgcgcg aggaaggcga 1080ctgtcctgag gcggaggtgc
ctccggcaag atggcgctgt tcacctttgg actccccatt 1140cccgtccggg ggactcccgg
acgtcgccca accccccctt cgtccggcgc gtccctccac 1200tgctcgcctt cgaccgccct
tggggaattt ttttttctcg ggttgcgggc cgatccaccc 1260agcgttcact ctcccgtggg
cgtttgtgac acgcctctcc acggtcgcta aaagtagcgt 1320tgctaaagac gcttcaccgt
gactgacggg aagggggggg agtcgcaacc acccggttct 1380tggacctgaa aggcggtgta
ctcgtcgcgc aaagataccc gactggggtt ttggtgagga 1440catttccct
1449831449DNAArtificial
sequenceEEE1-B1 83tttcaggcaa ccagagctac atagtgagat cctgtctcaa caaaaataaa
ataatctaag 60gcttcaaagg gttcaatctc ttaggtagct aaatatgaac aaaatttggg
aaatgtgacc 120ttttccttag tgacagtcag ataactgcat tccgagtgca aggacaccaa
gtgcaaacag 180gctcaagaac agcctggaaa ggtctagtgc tatggggctt caggtcgaat
gccaactgtt 240ttcaagaact gtgtggattt ttctgcctgt aacgaattca gattcatttt
tcaaaactcg 300gggagagttt tcccccttta taattttttt tttaaattta ttaaactttg
tttcgttccc 360cttgttttga gaattgcaga gtcatccacc ctgtcacagt gccagggagc
tcagggatgg 420gcccaggggc ctggcggggc tgaaggggct ggggaagcga gggctccaaa
gggaccccag 480tgtggcagga gccaaagccc taggtcccta gaacgcagag gccaccggga
ccccccagac 540ggggtaagcg ggtgggtgtc tggggcgcga agccgcactg cgcatgcgcc
gaggtccgct 600ccggccgcgc ttgcacaagc cgggttctcg cgccgacctg gtcgtgattg
acaagtcaca 660cacgcttgca cctccgcggg gccgcacagg gtcacagcct ttcccctccc
cacaaagccc 720cctactctct gggcaccaca cacgaacatt ccttgagcgt gaccttgttg
gctctagtca 780ggcgcctccg gtgcagagac tggaacggcc ttgggctgag atacctaacc
gcatttccgc 840ggagggatcg tcgggagggc gtggcttctg aggattatat aaggcgactc
cgggcgggtc 900ttagctagtt ccgtcggaga cccgagttca gtcgccgctt ctctgtgagg
actgctgccg 960ccgccgctgg tgaggagaag ccgccgcgct tggcgtagct gagagacggg
gagggggcgc 1020ggacacgagg ggcagcccgc ggcctggacg ttctgtttcc gtggcccgcg
aggaaggcga 1080ctgtcctgag gcggaggacc cagcgtagag caggcggcca agtggaagcc
tgaggggata 1140ggcgagcggc cctgaggcgc tcgacggggt tgggggggaa gcaggcccgc
gaggcagctg 1200cagccgggaa cgtgcggcca accccttatt ttttttgacg ggttgcgggc
cgtaggtgcc 1260tccgaagtga gagccgtggg cgtttgactg tcgggagagg tcggtcggat
tttcatccgt 1320tgctaaagac ggaagtgcga ctgagacggg aagggggggg agtcggttgg
tggcggttga 1380acctggacta aggcgcacat gacgtcgcgg tttctatggg ctcataatgg
gagggtgtga 1440catttccct
1449841449DNAArtificial sequenceEEE1-B2 84tttcaggcaa
ccagagctac atagtgagat cctgtctcaa caaaaataaa ataatctaag 60gcttcaaagg
gttcaatctc ttaggtagct aaatatgaac aaaatttggg aaatgtgacc 120ttttccttag
tgacagtcag atagaacctt ctcgagtgca aggacaccaa gtgcaaacag 180gctcaagaac
agcctggaaa ggtctagtgc tatggggctt caggtccaat gccaactgtt 240ttcaagaact
gtgtggattt ttctgcctgt aaccaattca gattcatttt tcaaaactgg 300gggagagttt
tcccccttta taattttttt tttaaattta ttaaactttg tttccttccc 360cttgttttga
gaattgcaga gtcatccacc ctgtcacagt gccagggagc tcagggatgg 420gcccaggggc
ctgggggggc tgaaggggct ggggaagcca gggctccaaa gggaccccag 480tgtggcagga
gccaaagccc taggtcccta gaacccagag gccaccggga ccccccagac 540ggggtaaggg
ggtgggtgtc tggggcccca agccccactg cccatgggcc caggtcccct 600ccggccgcgc
tgatccaagc cgggttctgg cccccacctg gtggtgattg acaagtcaca 660cacgctgatc
cctccgcggg gccgcacagg gtcacagcct ttcccctccc cacaaagccc 720cctactctct
gggcaccaca caggaacatt ccttgagcct gaccttgttg gctctagtca 780ggggcctccg
gtgcagagac tggaacggcc ttgggaagta gtccctaacc gcatttcccc 840ggagggatcg
tcgggagggc gtggcttctg aggattatat aaggccactc ctggggggtc 900ttagctagtt
ccctgggaga ccccagttca gtcccccctt ctctgtgagg actgctgccc 960ccgccgctgg
tgaggagaag ccccctcgct tggcgtagct gagagacggg gagggggcgc 1020ggacacgagg
ggcagccccc tgcctggacc ttctgtttcc gtggcccgcg aggaaggcga 1080ctgtcctgag
gcggaggacc cagcggcaag atggcggcca agtggaagcc tgaggggata 1140ggccaggggc
cctgaggccc tccagggggt tgggggggaa gcaggcccgc caggcagctg 1200cagctgggaa
cctgcggcca accccttatt ttttttgacg ggttgggggc cgtaggtgcc 1260tccgaagtga
gagccgtggg cgtttgactg tcgggagagg tcggtcggat tttcatccgt 1320tgctaaagac
ggaagtgcga ctgagacggg aagggggggg agtcggttgg tgggggttga 1380acctggacta
aggcccacat gacgtccctg tttctatggg ctcataatgg gtggtgagga 1440catttccct
1449851449DNAArtificial sequenceEEE1-B3 85tttcaggcaa ccagagctac
atagtgagat cctgtctcaa caaaaataaa ataatctaag 60gcttcaaagg gttcaatctc
ttaggtagct aaatatgaac aaaatttggg aaatgtgacc 120ttttccttag tgacagtcag
atagaacctt ctcgagtgca aggacaccaa gtgcaaacag 180gctcaagaac agcctggaaa
ggtctagtgc tatggggctt caggtcgaat gccaactgtt 240ttcaagaact gtgtggattt
ttctgcctgt aacgaattca gattcatttt tcaaaactcg 300gggagagttt tcccccttta
taattttttt tttaaattta ttaaactttg tttcgttccc 360cttgttttga gaattgcaga
gtcatccacc ctgtcacagt gccagggagc tcagggatgg 420gcccaggggc ctggcggggc
tgaaggggct ggggaagcga gggctccaaa gggaccccag 480tgtggcagga gccaaagccc
taggtcccta gaacgcagag gccaccggga ccccccagac 540ggggtaagcg ggtgggtgtc
tggggcgcca agccccactg cccatgcgcc gaggtcccct 600cctgccgcgc tgatccaagc
cgggttctcg cgccgacctg gtcgtgattg acaagtcaca 660cacgctgatc cctccgcggg
gccgcacagg gtcacagcct ttcccctccc cacaaagccc 720cctactctct gggcaccaca
cacgaacatt ccttgagcgt gaccttgttg gctctagtca 780ggcgcctccg gtgcagagac
tggaacggcc ttgggaagta gtccctaacc gcatttccgc 840ggagggatcg tcgggagggc
gtggcttctg aggattatat aaggcgactc cgggcgggtc 900ttagctagtt ccgtcggaga
cccgagttca gtcgccgctt ctctgtgagg actgctgccg 960ccgccgctgg tgaggagaag
ccgccgcgct tggcgtagct gagagacggg gagggggcgc 1020ggacacgagg ggcagcccgc
ggcctggacg ttctgtttcc gtggcccgcg aggaaggcga 1080ctgtcctgag gcggaggacc
cagcggcaag atggcggcca agtggaagcc tgaggggata 1140ggcgagcggc cctgaggcgc
tcgacggggt tgggggggaa gcaggcccgc gaggcagctg 1200cagccgggaa cgtgcggcca
accccttatt ttttttgacg ggttgcgggc cgtaggtgcc 1260tccgaagtga gagccgtggg
cgtttgactg tcgggagagg tcggtcggat tttcatccgt 1320tgctaaagac ggaagtgcga
ctgagacggg aagggggggg agtcggttgg tggcggttga 1380acctggacta aggcgcacat
gacgtcgcgg tttctatggg ctcataatgg gtggtgagga 1440catttccct
1449861449DNAArtificial
sequenceEEE1-B4 86tttcaggcaa ccagagctac atagtgagat cctgtctcaa caaaaataaa
ataatctaag 60gcttcaaagg gttcaatctc ttaggtagct aaatatgaac aaaatttggg
aaatgtgacc 120ttttccttag tgacagtcag atagaacctt ctcgagtgca aggacaccaa
gtgcaaacag 180gctcaagaac agcctggaaa ggtctagtgc tatggggctt caggtccaat
gccaactgtt 240ttcaagaact gtgtggattt ttctgcctgt aaggaattca gattcatttt
tcaaaactcg 300gggagagttt tcccccttta taattttttt tttaaattta ttaaactttg
tttccttccc 360cttgttttga gaattgcaga gtcatccacc ctgtcacagt gccagggagc
tcagggatgg 420gcccaggggc ctgggggggc tgaaggggct ggggaaggga gggctccaaa
gggaccccag 480tgtggcagga gccaaagccc taggtcccta gaacccagag gccaccggga
ccccccagag 540ggggtaagcg ggtgggtgtc tggggcccga agccccactg cgcatgcccc
gaggtcccct 600ccggccccgc tgatccaagc ctggttctcg ccccgacctg gtcgtgattg
acaagtcaca 660cacgctgatc cctccgcggg gccccacagg gtcacagcct ttcccctccc
cacaaagccc 720cctactctct gggcaccaca caccaacatt ccttgagcgt gaccttgttg
gctctagtca 780ggggcctccg gtgcagagac tggaagggcc ttgggaagta gtccctaacc
gcatttcccc 840ggagggatcc tcgggagggg gtggcttctg aggattatat aaggccactc
cgggcgggtc 900ttagctagtt ccctcggaga ccccagttca gtcgcccctt ctctgtgagg
actgctgccg 960cccccgctgg tgaggagaag cccccgccct tggggtagct gagagacggg
gagggggccc 1020ggacaggagg ggcagcccgg ggcctggacg ttctgtttcc ctggcccgcg
aggaaggcca 1080ctgtcctgag ggggaggacc cagcggcaag atggcggcca agtggaagcc
tgaggggata 1140ggcgaggggc cctgaggcgc tccacggggt tgggggggaa gcaggccccc
gaggcagctg 1200cagcctggaa cgtggggcca accccttatt ttttttgacg ggttgggggc
cgtaggtgcc 1260tcccaagtga gagccctggg cctttgactg tcgggagagg tcggtcggat
tttcatccct 1320tgctaaagag ggaagtggga ctgagacggg aagggggggg agtgggttgg
tggcggttga 1380acctggacta aggggcacat gacgtcccgg tttctatggg ctcataatgg
gtggtgagga 1440catttccct
1449871449DNAArtificial sequenceEEE1-B5 87tttcaggcaa
ccagagctac atagtgagat cctgtctcaa caaaaataaa ataatctaag 60gcttcaaagg
gttcaatctc ttaggtagct aaatatgaac aaaagtatta ggatgtgacc 120ttttccttag
tgacagtcag atagaacctt ctcgagtgca aggacaccaa gtgcaaacag 180gctcaagaac
agcctggaaa ggtctagtgc tatggggctt caggtcgaat gccaactgtt 240ttcaagaact
gtgtggattt ttctgcctgt aacgaattca gattcatttt tcaaaactcg 300gggagagttt
tcccccttta taattttttt tttaaattta ttaaactttg tttcgttccc 360cttgttttga
gaattgcaga gtcatccacc ctgtcacagt gccagggagc tcagggatgg 420gcccaggggc
ctggcggggc tgaaggggct ggggaagcga gggctccaaa gggaccccag 480tgtggcagga
gccaaagccc taggtcccta gaacgcagag gccaccgccc aggcccagac 540ggggtaagcg
ggtgggtgtc tggggcgcga agccgcactg cgcatgcgcc gaggtccgct 600ccggccgcgc
tgatccaagc cgggttctcg cgccgacctg gtcgtgattg acaagtcaca 660cacgctgatc
cctccgcggg gccgcacagg gtcacagcct ttcccctccc cacaaagccc 720cctactctct
gggcaccaca cacgaacatt ccttgagcgt gaccttgttg gctctagtca 780ggcgcctccg
gtgcagagac tggaacggcc tggtatggaa gtccctaacc gcatttccgc 840ggagggatcg
tcggggagcc ggggtttctg aggattatat aaggcgactc cgggcgggtc 900ttagctagtt
ccgtcggaga cccgagttca gtcgccgctt ctctgtgagg actgctgccg 960ccgccgctgg
tgaggagaag ccgccgcgct tggcgtagct gagagacgga gggggggcgc 1020ggacacgagg
ggcagcccgc ggcctggacg ttctgtttcc gtggcccgcg ggaggcacgt 1080gaggataggc
acggtgcacc cagcggcaag atggcggcca agtggaagcc tgaggggata 1140ggcgagcggc
cctgaggcgc tcgacggggt tgggggggaa gcaggcccgc gaggcagctg 1200cagccgggaa
cgtgcggcca accccttatt ttttttgacg ggttgcgggc cgtaggtgcc 1260tccgaagtga
gagccgtgtg cgtgtgactg tcgggagagg tcggtcggat tttcatccga 1320tagttacaga
cggaagtgcg actgaggcgg ggaggaggag agtcggttgg tggcggttga 1380acctggacta
aggcgcacat gacgtcgcgg tttctatggg ctcataatgg gtggtgaggt 1440cattcacct
1449882617DNAArtificial sequenceTranscription regulating sequence
88tttcaggcaa ccagagctac atagtgagat cctgtctcaa caaaaataaa ataatctaag
60gcttcaaagg gttcaatctc ttaggtagct aaatatgaac aaaatttggg aaatgtgacc
120ttttccttag tgacagtcag atagaacctt ctcgagtgca aggacaccaa gtgcaaacag
180gctcaagaac agcctggaaa ggtctagtgc tatggggctt caggtcgaat gccaactgtt
240ttcaagaact gtgtggattt ttctgcctgt aacgaattca gattcatttt tcaaaactcg
300gggagagttt tcccccttta taattttttt tttaaattta ttaaactttg tttcgttccc
360cttgttttga gaattgcaga gtcatccacc ctgtcacagt gccagggagc tcagggatgg
420gcccaggggc ctggcggggc tgaaggggct ggggaagcga gggctccaaa gggaccccag
480tgtggcagga gccaaagccc taggtcccta gaacgcagag gccaccggga ccccccagac
540ggggtaagcg ggtgggtgtc tggggcgcga agccgcactg cgcatgcgcc gaggtccgct
600ccggccgcgc tgatccaagc cgggttctcg cgccgacctg gtcgtgattg acaagtcaca
660cacgctgatc cctccgcggg gccgcacagg gtcacagcct ttcccctccc cacaaagccc
720cctactctct gggcaccaca cacgaacatt ccttgagcgt gaccttgttg gctctagtca
780ggcgcctccg gtgcagagac tggaacggcc ttgggaagta gtccctaacc gcatttccgc
840ggagggatcg tcgggagggc gtggcttctg aggattatat aaggcgactc cgggcgggtc
900ttagctagtt ccgtcggaga cccgagttca gtcgccgctt ctctgtgagg actgctgccg
960ccgccgctgg tgaggagaag ccgccgcgct tggcgtagct gagagacggg gagggggcgc
1020ggacacgagg ggcagcccgc ggcctggacg ttctgtttcc gtggcccgcg aggaaggcga
1080ctgtcctgag gcggaggacc cagcggcaag atggcggcca agtggaagcc tgaggggata
1140ggcgagcggc cctgaggcgc tcgacggggt tgggggggaa gcaggcccgc gaggcagctg
1200cagccgggaa cgtgcggcca accccttatt ttttttgacg ggttgcgggc cgtaggtgcc
1260tccgaagtga gagccgtggg cgtttgactg tcgggagagg tcggtcggat tttcatccgt
1320tgctaaagac ggaagtgcga ctgagacggg aagggggggg agtcggttgg tggcggttga
1380acctggacta aggcgcacat gacgtcgcgg tttctatggg ctcataatgg gtggtgagga
1440catttccctg actatagctt tccctcagtt gtaggacagg gtttgggcct cggcctcggg
1500ttaggctctc cagagtgggc aggaaccgga aatccagagg ggggaaaagt gagcctaaat
1560tgagttttgt ttcttgtcct atatggttta gagagagact cgctgcaaaa ccgtggctgg
1620cctggaactc tagaccagaa ccctggcctt tgccgaccca catgattaga ttcaaggcct
1680gtgccaccag cccaggcttt attattatgg tctgggattt ctgcgatttc atccctggtg
1740ttttgggatg atgacttgtg ggtcttccct cctccccctt actgtttctg tccatggcgt
1800gtgttctaac ccaagtttgt tcttttgggg gggtgggagg gttgcgataa aatgggatct
1860atctctgccc tcccaacttg agatctgcct gtcagaagtc tcagtgctga gaataaaggt
1920gtgcattggc tcagacctcg attttttttt tttttattat tttgtaggaa gtctgtagtc
1980cttacttgat acataagacc agacaggatc tgatttcctg cctatgaatg gtagatcctc
2040tcagtgactg cagtgtgaat ggggaccacg cttttctcca aactatgcag atagccatga
2100aagccatgaa atgactttca gccactggta ctgcaatatc cactcaccat ttattatatg
2160gaccaggttc accatgccta ggtggctttg cttttgagac acggtttctc tgtgtagcct
2220tggttatgtt tttttgtttg tttttttaat tatttttggt ttttcgagac agggtttctc
2280tgtgtagctt tggagcctat cctggcactt gctccggaga ccaggctggc ctccaactca
2340gatctgcctg cctctgcctc ccgactgctg ggattaaagt aaagccattc tgcaaccctg
2400aataccactc aataggtttc ttatttgaaa tgtggtttta tgatttttat ttctggattt
2460agaaaagaaa tcttcagaca gaagtcttca gacagaaact agctgtagtt tggctgtgtg
2520aactaaattg gcatccattt cacagcaatc caactgttag taccatacca cgaatatttg
2580tcattcctga cctgtttttt gtttgtgtgt gtgacag
2617
User Contributions:
Comment about this patent or add new information about this topic: