# Patent application title: Method for Deciding Whether a Sample is Consistent with an Established Production Norm for Heterogeneous Products

##
Inventors:
Giangiacomo Torri (Milano, IT)
Marco Guerrini (Saranno, IT)
Timothy Robert Rudd (Prenton Merseyside, GB)

Assignees:
Istituto di Ricerche Chimiche e Biochimiche G. Ronzoni
Anglo-Italian Chemometrics LTD.

IPC8 Class: AG01N2400FI

USPC Class:
702 28

Class name: Chemical analysis molecular structure or composition determination using radiant energy

Publication date: 2015-04-09

Patent application number: 20150100249

## Abstract:

A method of analysis of a heterogeneous product, for example heparin or
heparin derivatives, to define whether said heterogeneous product is
consistent with a library of verified heterogeneous samples (Library 1)
by analysing the variation, natural or alien. The acceptable variation of
the heterogeneous product is determined by comparing Library 1 with a
second set of verified spectra (Library 2), by use of comparative
two-dimensional correlation spectroscopic filtering (comparative
2D-COS-f). The method comprises obtaining a one-dimensional complex
spectrum, for example ^{1}H-NMR spectra, of a heterogeneous product and testing if it has features that are greater than features found testing a spectrum from Library 2 against Library 1. In a second embodiment comparative 2D-COS-f with iterative random sampling (2D-COS-firs) is applied, which provides a more accurate and stable extraction of aliens/unnatural features. The method defines whether a test sample is consistent with a library of production norms of heterogeneous products; determines the acceptance criteria to be considered as normal production for heterogeneous products and detects species alien to the production norms of heterogeneous products.

## Claims:

**1.**Method of analysis of a heterogeneous product comprising: a) obtaining a one-dimensional, complex spectrum of the heterogeneous product to be tested (Test sample), b) obtaining a library of spectra of verified heterogeneous products (Library 1), c) obtaining a second library of spectra of verified heterogeneous products (Library 2), wherein Library 2 contains a number of spectra x and Library 1 contains a number of spectra n, where n>x and n is more than 2, preferably more than 50, d) comparing said Library 1 against said Library 2 by comparative two-dimensional correlation spectroscopic filtering (comparative 2D-COS-f), e) comparing said Test spectra against said Library 1 by comparative 2D-COS-f, f) identifying the features of said Test spectra which are not consistent to Library 1, wherein the steps to perform comparative 2D-COS-f comprise: i. mean-centering Library 1 (x.sub.(library1)) by subtracting the mean spectra of Library 1 from each of the spectra in Library 1, obtaining the mean-centered data set x; x=

^{X}.sub.(library1) ij

**-.**sup.X.sub.(library1) average i, ii. determining the covariance matrix of the mean-centered Library 1 (COV

_{LIB}), where COV

_{LIB}=1/(n-1)*xx

^{T}, iii. repeating steps i-ii with Library 1 plus one of the spectra from Library 2 obtaining the covariance matrix (COV

_{LIBTEST}), iv. subtracting COV

_{LIB}from COV

_{LIBTEST}obtaining the difference covariance matrix ΔCOV

_{LIBTEST}-LIB, v. repeating steps iii-iv for each of the spectra that are within Library 2, vi. repeating steps i-ii with Library 1 plus the spectrum of the Test sample obtaining the covariance matrix (COV

_{TEST}); vii. subtracting COV

_{LIB}from COV

_{TEST}obtaining the difference covariance matrix ΔCOV

_{TEST}-LIB wherein the Test sample is considered not consistent with the Library 1 of verified heterogeneous products when it has one or more features within ΔCOV

_{TEST}-LIB whose amplitude is greater than any of the features within ΔCOV

_{TEST}-LIB.

**2.**Method of analysis of a heterogeneous product comprising: a) obtaining a one-dimensional, complex spectrum of the heterogeneous product to be tested (Test sample), b) obtaining a library of spectra of verified heterogeneous products (Library 1), c) obtaining a second library of spectra of verified heterogeneous products (Library 2), wherein Library 2 contains a number of spectra x and Library 1 contains a number of spectra n, where n>x and n is more than 2, preferably more than 50, d) comparing said Library 1 against said Library 2 by comparative two-dimensional correlation spectroscopic filtering with iterative random sampling (2D-COS-firs), e) comparing said Test spectra against said Library 1 by comparative 2D-COSfirs, f) identifying the features of said Test spectra which are not consistent to Library 1, wherein the steps to perform comparative 2D-COS-firs comprise: i. mean-centering a randomly selected proportion of Library 1 by subtracting the mean spectra of said randomly selected proportion of Library 1 from each of the spectra in Library 1, obtaining the mean-centered data set x, ii. determining the covariance matrix of the mean-centered randomly selected proportion of Library 1 (COV

_{LIB}), where COV

_{LIB}=1/(n-1)*xx

^{T}, iii. repeating steps i-ii with said randomly selected proportion of Library 1 plus one randomly selected spectrum from Library 2 obtaining the covariance matrix (COV

_{LIBTEST}), iv. subtracting COV

_{LIB}from COV

_{LIBTEST}obtaining the difference covariance matrix ΔCOV

_{TEST}-LIB, v. repeat steps i-iv a number j of times, wherein j is from 10 to 10000, preferably j>1000; vi. repeating steps i-ii with said randomly selected proportion of Library 1 plus the spectrum of the Test sample obtaining the covariance matrix (COV

_{TEST}); vii. subtracting COV

_{Lm}from COV

_{TEST}obtaining the difference covariance matrix ΔCOV

_{TEST}-LIB viii. repeating steps vi-vii of comparative 2D-COS-firs a number j of times, wherein j is from 10 to 10000, preferably j>1000; further comprising determining the mean spectrum of the j repeats; determining a measure of the variation of Library 1 at each point of the spectra, preferably the 95% confidence interval at each point; and wherein the Test sample is considered not consistent with the Library 1 of verified heterogeneous products when the amplitude of any of the features within ΔCOV

_{TEST}-LIB is greater than the measure of the variation of Library 1 at each point of the spectra.

**3.**Method according to claim 1 further comprising: verifying the consistency of Library 1 and Library 2 by principal component analysis.

**4.**Method according to claim 1 wherein the one-dimensional, complex spectrum of the heterogeneous product to be tested is obtained by

^{1}H NMR.

**5.**Method according to claim 1 wherein the heterogeneous product is heparin, high-, low- or ultra-low-molecular weight, or heparin derivatives, wherein low molecular weight is comprised from 3000 to 7000 Da, preferably from 4000 and 6000 Da, and ultra-low molecular weight is comprised from 1200 to 3000 Da, preferably from 1600 to 2400 Da.

**6.**Method according to claim 1 wherein the output of any of comparative 2D-COS-f or 2D-COSfirs is further used in at least one statistical test.

**7.**Method according to claim 5 wherein the statistical test is at least one selected from the group containing principal component analysis, partial least squares, support vector machines.

## Description:

**BACKGROUND OF THE INVENTION**

**[0001]**Many areas of industrial production, including pharmaceutical production and food production, have to deal with structurally heterogeneous products, which are anyhow regarded as one type of material. The property of a material of having a range of structures (heterogeneity) is an issue in industrial production because of the difficulty of monitoring the precise nature of the material.

**[0002]**In a molecule this heterogeneity can take several forms: it can comprise a range of different molecular weights in the case of a homopolymer (e.g. cellulose), varied sequence with the same molecular formula, varied sequence and molecular weight, as well as, in some cases, different degrees or types of substitution or branching.

**[0003]**Typically, the spread of structures can only be monitored overall. This is usually performed by using one or several physical techniques, which must be sensitive to one or more of the variable properties. These techniques measure the molecular weight or size, and report the average of this property for the material under examination.

**[0004]**However in many industrial processes, the ability to monitor the composition in more detail, for example to provide sequence information or the means of setting acceptance criteria for a particular measure of quality control, would be highly desirable.

**[0005]**A typical example of heterogeneous product, related to the pharmaceutical industry, is the widely used anticoagulant agent heparin, which is a linear polysaccharide comprising a mixture of polysaccharide chains with both varied sequences and a spread of molecular weights. Moreover, since it is a natural product extracted from animal mucosa (at present) it is also subjected to variation due to individual animal, regional variation and even seasonal differences. Furthermore, it can have additional structural modifications, which are introduced during the extraction and processing procedures. Heparin consists of 1,4 linked uronate-glucosamine unit: the uronate residue is primarily α-L-iduronic acid (α-L-IdoA), but can also be the C-5 epimer β-D-glucuronic acid (β-D-GlcA). The uronic acid can be O-sulfated at position 2, while the α-D-glucosamine α-D-GlcN) residue can be O-sulfated at positions 6 and 3, the latter being rarer. Furthermore, the glucosamine can have multiple modifications at position 2, being N-sulfated, N-acetylated or a free amine. The most common disaccharide is the tri-sulfated structure 2-O-sulfated iduronic acid and 6-O-sulfated N-sulfated glucosamine.

**[0006]**Producers and regulatory authorities share an interest in knowing more about the composition of such materials for several reasons. The first is that it would provide the means by which a better-defined and reproducible product could be produced in the sense of it being more homogeneous. This would also help to provide a reference to which each production run could be compared. Second, more detailed information that can link structure and activity can be provided to the manufacturer. These are of considerable importance in the growing area of biotechnological production, including bio-similar compounds/agents, and generic products in the pharmaceutical industry.

**[0007]**The achievement of these aims is a considerable challenge as it is required to compare products, each of which consists of mixtures of material and whose compositions cannot be defined precisely owing to the difficulties of separation, identification and/or quantification of the components.

**[0008]**Several analytical techniques have been developed in order to analyse heterogeneous molecules, for example NMR analytical techniques.

**[0009]**One is principal component analysis (PCA), which decomposes a matrix of numerical data into a number of model features that, when recombined will reproduce closely the original dataset. This can be used to examine how different heterogeneous samples are related to each other, but lacks information on the alien features that can be present in a sample when compared to another.

**[0010]**Another is two-dimensional correlation spectroscopy (2D-COS) which is a means of elucidating correlated and uncorrelated changes in perturbed chemical systems, this perturbation maybe be mechanical or chemical. 2D-COS analysis can be performed on data generated by different forms of spectroscopy; it can be performed on a single dataset, as a perturbed chemical system observed by one form of spectroscopy (homo-correlations), or between spectroscopic data generated by two different forms of spectroscopy for the same system and then correlated together (hetero-correlations).

**[0011]**A development of 2D-COS is two-dimensional correlation spectroscopic filtering (2D-COSf). In 2D-COSf the spectrum of a heterogeneous sample is tested against a library of spectra of verified heterogeneous products (Library 1). Library 1 is used to "filter" the test sample spectrum, removing spectral features consistent with verified heterogeneous products library and leaving only alien features, if present. Any feature that remains is considered not to be consistent with the Library 1 of verified heterogeneous compounds. However 2D-COSf does not give information on whether the extracted alien features arise from variations due to natural heterogeneity or from unnatural signals. Indeed since in a heterogeneous polymer no two samples are identical, it is conceivable that, if a bona fide heterogeneous test sample is analysed using 2D-COS-f against a library containing bona fide heterogeneous samples, spurious signals may be found. Thus a pass or fail criteria needs to be set that handles the natural variation within heterogeneous samples.

**[0012]**Therefore the need remains for a method of analysis of heterogeneous samples that is generally applicable and capable of providing objective test for assessing the conformity of heterogeneous samples to set standards of production.

**BRIEF DESCRIPTION OF THE INVENTION**

**[0013]**The present invention provides a method of analysis of heterogeneous products, for example heparin, that can define whether said heterogeneous product is consistent with a library of verified heterogeneous samples by analysing the variation, whether natural or alien, within a set of heterogeneous samples.

**[0014]**In 2D-COSf the spectrum of a heterogeneous product is filtered against a library of spectra of verified heterogeneous products (Library 1) and any feature that is not consistent with Library 1 is considered as alien feature.

**[0015]**The method of the present invention is a new development of 2D-COSf, which makes use of a second set of verified spectra (Library 2) to determine the acceptable variation of the heterogeneous product. Said method is defined "comparative 2D-COS-f" since it compares a filtered test sample with a filtered bona fide heterogeneous samples library.

**[0016]**In one embodiment the method comprises obtaining one-dimensional complex spectra of a heterogeneous product and applying comparative 2D-COS-f. The one-dimensional complex spectra/chromatographs are, for example,

^{1}H-NMR spectra, mass spectra, infrared spectra, Raman spectra, chromatographs produced by liquid/gas chromatography, near-infrared spectra and UV spectra. If a heterogeneous product tested against Library 1 has features that are greater than features found testing a spectrum from Library 2 against Library 1 the features are considered not to be consistent with those of Library 1.

**[0017]**In a second embodiment the method comprises obtaining one-dimensional complex spectra of a heterogeneous product and applying comparative 2D-COS-f with iterative random sampling (2D-COS-firs). The one-dimensional complex spectra/chromatographs are, for example,

^{1}H-NMR spectra, mass spectra, infrared spectra, Raman spectra, chromatographs produced by liquid/gas chromatography, near-infrared spectra and UV spectra. Each component is tested against the others randomly, where a proportion of Library 1 is randomly selected and a randomly selected spectrum from Library 2 is used in each iteration, while the test sample remains constant. This embodiment provides a measure of the variation within the verified products spectra, to which the test sample can be compared. Iteration with random sampling process provides a more accurate and stable extraction of aliens/unnatural features.

**[0018]**In another embodiment of the invention, the output of any of comparative 2D-COS-f or 2D-COS-firs, can be used in further statistical tests, for example, principal component analysis, partial least squares or support vector machines, to identify common/known and alien features within the test sample.

**[0019]**Optionally the content of Library 1 and the content of Library 2 can be tested using principal component analysis in order to verify that they are consistent with each other.

**[0020]**In the present specifications, spectrum/spectra is/are defined as any complex one-dimensional datum/dataset.

**[0021]**Through the different embodiments of the invention it is possible to decide whether a test sample is consistent with a library of production norms of heterogeneous products, to determine the acceptance criteria to be considered as normal production for heterogeneous products and to detect species alien to the production norms of heterogeneous products.

**DESCRIPTION OF FIGURES**

**[0022]**FIG. 1:

^{1}H NMR spectrum of porcine intestinal mucosal heparin, within the region of 1.95-6.00 ppm, the water peak has been cut ˜4.90-4.75 ppm.

**[0023]**FIG. 2: Principal component analysis of a library of bona fide heparin

^{1}H NMR spectra. Top left: Scree plot, the measure of the variation within the data set. Top right, bottom right and left: sequential loading plots of component one, two and three. The loadings are the proportion of each `ideal` spectrum contained within a sample actual spectrum, as determined by principal component analysis.

**[0024]**FIG. 3: Principal component analysis of a library of bona fide heparin

^{1}H NMR spectra. Score plots for component one, two and three. The ideal features extracted by principal component analysis.

**[0025]**FIG. 4: Principal component analysis of a library of bona fide heparin

^{1}H NMR spectra. Hierarchical cluster analysis of the distance matrix of the loadings for component one, two and three.

**[0026]**FIG. 5: Principal component analysis of a library of bona fide heparin

^{1}H NMR spectra. Network analysis of the distance matrix of the loadings for component one, two and three.

**[0027]**FIG. 6.

^{1}H NMR spectrum of porcine intestinal mucosal heparin (solid line) and a porcine intestinal heparin adulterated with 10% (w/w) bovine intestinal mucosal heparin (dotted line), within the region of 1.95-6.00 ppm, the water peak has been cut ˜4.90-4.75 ppm. Note that the two spectra are indistinguishable.

**[0028]**FIG. 7. Principal component analysis of a library of bona fide porcine intestinal heparin

^{1}H NMR spectra containing one sample adulterated with 10% (w/w) bovine intestinal mucosal heparin. Top left: Scree plot, the measure of the variation within the data set. Top right, bottom right and left: sequential loading plots of component one, two and three. The loadings are the proportion of each `ideal` spectrum contained within a spectrum of a spectrum, as determined by principal component analysis. Note that the contaminated sample (open circle) is not clearly distinguishable from the bona fide porcine intestinal heparin.

**[0029]**FIG. 8. Principal component analysis of a library of bona fide heparin

^{1}H NMR spectra containing one sample adulterated with 10% (w/w) bovine intestinal mucosal heparin. Score plots for component one, two and three. The ideal features extracted by principal component analysis.

**[0030]**FIG. 9. 2D-COS-f analysis of a heparin sample adulterated with 10% (w/w) bovine intestinal mucosal heparin tested against a library of bona fide porcine intestinal heparin. A) covariance matrix of Library 1, the library of bona fide porcine intestinal heparin. B) covariance matrix of Library 1 plus the adulterated test sample. C) difference of B and A. The matrix in panel C contains the features that are not consistent with the spectral features contained within Library 1, i.e., the alien bovine heparin signals.

**[0031]**FIG. 10. The diagonal of FIG. 9 panel C, The power spectrum contains features of the alien bovine heparin (positive correlations [features above the x axis] and negative correlations [features below the x axis] dotted line). The diagonal is the variance of the covariance matrix. The amplitude is normalised to the maximum value of the covariance matrix of Library 1.

**[0032]**FIG. 11. 2D-COS-f analysis of a porcine intestinal mucosal heparin sample adulterated with 10% (w/w) bovine intestinal mucosal heparin tested against a library of bona fide porcine intestinal heparin (black line, positive correlations [features above the x axis] and negative correlations [features below the x axis]). In this circumstance as well as the test sample being tested against Library 1, the definition of the heterogeneous sample, a bona fide heparin not contained within Library 1 is also tested against the library (+ symbol). This second test illustrates the acceptable variation of the heterogeneous product in question. If the amplitude of the filtered spectrum of the test sequence is greater than this, it is considered to contain alien or non-consistent features.

**[0033]**FIG. 12. 2D-COS-f analysis of a porcine intestinal mucosal heparin sample adulterated with 10% (w/w) bovine intestinal mucosal heparin tested against a library of bona fide porcine intestinal heparin (+ symbol). In this circumstance as well as the test sample being tested against Library 1, the definition of the heterogeneous sample, a second set of bona fide heparin, Library 2, not contained within Library 1 is also tested against Library 1 (black polygon). This second test illustrates the acceptable variation of the heterogeneous product in question. If the amplitude of filtered spectrum of the test sample is greater than this, then it is considered to contain alien or non-consistent features. The black polygon is the pass/fail criteria for the sample being a verified heterogeneous sample. In this circumstance the black dotted line is the modulus of the 95% confidence interval (|x±SE

_{x}×1.96|) for 1500 iterations of random sampling.

**[0034]**FIG. 13. The effect of Library 1 size on 2D-COS-firs. The size of Library 1, that defines the heterogeneous polymer, varied from 10 to 57 spectra, with a step size of 1. At each step a test sample (heparin contaminated with 1% bovine mucosal or ovine mucosal heparin) was filtered with 100 iterations, with the mean spectrum and the standard deviation at each point along the spectrum being recorded. A) The absolute response (area under the modulus of the power spectrum) at each step is plotted; open circles: randomly filtered spectrum from Library 2 (pass or fail criteria); black square: heparin contaminated with 1% ovine mucosal heparin; black circle: heparin contaminated with 1% bovine heparin. B) Standard deviation at a randomly chosen point [3.03 ppm]. C) The absolute response (area under the modulus of the power spectrum) plotted against number of iterations for 2D-COS-firs of porcine intestinal mucosal heparin contaminated with 5% (small contamination) [open triangle] and 20% (gross contamination) [open square] bovine mucosal heparin. D) Standard deviation at a randomly chosen point [3.03 ppm] for the filtered spectra of heparin adulterated with 30% to 1% bovine heparin.

**[0035]**FIG. 14. Principal component analysis of one porcine intestinal mucosal heparin sample that has been contaminated with 1% (w/w) bovine mucosal heparin and with ten porcine intestinal mucosal heparin samples. In this circumstance all the samples spectra have been filtered by Library 1, the definition of porcine intestinal mucosal heparin, this removes all signs from the spectra that are consistent with features contained with Library 1. In FIG. 7 it is difficult to differentiate a heparin sample contaminated with 10% (w/w) bovine intestinal mucosal heparin, where as after filtering using 2D-COS-firs it is possible to differentiate a sample contaminated with a much lower amount material.

**[0036]**FIG. 15. 2D-COS-firs of a generic LMWH, with Library 1 now containing lovenox LMWH. Here 2D-COS-firs is used to illustrate the features within the generic LMWH that are not common with lovenox samples contained with Library 1. In the upper panel the filtered result, in the lower panel the test sample spectrum and an example spectrum of lovenox.

**DETAILED DESCRIPTION OF THE INVENTION**

**[0037]**The present invention provides a method of analysis of heterogeneous products capable of defining whether a heterogeneous product, for example a natural or bio-manufactured product, is consistent with a library of verified heterogeneous samples by analysing the variation, whether natural or alien, within a set of heterogeneous samples.

**[0038]**Preferably the heterogeneous product is heparin, high-, low- (MW from 3000 to 7000 Da, preferably from 4000 to 6000 Da) and ultra-low- (MW from 1200 to 3000 Da, preferably from 1600 to 2400 Da) molecular weight. In other preferred embodiments the heterogeneous product consists typically of polymer chains which, even though they contain consistent levels of subunits (or within some range), nevertheless are characterised by chains in which the sequence of these sub-units is variable.

**[0039]**In a first embodiment a one-dimensional complex spectrum of the product to be tested (Test sample) is obtained and it is tested against a library of verified heterogeneous products (Library 1) by use of comparative 2D-COS-f as described hereafter. The one-dimensional complex spectra/chromatographs are, for example,

^{1}H-NMR spectra, mass spectra, infrared spectra, Raman spectra, chromatographs produced by liquid/gas chromatography, near-infrared spectra and UV spectra. Preferably the one-dimensional complex spectrum is a

^{1}H-NMR spectrum.

**[0040]**Before testing the heterogeneous product against Library 1 by use of comparative 2D-COS-f, a second library of verified products (Library 2) is tested against Library 1 by use of comparative 2D-COS-f in order to determine the acceptable variation within Library 1. Both Libraries 1 and 2 comprise bona fide samples of the heterogeneous product.

**[0041]**A suitable Library 1 contains more than 2 spectra, preferably more than 50 spectra.

**[0042]**Library 1 contains a number of spectra greater than the number of spectra of Library 2.

**[0043]**The contents of Library 1, that defines the features of the heterogeneous product, and the contents of Library 2, that measures the acceptable variation of the heterogeneous product, comply with the requisite regulations. The consistency of the members of Library 2 with Library 1 can also be confirmed using an explorative statistical technique such as principal component analysis.

**[0044]**Library 1, (x.sub.(library1)), is mean-centred by subtracting the mean spectra of Library 1 from each of the spectra in Library 1 (i) and a mean-centred data set x is obtained (x=x.sub.(library1)ij-x.sub.(library1)average i). The covariance matrix of the mean-centred Library 1 (COV

_{LIB}) is then determined (ii), where COV

_{LIB}is equal to the outer product matrix of x, scaled to the number of spectra n in the dataset (COV

_{LIB}=1/(n-1)*xx

^{T}); steps (i) and (ii) are then repeated on Library 1 plus one of the spectra from Library 2 and the covariance matrix COV

_{LIBTEST}is obtained (iii); COV

_{LIB}is subtracted from COV

_{LIBTEST}(iv) obtaining the difference covariance matrix ΔCOV

_{LIBTEST}-LIB (COV

_{LIBTEST}-COV

_{LIB}=ΔCOV

_{LIBTEST}-LIB).

**[0045]**ΔCOV

_{LIBTEST}-LIB is a measure of the acceptable variation within the heterogeneous samples.

**[0046]**The same procedure (iii-iv) is repeated one by one for all the spectra that are within Library 2 (v): the difference covariance spectra form the acceptance criteria whether the Test sample conforms to the library or not.

**[0047]**Steps (i) and (ii) are then repeated on Library 1 plus the spectrum of the heterogeneous product to be tested (Test sample) obtaining the covariance matrix COV

_{TEST}(vi). COV

_{LIB}is then subtracted from COV

_{TEST}(vii) forming the difference covariance matrix ΔCOV

_{TEST}-LIB (COV

_{TEST}-COV

_{LIB}=ΔCOV

_{TEST}-LIB) revealing the features of the Test sample that are not consistent with Library 1.

**[0048]**The following pass-fail criteria is applied in the analysis of the Test sample: if the amplitude of any of the features within ΔCOV

_{TEST}-LIB is greater than any of the features within ΔCOV

_{LIBTEST}-LIB, then the Test sample fails the test (see scheme 1).

**##STR00001##**

**[0049]**In a second embodiment two-dimensional correlation spectroscopy filtering with iterative random sampling (2D-COS-firs) is applied.

**[0050]**2D-COS-firs provides more accurate measure of the variation within the verified spectra (Library 2 against Library 1) and more accurate extraction of alien/unnatural features from the spectrum of the test sample by using a randomly selected proportion of Library 1 with one randomly selected spectrum from Library 2 and iterating the procedure while the test sample remains constant.

**[0051]**In this second embodiment ΔCOV

_{LIBTEST}-LIB is determined for a randomly selected proportion of Library 1 and one randomly selected spectrum of Library 2 with the steps (i-iv) described for the first embodiment; these steps are repeated j times until the response is stable, where j is greater than 10, preferably j is from 10 to 8000, more preferably from 1000 to 2000. The mean is determined for the j filtered spectra and a measure of the variation is determined at each point along the spectra, for example 95% confidence interval at each point (the mean value at a point±the error of the mean at that point×1.96). This forms the acceptance criteria whether the test spectrum conforms to the library or not (see scheme 2).

**[0052]**The covariance matrix COV

_{TEST}of the same randomly selected portion of Library 1 plus the test spectrum and the ΔCOV

_{TEST}-LIB are determined as in the first embodiment (steps vi and vii); these steps are repeated j times, until the response is stable, where j is greater than 10, preferably j is from 10 to 8000, more preferably from 1000 to 2000, determining the mean spectrum of the j repeats.

**[0053]**If the alien variation within the test sample (i.e. the amplitude of the spectrum determined by testing the test sample spectrum against Library 1) is greater that the natural variation of the library measured at each point of the spectra (i.e., the 95% confidence interval at each point), then the test sample is considered not to be consistent with the definition of the heterogeneous samples, Library 1.

**##STR00002##**

**[0054]**To perform the principal component analysis the spectra are mean-centred; the covariance matrix of the mean-centred set of spectra x is determined (c=xx

^{T}, where c is equal to the cross product matrix of x); the Eigen decomposition/diagonalization of the covariance matrix is performed, which forms a new orthonormal coordinate system, the results of which is c=TΛT

^{T}, where Λ is a diagonal matrix of eigenvalues while T are the eigenvectors (loadings). The data set x are then projected on to the new coordinate system by the following transformation S=xT, where T are the eigenvectors and S are the component scores.

**[0055]**In another embodiment of the invention, the output of any of comparative 2D-COS-f or 2D-COS-firs, can be used in further statistical tests, for example, principal component analysis, partial least squares or support vector machines, to identify common/known and alien features within the test sample.

**EXAMPLES**

**Comparative Example**1

**Principal Component Analysis**(PCA) of Porcine Intestinal Heparin Spectrum

**[0056]**

^{1}H NMR spectrum of porcine intestinal mucosal heparin is obtained. Porcine intestinal mucosal heparin is a heterogeneous carbohydrate, therefore its

^{1}H NMR spectrum contains many overlapping bands (FIG. 1). In FIG. 2 a library of 57 bona fide pharmaceutical porcine intestinal mucosal heparins are differentiated. These 57 spectra can be considered as a definition of the heterogeneous polymer heparin. Principal component analysis decomposes the spectra into ideal spectra (components) that can be linearly summed to obtain any spectra within the test dataset. The scree plot within FIG. 2 indicates the importance of the derived components: in this case there is one major component. The loading plots in FIG. 2 illustrate how the test spectra are composed of different amounts of the PCA derived components. The spectral features of each component are shown in the score plots (FIG. 3).

**Comparative Example**2

**Method to Detect Ruminant Material Contaminants in Porcine Heparin**

**[0057]**Principal component analysis can be used to find oddities within a dataset. FIG. 6 contains the

^{1}H NMR spectra of bona fide pharmaceutical porcine intestinal mucosal heparin and a heparin sample contaminated with 10% bovine mucosal heparin (w/w). Visually it is difficult to differentiate the two (FIG. 6). FIGS. 7 and 8 contain the results of PCA of the 57 spectra, which are considered in this circumstance to constitute the definition of pharmaceutical porcine intestinal mucosal heparin, with the contaminated heparin sample, containing 10% (w/w) bovine intestinal mucosal heparin. As can be seen from this analysis, the contaminated sample isn't clearly differentiated from the definition of pharmaceutical porcine intestinal mucosal heparin.

**Comparative Example**3

2D-COSf Analysis of a Heparin Sample Adulterated with 10% (w/w) Bovine Intestinal Mucosal Heparin

**[0058]**Instead of trying to decompose the entire dataset, test sample and bona fide pharmaceutical porcine intestinal mucosal heparin samples, into components, the 57 heparin spectra, which are considered to be an example of the definition of pharmaceutical porcine intestinal mucosal heparin, can be used to "filter" the test sample removing spectral features consistent with bona fide pharmaceutical porcine intestinal mucosal heparin leaving only alien features, when present. FIG. 9 illustrates the process of 2D-COS-f: FIG. 9A contains the covariance matrix formed from the 57

^{1}H NMR spectra which comprise an example definition of pharmaceutical porcine intestinal mucosal heparin; this is a pseudo-TOCSY spectra where features that change together within the dataset are linked together. The spectrum of the test sample is added to the 57 spectra and the covariance matrix is formed again. To filter the porcine intestinal mucosal heparin features from the test spectrum the covariance matrix represented in FIG. 9A is subtracted from the covariance matrix represented in FIG. 9B; this leaves the difference covariance matrix (FIG. 9C) which contains the alien features present in the test sample, i.e. the features due to the bovine heparin contaminant in this circumstance. The diagonal of the difference covariance matrix (FIG. 9C) is shown in FIG. 10: the spectrum contains alien features attributed to bovine heparin, specifically the anomeric region which is due to the presence of a higher amount of de-6-O-sulfation within bovine mucosal heparin.

**Example**1

**Analysis of a Porcine Intestinal Mucosal Heparin Sample Adulterated with**10% (w/w) Bovine Intestinal Mucosal Heparin by Comparative 2D-COSf

**[0059]**According to the first embodiment of the invention, a test sample is tested (filtered) against Library 1 of bona fide porcine intestinal mucosal heparin, which defines the heterogeneous sample. A bona fide heparin, contained in a second library (Library 2) and not contained within Library 1, is also tested (filtered) against Library 1. This second test illustrates the acceptable variation of the heterogeneous product in question. The test sample filtering by Library 1 is then compared with the bona fide porcine intestinal mucosal heparin sample of Library 2, filtered by the Library 1 as well. If the amplitude of the filtered spectrum of the test sample is greater than the amplitude of the filtered spectrum of the Library 2 filtered bona fide heparin, it is considered to contain features alien or non-consistent to porcine intestinal mucosal heparin. In this example the porcine intestinal mucosal heparin contaminated with 10% (w/w) bovine mucosal heparin failed the test.

**Example**2

**Analysis of a Porcine Intestinal Mucosal Heparin Sample Adulterated with**10% (w/w) Bovine Intestinal Mucosal Heparin by Iterative Random Sampling (2D-COS-Firs)

**[0060]**According to the second embodiment of the invention, random sampling is used to provide a stricter pass or fail criteria. This analysis requires three data sets: a library of bona fide porcine intestinal mucosal heparin which is consider to be the definition of porcine intestinal mucosal heparin (Library 1--containing 57 spectra in this example), a further library of bona fide porcine intestinal mucosal heparin which is a test library which will determine the natural variation within porcine intestinal mucosal heparin (Library 2--containing 12 spectra in this example) and finally the test sample. The pass or fail criteria is found by filtering a randomly selected sample from Library 2 by a random selection of Library 1 (the number of samples contained within Library 1-1), this is repeated 1500 times and the resultant spectra can be averaged to form a spectrum which encompassed the average natural variation with heparin. Here we determined the 95% confidence interval (x±SE

_{x}×1.96) and used it as the pass or fail criteria (FIG. 12). Then the actual test sample went through a similar process, being filtered by a random selection of Library 1 (the number of samples contained within Library 1-1) iterating this for 1500 times: the results are averaged and compared against the measure of natural variation--the pass or fail criteria. If any signal lays outside the measure of natural variation--the pass or fail criteria--then it is considered to be alien to the porcine intestinal mucosal heparin and that the sample contains non-porcine intestinal mucosal heparin material. In the example shown here the porcine intestinal mucosal heparin contaminated with 10% (w/w) bovine mucosal heparin fail the test.

**Example**3

**The Effect of Library**1 Size on the Analysis of Heparin Contaminated with 1% Bovine Mucosal or Ovine Mucosal Heparin by 2D-COS-Firs

**[0061]**The effect of varying the size of Library 1 and the number of iterations used for 2D-COS-firs is illustrated in FIG. 13. When the sample is iterated 1500 the standard deviation at any point of the spectrum becomes stable and no improvement occurs if the number of iteration is increased over 1500 times (FIGS. 13 C and D). While using a Library 1 containing 57 spectra provides sensitive filtering, additional spectra added to Library 1 would improve the result (FIGS. 13 A and B); for a stable result at least 50 spectra are required.

**Example**4

**Principal Component Analysis of One Porcine Intestinal Mucosal Heparin**Contaminated with 1% Bovine Mucosal Heparin after 2D-COS-Firs with 10 Bona Fide Porcine Intestinal Mucosal Heparin Spectra

**[0062]**In this example principal component analysis is applied after all the samples spectra have been filtered by Library 1, the definition of porcine intestinal mucosal heparin, removing all signs from the spectra that are consistent with features contained with Library 1. As can be seen in FIG. 14 the removal of all the features that are consistent with Library 1 improves the separation of the spectra with principal component analysis dramatically. While in comparative example 2 it was difficult to differentiate porcine intestinal mucosal heparin contaminated with 10% (w/w) bovine mucosal heparin, here it is possible to differentiate a sample contaminated with a much lower amount material. Therefore, 2D-COS-firs can be used to improve the sensitivity of other statistical techniques.

**Example**5

**Method to Differentiate LMWHs Produced by Different Manufactures**. Here 2D-COS-Firs is Used to Filter a Generic LMWH Against Library 1 that Contains Lovenox LMWH Spectra (FIG. 15)

**[0063]**By filtering the generic LMWH test sample against the lovenox-containing Library 1 all the features within the generic LMWH that are not consistent with lovenox are revealed.

**PUBLICATIONS**

**[0064]**Noda, I. Two-dimensional infrared-spectroscopy. Journal of the American Chemical Society, 1989; 111(21), 8116-8.

**[0065]**Noda, I. Generalized 2-dimensional correlation method applicable to infrared, Raman, and other types of spectroscopy. Applied Spectroscopy, 1993; 47(9), 1329-36.

**[0066]**Abdulla, H. A. N., et al. Using Two-Dimensional Correlations of (13)C NMR and FTIR To Investigate Changes in the Chemical Composition of Dissolved Organic Matter along an Estuarine Transect. Environmental science & technology, 2010; 44(21), 8044-49.

**[0067]**Rudd T. R., et al. Site-specific interactions of copper(II) ions with heparin revealed with complementary (SRCD, NMR, FTIR and EPR) spectroscopic techniques. Carbohydrate Research, 2008; 343, 2184-2193.

**[0068]**Roden L, Ananth S, Campbell P, Curenton T, Ekborg G, Manzella S, Pillion D, Meezan E. Heparin--an introduction. In: Lane D A, Bjork, Lindahl U, eds. AQ3. Heparin and Related Polysaccharides. New York: Plenum Press, 1992; 1-20.

**[0069]**Lindahl U, Lidholt K, Spillmann D, Kjellen L. More to "heparin" than anticoagulation. Thromb Res 1994; 75:1-32.

**[0070]**Casu Bin Chemistry and Biology of Heparin and Heparan Sulphate (Garg, H. G, Linhardt, R. J., and Hales, C. A., eds) 2005; pp. 1-28, Elsevier Ltd., Oxford, UK.

**[0071]**Yates E A, Santini F, Guerrini M, Naggi A, Torri G, Casu B. 1H and 13C NMR spectral assignments of the major sequences of twelve systematically modified heparin derivatives. Carbohydr Res. 1996; 294, 15-27.

**[0072]**Bertini S, Bisio A, Torri G, Bensi B, Terbojevich M. Molecular Weight Determination of Heparin and Dermatan

**[0073]**Sulfate by Size Exclusion Chromatography with a Triple Detector Array. Biomacromolecules 2005; 6, 168-173.

**[0074]**Casu B, Guerrini M, Naggi A et al. Characterization of sulfation patterns of beef and pig mucosal heparins by nuclear magnetic resonance spectroscopy. Arzneim Forsch/Drug Res 1996; 46, 472-477.

**[0075]**Warda M, Gouda E M, Toida T, Chi L, Linhardt R J. Isolation and characterization of raw heparin from dromedary intestine: evaluation of a new source of pharmaceutical heparin. Comp Biochem Physiol C Toxicol Pharmacol. 2003; 136, 357-365.

**[0076]**Linhardt R J, Gunay N S (1999) Production and chemical properties of low molecular weight heparins. Semin Thromb Hemost 1999; 25, 5-16

**[0077]**Mourier P A J, Guichard O J, Herman F, Viskov C, Heparin sodium compliance to the new proposed USP monograph: Elucidation of a minor structural modification responsible for a process dependent 2.10 ppm NMR signal, J. Pharm. Biomed. Anal. 2011; 54, 337-344.

**[0078]**Beni S, Limtiaco J F, Larive C K. Analysis and characterization of heparin impurities. Anal Bioanal Chem. 2011; 399; 527-539.

**[0079]**Beaudet J M, Weyers A, Solakyildirim K, Yang B, Takieddin M, Mousa S, Zhang F, Linhardt R J Impact of Autoclave Sterilization on the Activity and Structure of Formulated Heparin. J. Pharm. Sci. 2011; 100, 3396-3404.

**[0080]**Lee S E, Chess E K, Rabinow B, Ray G J, Szabo C M, Melnick B, Miller R L, Nair L M, Moore E G NMR of heparin API: investigation of unidentified signals in the USP-specified range of 2.12-3.00 ppm. Anal Bioanal Chem. 2011; 399, 651-662.

**[0081]**Xu Y, Masuko S, Takieddin M, Xu H, Liu R, Jing J, Mousa S A, Linhardt R J, Liu J. Chemoenzymatic synthesis of homogeneous ultralow molecular weight heparins. Science. 2011; 334, 498-501.

**[0082]**Winter W, Deubner R, Holzgrabe U. Multivariate analysis of nuclear magnetic resonance data--characterization of critical drug substance quality of gentamicin sulfate. J Pharm Biomed Anal. 2005; 38, 833-839.

**[0083]**Beyer T, Diehl B, Randel G, Humpfer E, Schafer H, Spraul M, Schollmayer C, Holzgrabe U. Quality assessment of unfractionated heparin using 1H nuclear magnetic resonance spectroscopy. J Pharm Biomed Anal. 2008; 48, 13-19.

**[0084]**Rudd T R, Gaudesi D, Skidmore M A, Ferro M, Guerrini M, Mulloy B, Torri G, Yates E A. Construction and use of a library of bona fide heparins employing 1H NMR and multivariate analysis. Analyst. 2011; 136, 1380-1389.

**[0085]**Ruiz-Calero V, Saurina J, Galceran M T, Hernandez-Cassou S, Puignou L. Estimation of the composition of heparin mixtures from various origins using proton nuclear magnetic resonance and multivariate calibration methods. Anal Bioanal Chem. 2002; 373, 259-65.

**[0086]**Alban S, Lahn S, Schiemann S, Beyer T, Norwig J, Schilling C, Radler O, Wolf B, Matz M, Baumann K, Holzgrabe U. Comparison of established and novel purity tests for the quality control of heparin by means of a set of 177 heparin samples. Anal Bioanal Chem. 2011; 399, 605-620.

**[0087]**Zang Q, Keire D A, Wood R D, Buhse L F, Moore C M, Nasr M, Al-Hakim A, Trehy M L, Welsh W J. Determination of galactosamine impurities in heparin samples by multivariate regression analysis of their (1)H NMR spectra. Anal Bioanal Chem. 2011; 399, 635-649.

**[0088]**Jolliffe I T, Principal component analysis, Springer, New York, London, 2002.

**[0089]**Sasic S, Muszynskiand A, Ozaki Y, Appl. Spectrosc., 2001; 55, 343-349.

**[0090]**Noda I, Dowrey A E, Marcott C, Story G M, Ozaki Y, Appl. Spectrosc., 2000, 54, 236A-248A.

**[0091]**Rudd T R, Gaudesi D, Skidmore M A, Lima A M, Skidmore M A, Mulloy B, Torri G, Nader H B, Guerrini M, Yates E A. High-sensitivity visualisation of contaminants in heparin samples by spectral filtering of 1H NMR spectra Analyst. 2011; 136, 1390-1398.

User Contributions:

Comment about this patent or add new information about this topic: