|
|
Analytica Chimica Acta (v.642, #1-2)
CAC 2008
by Anna de Juan Guest Editor; Jean Michel Roger President of the Organizing Committee; Lutgarde Buydens Editor (pp. 1-2).
IUPAC project: A glossary of concepts and terms in chemometrics
by D. Brynn Hibbert; Pentti Minkkinen; N.M. Faber; Barry M. Wise (pp. 3-5).
A project has been initiated by the International Union of Pure and Applied Chemistry (IUPAC) to create a glossary of concepts and terms in chemometrics. This will be accomplished by consultation with the community through the means of a wiki – a web site that can be modified by users (seehttp://www.iupacterms.eigenvector.com/index.php?title=Main_Page). Over time new terms can be added, and consensus definitions arrived at. The definitions will be published as IUPAC recommendations.
Keywords: IUPAC; Chemometrics definitions; Chemometrics terms; Chemometrics concepts; Wiki
Contribution of external parameter orthogonalisation for calibration transfer in short waves—Near infrared spectroscopy application to gasoline quality
by S. Amat-Tosello; N. Dupuy; J. Kister (pp. 6-11).
The octane number rating of a gasoline gives an indication of the gasoline performances, under various engine conditions. Two different ratings are included: Research Octane Number (RON) and Motor Octane Number (MON). The standard laboratory method for octane number determination is the knock engine method in which a gasoline is burned and its combustion characteristics compared to known standards. This method is time consuming and labor intensive, and provides no ability for real time control of production. NIR can be applied in real time directly in process monitoring or as a laboratory procedure. Near infrared spectra of gasoline samples were collected thanks to four different short wavelengths near infrared analysers, built with strictly the same technology. The aim of this study was to transfer the calibration built on one spectrometer to the other ones. We applied the external parameter orthogonalisation (EPO) correction to get rid of the apparatus influence on information contained in spectra. By this method, we managed to improve prediction values of two major gasolines’ properties, i.e. Research and Motor Octane Number.
Keywords: Calibration transfer; Near infrared; External parameter orthogonalisation; Gasoline; Octane number; Research Octane Number; Motor Octane Number
Classification of Brazilian soils by using LIBS and variable selection in the wavelet domain
by Márcio José Coelho Pontes; Juliana Cortez; Roberto Kawakami Harrop Galvão; Celio Pasquini; Mário César Ugulino Araújo; Ricardo Marques Coelho; Márcio Koiti Chiba; Mônica Ferreira de Abreu; Beáta Emöke Madari (pp. 12-18).
This paper proposes a novel analytical methodology for soil classification based on the use of laser-induced breakdown spectroscopy (LIBS) and chemometric techniques. In the proposed methodology, linear discriminant analysis (LDA) is employed to build a classification model on the basis of a reduced subset of spectral variables. For the purpose of variable selection, three techniques are considered, namely the successive projection algorithm (SPA), the genetic algorithm (GA), and a stepwise formulation (SW). The use of a data compression procedure in the wavelet domain is also proposed to reduce the computational workload involved in the variable selection process. The methodology is validated in a case study involving the classification of 149 Brazilian soil samples into three different orders (Argissolo, Latossolo and Nitossolo). For means of comparison, soft independent modelling of class analogy (SIMCA) models are also employed. The best discrimination of soil types was attained by SPA–LDA, which achieved an average classification rate of 90% in the validation set and 72% in cross-validation. Moreover, the proposed wavelet compression procedure was found to be of value by providing a 100-fold reduction in computational workload without significantly compromising the classification accuracy of the resulting models.
Keywords: Brazilian soils; Laser-induced breakdown spectroscopy; Classification; Wavelet compression; Successive projections algorithm; Linear discriminant analysis
Chemometrics description of measurement error structure: Study of an ultrafast absorption spectroscopy experiment
by Lionel Blanchet; Julien Réhault; Cyril Ruckebusch; Jean Pierre Huvenne; Romà Tauler; Anna de Juan (pp. 19-26).
The error structure inherent to a transient absorption spectroscopy kinetic experiment is described through the examination of the related measurement error covariance matrices in the time and the spectral direction. The systematic approach proposed by Wentzell's group to characterize the measurement error covariance matrices is used for this purpose. The goals of this study are identifying and modelling the factors contributing to the error measurement and proposing a weighting error-based scheme that can be generally used when analysing data coming from this experimental setup. From the analysis performed, independent noise structure is associated with the time direction of the data sets, whereas the spectral direction is affected by two sources of correlated errors, which can be related to the probe spectrum and to the mean signal registered.
Keywords: Error covariance; Noise estimation; Transient absorption spectroscopy; Weighting schemes
Drift correction in multivariate calibration models using on-line reference measurements
by Paman Gujral; Michael Amrhein; Dominique Bonvin (pp. 27-36).
On-line measurements from first-order instruments such as spectrometers may be compromised by instrumental, process and operational drifts that are not seen during off-line calibration. This can render the calibration model unsuitable for prediction of key components such as analyte concentrations. In this work, infrequently available on-line reference measurements of the analytes of interest are used for drift correction. The drift-correction methods that include drift in the calibration set are referred to as implicit correction methods (ICM), while explicit correction methods (ECM) model the drift based on the reference measurements and make the calibration model orthogonal or invariant to the space spanned by the drift. Under some working assumptions such as linearity between the concentrations and the spectra, necessary and sufficient conditions for correct prediction using ICM and ECM are proposed. These so-called space-inclusion conditions can be checked on-line by monitoring the Q-statistic. Hence, violation of these conditions implies the violation of one or more of the working assumptions, which can be used, e.g. to infer the need for new reference measurements. These conditions are also valid for rank-deficient calibration data, i.e. when the concentrations of the various species are linearly dependent. A constraint on the kernel used in ECM follows from the space-inclusion condition. This kernel does not estimate the drift itself but leads to an unbiased estimate of the drift space. In a noise-free environment, it is shown that ICM and ECM are equivalent. However, in the presence of noise, a Monte Carlo simulation shows that ECM performs slightly better than ICM. A paired t-test indicates that this difference is statistically significant. When applied to experimental fermentation data, ICM and ECM lead to a significant reduction in prediction error for the concentrations of five metabolites predicted from infrared spectra.
Keywords: Spectral measurements; Multivariate calibration; Drift invariance; Orthogonal projection; Space inclusion
Two new extensions of principal component transform to compute a PLS2 model between two wide matrices: PCT-PLS2 and segmented PCT-PLS2
by D. Jouan-Rimbaud Bouveresse; D.N. Rutledge (pp. 37-44).
The progress of analytical techniques has led to the possibility of acquiring a large number of data for each analysed sample. Moreover, the application of pre-treatment methods such as Contrast greatly increases the number of variables, yielding (very) wide data matrices. The computation of a PLS2 model between two such matrices may be slowed down, or made impossible, because of computer memory problems. The method presented in this article proposes an algorithm to solve this problem, and to enable the computation of a PLS2 model between two matrices containing a large number of variables. To do this, the PLS2 model is computed between the score matrices obtained by a PCA on each original matrix separately. After PLS2, a back-transformation to the original space is possible, and leads to results identical to those which would have been obtained in the original space. The method can be later extended, by segmenting the matrices, and computing the PC transform on each segment, before concatenating all the resulting score matrices and computing the PLS2 model on the obtained matrices.
Keywords: Partial least squares regression; Principal component transform; Segmented-principal component transform
Exploring a physico-chemical multi-array explanatory model with a new multiple covariance-based technique: Structural equation exploratory regression
by X. Bry; T. Verron; P. Cazes (pp. 45-58).
In this work, we consider chemical and physical variable groups describing a common set of observations (cigarettes). One of the groups, minor smoke compounds ( minSC), is assumed to depend on the others ( minSC predictors). PLS regression (PLSR) of m inSC on the set of all predictors appears not to lead to a satisfactory analytic model, because it does not take into account the expert’s knowledge. PLS path modeling (PLSPM) does not use the multidimensional structure of predictor groups. Indeed, the expert needs to separate the influence of several pre-designed predictor groups on minSC, in order to see what dimensions this influence involves. To meet these needs, we consider a multi-group component-regression model, and propose a method to extract from each group several strong uncorrelated components that fit the model. Estimation is based on a global multiple covariance criterion, used in combination with an appropriate nesting approach. Compared to PLSR and PLSPM, the structural equation exploratory regression (SEER) we propose fully uses predictor group complementarity, both conceptually and statistically, to predict the dependent group.
Keywords: Linear regression; Latent variables; Multi-block component regression model; PLS path modeling; PLS regression; Structural equation models; SEER
The best approaches in the on-line monitoring of batch processes based on PCA: Does the modelling structure matter?
by J. Camacho; J. Picó; A. Ferrer (pp. 59-68).
Much work has been devoted to the on-line multivariate statistical process control (MSPC) of batch processes based on bilinear models such as principal component analysis (PCA). The data collected from a batch process have 3-way structure and they have to be arranged in 2-way matrices prior to PCA. Several transformation methods have been proposed reporting contradictory conclusions regarding monitoring performance. The aim of this paper is to run an objective research to assess the influence of 3-way to 2-way transformations on the monitoring performance. Batch-wise, variable-wise, batch-dynamic, local and multi-phase approaches are compared using simulated data from a pair of ‘toy’ processes with quite different and perfectly known dynamics. Several realistic types of faults are simulated. The main conclusions regarding fault detection performance are: (i) the best monitoring approach depends more on the type of fault than on the process dynamics; (ii) the information considered in the model structure will determine which faults are detectable; (iii) in general, a good monitoring performance is achieved with a parsimonious model; (iv) proper identification of the changes in the correlation structure -the phases of the process- is crucial for parsimonious models; (v) it can be advisable to combine several unfolding methods to improve the detection ability of predefined types of faults. These results were coherently explained from the theoretically known features of the models and validate the conclusions in previous investigations with real batch processes and realistic simulations.
Keywords: Batch processes; On-line monitoring; Principal component analysis; Unfolding; Multi-phase
Near Infrared Spectroscopy and multivariate analysis methods for monitoring flour performance in an industrial bread-making process
by M. Li Vigni; C. Durante; G. Foca; A. Marchetti; A. Ulrici; M. Cocchi (pp. 69-76).
The present study is aimed at evaluating the possibility to predict bread specifications, for an industrial bread-making process, on the basis of the properties of flour employed in production. The flour delivered at the production plant, of which rheological and chemical properties were available, were analysed by means of Near Infrared Spectroscopy (NIRS). Based on the flour properties and NIR signals, multivariate control charts were constructed in order to detect flour batches leading to a bread with non-optimal behaviour. The results show that it is possible to distinguish flour batches leading to a product with a particularly negative performance, by modelling the properties commonly measured on flours and the acquired Near Infrared signals. In spite of the absence of monitoring of process variables, which could have offered a more sound basis for the interpretation, especially when false positives and negatives are detected, these results are of particular interest from the point of view of raw material evaluation in process monitoring. Also, the potentiality of Near Infrared Spectroscopy allows considering this approach for an on-line implementation in the control of incoming raw materials in this industrial process.
Keywords: Flour; Near Infrared Spectroscopy; Multivariate control charts; Bread-making
Variation patterns of nitric oxide in Catalonia during the period from 2001 to 2006 using multivariate data analysis methods
by M. Alier; M. Felipe-Sotelo; I. Hernàndez; R. Tauler (pp. 77-88).
Multivariate data analysis methods are applied to study of the geographical and temporal distribution of nitric oxide (NO) in Catalonia (North-East Spain), measured during the period 2001–2006 in 50 sampling stations. Principal component analysis (PCA) and Multivariate Curve Resolution Alternating Least Squares (MCR-ALS) were applied for that purpose. The simultaneous analysis of NO data from sampling stations showed that its geographical distribution was rather uniform during the period considered. When three individual sampling stations were considered (two urban sites and one rural location), three different temporal patterns were resolved, with marked daily-night changes mainly attributed to traffic and also, important winter–summer seasonal variations. A decreasing trend in the levels of NO has been also observed in recent years. Comparison with nitrogen dioxide (NO2) profiles shows that the daily variation is quite similar to the NO variation, however NO2 displays very little oscillations along the seasons and no reduction of its concentration was observed in the last years, contrasting with NO tendencies.The use of MCR-ALS is confirmed to be a useful method to improve interpretability in atmospheric contamination studies. The use of non-negativity and trilinearity constraints is shown to provide improved interpretations of the different contamination patterns in environmental terms.
Keywords: Chemometrics; Principal component analysis; Multivariate curve resolution; Nitric oxide pollution
A Backward Variable Selection method for PLS regression (BVSPLS)
by Juan Antonio Fernández Pierna; Ouissam Abbas; Vincent Baeten; Pierre Dardenne (pp. 89-93).
Variable selection has been discussed in many papers and it became an important topic in areas as chemometrics and science in general. Here a backward iterative step-by-step wrapper method is proposed using PLS. The root-mean-square error of prediction (RMSEP) for an independent test set is used as selection criterion to quantify the gain obtained using the selected range of variables. The method has been applied to different data sets and the results obtained revealed that one can improve or at least keep constant the prediction performances of the PLS models compared to the full-spectrum models. Moreover with the advantage that the number of variables is reduced driving to an easier interpretation of the relationship between model and sample composition and/or properties. The aim is not to compare to other variable selection methods but to show that a simple one can improve or at least keep constant the prediction performances of the PLS models by using only a limited number of variables.
Keywords: Variable selection; Partial least squares; Regression; Near infrared spectroscopy
Correlation between sludge settling ability and image analysis information using partial least squares
by D.P. Mesquita; O. Dias; A.M.A. Dias; A.L. Amaral; E.C. Ferreira (pp. 94-101).
In the last years there has been an increase on the research of the activated sludge processes, and mainly on the solid–liquid separation stage, considered of critical importance, due to the different problems that may arise affecting the compaction and the settling of the sludge. Furthermore, image analysis procedures are, nowadays considered to be an adequate method to characterize both aggregated and filamentous bacteria, and increasingly used to monitor bulking events in pilot plants. As a result of that, in this work, image analysis routines were developed in Matlab environment, allowing the identification and characterization of microbial aggregates and protruding filaments. Moreover, the large amount of activated sludge data collected with the image analysis implementation can be subsequently treated by multivariate statistical procedures such as PLS. In the current work the implementation of image analysis and PLS techniques has shown to provide important information for better understanding the behavior of activated sludge processes, and to predict, at some extent, the sludge volume index. As a matter of fact, the obtained results allowed explaining the strong relationships between the sludge settling properties and the free filamentous bacteria contents, aggregates size and aggregates morphology, establishing relevant relationships between macroscopic and microscopic properties of the biological system.
Keywords: Activated sludge; Image analysis; Sludge volume index; Partial least squares
Recursive multimodel partial least squares estimation of mineral flotation slurry contents using optical reflectance spectra
by Olli Haavisto; Heikki Hyötyniemi (pp. 102-109).
In mineral flotation the X-ray fluorescence (XRF) grade measurements of the slurries typically give the most important on-line information on the state of the flotation process. It has been shown that the visual and near-infrared (VNIR) reflectance spectrum measurements of certain mineral slurries can be used to complete the sparse XRF slurry content information. This study focuses on the chemometrical analysis of the VNIR spectrum of the slurries and presents a new partial least squares (PLS)-based recursive multimodel approach with local orthogonal signal correction (OSC) for predicting the slurry contents. The advantage of the presented approach is that it can recursively adapt to real process data variations in normal operating conditions, and is still able to remember the rare process failure situations with notably different content values.
Keywords: Mineral flotation; Reflectance spectroscopy; Recursive partial least squares; Orthogonal signal correction; Local modeling
Support vector regression for functional data in multivariate calibration problems
by Noslen Hernández; Isneri Talavera; Rolando J. Biscay; Diana Porro; Marcia M.C. Ferreira (pp. 110-116).
Quantitative analyses involving instrumental signals, such as chromatograms, NIR, and MIR spectra have been successfully applied nowadays for the solution of important chemical tasks. Multivariate calibration is very useful for such purposes and the commonly used methods in chemometrics consider each sample spectrum as a sequence of discrete data points. An alternative way to analyze spectral data is to consider each sample as a function, in which a functional data is obtained. Concerning regression, some linear and nonparametric regression methods have been generalized to functional data. This paper proposes the use of the recently introduced method, support vector regression for functional data (FDA-SVR) for the solution of linear and nonlinear multivariate calibration problems. Three different spectral datasets were analyzed and a comparative study was carried out to test its performance with respect to some traditional calibration methods used in chemometrics such as PLS, SVR and LS-SVR. The satisfactory results obtained with FDA-SVR suggest that it can be an effective and promising tool for multivariate calibration tasks.
Keywords: Support vector regression; Functional Data Analysis; Multivariate calibration
Classification of nucleic acids structures by means of the chemometric analysis of circular dichroism spectra
by Joaquim Jaumot; Ramon Eritja; Susana Navea; Raimundo Gargallo (pp. 117-126).
DNA can adopt structures in solution apart from the well-known Watson–Crick double helix, ranging from disordered single strands to high-order structures such as triplexes or quadruplexes. Moreover, different topologies can be adopted depending on the polarity of the DNA strands. The elucidation of the structure and topology adopted by a DNA sequence is usually carried out by means of spectroscopic techniques, such as circular dichroism.In this work, the ability of several chemometric methods to efficiently classify DNA structures from circular dichroism data is tested. With this objective in mind, a dataset including 50 experimental spectra corresponding to different DNA structures (random coil, duplex, hairpin, reversed and normal triplex, parallel and antiparallel G-quadruplex, and i-motif) has been analyzed by means of unsupervised hierarchical clustering analysis, principal component analysis and partial least squares discriminant analysis. The results have shown than those methods allow efficiently the classification of DNA structures from circular dichroism spectra. Moreover, these classification methods also provided the most characteristic wavelengths used in the classification procedures.
Keywords: DNA structure; Classification; Principal component analysis; Clustering; Partial least squares discriminant analysis; Circular dichroism spectroscopy
Using scattering and absorption spectra as MCR-hard model constraints for diffuse reflectance measurements of tablets
by Waltraud Kessler; Dieter Oelkrug; Rudolf Kessler (pp. 127-134).
One of the most often used tools in process analytical technology (PAT) is NIR spectroscopy as a non-destructive fast and reliable method to identify and quantify active pharmaceutical ingredients (API) in tablets. Very little work has been devoted to analyse the effects of scatter on quantitative analysis of the chemical composition. A novel approach to compensate scatter in reflectance spectroscopy which is more science based will be presented here. The basic assumption is to determine in step 1 a separate scattering spectral fingerprint, denoted as S spectra, and an absorption spectral fingerprint, denoted as K spectra. In the second step, the two spectra may then be used as input to the alternating least square (ALS) algorithm in multivariate curve resolution (MCR) in order to account for the spectral distortions due to the interaction of scatter and absorption. Standard tablets with a mass of 1.5g and a diameter of 20mm (thickness approx. 3.4mm, optically infinite) were prepared according to a central composite design by mixing theophyllin, magnesiumstearate and cellactose at three different compactions of 31, 156 and 281MPa. The samples are measured by an UV/Vis/NIR spectrometer attached with an integrating sphere in the wavelength range from 500 up to 2100nm. The diffuse reflectance spectra of the center point sample with an optically infinite thickness R∞ as well as a sample of finite thickness R0 (“optically thin”) is measured as reference for the S and K spectra which are then calculated with the exponential solution of the Kubelka–Munk equation. After normalization, the S spectrum and the K spectrum of a single tablet are integrated as hard model constraints into the MCR–ALS procedure. In comparison to PLS modeling with EMSC pretreatment of the spectra, the hard model constrained MCR–ALS algorithm results in an improved prediction of the concentration of the API together with a higher robustness of the calibration models.
Keywords: Scatter correction; Scattering coefficient; Multivariate curve resolution; Multivariate curve resolution–alternating least square; Diffuse reflectance
Classification and analysis of non-isotropic images by Angle Measure Technique (AMT) with contour unfolding
by Sergey Kucheryavski; Ivan Belyaev (pp. 135-141).
Analysis and classification of non-isotropic images with Angle Measure Technique (AMT) is augmented with a new object contour unfolding. The new unfolding algorithm is presented and compared with earlier unfolding options in AMT. The new unfolding with Projection on Latent Structures (PLS) discriminant analysis was applied for classification of digital imagery of blood cells and analysis of diffusion-limited aggregation clusters.
Keywords: Angle Measure Technique; Image processing; Image analysis; Unfolding; Classification; Projection on Latent Structures (PLS) discriminant analysis; Feature extraction
Automatic adjustment of the relative importance of different input variables for optimization of counter-propagation artificial neural networks
by Igor Kuzmanovski; Marjana Novič; Mira Trpkovska (pp. 142-147).
In this work we present a quantitative structure–activity relationship study with 49 peptidic molecules, inhibitors of the HIV-1 protease. The modelling was preformed using counter-propagation artificial neural networks (CPANN), an algorithm which has been proven as a valuable tool for data analysis. The initial pre-processing of the data involved auto-scaling, which gives equal importance to all the variables considered in the model. In order to enhance the influence of some of the variables that carry valuable information for improvement of the model, we introduce a novel approach for adjustment of the relative importance of different input variables. Having involved a genetic algorithm, the relative importance was adjusted during the training of the CPANN. The proposed approach is capable of finding simpler efficient models, when compared to the approach with the original, i.e. equally important input variables. A simpler model also means more robust and less subjected to the overfitting model, therefore we consider the proposed procedure as a valuable improvement of the CPANN algorithm.
Keywords: Counter-propagation artificial neural networks; Genetic algorithms; Quantitative structure–activity relationship; HIV-1 protease inhibitors
Chemometric resolution of NIR spectra data of a model aza-Michael reaction with a combination of local rank exploratory analysis and multivariate curve resolution-alternating least squares (MCR-ALS) method
by Vanessa del Río; M. Pilar Callao; M. Soledad Larrechi; Lucas Montero de Espinosa; J. Carles Ronda; Virginia Cádiz (pp. 148-154).
The aza-Michael reaction, a variation of the Michael reaction in which an amine acts as the nucleophile, permits the synthesis of sophisticated macromolecular structures with potential use in many applications such as drug delivery systems, high performance composites and coatings. The aza-Michael product can be affected by a retro-Mannich-type fragmentation. A way of determining the reactions that are taking place and evaluate the quantitative evolution of the chemical species involved in the reactions is presented. The aza-Michael reaction between a modified fatty acid ester with α,β-unsaturated ketone groups (enone containing methyl oleate (eno-MO)) and aniline (1:1) was studied isothermally at 95°C and monitored in situ by near-infrared spectroscopy (NIR).The number of reactions involved in the system was determined analyzing the rank matrix of NIR spectra data recorded during the reaction. Singular value decomposition (SVD) and evolving factor analysis (EFA) adapted to analyze full rank augmented data matrices have been used. In the experimental conditions, we found that the resulting aza-Michael adduct undergoes a retro-Mannich-type fragmentation, but the final products of this reaction were present in negligible amounts. This was confirmed by recording the1H NMR spectra of the final product. Applying multivariate curve resolution-alternating least squares (MCR-ALS) to the NIR spectra data obtained during the reaction, it has been possible to obtain the concentration values of the species involved in the aza-Michael reaction. The performance of the model was evaluated by two parameters: ALS lack of fit (lof=1.31%) and explained variance ( R2=99.92%). Also, the recovered spectra were compared with the experimentally recorded spectra for the reagents (aniline and eno-MO) and the correlation coefficients ( r) were 0.9997 for the aniline and 0.9578 for the eno-MO.
Keywords: Near-infrared spectroscopy; Evolving factor analysis; Multivariate curve resolution; Alternating least squares; Aza-Michael reaction; Ester fatty acid
MCR–ALS for sequential estimation of FTIR–ATR spectra to resolve a curing process using global phase angle convergence criterion
by Nicolás Spegazzini; Itziar Ruisánchez; M. Soledad Larrechi (pp. 155-162).
Curing reactions include side and consecutive reactions while the polymer is growing. In this paper, we used multivariate curve resolution–alternating least squares (MCR–ALS) to obtain quantitative information about the concentration of the chemical species involved in these reactions. The cationic curing reaction between diglycidyl ether of bisphenol A (DGEBA) and γ-valerolactone (γ-VL) was monitored by infrared spectroscopy (FTIR–ATR). If the MCR–ALS method is to be used with recorded spectral data, the rank deficiency usually present in the data matrix needs to be overcome and the goodness of the results depends on the initial estimates of the chemical species involved in the reaction. Our strategy was to sequentially apply MCR–ALS in the time intervals where there is selectivity for some reactions and to use the error criterion based on the global phase angle to identify the optimal number of iterations in ALS. The estimated spectra were sequentially incorporated into the data matrix to overcome the rank deficiency.The MCR–ALS results are evaluated by the residuals and parameters such as lack of fit, the percentage of explained variance and the coefficient of dissimilarity between the recovered spectra and the spectra of the pure species, when it is possible. For intermediate species, the correspondence between the ALS spectra solution and the chemical knowledge of this species was also qualitatively evaluated. The goodness of the estimated concentration profile of the two reagents γ-VL and DGEBA, was evaluated by the correlation coefficient value between the estimated profile and the profile obtained when the specific absorption bands were monitored at 1776 and 910cm−1, respectively for each reagent. The correlation coefficient values were 0.9954 and 0.9885, respectively.
Keywords: Fourier transform infrared–attenuated total reflection; Multivariate curve resolution–alternating-least squares; Global phase angle; Curing reaction
Study of jojoba oil aging by FTIR
by Y. Le Dréau; N. Dupuy; V. Gaydou; J. Joachim; J. Kister (pp. 163-170).
As the jojoba oil was used in cosmetic, pharmaceutical, dietetic food, animal feeding, lubrication, polishing and bio-diesel fields, it was important to study its aging at high temperature by oxidative process. In this work a FT-MIR methodology was developed for monitoring accelerate oxidative degradation of jojoba oils. Principal component analysis (PCA) was used to differentiate various samples according to their origin and obtaining process, and to differentiate oxidative conditions applied on oils. Two spectroscopic indices were calculated to report simply the oxidation phenomenon. Results were confirmed and deepened by multivariate curve resolution-alternative least square method (MCR-ALS). It allowed identifying chemical species produced or degraded during the thermal treatment according to a SIMPLISMA pretreatment.
Keywords: Fourier transform mid infrared spectroscopy (FT-MIR); Principal component analysis; Multivariate curve resolution-alternative least square; Jojoba; SIMPLISMA; Aging
Determination of glucose and ethanol in bioethanol production by near infrared spectroscopy and chemometrics
by B. Liebmann; A. Friedl; K. Varmuza (pp. 171-178).
The concentrations of glucose and ethanol in substrates from bioethanol processes have been modeled by near infrared (NIR) spectroscopy data. NIR spectra were acquired in the wavelength range of 1100–2300nm by means of a transflectance probe for measurements in liquid samples. For building of regression models a genetic algorithm has been applied for variable selection, and partial least-squares (PLS) regression for creation of linear models. A realistic estimation of the prediction performance of the models was obtained by a repeated double cross-validation (rdCV). Reduced data sets with only 15 variables showed improved prediction qualities, in comparison with models containing 235 variables, particularly for the determination of the ethanol concentration in distillation residues (stillages). The squared correlation coefficient, R2, between the concentrations obtained by HPLC analysis and the concentrations derived from NIR data (using 15 selected wavelengths, test set samples) was 0.999 for ethanol in stillage, and 0.977 for glucose in mash. The standard deviation of prediction errors, SEP, obtained from test set samples was 0.6gL−1 for ethanol (2% of the mean ethanol concentration), and 2.0gL−1 for glucose (9.6% of the mean glucose concentration).
Keywords: Near infrared; Bioethanol; Partial least-squares; Repeated double cross-validation
The use of net analyte signal (NAS) in near infrared spectroscopy pharmaceutical applications: Interpretability and figures of merit
by Mafalda Cruz Sarraguça; João Almeida Lopes (pp. 179-185).
Near infrared spectroscopy (NIRS) has been extensively used as an analytical method for quality control of solid dosage forms for the pharmaceutical industry. Pharmaceutical formulations can be extremely complex, containing typically one or more active product ingredients (API) and various excipients, yielding very complex near infrared (NIR) spectra. The NIR spectra interpretability can be improved using the concept of net analyte signal (NAS). NAS is defined as the part of the spectrum unique to the analyte of interest. The objective of this work was to compare two different methods to estimate the API's NAS vector of different pharmaceutical formulations. The main difference between the methods is the knowledge of API free formulations NIR spectra. The comparison between the two methods was assessed in a qualitative and quantitative way. Results showed that both methods produced good results in terms of the similarity between the NAS vector and the pure API spectrum, as well as in the ability to predict the API concentration of unknown samples. Moreover, figures of merit such as sensitivity, selectivity, and limit of detection were estimated in a straightforward manner.
Keywords: Near infrared spectroscopy; Net analyte signal; Multivariate calibration; Partial least squares; Figures of merit
Moisture content determination of pharmaceutical pellets by near infrared spectroscopy: Method development and validation
by J. Mantanus; E. Ziémons; P. Lebrun; E. Rozet; R. Klinkenberg; B. Streel; B. Evrard; Ph. Hubert (pp. 186-192).
The aim of the present study was to develop and validate a near infrared method able to accurately determine a moisture content of pharmaceutical pellets ranging from 1% to 8% in order to check their moisture content conformity. A calibration and validation set were designed for the conception and evaluation of the method adequacy. An experimental protocol was then followed, involving two operators, independent production campaign batches and different temperatures for data acquisition. On the basis of this protocol, prediction models based on partial least squares (PLS) regression were then carried out. Conventional criteria such as the R2, the root mean square errors of calibration and prediction (RMSEC and RMSEP) as well as the number of PLS factors enabled the selection of three preliminary models. However, such criteria did not clearly demonstrate the model's ability to give accurate predictions over the whole analyzed water content range. Consequently, a novel approach based on accuracy profiles which allow the selection of the most fitted model for purpose was used. According to this novel approach, the model using multiplicative scatter correction (MSC) pre-treatment was obviously the most suitable. Indeed, the resulting accuracy profile clearly showed that this model was able to determine moisture content over the range of 1–8% with a very acceptable accuracy.The present study confirmed that NIR spectroscopy could be used in the PAT concept as a non-invasive, non-destructive and fast technique for moisture content determination in pharmaceutical pellets. In addition, facing the limit of the classical and commonly used criteria, the use of accuracy profiles proved to be useful as a powerful decision tool to demonstrate the suitability of the proposed analytical method.
Keywords: Moisture; Near infrared spectroscopy; Pharmaceutical pellets; Validation; Accuracy profile; PAT
Identification and quantification of ciprofloxacin in urine through excitation-emission fluorescence and three-way PARAFAC calibration
by M.C. Ortiz; L.A. Sarabia; M.S. Sánchez; D. Giménez (pp. 193-205).
Due to the second-order advantage, calibration models based on parallel factor analysis (PARAFAC) decomposition of three-way data are becoming important in routine analysis.This work studies the possibility of fitting PARAFAC models with excitation-emission fluorescence data for the determination of ciprofloxacin in human urine. The finally chosen PARAFAC decomposition is built with calibration samples spiked with ciprofloxacin, and with other series of urine samples that were also spiked. One of the series of samples has also another drug because the patient was taking mesalazine. The mesalazine is a fluorescent substance that interferes with the ciprofloxacin. Finally, the procedure is applied to samples of a patient who was being treated with ciprofloxacin.The trueness has been established by the regression “predicted concentration versus added concentration”. The recovery factor is 88.3% for ciprofloxacin in urine, and the mean of the absolute value of the relative errors is 4.2% for 46 test samples.The multivariate sensitivity of the fit calibration model is evaluated by a regression between the loadings of PARAFAC linked to ciprofloxacin versus the true concentration in spiked samples. The multivariate capability of discrimination is near 8μgL−1 when the probabilities of false non-compliance and false compliance are fixed at 5%.
Keywords: Ciprofloxacin; Parallel factor analysis; Urine; Fluorescence
Uncertainty and periodic behavior of process derived from online NIR
by Maaret Paakkunainen; Jarno Kohonen; Satu-Pia Reinikainen (pp. 206-211).
Past years have shown that near infra-red (NIR) can be successfully applied in online process control. The NIR measurements are commonly utilized because they are fast, versatile and relatively cost-effective. The online instruments produce an enormous amount of data, which need to be analyzed for, e.g., reliability, like any other online data. Instrumental data containing huge amount of simultaneously determined variables is multivariate in nature, and it has to be taken into account when the data is analyzed. The aim of this study was to show that variographic analysis gives a novel insight to online NIR data and the total uncertainty including variation arising from process itself can be estimated. It will be shown, that variographic analysis can be utilized in monitoring the process dynamics, as well as, in optimization of sampling interval.The periodic behavior was identified with autocorrelation and fast Fourier transformation (FFT) as well as with the variographic analysis. However, the variographic analysis gave a more detailed insight to the process dynamics and enabled estimation of uncertainty as a function of sampling interval. These approaches are illustrated with real industrial data originating from a petrochemical plant. Similar periodic behavior could be detected by applying any of the three mathematical methods to the online variable sets containing either NIR or other process control variables. The total uncertainty of the NIR data was estimated by applying variographic analysis with an assumption that the different principal components (PC) are individual “error sources” causing uncertainty.
Keywords: Uncertainty; Periodic behavior; NIR measurements; Variographic analysis; Autocorrelation; Fast Fourier transformation
Simultaneous determination of acetylsalicylic acid, paracetamol and caffeine using solid-phase molecular fluorescence and parallel factor analysis
by Julio Cesar L. Alves; Ronei J. Poppi (pp. 212-216).
This paper describes the determination of acetylsalicylic acid (ASA), paracetamol and caffeine in pharmaceutical formulations using solid-phase molecular fluorescence and second order multivariate calibration. This methodology is applicable even in the presence of unknown interferences and with spectral overlap of the components in the mixture. Parallel factor analysis (PARAFAC) was used for model development, whose effectiveness was demonstrated by analysis of variance (ANOVA). Errors below 10% were obtained for all compounds using an external validation set. Benefits of the new procedures not included in the reference methods such as low cost, no need of sample preparation, simple and fast analysis using fluorescence spectrometer and no generation of waste, make this method very attractive, allowing for the simultaneous determination of compounds with good reproducibility and accuracy.
Keywords: Solid-phase fluorescence spectroscopy; Parallel factor analysis; Acetylsalicylic acid; Paracetamol; Caffeine
Application of near infrared spectroscopy and multivariate control charts for monitoring biodiesel blends
by Ingrid Komorizono de Oliveira; Wérickson F. de Carvalho Rocha; Ronei J. Poppi (pp. 217-221).
Multivariate control charts in conjunction with near infrared spectroscopy were developed to verify the quality of biodiesel blends (2% of biodiesel and 98% of diesel). The control charts were built using the net analyte signal method, generating three charts: the NAS chart that corresponds to the analyte of interest (biodiesel in this case), the interference chart that corresponds to the contribution of other compounds in the sample (diesel in this case) and the residual chart that corresponds to nonsystematic variations. For each chart, statistical limits were developed using samples inside the quality specifications. It was possible to identify biodiesel blend samples that were out of specifications relative to biodiesel content, biodiesel contaminated with vegetable oil and diesel contaminated with naphtha.
Keywords: Biodiesel; Multivariate control charts; Net analyte signal; Near infrared spectroscopy
Quality control of packed raw materials in pharmaceutical industry
by O.Ye. Rodionova; Ya.V. Sokovikov; A.L. Pomerantsev (pp. 222-227).
The possibility of routine testing of pharmaceutical substances directly in warehouses is of great importance for manufactures, especially taking into account the demands of PAT. The application of NIR instruments with remote fiber optic probe makes these measurements simple and rapid. On the other hand carrying out measurements through closed polyethylene bags is a real challenge. To make the whole procedure reliable we propose the special trichotomy classification procedure. The approach is illustrated by a real-world example.
Keywords: NIR measurements; PAT; Fiber optic probe; Measurements through PE packaging; Classification limits; SIMCA
Hybrid hard- and soft-modelling applied to analyze ultrafast processes by femtosecond transient absorption spectroscopy: Study of the photochromism of salicylidene anilines
by C. Ruckebusch; M. Sliwa; J. Réhault; P. Naumov; J.P. Huvenne; G. Buntinx (pp. 228-234).
Multivariate curve resolution-alternating least squares (MCR-ALS) of multi-experiment data analysis was successfully applied to elucidate the photodynamics of the N-(3-methylsalicylidene)-3-methylaniline by analyzing UV–vis femtosecond transient absorption spectra. The two-way data obtained present some specific difficulties linked to the nature of the transient spectra collected and to the overlapping of the photodynamics of the solvent and other contributions at short time scale (below 1ps).Advantage was taken from the flexibility of the hybrid hard–soft multivariate curve resolution (HS-MCR) approach to consider a non-absorbing contribution in the kinetic model and to provide a functional description of the solvent in soft-modelling. The results obtained confirm the existence of an intermediate excited state in the process, which is created just after the ESIPT. It was observed that this intermediate relaxes in a few hundreds of femtosecond to the S1 fluorescent cis-keto excited state and a decay time constant of 219fs was found. These results confirm other femtosecond time-resolved fluorescence studies on salicylidene aniline molecules. Previous hypothesis on the formation of the trans-keto photoproduct from the S1 fluorescent cis-keto state (time constant 14ps) is also confirmed.
Keywords: Multivariate curve resolution; MCR-ALS; Kinetic modelling; Femtosecond spectroscopy; Salicylidene aniline
Study of the influence of micro-oxygenation and oak chip maceration on wine composition using an electronic tongue and chemical analysis
by A. Rudnitskaya; L.M. Schmidtke; I. Delgadillo; A. Legin; G. Scollary (pp. 235-245).
The influence of micro-oxygenation (MOX) and maceration with oak chips treatments on wine was studied on wine samples from three vintages produced in the Yarra Valley, Australia. A full factorial design was employed where two factors (MOX and oak chips treatments) had two levels and one factor (vintage) had three levels. Three replicated treatments were run for each factor's setting. Wine samples were analysed using conventional laboratory methods with respect to the phenolic wine compounds and colour attributes since the phenolic fraction of wine is most affected by both MOX and oak maceration treatments. The same wine samples were measured with an electronic tongue based on potentiometric chemical sensors. The significance of treatments and vintage effects on wine phenolic compounds was assessed using ANOVA and ANOVA-Simultaneous Component Analysis (ASCA). Cross-validation was used for the ASCA sub-model optimisations and permutation test for evaluations of the significance of the factors. Main effects of vintage and maceration with oak chips were found to be significant for both physicochemical and the ET data. Main effect of MOX treatment was also found significant for the physicochemical parameters. The largest effect on the phenolic composition of wine was due to its vintage, which accounted for 70% and 33% of total variance in the physicochemical and ET data respectively. The ET was calibrated with respect to the total phenolic content, colour density and hue and chemical ages 1 and 2 and could predict these parameters of wine with good precision.
Keywords: Electronic tongue; Chemical sensors; Micro-oxygenation; Red wine; Oak maceration; Analysis of variance-Simultaneous Component Analysis; Permutation test
Local examination of skin diffusion using FTIR spectroscopic imaging and multivariate target factor analysis
by J. Tetteh; K.T. Mader; J.-M. Andanson; W.J. McAuley; M.E. Lane; J. Hadgraft; S.G. Kazarian; J.C. Mitchell (pp. 246-256).
In the context of trans-dermal drug delivery it is very important to have mechanistic insight into the barrier function of the skin's stratum corneum and the diffusion mechanisms of topically applied drugs. Currently spectroscopic imaging techniques are evolving which enable a spatial examination of various types of samples in a dynamic way. ATR-FTIR imaging opens up the possibility to monitor spatial diffusion profiles across the stratum corneum of a skin sample. Multivariate data analyses methods based on factor analysis are able to provide insight into the large amount of spectroscopically complex and highly overlapping signals generated. Multivariate target factor analysis was used for spectral resolution and local diffusion profiles with time through stratum corneum. A model drug, 4-cyanophenol in polyethylene glycol 600 and water was studied. Results indicate that the average diffusion profiles between spatially different locations show similar profiles despite the heterogeneous nature of the biological sample and the challenging experimental set-up.
Keywords: Fourier transform infrared spectroscopy; Skin; Stratum corneum; Target factor analysis; Fourier transform infrared imaging
A chemometric study of chromatograms of tea extracts by correlation optimization warping in conjunction with PCA, support vector machines and random forest data modeling
by L. Zheng; D.G. Watson; B.F. Johnston; R.L. Clark; R. Edrada-Ebel; W. Elseheri (pp. 257-265).
A reverse phase high performance liquid chromatography (HPLC) separation was established for profiling water soluble compounds in extracts from tea. Whole chromatograms were pre-processed by techniques including baseline correction, binning and normalisation. In addition, peak alignment by correction of retention time shifts was performed using correlation optimization warping (COW) producing a correlation score of 0.96. To extract the chemically relevant information from the data, a variety of chemometric approaches were employed. Principle component analysis (PCA) was used to group the tea samples according to their chromatographic differences. Three principal components (PCs) described 78% of the total variance after peak alignment (64% before) and analysis of the score and loading plots provided insight into the main chemical differences between the samples. Finally, PCA, support vector machines (SVMs) and random forest (RF) machine learning methods were evaluated comparatively on their ability to predict unknown tea samples using models constructed from a predetermined training set. The best predictions of identity were obtained by using RF.
Keywords: Tea; Principle component analysis; Warping; Correlation optimization warping; Support vector machines; Random forest; Prediction
Likelihood ratio model for classification of forensic evidence
by G. Zadora; T. Neocleous (pp. 266-278).
One of the problems of analysis of forensic evidence such as glass fragments, is the determination of their use-type category, e.g. does a glass fragment originate from an unknown window or container? Very small glass fragments arise during various accidents and criminal offences, and could be carried on the clothes, shoes and hair of participants. It is therefore necessary to obtain information on their physicochemical composition in order to solve the classification problem. Scanning Electron Microscopy coupled with an Energy Dispersive X-ray Spectrometer and the Glass Refractive Index Measurement method are routinely used in many forensic institutes for the investigation of glass. A natural form of glass evidence evaluation for forensic purposes is the likelihood ratio— LR= p( E|H1)/ p( E|H2).The main aim of this paper was to study the performance of LR models for glass object classification which considered one or two sources of data variability, i.e. between-glass-object variability and(or) within-glass-object variability. Within the proposed model a multivariate kernel density approach was adopted for modelling the between-object distribution and a multivariate normal distribution was adopted for modelling within-object distributions. Moreover, a graphical method of estimating the dependence structure was employed to reduce the highly multivariate problem to several lower-dimensional problems.The performed analysis showed that the best likelihood model was the one which allows to include information about between and within-object variability, and with variables derived from elemental compositions measured by SEM-EDX, and refractive values determined before (RIb) and after (RIa) the annealing process, in the form of dRI=log10|RIa−RIb|.This model gave better results than the model with only between-object variability considered. In addition, when dRI and variables derived from elemental compositions were used, this model outperformed two other classification methods in classifying test set observations into car or building windows.
Keywords: Multivariate physico-chemical data; Likelihood ratio; Graphical models; Glass; Evidence evaluation; Forensic sciences
Evaluation of evidence value of glass fragments by likelihood ratio and Bayesian Network approaches
by G. Zadora (pp. 279-290).
Growing interest in applications of Bayesian Networks (BNs) in forensic science raises the question whether BN could be used in forensic practice for the evaluation of glass objects described by the results of physico-chemical analysis, especially the information obtained from analysis performed by Glass Refractive Index Measurement technique. Comparison of glass fragments, i.e. could two glass samples (recovered from, e.g. the suspect’s clothes and control, collected from the scene of crime) have originated from the same object, is one of the tasks of evaluation of glass fragments for forensic purposes. The second problem is the determination of their use-type category, e.g. does an analysed glass fragment originate from an unknown window or container? This process, known as classification, is especially important when the analysed fragment was recovered from the suspect’s clothes and there was no control sample. 111 glass objects (car windows, building windows, and containers) were measured in order to determine the refractive index (RI) before (RIb) and after the annealing process (RIa), from which a new variable dRI=log10|RIa−RIb| was calculated. Results obtained by the application of BN models were compared to results obtained by the application of suitable likelihood ratio models commonly used in the forensic sphere nowadays. The performed research showed that BN models could be satisfactorily applied to obtain the evidence value of glass fragments when RIb is used in the comparison problem. Use of BN with dRI in the classification problem also gave good results.
Keywords: Bayesian Networks; Likelihood ratio; Physico-chemical data; Glass; Evidence evaluation; Forensic sciences
|
|