APLICACIÓN DE MÉTODOS MULTIVARIADOS PARA LA DIFERENCIACIÓN DE VINOS PERUANOS APPLICATION OF MULTIVARIATE METHODS TO THE DIFFERENTIATION OF PERUVIAN WINES

This work presents the results of the sensing analysis of Peruvian wines of known (Commercial wines) and handmade brands, using electronic noses (E-noses) which consist of an array of sensors based on tin oxide doped with Pd or Pt, and some with zeolite coating. The combinations of the sensors were performed seeking to obtain the best discrimination of the wines with the multivariate methods, with a high level of confidence and a good distribution of the results. The Principal Component Analysis (PCA), cluster and factorial results showed that the electronic noses allowed to efficiently identify wines of known brand from those of handmade brand, revealing the way in which the wines have been produced. On the other hand, the multivariate methods applied to the electronic noses made up of SnO sensors doped palladium showed a clear differentiation of Borgoña-type wines from wines of and evidenced the formation of agglomerations between red and Rosé wines. The application of PCA, cluster and factorial obtained in this study allowed to obtain good results in the differentiation of wines, even with electronic noses formed with a low number of sensors.


ABSTRACT
This work presents the results of the sensing analysis of Peruvian wines of known (Commercial wines) and handmade brands, using electronic noses (Enoses) which consist of an array of sensors based on tin oxide doped with Pd or Pt, and some with zeolite coating. The combinations of the sensors were performed seeking to obtain the best discrimination of the wines with the multivariate methods, with a high level of confidence and a good distribution of the results. The Principal Component Analysis (PCA), cluster and factorial results showed that the electronic noses allowed to efficiently identify wines of known brand from those of handmade brand, revealing the way in which the wines have been produced. On the other hand, the multivariate methods applied to the electronic noses made up of SnO 2 sensors doped with palladium showed a clear differentiation of Borgoña-type wines from wines of handmade brand and evidenced the formation of agglomerations between red and Rosé wines. The application of PCA, cluster and factorial obtained in this study allowed to obtain good results in the differentiation of wines, even with electronic noses formed with a low number of sensors.
The real world is essentially full of multivariate systems requiring a simultaneous analysis of the different variables which could affect the process. For example, to analyze a food or a drink is necessary to consider not only, the chemicals from which the product is made up but also the different variables (statistically, many variables) that could interact with each other. i.e., throughout a matrix of relationships that may affect two or more variables at the same time. Among the most common methods to analyze these systems, the Principal Component Analysis (PCA), is the most frequently statistical approach for evaluating information; converts a set of a data universe with many variables into a set of possibly correlated variable observations, that is, into a set of variable values without linear correlation called principal components. This statistical method constructs a linear transformation that chooses a new coordinate system for the original data set in which the largest variance of the data set is captured on the first axis (called the First Principal Component), the second largest variance, the large is the second axis, and so on (Gupta & Barbu, 2018;Johnson &Wichern, 2007).
A key aspect in the PCA method is the interpretation of the main factors where the variances of the initial data will be distributed, since there is no general methodology applicable to all types of data that may exist but will be deduced after observing the relationship of the main factors with the initial data.
For this study, cluster and factorial analysis were selected to corroborate the results with the PCA approach and to provide more information to the results statistics obtained in these studies for the discrimination of wines according either to the brand (known and handmade brands) or the type of wine (red, rosé, Burgundy).

INTRODUCTION Principal Component Analysis
Principal Component Analysis (PCA) (Johnson & Wichern, 2007;Aldás & Uriel, 2017) is a statistical method used to reduce the dimensionality of a data set. This method is usually used to find the causes of the variation of the data set and to order them by prevalence. It is primarily used in statistical exploratory data analysis and to build predictive models. It involves obtaining the eigenvalues and vectors of the covariance matrix, after centering the variables in relation to the mean.
In the principal component analysis, there is the possibility of using the correlation matrix or the covariance matrix. In the first option, the same value is proposed to each one of the variables; this may be appropriate when all variables are considered equally important. The second form is applied when all the variables have the same measurement units, and when it necessary to highlight each one of the variables.
The main components are obtained as linear combinations of the original variables. The components are ordered according to the percentage of variance explained. One of the advantages of the method is the remain of the variables of the data set that contribute the most to its variance, being the first component the most important because it contains the highest percentage of the variance of the data.
In relation to this study, the sign of response of the sensor (voltage) measured in time (seconds) was considered for each of the objects, which in this case were the different types of wine, the volatile components of the wine, the type of sensor, the dopant metal in the sensor as well as the coating of the sensor with zeolite. Applying PCA method, it was possible to select those components that would later replace the original variables.

Cluster Analysis
Cluster analysis is the name of a group of multivariate techniques whose main purpose is to group ob-

MATERIALS AND METHODS
jects based on their characteristics. Cluster analysis classifies objects in such a way that each object is very similar to the objects in the cluster, with respect to some predetermined selection criteria. The resulting object clusters must show a high degree of internal homogeneity (inside the cluster) and a high degree of external heterogeneity (among clusters). Cluster analysis is especially useful when it is necessary to develop hypotheses concerning the nature of the data or to examine previously established hypotheses (Johnson & Wichern, 2007;Aldás & Uriel, 2017) Any number of rules can be used in cluster analysis, but the fundamental task is to assess the mean similarity within the clusters, so that as the mean increases, the cluster becomes less similar.

Factor Analysis or Analysis of Common Factors
In different research studies it is not always possible to directly measure the variables, as is the case of qualitative variables: level of intelligence, social class, etc. In these cases, it is necessary to collect indirect measures that are related to the concepts that interest. The variables that interest are called latent variables and the methodology that relates them to observed variables is called Factor Analysis. The Factor Analysis model is a multiple regression model that relates latent variables with observed variables. This method has many points in common with principal component analysis, and essentially looks for new variables or factors that explain the data. In principal component analysis, in fact, only orthogonal transformations of the original variables are made, emphasizing the variance of the new variables, meanwhile, in the factor analysis, on the contrary, it is more interesting to explain the structure of the covariances between the variables (Johnson & Wichern, 2007).

Sensors Preparation
In a previous work (Paredes-Doig et al., 2019), the sensors based on SnO 2 doped with palladium (0.1, 0.2, 0.3 and 0.5% Pd) or platinum (0.1, 0.2, 0.3 and 0.5% Pt) were prepared by wet impregnation method. To increase the sensitivity of the sensors to contact with volatile chemicals present in the aroma for the evaluation of Peruvian wines, some sensors were coated with Zeolite Y.

Preparation of samples
A template was formed with the adhesive tape to define the area that would cover the SnO 2 doped with metal (Pd or Pt) on the surface of one alumina plates. Subsequently, 0.1 gram of doped tin oxide was combined with (0.1, 0.2, 0.3 and 0.5 %) Pd or Pt with 0.02 g of ethylcellulose and 32 μL of α-terpineol, forming a paste, which was deposited on the one substrate of alumina, and then a heat treatment was carried out in the oven for 15 min at 600 °C.

Sensors with zeolite Y coating
To a beaker containing 0.05 g of tin oxide doped with Pt or with Pd, 0.01 g of ethylcellulose and 16 μL of αterpineol were added. All the substances were mixed uniformly to form a paste, which was deposited on a surface of alumina containing two gold electrodes, and then, it was calcined at 600 °C for 10 min using a heating ramp of 3 °C/min.
To prepare a thin layer of zeolite Y, 1,2-propanediol was used as a solvent following the procedure described by (Vilaseca et al., 2008). Each mixture was constantly stirred until the zeolite Y was dispersed in the solvent. Once the system achieved homogeneity, with the help of a micropipette, a small quantity was extracted and deposited by microdripping on the surface of the tin oxide, previously placed on the alumina sheet. Subsequently, the sensor was tested in the presence of volatile compounds of wine samples.
The sensing measurements of the volatile components contained in the aroma of each wine were performed for each sample in triplicate using the following measurement parameters:  Table 3 shows the relationship of the Peruvian wines used in the analysis and the nomenclature used.  It is also observed that the PCA can associate wines with similar characteristics, in this case by the type of wine. As can be seen in Figure 1, in some cases the circles intersect, showing dispersion in the results.
From the analysis of the PCA is observed that the electronic nose made up of palladium doped SnO 2 sensors without zeolite coating allowed to obtain no homogeneous distribution by the type of wine.
An interclass variation of 66.19 % and an intraclass variation of 33.81% were obtained from the hierarchical cluster. The results are not clear as in the case of the PCA. Only OB wine is distinguished from others. However, the interclass distance is 66% and indicates that there is a good differentiation of the classes that group the wines. In other words, there is heterogeneity between classes and more homogeneity within classes.
After seeing the results with the hierarchical cluster method, the k-means cluster with only 2 fixed classes was applied, obtaining only 13.52 % for interclass variation and 86.48 for intraclass. The interclass separation is very low; therefore, this method did not contribute to the discrimination of the analyzed wines.   (0.1, 0.2, 0.3, 0.5% Pd) without zeolite coating In the following section, multivariate methods (PCA, Cluster and Factorial) were applied to some combinations of sensor arrays, but which do not correspond to the total set of sensors that have been used or to all samples, but rather to partial sets of such sensors.
a) Array of sensors: SnO 2 -Z, 0.1%Pd/SnO 2 -Z and 0.2%Pd/SnO 2 -Z In Figure 2, a better differentiation by type of wine is observed, especially those of the Borgoña type because the region where the Borgoña type wines are located is clearly separated from that of handmade brand wines and other wines. On the other hand, the signs corresponding to the Rosé and red wines are located in the same region, probably due to the formation of agglomerates.
Better results were achieved for the cluster method, using a shaped nose with only 3 sensors. The interclass percentage reaches 87 % which means that a good class differentiation is observed. Furhermore, the Borgoña wines are grouped into a single class and almost all of the wines reds and rosés are in another class. This picture has been also observed with the first factorial.
The k-means cluster and factorial approach allowed to obtain 3 clases of wines. The 3 groups observed were: 1. Commercial Borgoña Wines, 2. Commercial Red and Rosé Wines and 3. Handmade Wines.  Figure 3 presents the PCA obtained with the combination of five sensors: SnO 2 -Z, SnO 2 doped with platinum and all those with zeolite coating. The total variance level is 93.06%. In this case, a clear differentiation of the well-known brand wines from those of the handmade brand is observed. Moreover, a differentiation of the Borgoña-type wines is observed among the former. However, the signs of the red and rosé-type wines are in the same region showing agglomeration, which indicates a medium distribution of the signals. The interclass percentage is quite high (86.07%), which indicates that there is a good separation between classes and closer proximity of objects within the agglomerates.
From the results obtained of this method, the wines are separated into three groups: 2 classes for handmade wines and 1 class for commercial wines. Therefore, with this nose it was possible to differentiate commercial wines from handmade ones. Thus, the brand is related to the composition of the wine that has been monitored with the electronic nose.
The factor analysis was applied to corroborate the results obtained with the PCA and the cluster method. Up to three groups can be observed and the differentiation of the commercials from the handmade ones is quite clear. As is in the previous nose, the application of a cluster method has separated the wines into three classes: 2 classes for handmade wines and 1 for commercial wines.
With a considerable interclass distance (more than 83%) obtained with the previous electronic nose (e-nose); the differentiation of wines was also tested with the sensors doped with platinum. With this nose, a good differentiation between handmade wines and commercial was observed in a good way. With the two e-noses with platinum sensors

DISCUSSION
In recent years, great attention has been paid to the application of data analysis systems to artificial detection systems, to integrate responses with sensory and chemical data and to combine data from different tech-nologies such as electronic noses, which serve to better replicate the human sensory system (Baldwin et al., 2011). This is why the present investigation has been carried out.
For the differentiation of samples in this type of systems, chemometric tools and analysis have been used to extract the causes of the variance of the readings of the electronic nose and the multivariate distance (Casagrande Silvello & Alcarde, 2020). The most applied multivariate procedures are cluster analysis, factor analysis, multidimensional scaling, discriminant analysis, regression analysis, and artificial neural networks (García-González & Aparicio, 2002). In the present work, three multivariate methods have been used: PCA, Cluster analysis and Factor analysis.
PCA as a technique applied to chemistry has been used in other studies (Welke et al., 2013). For example, in previous works different types of wines such as Chardonnay, Merlot, Cabernet Sauvignon, Sauvignon Blanc and 50 % Chardonnay/Pinot Noir 50 % have been achieved, finding total variances of the first two components, lower than those found in the present work. Welke et al. (2013), also found the red wines, Cabernet Sauvignon and Merlot, are in the same quadrant. Chardonnay and Sauvignon Blanc wines were separated by PC2, while Merlot, Cabernet Sauvignon and 50% Chardonnay/50% Pinot Noir wines were most influenced by variables related with PC1. In the present study, it was observed that handmade wines were in quadrants I and IV, while commercial wines were found in quadrants II and III. It is also important to appreciate that the sweetest wines were influenced by PC1, whether commercial or handmade.
Sensors doped with platinum reached better results of the wines detection and discrimination than tin oxide sensors doped with palladium. This behavior was seen in a previous work (Paredes-Doig et al., 2019).Platinum aggregation in the bulk (¨bulk¨) of tin oxide leads to an increase in the density of the chemisorbed oxygen on the surface and in a certain way increases the resistance of the MOS; however, its character as a dehydrogenation catalyst is the one that predominates and for which it is used to increase the sensitivity of a sensor (Sevastyanova et al., 2012).
The zeolite films improved the detection of wines such as in the work of Vilaseca et al. (2008) The e-noses built with sensors coated with zeolite Y shown better results when the multivariate methods were applied.
E-noses can detect the adulteration of wines with methanol or ethanol (Penza & Cassano, 2004;Berna, 2010). Penza & Cassano, 2004), tes ted three red, three white and three rosé wines from different Italian denominations of origin and vintages using a multisensor array that incorporated four metal oxide (WO 3 ) semi-conductor thin film sensors. In this study, something similar appears with handmade wines like with adulterated wines studied by Penza and Cassano (2004).
Electronic systems can be used to discriminate wines elaborated using different grapes and techniques. That can be used to verify authenticity of the wines in comparison with traditional techniques.
Di Natale et al. (1996) employed four MOS sensors to classify wines having the same geographic origin but coming from different vineyards. That detection and differentiation of the commercial wines from handmade wines were also reached in the present work. Lozano et al. (2005) used an e-nose combining sixteen tin oxide thin film-based sensors to recognize aromas in white and red wines. In the present investigation, it was used sensors arrays of five sensors maximum. And, also, it can say that the enoses of three sensors exposing good results in comparison with other studies. Cozzolino et al. (2009) reported that the results show that MOS sensors can discriminate between grape and type of wines and may become an important tool for standardization of wine quality. And this was found in the present study, because with the enoses it could see that wines were manufacturing with different type of grape (like Burgundy grape) was agglomerated in other class. Therefore, the wines the better quality from known brands evidenced a separation in the plot from the handmade wines.
For example, cluster or cluster analysis has been used to classify four types of coffee, while (Pearce et al., 1993) used it to distinguish two types of lagers. Cluster analysis has also been used to study sensor similarities to select the sensors with the highest sensitivity from each batch of sensors and thus avoid redundancy (Chaudry et al., 2000;García-González & Aparicio, 2002). Although, cluster analysis is not most used technique for this class of works like the PCA; in the present study, cluster analysis method contributed and corroborated to classify the wines too in a good way. It was also observed that with the factor analysis method, the results obtained with the PCA, and cluster methods were verified.