X-ray fluorescence spectroscopy
During X-ray fluorescent analysis [8, 9] the surface of a sample is irradiated by X-ray beam. By applying the appropriate energy, a photoelectron is emitted. The vacancy then is filled by an outer electron, while the energy difference is emitted in the form of X-ray fluorescent radiation. The excitation energies correspond to the emission lines of the elements, while the intensity of the emission provides information about their concentration on the sample surface.
The X-ray fluorescent technique provides quick non-destructive analysis. It gives information about the composition of metallic and non-metallic surfaces without the need for any pretreatment. The technique is independent from the chemical state of the elements, but it doesn’t give information about the chemical bonds (oxidation state) of the examined elements. During a measurement with an appropriate excitation source, all of the elements in the sample can be examined simultaneously. This method enables the study of both solid and liquid substances.
We applied an INNOV-X Alpha handheld analyzer for our studies, which can easily measure concentration of elements heavier than sodium with 0.01% precision from very different matrices. According to recent research, handheld devices can produce equally accurate results as benchtop XRF analyzers in the study of coins [10].
Instrument specifications
-
Excitation source: X-ray tube, W anode, 10-40 kV, 10-50 μA.
-
Detector: Si PiN diode detector, <230 eV FWHM at 5.95 keV Mn K-alpha line.
-
Standard elements: Pb, Cr, Hg, Cd, Sb, Ti, Mn, Fe, Ni, Cu, Zn, Sn, Ag, As, Se, Ba, Co, Zr, Rb.
Principal component analysis (PCA)
Principal component analysis [11–14] is known by several names in different areas of science, so it can also be found in articles as “eigenvector analysis” or “characteristic vector analysis”. PCA is unsupervised, so we don’t classify the samples before the analysis. The basic idea is that we place our measured data in a data matrix (marked X), in which the rows correspond to the samples (in this case, coins), whilst the columns represent the studied properties (here: metal concentrations). This matrix can be decomposed into the product of two matrices. There are an infinite number of resolutions, but with constraints like orthogonality and normalization the solution becomes definite (aside from central mirroring). During standardization we first shift our original scale by a constant number and then shrink or expand it, so that the arithmetic mean of the property vectors becomes 0 and their deviation 1. The resulting matrices are the score (T) and factor loading (P) matrices.
PCA can be applied to rule out outliers, to reduce our dataset (which can ease our work greatly in cases of big, complex datasets) and to build models that describe the behavior of a physical or chemical system and reveal any pattern in the data. The models can be used for predictions when we introduce new data (new samples measured in the same way).
Linear discriminant analysis (LDA)
Similarly to PCA, linear discriminant analysis [15] (LDA) is a dimension-reducing method, in which we create background variables (called canonical variables, roots) by a linear combination of the variables of the original data matrix. LDA is a widely-used supervised pattern recognition technique. The main difference between PCA and LDA is that LDA is supervised, thus we need to know the class memberships of samples before the analysis. We can create N-1 canonical variables for N classes.
During LDA, we plot an ellipse (ellipsoid or a hyperellipsoid in the case of more than three variables) around each group of scattered points. The ellipse can be interpreted as a section plane of a Gaussian surface, which includes a given percentage of the points of the corresponding group. The center of the ellipse represents the maximum of the Gaussian surface. The discriminant function is given by the line connecting the intersections of the ellipses.
Classification and regression tree (CART)
CART [15, 16] is a recursive classification method, which creates binary divisions from our dataset. The principle of this method is to ask yes-or-no questions during the classification of the samples (i.e. the creation of a tree). The algorithm aims at identifying the possible variables and their values for the best resolution. The starting group is considered the root of the tree, which is always the group with the most samples. At the start, the other groups of samples are included in this group as well. Then the algorithm splits the samples to achieve the most advantageous separation of groups.
Its expressivity made it very popular in various field, such as data classification in medical diagnostics. Its theoretical basis was devised by Breiman, Friedman, Olshen and Stone in the 1980’s [17].
Partial least-square regression discriminant analysis (PLS-DA)
PLS-DA is used for identifying outliers, ruling out variables with low variance (thus easing further studies) and mainly examining groupings of samples. [18–20]. PLS-DA is closely related to multivariate linear regression, to PCA, and to principal component regression. A possible implementation of the method is to apply matrix decomposition to the original X and Y data matrices, which are thus expressed as a product of three matrices. In case of (PLS-DA) the data matrix Y contains the independent group variables.
Multivariate curve resolution with alternating least squares (MCR-ALS)
The method of multivariate curve resolution with alternating least squares (MCR-ALS) [21–23], as a chemometric method, can decompose the data matrix to profiles (composition profiles and pure elemental distribution profiles) with the use of certain constraints [24–26]. The usual assumption in multivariate resolution methods is that the experimental data follow a bilinear model similar to the Lambert-Beer law in absorption spectroscopy. In matrix form, this model can be described as
(1)
where X is the response matrix, E is the elemental distribution profile matrix of the components, and C is the composition profile matrix for the samples.
Suitably chosen initial estimations of E or C are optimized by solving Eq. (1) iteratively by alternating least squares optimization:
(2)
where the matrix X* is the reproduced data matrix obtained by principal component analysis for the selected number of components, and + means the pseudoinverse of the original X matrix [27]. Unfortunately, this decomposition is very often not unique because of the rotational and intensity (scaling) ambiguities [28, 29]. The rotational ambiguities can be moderated or even eliminated if convenient constraints can be used [24–26]. Tauler and coworkers developed a Matlab code for MCR-ALS with some constraints [30].
Experimental
We have examined 289 silver coins provided by the Déry Museum of Debrecen. 32 coins were omitted from this dataset, because if only a small amount (3 or 4 pieces) was dated back to the time of a particular king, that set cannot be considered representative to that period. Four coins were identified with PCA as outliers in the early phase of research, so they were also omitted. Each measurement (spectrum acquisition and calculation of elemental composition) was carried out three times, with 30 seconds of irradiation. This time-span was found optimally short and precise by a prior investigation of several alloys. We have used the mean of the three measurements in cases, where elemental composition data were needed, and the three results separately, where X-ray spectra were needed. The amount of the following elements has been determined: Ti, Fe, Ni, Cu, Zn, Ag, Sn, Sb, Pb, Bi. The properties of coins were summarized in two tables: one containing the mean values of elemental composition (257 × 10), and the other containing intensity values for all studied wavelengths (257 × 2048). Two data matrices were created accordingly and evaluated by PCA, LDA, CART and PLS modules of the software STATISTICA 6.0; besides, MCR-ALS calculations were completed by PLS Toolbox V6.7.