Highly sensitive terahertz non‐destructive testing technology for stone relics deterioration prediction using SVM-based machine learning models

The hollowing deterioration of stone relics required effective non-destructive testing (NDT) methods for their timely restoration and maintenance. To this end, a new NDT method based on terahertz (THz) technology by using support vector machine (SVM)-based machine learning models was developed to assess and diagnose the hollowing deterioration of the Yungang Grottoes. According to experiment design, a series of hollowing deterioration samples with various thicknesses of hollowing deterioration were prepared and then measured by using THz time-domain spectroscopy (THz-TDS). Based on the THz-TDS results of 30 randomly selected samples, a SVM-based hollowing deterioration prediction model (SVM-HDPM) was established by analyzing the relationship between the hollowing samples and the THz spectral information. The reliability and accuracy of the model was further proved by verified and compared with using the THz spectral data of the remaining 10 samples. The experimental results with the linear kernel function greatly demonstrated that the SVM-HDPM can have superior prediction accuracy, implying that the model is feasible for the prediction the hollowing deterioration of the stone relics. Moreover, one data preprocess was introduced into SVM-HDPM to meet the needs of field-based test. The predicted results of five different hollowing deterioration with different flaked stone thickness revealed good performance with very low mean square error (MSE) value. Therefore, it is believed that the proposed method can be regarded as an effective NDT technique with practical applications in analyzing cultural relics and have promising future prospects in inspection stone relics-like ancient heritage for hidden flaws.


Introduction
Hollowing deterioration is known as one of the most common deterioration processes in the open-air stone cultural relics. Even worse, the delayed detection and repair of hollowing deterioration damage may lead to seriously irreparable damage due to more air can be trapped inside the formed hollowing structure over time destroying the sculptures on the surface of the stone relics [1][2][3]. Therefore, it is very important to evaluate the situation of the hollowing deterioration to prevent damage. One promising way to diagnose the extent of damage suffered by the problem is used the non-destructive testing (NDT) methods such as X-ray diffraction (XRD), ultrasonic, VIS/NIR hyperspectral imaging, ground-penetrating radar and so on. Hard X-ray can be transmitted through the density of a larger, smaller thickness of the material, such as bronze and so on [4]. Soft X-rays can Open Access *Correspondence: liutg2020@163.com 1 Institute of Solid State Physics, Shanxi Datong University, Xingyun Street, Datong Shanxi 037009, China Full list of author information is available at the end of the article penetrate small cultural relics such as porcelain, lacquerware, calligraphy and painting [5]. Ultrasonic wave can be used to detect the internal defects of stone cultural relics, thus providing the information of the internal structure of cultural relics [6]. Ultraviolet and infrared rays can usually be used for the analysis of the surface material composition of calligraphy, painting and textiles [7]. Raman can be used for the analysis of manuscripts, calligraphy and painting, porcelain surface pigment composition and bronze rust product composition analysis, thus providing a large number of cultural relics surface information [8]. Due to the advantages of its non-contact, high resolution, highly efficient detection potential and reliable evaluation methods for analyzing cultural relics, terahertz (THz) spectroscopy has been gaining an edge over other conventional NDT methods [9][10][11][12][13][14]. To the date, the THz NDT approach has been successfully applied to detect some kinds of cultural relics diseases. For example, as early as 2006, Jackson et al. firstly applied THz spectroscopy to analyze various painted artifacts to detect overpainting, recognize hidden layers in oil painting, and identify hidden portrait in easel painting [15][16][17]. Then, Fukunaga Kaori et al. conducted extensive research on the deterioration of tempera paintings, printmaking, murals, painting layers, and ancient vases from the late Middle Ages using THz spectroscopy [9,18,19]. Recently, THz spectroscopy has been used to investigate ancient wooden structures, stone carving, pottery, ancient mummies, bones, etc. [20][21][22][23][24][25][26]. For instance, Krügener et al. from Germany detected hidden crack in a stone medallion in the Niedersachsen National Museum in Hannover, assessed repair conditions of a window sill in the Trier Cathedral, and detected defects below the glaze argil layer in pottery from the 16th Century. However, despite enormous efforts, the development of THz in the hollowing disease of stone cultural relics still remain a daunting challenge which can be attributed to the followed reasons: (1) As there is no a fingerprint spectrum for the stone in THz wave band, the hollowing disease of stone cultural relics cannot be directly characterized by the THz technology. (2) The absence of a universality theoretical model greatly restricts the THz technology wide application in the field of the detection of diverse cultural heritage diseases. Thus, it needs to promote the THz detection technology extending its application scope for all the stone culture relics by using a more universality theoretical model, rather than only fitting in one or several limited kinds of ones. However, due to the limit of the number of samples to be tested (Cultural relics testing must be non-destructive, and thus, in general, it is difficult to conduct in multi-point and multitimes testing), the universality theoretical model derived by the few experimental sample and data needs the help of machine learning to verify its correctness and universality. Compared with other types of machine learning, SVM has the advantages that it can deal with the problems of small samples, nonlinearity and high dimension, and overcome the problem of local minimum in neural network. More important, according to report, the established SVM prediction model of hollowing deterioration disease can effectively improve the universality of THz detection, which is based on the Vapnike Chervonenkis (VC) dimension theory and the minimizing structural risk principle of statistical learning theory [27]. Thus, as the training and testing samples of SVM prediction model, the SVM disease prediction model with excellent robustness is urgently needed to be addressed. Therefore, it is necessary to pursue new ideas and methods for disease detection of stone cultural relics, especially further theoretical works by machine learning are necessary to build a universal THz detection theory of stone relics. It is hoped that when detecting the hollowing deterioration disease of the sandstone stone cultural relic, no matter what kinds of samples with any shape, or which part of the object they come from, the degree of deterioration can be predicted by just measuring the THz reflection spectrum information. Furthermore, THz detection technology aided by the SVM disease prediction model, can be more rapid, efficient and cost-saving in the actual detection of cultural relics diseases. By the way, the high sensitivity of THz spectrum can make it possible to distinguish the diseases of cultural relics with minimal difference in deterioration.
In this study, THz NDT of hollowing deterioration in a typical sandstone open-air stone cultural relic of Yungang Grottoes was performed by experimental and theoretical research of hollowing deterioration. Based on the relation between the relative time delay of three reflected pulses and the hollowing thickness of the samples [28][29][30][31], the SVM-based hollowing deterioration prediction model (SVM-HDPM) for stone relics can be established by using LS-SVM. By using the model, the hollowing thickness in stone relics could be accurately predicted and discovered when the THz data of hollowing deterioration were input into the SVM-HDPM, which can have high accuracy and validity proved by the experimental results.

Hollowing deterioration samples
Due to most of the flaked stone thickness of the hollowing disease in the typical open stone scenic spot of Yungang Grottoes is 2 mm, and therefore the sandstones collected from the Yungang Grottoes site were cut and ground into 2 mm thick (d_1) samples with the two sides are parallel. Further, the thickness of the substrate layer (d_3) (stone wall) of the testing samples was set to 6 mm to ensure optical opacity, which is similar to the actual hollowing deterioration rear wall of the stone relics of Yungang Grottoes. The thickness of the air trapping cavity (hollowing) formed under the relic surface is called the hollowing deterioration thickness (d_2). Typical samples with different d_2 values were prepared by changing the thickness of air in the cavities under the sample surface. The schematic of hollowing deterioration of stone relics is shown in the Fig. 1. Although the surface roughness of a hollow sample causes scattering and diffuse reflection of THz light, which energy can be absorbed by the surrounding medium or dissipated directly into the air, the effect of this dissipation is the same for all hollowing samples. Therefore, the surface roughness cannot affect the time delay variation law and the value caused by the hollowing variation. In addition, the method used in this study is single-point detection and the beam waist diameter of THz is 2 mm. Based on the above facts, the effect of surface roughness can be negligible.

THz non-destructive testing method
THz NDT was performed on 40 hollowing deterioration samples with d_2 between 0.1 and 4 mm at 0.1 mm intervals by using a commercial THz time-domain spectroscopy system (BATOP THz-TDS 1008). The sample with d_2 = 0 mm indicates that it is absence of hollowing deterioration. The samples were analyzed by setting the THz-TDS 1008 system parameters [32], that is, tune the central wavelength of laser to 800 nm, pulse duration to 100 fs, THz spectral scanning range to 340-420 ps and step length to 0.02 ps. Further, the samples were tested at room temperature (293 K) and 30 % relative humidity to fully simulate the actual deteriorating environment of the Yungang Grottoes scenic spot.
In order to understand performance characteristic of the flaked stone of hollowing samples in THz wavelengths, two different kinds of flaked stones with the typical thicknesses of 1.5 mm and 2.0 mm were tested by THz-TDS with transmission sample stage. Figure 2a, b exhibit the TDS pulses of reference and transmitted through the flaked stones and their spectra, respectively. The pulse after transmission through the flaked stones are stretched due to the dispersion of the material in the time domain, and have lower amplitude due to their attenuation in the frequency domain. We also investigate the THz wave penetration properties for the two flaked stones from the transmissivity spectra, as displayed in Fig. 2c. The results demonstrate that the THz wave can still effectively pass through the flaked stones with the thickness increasing from 1.5 to 2.0 mm with keeping the high ratio of signal to noise (S/N). In this case, the THz signals with more than 50 % transmissivity can be effectively used to detect the hollowing samples, which can ensure THz spectrum passing through the flaked stones, reaching stone wall of hollowing deterioration samples through the air layer, and reflecting back the THz signal. >50 % transmissivity of THz signal ensures that the THz-TDS system runs normally and maintains its own high SNR detection performance. In addition, it is noted that the refractive indexes of the flaked stones are very stable in the THz band, as given in Fig. 2d. Therefore, the flaked stones could be approximately regarded as an isotropic medium in our subsequent analysis based on the stable refractive indexes and the two parallel sides. And the data was processed with using the model proposed by T. D. Dorney [33] for extracting the optical constants of materials using terahertz time-domain spectroscopy: where n(ω) is the refractive index of the sample at a certain frequency, Φ(ω) is the phase delay of THz through the sample, and d is the sample thickness, respectively.
The THz time-domain signals of 10 hollowing deterioration samples are displayed in Fig. 3. The right side of Fig. 3 is an enlarged view of the THz time-domain signals in the time interval of 370 to 400 ps. As can be seen from Fig. 1, there are three evident reflected pulses in the THz reflection signals of the hollowing deterioration samples (except for the case when d_2 = 0 mm). The first pulse can be generated due to the reflection of the THz pulse from the front surface of the flaked stone of the hollowing deterioration samples. Meanwhile, the second and third reflection pulse can be generated from the rear surface of flaked stone and the front surface of the stone wall of the (1) n(ω) = �(ω)c ωd + 1, in the third pulse gradually increases with the increasing d_2. Therefore, the time delay of the third pulse can be sensitive to the d_2 valued with different thickness in the hollowing samples. In addition, according to reports [28,31,34], the shifted peaks and the existed sub-peaks following the primary THz peak can be attributed to the oscillations of hollowing deterioration samples in timedomain spectroscopy caused by the etalon effect in samples and optical components. After making clear that the THz signal can be used as the fingerprinting technology to identify the hollowing deterioration samples, our goal is to establish a function relationship between the hollowing thickness (d_2) However, the THz time-domain signal of samples can be greatly affected by the background noise, system noise, sample scattering, and the dispersion-based stretching of the THz pulse, resulting in the THz signal containing partially overlapping and shifted in phase pulses. These factors significantly obstruct determination of the correct conclusions, and therefore the conventional denoising and deconvolution methods are employed to eliminate the impact of noise interference and separate them in the detected THz signal [34,35]. Firstly, the data of THz time-domain signal of hollowing deterioration samples minus the data of THz time-domain signal of reference (when d_2 = 0 mm), that is, the influence of the first reflection pulse and environmental factors can be eliminated. However, this data-processing method may be not suitable for the case when the d_2 is small due to it is hard to distinguish the closely-lying second and third peaks of the THz pulse, as shown in Fig. 3. Secondly, using the result data of the first step subtract the averaged data of the second THz pulse data of THz timedomain signal of 20 samples with d_2 between 2.1 mm to 4.0 mm. After that, the data of THz time-domain signal for the hollowing deterioration samples with only the third peaks of the THz pulse can be obtained. The THz time-domain signal of the hollowing deterioration samples containing only the third pulse information are processed regular denoising and thus the time delay of the third pulse depended on the hollowing thickness (d_2) can be obtained as shown in Fig. 4.

Support vector machine (SVM)
LS-SVM uses the least squares linear systems instead of the quadratic programming method of the conventional SVM as the loss functions and thus is less complex than the conventional one. The loss function [27] can be solved by Lagrange method and the resulting classification discrimination function is given by: where a i is the Lagrange coefficient, b is the classification threshold, and K (x i , x j ) is a kernel function, respectively. Construction of the kernel function is the key step for SVM. Linear kernel K (x, x i ) = x T x i , polynomial kernel K (x, x i ) = (γx T x i + r) p , γ > 0, and radial basis function (RBF) kernel K (x, x i ) = exp(−γ||x i − x||) 2 (−γ||x i − x|| 2 ) are the common kernel functions. Thus, three SVM prediction models of hollowing deterioration can be developed by using the above three kernel functions, respectively. The optimal model parameters can be determined through the leave-one-out cross-validation (LOOCV) method to achieve optimal prediction results. The optimal model parameters include penalty coefficient (c) that controls empirical risk, insensitivity parameter (ε) that controls error boundaries, and radial basis coefficient (γ) that controls sensitivity of SVM to changes in the input variables [36].

SVM hollowing deterioration prediction model
It can be observed from the left side of Figs. 3 and 4 that there is an approximate linear correlation between d_2 thickness and the trough position of the third THz pulse signal of the hollowing deterioration samples. Therefore, the delay time of the third reflected pulse in the timedomain signal of the hollowing deterioration samples and d_2 can be set as the vectors for the SVM prediction model analysis. To develop the SVM-HDPM with an arbitrary thickness of d_2, 40 THz time-domain signals of the hollowing deterioration samples with different d_2 were selected. The time delay values of the troughs of the reflected wave were extracted from the time-domain signal of samples and taken as the eigenvalues along with d_2, which formed the feature vector, and then this vector was translated into the SVM light sample format [36]. In the next step, 30 feature vector data groups were randomly selected from the total datasets as training samples and the best parameters for the SVM models of hollowing deterioration were found by using the method of Leave-One-Out. The procedure of Leave-One-Out method is as follows: Use 29 of these 30 datasets as the training set, the remaining one as the test set, then loop through the next one as the test set, the remaining 29 as the training set, and so on. In this manner, SVM-HDPM can be constantly developed by using LS-SVM. The remaining 10 groups of feature vector data are used as test samples to verify the reliability of the developed SVM-HDPM.
For a quantitative comparison of differences of the three prediction models, main modeling and evaluation parameters are chosen, including c, ε, γ, and the mean square error (MSE). The relevant parameters of the three prediction models are listed in Table 1. It can be found that the SVM-HDPM using the linear kernel function can have the lowest MSE with the value as low as 3.303 E-4, which indicates that it can have the most superior prediction accuracy among the three models. Moreover, it is better than the traditional curve fitting method with MSE of 0.998 reported by Feng [37]. Therefore, the SVM-HDPM with the linear kernel function can have the most accurate in the predicting the thickness of the hollowing thickness. The SVM-HDPM with RBF kernel function ranks second in terms of prediction accuracy, and the SVM-HDPM with polynomial kernel function has the lowest prediction accuracy. Therefore, the linear kernel function can be regarded as the most suitable to develop the hollowing deterioration prediction model. In practical applications, the d_2 value can be predicted when inputting the time delay to the SVM-HDPM, which the time delay is the data of the third pulse information in the THz time-domain signal of the hollowing sample. Obviously, the SVM-HDPM can provide effective reference data for the timely restoration and maintenance of cultural relics.

Application of the SVM-HDPM
Actually, the flaked stone thickness of the hollowing deterioration can be rarely exactly equal to 2 mm in practice sandstone cultural relics. Therefore, to meet the needs of field-based test, data preprocess is a necessary step for data quality and reliability. Take flaked stone thickness of 1.5 mm as an example. In our model, the  second reflection pulse is generated from the rear surface of flaked stone of the hollowing deterioration, that is, the time delay of second reflection pulse can express the flaked stone thickness (d_1). Thus, THz signal can be a function of the time delay difference (Δt) of the second reflection pulse between the two-hollowing deterioration from their THz time-domain signal as exhibited in Fig. 5. The spectra in the small image are the corresponding range of THz spectra in the large image, which are denoised and smoothed using the loess (locally weighted regression) method. While the time delay of the third reflection pulse of the hollowing deterioration can be set as the input value after it is extracted, added the Δt and translated into the SVM light sample format. After input the value into the SVM-HDPM, the predict hollowing deterioration thickness can be obtained by the output value of SVM-HDPM. The diagram of detailed process is displayed in Fig. 6. It can be seen from Fig. 6, the five different d_2 hollowing deterioration samples with the fixed d_1 = 1.5 mm are further tested by THz-TDS, and the predicted theoretical values of d_2 by using the SVM-HDPM can be in well accordance with the experimental ones. More importantly, the results after data-processing can have the MSE values of 4.46 E-4, indicating the SVM_HDPM can have good applicability and high accuracy on the basis of very complicated application situation. In addition, it is believed that the SVM-HDPM can be applied to a wide range of stone relics, yielding accurate results despite variations of the thickness of hollowing deterioration sample whether it is randomized and double blind or not.

Conclusions
The LS-SVM regression algorithm using the THz spectral eigenvalues of hollowing deterioration of stone cultural relics as the input values and the optimal parameters can be determined by LOOCV after an appropriate kernel function is chosen. On this basis, the SVM prediction models of hollowing deterioration were proposed. It is observed that the developed models can predict the hollowing deterioration thickness (d_2) accurately by simply inputting characteristic spectral data obtained from THz NDT. The proposed THz-based method integrates the non-contact and non-destructive THz reflected signals from the sample surface can have the merits of simple The diagram of detailed process for predict hollowing deterioration thickness application of the SVM-HDPM structure, fast operation, high stability, and repeatability when compared with other conventional contact and invasive detection methods. Moreover, the prediction model not only provides a new method for non-destructive detection of deterioration in stone cultural relics but also has promising practical applications and future prospects.