Skip to main content

A color prediction model for mending materials of the Yuquan Iron Pagoda in China based on machine learning

Abstract

During the restoration of iron cultural relics, the removal of rust from these artifacts is necessary. However, this rust removal process may lead to inconsistent local color on the iron relics. To address this, mending materials are applied to treat the surface, ensuring consistent local color. In the surface treatment of iron cultural relics, a significant challenge lies in modulating the color of these mending materials. The corrosion products of Yuquan Iron Pagoda are mainly Fe3O4, γ-FeO(OH), α-FeO(OH) and α-Fe2O3, with contents of 13.1, 16.1, 40.2 and 30.6%, respectively. Due to their structural stability and suitable color characteristics, Fe3O4 and α-Fe2O3 are selected as the primary raw materials for the repair material. This study employs machine learning methods to predict the color of mending materials corresponding to varying contents of α-Fe2O3, Fe3O4, and epoxy resin. The Artificial Neural Network (ANN), eXtreme Gradient Boosting (XGBoost), and Light Gradient Boost Machine (LightGBM) algorithms are utilized to develop the model, and the predictive performance of these three algorithms is compared. XGBoost exhibits the best prediction performance, achieving a square correlation coefficient (R2) of 0.94238 and a mean absolute error (MAE) of 0.68485. Additionally, the SHapley Additive exPlanations (SHAP) method is employed to analyze the most crucial raw material affecting the color of mending materials, which is identified as Fe3O4. The study illustrates the specific process of employing this model by applying it to the surface treatment of the Yuquan Iron Pagoda, demonstrating the practicality of the model. This model can be applied to assist in the surface treatment of other iron cultural relics.

Introduction

The Yuquan Iron Pagoda

Iron cultural relics hold significant historical value and serve as crucial evidence in historical and cultural research. Among these relics, the Yuquan Iron Pagoda stands out as the tallest, heaviest, and most complete iron pagoda cultural relic in the country. Built in 1061, the Yuquan Iron Pagoda boasts a history of 963 years. The Yuquan Iron Pagoda stands at 16.945 m in height and weighs 26472.3 kg. Its surface features a total of 120 patterns, including 85 Buddha images, 2279 Buddha figures (2260 Buddha statues in existence), and 35 other decorative motifs. The intricate Buddha statues and decorative patterns highlight exceptional casting and carving craftsmanship. Additionally, there are 1762 inscriptions on the pagoda's surface, documenting its historical significance. Constructed in the form of a wooden pavilion, the pagoda's body is made of pig iron in sections without welding, relying solely on its weight for stability. The unique construction technology and design concept of the Yuquan Iron Pagoda offer valuable guidance in modern architectural design. As an outstanding material cultural heritage of the Chinese nation, the Yuquan Iron Pagoda serves as a repository of history and culture, representing a significant humanistic resource with profound social and artistic value.

The long-term effects of the natural environment have led to damage to the Yuquan Iron Pagoda [1,2,3], as depicted in Fig. 1. The Yuquan Iron Pagoda was dismantled, revealing a total of 54 components. Among these, only 2 pieces showed no obvious corrosion, while the remaining 52 components exhibited varying degrees of corrosion, as illustrated in Fig. 2. The rust affliction affecting the Yuquan Iron Pagoda has been notably severe, particularly impacting the inscriptions and cast images on the pagoda, thereby endangering the loss of its historical information.

Fig. 1
figure 1

The corrosion status of the Yuquan Iron Pagoda

Fig. 2
figure 2

Schematic diagram of rust lesion grades

In an inland neutral environment, iron absorbs water, forming a thin liquid film on its surface, which leads to electrochemical corrosion and the formation of a primary corrosion cell. The process begins with the anodic dissolution of iron, resulting in the formation of Fe2+ ions (as shown in Formula 1). Subsequently, oxygen (O2) from the air enters the water film on the iron surface, leading to the ionization of OH ions (as shown in Formula 2). Electrochemical corrosion of iron primarily occurs due to oxygen absorption corrosion. Under neutral and weakly alkaline conditions, the OH ions produced by the cathodic reaction migrate to the anode and combine with Fe2+ ions to form Fe(OH)2 (as shown in Formula 3) [4]. The specific chemical reaction formulas are presented in Table 1.

Table 1 Electrochemical decay chemical reaction formula of Fe

The Fe(OH)2 produced during the initial stages of cast iron corrosion is characterized by its loose, weak structure, limited thermal stability, and susceptibility to rupture. It reacts with O2 in the thin water film to form γ-FeO(OH), a process characterized by a standard free energy change (∆G298) < 0, indicating it is a spontaneous and irreversible process [5]. However, γ-FeO(OH) exhibits poor thermal stability and initially dissolves and precipitates into amorphous hydroxyl iron oxide (FeOx(OH)3-2x) under neutral conditions. Rusty material in this state is highly unstable and undergoes a solid-state transformation to form the thermally more stable α-FeO(OH) [6]. Additionally, in humid nighttime conditions, the less stable γ-FeO(OH) transforms into the more stable Fe3O4 [5]. Conversely, under dry daytime exposure, some of the γ-FeO(OH) gradually loses H2O, leading to the formation of the more stable reddish-brown compound hematite α-Fe2O3 [7].

These corrosions will not only destroy the artistic value of the Yuquan Iron Pagoda, but also jeopardize its safety and stability. Hence, the restoration of the Yuquan Iron Pagoda is of utmost importance. The initial step in restoring an iron pagoda involves rust removal. However, this process may result in certain areas of the pagoda surface appearing too bright, creating a significant difference in the local color of the iron pagoda. When restoring iron cultural relics, efforts should be made to preserve their original appearance, ensuring no visible traces of repair. This approach maintains the sense of history and original style of the restored iron cultural relics. Therefore, the Yuquan Iron Pagoda requires surface treatment to achieve consistent local color and restore the historical essence of the pagoda.

The surface treatment of the iron pagoda involves the application of mending materials that closely resemble the color of the pagoda onto the areas requiring treatment. These mending materials must not only exhibit excellent bonding with the iron pagoda but also possess a color similar to that of the pagoda. As illustrated in Fig. 1, the rust colors observed on the iron pagoda are predominantly yellow, red, and black. Yellow rust is deemed harmful due to its poor stability and cannot serve as a raw material for repair materials. The colors of Fe3O4 and α-Fe2O3 are black and red, respectively, resembling the rust of the iron pagoda and possessing stable properties [8]. Therefore, Fe3O4 and α-Fe2O3 can be utilized as raw materials for crafting mending materials for iron cultural relics [9]. These materials are employed in the surface treatment of iron anchors [10]. Additionally, epoxy resin, known for its excellent aging resistance and adhesion properties, can be used as a raw material for mending materials [11]. When mixed with mineral pigments, epoxy resin is utilized to create mending materials for the surface treatment of an iron bell [12].

Currently, the colors of mending materials corresponding to different raw material contents are determined through manual testing, which is labor-intensive and time-consuming. Moreover, the production process of mending materials is subject to human factors such as technical expertise and experience level, which may introduce errors in the results. Therefore, there is a need to find a more efficient method to quickly identify the raw material content associated with different colors of mending materials.

Machine learning is increasingly employed as a computer-aided tool across various fields. It can handle complex data, extracting patterns and trends to aid in predictions and decision-making processes. By reducing the need for human intervention, machine learning enhances efficiency. Leveraging these advantages, machine learning algorithms can be utilized to predict the color of mending materials.

Machine learning in cultural heritage conservation

Machine learning has made significant contributions to the restoration and preservation of cultural relics. It aids in the classification of artifacts, enhancing the efficiency of cultural relic protection efforts. For instance, the ANN model can classify ceramic artifacts based on their origin [13]. Deep Neural Network (DNN) and Support Vector Machine (SVM) algorithms have been utilized to classify architectural heritage [14]. In the conservation of immovable artifacts, machine learning models like Relevance Vector Machine (RVM) predict diseases, while the Gray Model (GM) and Verhulst model forecast crack trends in immovable artifacts [15,16,17]. Moreover, machine learning techniques are employed in predicting the aging degree and remaining life of heritage buildings, using models such as ANN and logistic regression [18, 19]. The XGBoost model has been applied to predict fire risk levels in heritage buildings [20]. Convolutional Neural Networks (CNNs) coupled with artificial data enhance the documentation of heritage buildings [21]. In the analysis of damage to stone artifacts, models like Least Squares Support Vector Machine (LSSVM) and SVM-based models predict cracks and deterioration [22, 23]. The ANN model aids in eliminating potential human errors in identifying weathering of stone artifacts [24]. Machine learning techniques also facilitate the visualization of special cultural relics, where deep learning and computer vision are combined to detect damage in images of ruins [25]. Additionally, machine learning can detect climate change within heritage buildings, with XGBoost and CNN being used to predict such changes [26, 27]. The diverse applications of machine learning in cultural heritage are summarized in Table 2.

Table 2 The application of machine learning in cultural heritage

This study collected data on the colors corresponding to different contents of α-Fe2O3, Fe3O4, and epoxy resin, along with their corresponding mending materials, through experiments. These datasets were employed to train three models: ANN, XGBoost, and LightGBM. The selection of the optimal model was based on a comparison of the predictive performance of the three models. Subsequently, the SHapley Additive exPlanations (SHAP) method was applied to analyze the impact of different raw materials on the color of mending materials [28, 29]. The study provides a detailed explanation of the application of this model and validates its practicability by utilizing it to produce mending materials of various colors required for the restoration of the Yuquan Iron Pagoda.

Research aim

This study seeks to employ a machine learning model to address the challenge of producing mending materials in different colors, with the ultimate goal of enhancing the efficiency of surface treatment for the Yuquan Iron Pagoda and providing valuable support for its restoration. The primary objectives include: (i) Increasing efficiency: The study aims to streamline the process of making mending materials, making the surface treatment of the Yuquan Iron Pagoda more efficient. (ii) Resource optimization: By leveraging machine learning, the study aims to reduce the wastage of manpower, material resources, and time associated with traditional methods of making mending materials. (iii) Restoration contribution: The overarching aim is to contribute significantly to the restoration efforts of iron cultural relics, with a specific focus on the Yuquan Iron Pagoda. By addressing these objectives, the study intends to make a meaningful impact on the restoration practices, ensuring a more efficient and resource-optimized approach to the surface treatment of cultural relics.

Materials and methods

Characterization of rust

To characterize the distribution of the main elements in the rust, an Energy Dispersive Spectrometer (EDS) from the American company FEI (Quanta 650) was used for elemental identification. X-ray diffraction (XRD) analysis of the rust was conducted using an X-ray diffractometer (D8/ADVANCE). The scanning of the detected diffraction angle (2θ) ranged from 5 to 55°.

Materials

In this experiment, a square iron plate was chosen and divided into small grids measuring 1 × 1 cm. Different contents of α-Fe2O3, Fe3O4, and epoxy resin were thoroughly mixed and stirred to produce 524 sets of mending materials with varying compositions. These 524 groups of mending materials were applied to the iron plate, as illustrated in Fig. 3. Following a curing and drying period of 7 days, the color of the 524 sets of samples was measured using a DS-620 spectrophotometer. The color data for the 524 sets of mending materials are provided in Table S1.

Fig. 3
figure 3

Iron plate with the mending materials

A spectrophotometer comprises a light source, integrating sphere, grating, and photodetector. During measurement, light reflected or transmitted from the object is split into different wavelengths by a beam splitter. The photodetector then measures the intensity of these wavelengths, converting them into digital signals to calculate the object’s color [30].

This device operates based on the relationship between light wavelength and intensity. When white light strikes the sample, it absorbs certain wavelengths and reflects or transmits others. The spectrophotometer breaks down these spectra, detecting absorption and reflection intensities across different wavelengths [30]. These signals are then processed to derive color data. Key features of spectrophotometers include: (1) unrestricted testing positions; (2) simulation of various light sources; (3) high measurement precision; and (4) capability to measure the “reflectance curve” of each color point.

Characterization of the color of the substance is expressed in terms of L, a, and b in the CIE Lab color space [31]. In this color space: L represents the brightness value, where a larger value indicates higher brightness. a denotes the red-green axis, with a positive value indicating red and a negative value indicating green. b represents the yellow-blue axis, where a positive value indicates yellow and a negative value indicates blue. The total color value, denoted as E, signifies the overall color condition of the substance. The formula for calculating E is as follows:

$$E=\sqrt{(L{)}^{2}+(a{)}^{2}+(b{)}^{2}}$$
(1)

Models

In this study, the approach for predicting the color of mending materials corresponding to different contents of Fe3O4, α-Fe2O3, and epoxy resin is outlined in Fig. 4. The strategy involves the following steps: (1) Experimental data collection: Colors of multiple groups of mending materials corresponding to different contents of Fe3O4, α-Fe2O3, and epoxy resin were obtained through experiments. (2) Model training: The experimental data were utilized to train three models-ANN, XGBoost, and LightGBM. (3) Model comparison: The prediction performance of the three models was compared, and the optimal model was selected based on the comparison results. (4) SHAP analysis: The SHAP method was employed to analyze the factors that have the greatest impact on the color of mending materials. By following this strategy, the study aims to develop an effective predictive model for the color of mending materials, providing insights into the key factors influencing color variations.

Fig. 4
figure 4

Modeling strategy for color prediction of mending materials

To compare the predictive performance of the three algorithms, the square correlation coefficient (R2) and mean absolute error (MAE) were employed as metrics to assess the accuracy and effectiveness of the predictive models [32].

$${R}^{2}=1-\frac{\sum_{i=1}^{N}{(\widehat{y}-y)}^{2}}{\sum_{i=1}^{N}{(\overline{y }-y)}^{2}}$$
(2)
$$MAE=\frac{1}{n}\sum_{i=1}^{n}\left|(\widehat{y}-y)\right|$$
(3)

In the Eqs. 2 and 3, \(y\) denotes the experimental value, \(\widehat{y}\) represents the estimated value, \(\overline{y }\) represents the mean of the experimental value.

In this study, three robust machine learning algorithms (i.e., ANN, XGBoost, and LightGBM) were chosen to construct models for predicting the color of mending material. ANN models are renowned for their capability to handle intricate nonlinear relationships, making them applicable across diverse fields, including physical property modeling. XGBoost and LightGBM, both gradient boosting algorithms, are distinguished for their predictive accuracy. Moreover, they offer feature importance assessments, enabling researchers to discern the most influential input variables in the prediction process. This interpretability facilitates valuable insights into the underlying relationships between input and output variables. By harnessing the strengths of ANN, XGBoost, and LightGBM, the authors aimed to develop a comprehensive and precise predictive model for mending material color.

ANN

ANN is composed of numerous neurons with interconnections between them, where each neuron represents an output function known as an activation function. The connections between neurons are represented by weights, indicating the weighted value of the signal passing through the connection [33]. The processing units in artificial neural networks are categorized into input units, hidden units, and output units, as illustrated in Fig. 5. In this model, MATLAB software was employed for programming. The input layer of the model includes the content of Fe3O4 powder, the content of α-Fe2O3 powder, and the content of epoxy resin. The output layer of the model represents the E-value of the mending materials. The model adopts a feedforward neural network structure, with the hidden layer applying the Tansig transfer function and the output layer utilizing the Purelin transfer function [34]. The ANN model is trained using the Levenberg–Marquardt algorithm [35].

Fig. 5
figure 5

The structure of the ANN model

XGBoost

XGBoost is an iterative boosting algorithm [36]. It combines a gradient boosting framework with a decision tree model, iteratively training a series of decision trees to progressively enhance prediction performance. XGBoost utilizes residuals to correct prediction variables in each iteration. Illustrated in Fig. 6, XGBoost employs a level-wise growth strategy with depth constraints. This strategy facilitates multithreading optimization and mitigates the risk of overfitting, making it a robust approach for enhancing prediction accuracy.

Fig. 6
figure 6

The learning process of XGBoost

LightGBM

LightGBM is a gradient-boosting framework that employs a histogram-based decision tree learning algorithm [37]. It combines two data collection and classification methods: Gradient One-Side Sampling (GOSS) and Exclusive Feature Bundling (EFB) [38]. Illustrated in Fig. 7, LightGBM adopts the leaf-wise growth strategy with depth constraints, enhancing accuracy and mitigating the risk of overfitting. This algorithm proves effective in improving predictive performance while maintaining robustness.

Fig. 7
figure 7

The learning process of LightGBM

SHAP

The SHAP method represents a novel and burgeoning approach to elucidate machine learning models. Rooted in the Shapley value theory from game theory, it gauges the contribution of each feature in a machine learning model, thereby elucidating the model's prediction outcomes [39]. The SHAP method offers several advantages, including interpretability, high efficiency, stability, and comprehensiveness. By leveraging the SHAP method, users can gain deeper insights into the decision-making process of the model, thereby enhancing its interpretability and credibility. This method is particularly suitable for analyzing the influence of various factors on the color of materials in the present study.

Results and discussion

Chemical analysis of rust

Table 3 demonstrates the elements and contents of rust measured by EDS. Figure 8 shows the XRD patterns of the rust and the contents of the different crystalline phases are shown in Table 4.

Table 3 Energy spectrum results for rust
Fig. 8
figure 8

XRD pattern of the rust

Table 4 XRD Semi-quantitative analysis results

As shown in Table 3, rust is mainly composed of Fe, C, and O, of which oxygen is the most abundant element. The Yuquan Iron Pagoda surface is mainly white cast iron, white cast iron in the C all in the form of permeable carbon (Fe3C), so the carbon content is high.

Rust was detected by diffraction peaks of Fe3O4, γ-FeO(OH), α-FeO(OH), and α-Fe2O3. Among them, Fe3O4 has a spinel-like structure, α-FeO(OH) has an orthorhombic crystal system structure, and α-Fe2O3 has a tripartite crystal system structure, and these three rust products are stable due to their structures.

In the pre-corrosion stage of the pagoda, the corrosion of iron takes place mainly in the thin water film on its surface. In this process, the iron matrix provides Fe and the reaction of O2 with the thin water layer provides OH. γ-FeO(OH) and α-FeO(OH) rusts keep appearing. γ-FeO(OH) is unstable and gradually forms the stable α-FeO(OH), Fe3O4 with α-Fe2O3.

Models comparison and SHAP analysis

To mitigate the impact of model overfitting, data cleaning and random assignment were employed for data processing. Data cleaning involves identifying and removing outliers in the data to reduce the risk of overfitting. Random allocation entails randomly dividing the data into training, validation, and testing sets to prevent pseudo-overfitting. The 524 sets of data from Table S1 are randomly divided into three sets: 366 for training (70%), 79 for validation (15%), and 79 for testing (15%). The training set is used to train the model, the validation set aids in adjusting model parameters, and the testing set evaluates the prediction performance of the trained model. The MAE and R2 of the three models are compared to assess their predictive performance. A smaller MAE or a larger R2 suggests better prediction performance for the model.

To optimize the ANN model, different numbers of neurons in the hidden layer are considered. Excessive neurons in the hidden layer may lead to overfitting, making it challenging for the model to train and converge [40]. As depicted in Fig. 9, when the number of neurons exceeds 8, the R2 of the testing set tends to be mostly negative, indicating overfitting. The performance of ANN models is compared with varying numbers of hidden layer neurons from 1 to 8. The results reveal that the optimal prediction performance for the ANN model is achieved when the number of hidden layer neurons is set to 2.

Fig. 9
figure 9

Effect of numbers of neurons in the hidden layer on ANN prediction performance

Table 5 presents the MAE and R2 of the three algorithm models. As observed from Table 5, the R2 values for the testing sets of all three models are greater than 0.9, indicating relatively good prediction performance for all algorithms. Among them, the XGBoost model exhibits the highest R2 and the lowest MAE for the testing set, with values of 0.94238 and 0.68485, respectively. This highlights that the XGBoost model outperforms the other two models in terms of prediction accuracy.

Table 5 Comparison of prediction performance of three algorithm models

For a more detailed analysis of the prediction performance of different algorithm models, the experimental values were compared with the predicted values of the three models, as illustrated in Fig. 10. In the Fig. 10, most of the data points are clustered around the diagonal line, indicating relatively good prediction performance for all three models. Notably, the data points of the training set for the ANN model and the validation set for the LightGBM model exhibit some scattering, whereas the data points for the XGBoost model align closely with the diagonal. This indicates that the XGBoost model achieves the highest prediction accuracy among the three models.

Fig. 10
figure 10

Comparison of predicted and experimental values via ANN (a), XGBoost (b), and LightGBM (c)

To further compare the prediction performance of the model, the errors between the experimental values and the predicted values were calculated, as depicted in Fig. 11. In the figure, the errors of most data points are close to zero, with the XGBoost model exhibiting the smallest errors. This confirms that the prediction performance of XGBoost is superior to the ANN model and LightGBM model in this study.

Fig. 11
figure 11

Relative errors between predicted and experimental values via ANN (a), XGBoost (b), and LightGBM (c)

To further verify the accuracy of the XGBoost model prediction, three sets of mending materials corresponding to Fe3O4, α-Fe2O3, and epoxy resin in different contents were produced. Subsequently, the colors of the three groups of mending materials were measured. Three sets of Fe3O4, α-Fe2O3, and epoxy resin content data were input into the trained XGBoost model and the predicted values were obtained. The experimental measurement values were compared with the predicted values. The comparison results show that the maximum relative error between the experimental value and the predicted value of the XGBoost model is 0.02, as shown in Table 6. These three sets of data confirm that the XGBoost model exhibits good prediction effects and high accuracy.

Table 6 Comparison of experimental and predicted values for three groups of mending materials

To study the influence of each raw material on the color of mending materials, SHAP feature analysis was performed on each raw material, as shown in Fig. S1. Fig. S1a, b show the relationship between the three raw materials and the color of mending materials. It can be seen from the figure that the order of the influence of the three raw materials on the color of mending materials is Fe3O4, α-Fe2O3, and epoxy resin. When the content of epoxy resin and α-Fe2O3 is fixed, as the Fe3O4 content increases, the E-value of the mending materials becomes smaller. Fe3O4 has almost no effect on the red-green and yellow-blue colors of mending materials. This shows that as the content of Fe3O4 raw materials increases, the color of mending materials becomes darker. Fig. S1c shows the effect of co-action between the raw materials on the color of the mending materials. It can be seen that the combined effect of Fe3O4 and α-Fe2O3 has a greater effect on the color of the mending materials. The effect of the epoxy resin together with the other two raw materials on the color of the mending materials is not significant.

Application of the XGBoost model

The model is helpful for the surface treatment of iron cultural relics. The process of finding the content of α-Fe2O3, Fe3O4, and epoxy resin corresponding to mending materials using the XGBoost model is shown in Fig. 12. The E-value data of mending materials corresponding to different contents of α-Fe2O3, Fe3O4, and epoxy resin were used to train the XGBoost model to obtain the trained model. As illustrated in the code provided in Table S3, the trained model is saved in a file with the suffix “joblib”, which can be called directly in the “Jupyter Notebook” application. Multiple sets of α-Fe2O3, Fe3O4, and epoxy resin data with different contents were artificially assumed, and these fictitious data were fed into the trained model. The E-values of the mending materials corresponding to these fictitious different contents of α-Fe2O3, Fe3O4, and epoxy resin will be generated by the trained model. These data are collected into a database. EDatabase is the E-value corresponding to the different contents of α-Fe2O3, Fe3O4, and epoxy in the database.

Fig. 12
figure 12

Application of the XGBoost model

At the same time, the iron cultural relics are pretreated with rust removal. The areas where the color differs significantly from its surrounding areas are selected as experimental areas. The color of the areas surrounding the experimental areas is measured. Eiron cultural relics is the E-value of the areas surrounding the experimental areas.

The E-value of the areas surrounding the experimental areas is compared to the E-value in the database. If \(\frac{|{E}_{Database}-{E}_{Iron cultural relics}|}{|{E}_{Database}|}>0.01\), re-assume different contents of α-Fe2O3, Fe3O4, and epoxy resin data, and these data are input into the trained model. If \(\frac{|{E}_{Database}-{E}_{Iron cultural relics}|}{|{E}_{Database}|}\le 0.01\), the contents of α-Fe2O3, Fe3O4, and epoxy resin corresponding to EDatabase are the contents of raw material for the required mending materials.

This study takes the surface treatment of the Yuquan Iron Pagoda as a case to specifically illustrate the application process of this model. The rust on the iron pagoda has been removed, and the iron pagoda after rust removal is shown in Fig. 13a, c, and e. As can be seen in these three figures, the colors of areas A1, A2, and A3 are significantly different from the colors of their surrounding areas B1, B2, and B3, respectively. The diameters of regions A1, A2, and A3 are 29.43 mm, 28.11 mm, and 37.31 mm, respectively. The diameters of regions B1, B2, and B3 are 43.01 mm, 34.67 mm, and 49.83 mm, respectively. Three points within areas A1, A2, and A3 were selected separately, and these nine points were measured for color and recorded in Table 7. From Table 7, it can be seen that the difference between the L-values, a-values, b-values, and E-values of the three points W1.1, W1.2, and W1.3 within area A1 is less than 2. Therefore, the average of the L-values, a-values, b-values, and E-values of these three points is chosen as the ‾L-value, ‾a-value, ‾b-value, and ‾E-value for area A1. Similarly, the ‾L-values, ‾a-values, ‾b-values, and ‾E-values of areas A2 and A3 are the averages of the L-values, a-values, b-values, and E-values of the three points in their areas. These ‾L-values, ‾a-values, ‾b-values, and ‾E-values of areas A1, A2, and A3 are recorded in Table 8.

Fig. 13
figure 13

The Yuquan Iron Pagoda before and after surface treatment

Table 7 Color of points in areas A1, A2, and A3
Table 8 Color of areas A and B

Three points within the surrounding areas B1, B2, and B3 were selected respectively, and the colors of these points were measured and recorded in Table 9. From Table 9, it can be seen that the difference between the L-values, a-values, b-values, and E-values of the three points within area B2 is less than 2, so the average of the L-values, a-values, b-values, and E-values of these three points was chosen as the ‾L-value, ‾a-value, ‾b-value and ‾E-value for area B2. Similarly, the ‾L-value, ‾a-value, ‾b-value, and ‾E-value of area B3 is the average of the L-values, a-values, b-values, and E-values of points P3.1, P3.2, and P3.3. The L-value, a-value, b-value, and E-value of P1.3 in area B1 differ significantly from P1.1 and P1.2, which may be due to excessive errors caused by human measurements. So, the ‾L-value, ‾a-value, ‾b-value, and ‾E-value for area B1 is the average of the L-values, a-values, b-values, and E-values of points P1.1 and P1.2. These ‾L-values, ‾a-values, ‾b-values, and ‾E-values of areas B1, B2, and B3 are recorded in Table 8.

Table 9 Color of points in areas B1, B2, and B3

To perform surface treatment on areas A1, A2, and A3, the mending materials NA1, NA2, and NA3 for these areas A1, A2, and A3 need to be made. The E-values of mending materials NA1, NA2, and NA3 should be close to the ‾E-values of areas B1, B2, and B3. The corresponding α-Fe2O3, Fe3O4, and epoxy resin content of these mending materials NA1, NA2, and NA3 can be found in the database. EDatabase is the E-value corresponding to the different contents of α-Fe2O3, Fe3O4, and epoxy in the database. EB is the ‾E-value of the area B. ENA is the E-value of the mending material NA.

Retrieve the data in the database, if \(\frac{|{E}_{Database}-{E}_{B}|}{|{E}_{Database}|}\le 0.01\), then this EDatabase is the ENA of the mending material NA required for area A. The contents of α-Fe2O3, Fe3O4, and epoxy resin corresponding to ENA are the contents of the raw materials for mending materials NA required in area A. These data are presented in Table 10.

Table 10 E-value of the mending materials with its corresponding α-Fe2O3, Fe3O4, and epoxy resin content

The mending materials required for areas A1, A2, and A3 were made according to the data shown in Table 10, and these mending materials were applied to areas A1, A2, and A3, respectively. The effects of the surface treatment of areas A1, A2, and A3 are shown in Fig. 13b, d, and f.

Comparing the iron pagoda before and after surface treatment, it was found that the partial color of the iron pagoda after surface treatment was almost the same, and the surface treatment effect was excellent. The surface-treated pagoda is more beautiful and has more historical and artistic value.

The use of this model to assist in the surface treatment of iron cultural relics reduces the waste of manpower, material resources, and time, and resolves errors caused by human factors.

To clearly express the meaning of each letter, the letters are listed with their corresponding meanings in Table 11.

Table 11 Letters and their corresponding meanings

Conclusion

In this study, chemical analysis of the rust from the Yuquan Iron Pagoda revealed the presence of several compounds, predominantly Fe3O4, γ-FeO(OH), α-FeO(OH), and α-Fe2O3, constituting approximately 13.1, 16.1, 40.2, and 30.6% of the rust, respectively. Due to their structural stability and suitable color characteristics, Fe3O4 and α-Fe2O3 were selected as the primary raw materials for the repair material.

Machine learning models were developed to predict the color of mending materials by mixing α-Fe2O3, Fe3O4, and epoxy resin in different contents. The resulting 524 groups of mending materials were experimentally measured and recorded as training data for the models. Models based on ANN, XGBoost, and LightGBM algorithms were constructed, showing good predictive performance. The XGBoost model exhibited the best performance with an MAE of 0.68485 and an R2 of 0.94238. The SHAP analysis highlighted that the content of Fe3O4 had the most significant impact on the color of mending materials.

Applying this model to the surface treatment of the Yuquan Iron Pagoda has enhanced work efficiency and addressed issues related to resource waste. The E-values corresponding to the mending materials required for different parts of the Yuquan Iron Pagoda are 26.73, 25.62, and 25.85. By inputting these values into the database, we can swiftly retrieve the contents of α-Fe2O3, Fe3O4, and epoxy resin associated with each E-value. For instance, the mending material with an E-value of 26.73 corresponds to α-Fe2O3, Fe3O4, and epoxy resin contents of 0.11 g, 0.84 g, and 2.62 g, respectively. Similarly, the mending material with an E-value of 25.62 corresponds to α-Fe2O3, Fe3O4, and epoxy resin contents of 0.60 g, 0.97 g, and 2.38 g, respectively. Lastly, the mending material with an E-value of 25.85 corresponds to α-Fe2O3, Fe3O4, and epoxy resin contents of 0.54 g, 0.91 g, and 2.40 g, respectively. The success of this model in the Yuquan Iron Pagoda case suggests its potential application in the restoration of other iron cultural relics, such as the Cangzhou Iron Lion. Overall, this research introduces a novel method for surface treatment in iron cultural relic restoration, contributing to the preservation and restoration efforts in this field.

Availability of data and materials

The datasets used and/or analysed during the current study are available from the corresponding author on reasonable request.

Abbreviations

ANN:

Artificial neural network

XGBoost:

EXtreme Gradient Boosting

LightGBM:

Light Gradient Boosting Machine

R2:

Squared correlation coefficients

MAE:

Mean absolute error

SHAP:

SHapley Additive exPlanations

References

  1. Dillmann P, Mazaudier F, Hœrlé S. Advances in understanding atmospheric corrosion of iron. I. Rust characterisation of ancient ferrous artefacts exposed to indoor atmospheric corrosion. Corros Sci. 2004;46:1401–29.

    Article  CAS  Google Scholar 

  2. Hœrlé S, Mazaudier F, Dillmann P, Santarini G. Advances in understanding atmospheric corrosion of iron. II. Mechanistic modelling of wet-dry cycles. Corros Sci. 2004;46:1431–65.

    Article  Google Scholar 

  3. Hu P, Jia M, Li M, Sun J, Cui Y, Hu D, Hu G. Corrosion behavior of ancient white cast iron artifacts from marine excavations at atmospheric condition. Materials. 2022;12:921–33.

    CAS  Google Scholar 

  4. Misawa T, Kyuno T, Suetaka W, Shimodaira S. The mechanism of atmospheric rusting and the effect of Cu and P on the rust formation of low alloy steels. Corros Sci. 1971;11:35–48.

    Article  CAS  Google Scholar 

  5. Oosterhout GW. The transformation γ-FeO(OH) to α-FeO(OH). J Inorg Nucl Chem. 1967;29:1235–8.

    Article  Google Scholar 

  6. Tanaka H, Mishima R, Hatanaka N, Ishikawa T, Nakayama T. Formation of magnetite rust particles by reacting iron powder with artificial α-, β- and γ-FeOOH in aqueous media. Corros Sci. 2014;78:384–7.

    Article  CAS  Google Scholar 

  7. Misawa T, Asami K, Hashimoto K, Shimodaira S. The mechanism of atmospheric rusting and the protective amorphous rust on low alloy steel. Corros Sci. 1974;14:279–89.

    Article  CAS  Google Scholar 

  8. Khanam J, Hasan MR, Biswas B, Jahan S, Sharmin N, Ahmed S, Al-Reza S. Development of ceramic grade red iron oxide pigment from waste iron source. Heliyon. 2023;9:12854–67.

    Article  Google Scholar 

  9. Jia M, Hu P, Hu G. Corrosion layers on archaeological cast iron from Nanhai I. Materials. 2022;15:4980–95.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. Liu T. Qing dynasty iron anchors in the collection of guangdong institute of cultural relics and archaeology protection and restoration. Hakka Cult Herit. 2022;1:36–43.

    Google Scholar 

  11. Wang Y, Liu K, Wang C, Zhou S. Influence of solution concentration and temperature on the repair effect for electrophoretic deposition of rust-cracked reinforced concrete. J Build Eng. 2022;56:104772–86.

    Article  Google Scholar 

  12. Li N, Guo J. Introduction to the conservation and restoration of the iron bells of the Jinzi Museum. World Antiq. 2017;6:74–8.

    Google Scholar 

  13. Barone G, Mazzoleni P, Spagnolo GV, Raneri S. Artificial neural network for the provenance study of archaeological ceramics using clay sediment database. J Cult Herit. 2019;38:147–57.

    Article  Google Scholar 

  14. Artopoulos G, Maslioukova MI, Zavou C, Loizou M, Deligiorgi M, Averkiou M. An artificial neural network framework for classifying the style of cypriot hybrid examples of built heritage in 3D. J Cult Herit. 2023;63:135–47.

    Article  Google Scholar 

  15. Liu B, Mu K, Ye F, Deng J, Wang J. Immovable cultural relics disease prediction based on relevance vector machine. Math Probl Eng. 2020;2020:1–9.

    CAS  Google Scholar 

  16. Liu B, Ye F, Mu K, Wang J, Zhang J. Wavelet correlation analysis relevance vector machine diseases prediction for immovable cultural relics. Evol Intel. 2021;15:2679–90.

    Article  Google Scholar 

  17. X. Zhang, H. Wang, Z. Wang, T. Ma, Q. Shang, W. Li, Open-air unmovable cultural relics health trend prediction, In: Proceedings of the 2016 International Forum on Management, Education and Information Technology Application (2016) 838–841.

  18. El-Fetouh AA, Mohamed H, Shawky M. A framework based on geo-information neural system (GINS) for predicting remaining life of heritage buildings assets. Int J Comput Appl. 2012;58:5–11.

    Google Scholar 

  19. Chen S, Chen J, Yu J, Wang T, Xu J. Prediction of deterioration level of heritage buildings using a logistic regression model. Buildings. 2023;13:1006–17.

    Article  CAS  Google Scholar 

  20. Lei Y, Shen Z, Tian F, Yang X, Wang F, Pan R, Wang H, Jiao S, Kou W. Fire risk level prediction of timber heritage buildings based on entropy and XGBoost. J Cult Herit. 2023;63:11–22.

    Article  Google Scholar 

  21. Monna F, Rolland T, Denaire A, Navarro N, Granjon L, Barbé R, Chateau-Smith C. Deep learning to detect built cultural heritage from satellite imagery. -Spatial distribution and size of vernacular houses in Sumba, Indonesia. J Cult Herit. 2021;52:171–83.

    Article  Google Scholar 

  22. Liu B, Ye F, Mu K, Wang J, Zhang J. Crack prediction based on wavelet correlation analysis least squares support vector machine for stone cultural relics. Math Probl Eng. 2021;2021:1–10.

    Google Scholar 

  23. Meng T, Huang R, Lu Y, Liu H, Ren J, Zhao G, Hu W. Highly sensitive terahertz non-destructive testing technology for stone relics deterioration prediction using SVM-based machine learning models. Herit Sci. 2021;9:1–9.

    Article  Google Scholar 

  24. Hatir ME, Barstuğan M, İnce İ. Deep learning-based weathering type recognition in historical stone monuments. J Cult Herit. 2020;45:193–203.

    Article  Google Scholar 

  25. Pathak R, Saini A, Wadhwa A, Sharma H, Sangwan D. An object detection approach for detecting damages in heritage sites using 3-D point clouds and 2-D visual data. J Cult Herit. 2021;48:74–82.

    Article  Google Scholar 

  26. Boesgaard C, Hansen BV, Kejser UB, Mollerup SH, Ryhl-Svendsen M, Torp-Smith N. Prediction of the indoor climate in cultural heritage buildings through machine learning: first results from two field tests. Herit Sci. 2022;10:176–88.

    Article  Google Scholar 

  27. Miglioranza P, Scanu A, Simionato G, Califano N. Machine learning and engineering feature approaches to detect events perturbing the indoor microclimate in Ringebu and Heddal stave churches (Norway). Int J Build Pathol Adapt. 2024;42:35–47.

    Article  Google Scholar 

  28. Wen X, Xie Y, Wu L, Jiang L. Quantifying and comparing the effects of key risk factors on various types of roadway segment crashes with LightGBM and SHAP. Accid Anal Prev. 2021;159:106261–72.

    Article  PubMed  Google Scholar 

  29. Ekanayake IU, Meddage DPP, Rathnayake U. A novel approach to explain the black-box nature of machine learning in compressive strength predictions of concrete using Shapley additive explanations (SHAP). Case Stud Constr Mater. 2022;16:01059–79.

    Google Scholar 

  30. K. Yuan, H. Zhou, Y. Wu, C. Wang, S. Jin, Spectrophotometric colorimeter based on LED light source and method for realizing the same: U.S. Patent 9243953, 2016–1–26.

  31. Malounas I, Lentzou D, Xanthopoulos G, Fountas S. Testing the suitability of automated machine learning, hyperspectral imaging and CIELAB color space for proximal in situ fertilization level classification. Smart Agric Technol. 2024;8:100437–48.

    Article  Google Scholar 

  32. Chicco D, Warrens MJ, Jurman G. The coefficient of determination R-squared is more informative than SMAPE, MAE, MAPE, MSE and RMSE in regression analysis evaluation. PeerJ Comput Sci. 2021;7:623–46.

    Article  Google Scholar 

  33. Agatonovic-Kustrin S, Beresford R. Basic concepts of artificial neural network (ANN) modeling and its application in pharmaceutical research. J Pharm Biomed Anal. 2000;22:717–27.

    Article  CAS  PubMed  Google Scholar 

  34. Lei Y, Shu Y, Liu X, Liu X, Wu X, Chen Y. Predictive modeling on the surface tension and viscosity of ionic liquid-organic solvent mixtures via machine learning. J Taiwan Inst Chem Eng. 2023;151:105140–55.

    Article  CAS  Google Scholar 

  35. Mammadli S. Financial time series prediction using artificial neural network based on Levenberg-Marquardt algorithm. Procedia Comput Sci. 2017;120:602–7.

    Article  Google Scholar 

  36. T. Chen, C. Guestrin, XGBoost: A scalable tree boosting system, In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2016) 785–794.

  37. Ke G, Meng Q, Finley T, Wang T, Chen W, Ma W, Ye Q, Liu TY. A highly efficient gradient boosting decision tree. Adv Neural Inform Process Syst. 2017;30:3146–54.

    Google Scholar 

  38. Park JH, Jo HS, See SH, Oh SW, Na MG. A reliable intelligent diagnostic assistant for nuclear power plants using explainable artificial intelligence of GRU-AE, LightGBM and SHAP. Nucl Eng Technol. 2022;54:1271–87.

    Article  CAS  Google Scholar 

  39. Lundberg S, Lee S. A unified approach to interpreting model predictions. Adv Neural Inf Process Syst. 2017;30:4768–77.

    Google Scholar 

  40. LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015;521:436–44.

    Article  CAS  PubMed  Google Scholar 

Download references

Acknowledgements

Not applicable.

Funding

This research was supported by the Open Project of Key Scientific Research Base of the National Cultural Heritage Administration for the Protection of Unearthed Wood Lacquerware (2021H10198, 2023H10017).

Author information

Authors and Affiliations

Authors

Contributions

XL and YL wrote the main manuscript text; KW and YZ revised it critically for important intellectual content; YL and YC made substantial contributions to the conception or design of the work; HA and MW prepared figures. All authors reviewed the manuscript.

Corresponding authors

Correspondence to Yang Lei or Yuqiu Chen.

Ethics declarations

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

40494_2024_1295_MOESM1_ESM.xlsx

Supplementary Material1 Table S 1 summarizes experimental data measured by the colorimeter at different ratios of α-Fe2O3, Fe3O4, and epoxy resin. Table S 2 summarizes ANN code via MATLAB 2019b. Table S 3 summarizes XGBoost and LightGBM code via Jupyter Notebook. Fig. S 1 summarizes the SHAP summary plot (a), SHAP feature importance plot (b), and SHAP interaction plot (c).

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liu, X., Liu, Y., Wang, K. et al. A color prediction model for mending materials of the Yuquan Iron Pagoda in China based on machine learning. Herit Sci 12, 183 (2024). https://doi.org/10.1186/s40494-024-01295-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s40494-024-01295-1

Keywords