A color prediction model for mending materials of the Yuquan Iron Pagoda in China based on machine learning

Liu, Xuegang; Liu, Yuhang; Wang, Ke; Zhang, Yang; Lei, Yang; An, Hai; Wang, Mingqiang; Chen, Yuqiu

doi:10.1186/s40494-024-01295-1

Research
Open access
Published: 06 June 2024

A color prediction model for mending materials of the Yuquan Iron Pagoda in China based on machine learning

Xuegang Liu¹,
Yuhang Liu²,
Ke Wang²,
Yang Zhang³,
Yang Lei²,
Hai An⁴,
Mingqiang Wang⁵ &
…
Yuqiu Chen⁶

Heritage Science volume 12, Article number: 183 (2024) Cite this article

245 Accesses
Metrics details

Abstract

During the restoration of iron cultural relics, the removal of rust from these artifacts is necessary. However, this rust removal process may lead to inconsistent local color on the iron relics. To address this, mending materials are applied to treat the surface, ensuring consistent local color. In the surface treatment of iron cultural relics, a significant challenge lies in modulating the color of these mending materials. The corrosion products of Yuquan Iron Pagoda are mainly Fe₃O₄, γ-FeO(OH), α-FeO(OH) and α-Fe₂O₃, with contents of 13.1, 16.1, 40.2 and 30.6%, respectively. Due to their structural stability and suitable color characteristics, Fe₃O₄ and α-Fe₂O₃ are selected as the primary raw materials for the repair material. This study employs machine learning methods to predict the color of mending materials corresponding to varying contents of α-Fe₂O₃, Fe₃O₄, and epoxy resin. The Artificial Neural Network (ANN), eXtreme Gradient Boosting (XGBoost), and Light Gradient Boost Machine (LightGBM) algorithms are utilized to develop the model, and the predictive performance of these three algorithms is compared. XGBoost exhibits the best prediction performance, achieving a square correlation coefficient (R²) of 0.94238 and a mean absolute error (MAE) of 0.68485. Additionally, the SHapley Additive exPlanations (SHAP) method is employed to analyze the most crucial raw material affecting the color of mending materials, which is identified as Fe₃O₄. The study illustrates the specific process of employing this model by applying it to the surface treatment of the Yuquan Iron Pagoda, demonstrating the practicality of the model. This model can be applied to assist in the surface treatment of other iron cultural relics.

Introduction

The Yuquan Iron Pagoda

Iron cultural relics hold significant historical value and serve as crucial evidence in historical and cultural research. Among these relics, the Yuquan Iron Pagoda stands out as the tallest, heaviest, and most complete iron pagoda cultural relic in the country. Built in 1061, the Yuquan Iron Pagoda boasts a history of 963 years. The Yuquan Iron Pagoda stands at 16.945 m in height and weighs 26472.3 kg. Its surface features a total of 120 patterns, including 85 Buddha images, 2279 Buddha figures (2260 Buddha statues in existence), and 35 other decorative motifs. The intricate Buddha statues and decorative patterns highlight exceptional casting and carving craftsmanship. Additionally, there are 1762 inscriptions on the pagoda's surface, documenting its historical significance. Constructed in the form of a wooden pavilion, the pagoda's body is made of pig iron in sections without welding, relying solely on its weight for stability. The unique construction technology and design concept of the Yuquan Iron Pagoda offer valuable guidance in modern architectural design. As an outstanding material cultural heritage of the Chinese nation, the Yuquan Iron Pagoda serves as a repository of history and culture, representing a significant humanistic resource with profound social and artistic value.

The long-term effects of the natural environment have led to damage to the Yuquan Iron Pagoda [1,2,3], as depicted in Fig. 1. The Yuquan Iron Pagoda was dismantled, revealing a total of 54 components. Among these, only 2 pieces showed no obvious corrosion, while the remaining 52 components exhibited varying degrees of corrosion, as illustrated in Fig. 2. The rust affliction affecting the Yuquan Iron Pagoda has been notably severe, particularly impacting the inscriptions and cast images on the pagoda, thereby endangering the loss of its historical information.

In an inland neutral environment, iron absorbs water, forming a thin liquid film on its surface, which leads to electrochemical corrosion and the formation of a primary corrosion cell. The process begins with the anodic dissolution of iron, resulting in the formation of Fe²⁺ ions (as shown in Formula 1). Subsequently, oxygen (O₂) from the air enters the water film on the iron surface, leading to the ionization of OH⁻ ions (as shown in Formula 2). Electrochemical corrosion of iron primarily occurs due to oxygen absorption corrosion. Under neutral and weakly alkaline conditions, the OH⁻ ions produced by the cathodic reaction migrate to the anode and combine with Fe²⁺ ions to form Fe(OH)₂ (as shown in Formula 3) [4]. The specific chemical reaction formulas are presented in Table 1.

Table 1 Electrochemical decay chemical reaction formula of Fe

Full size table

The Fe(OH)₂ produced during the initial stages of cast iron corrosion is characterized by its loose, weak structure, limited thermal stability, and susceptibility to rupture. It reacts with O₂ in the thin water film to form γ-FeO(OH), a process characterized by a standard free energy change (∆G₂₉₈) < 0, indicating it is a spontaneous and irreversible process [5]. However, γ-FeO(OH) exhibits poor thermal stability and initially dissolves and precipitates into amorphous hydroxyl iron oxide (FeO_x(OH)_3-2x) under neutral conditions. Rusty material in this state is highly unstable and undergoes a solid-state transformation to form the thermally more stable α-FeO(OH) [6]. Additionally, in humid nighttime conditions, the less stable γ-FeO(OH) transforms into the more stable Fe₃O₄ [5]. Conversely, under dry daytime exposure, some of the γ-FeO(OH) gradually loses H₂O, leading to the formation of the more stable reddish-brown compound hematite α-Fe₂O₃ [7].

These corrosions will not only destroy the artistic value of the Yuquan Iron Pagoda, but also jeopardize its safety and stability. Hence, the restoration of the Yuquan Iron Pagoda is of utmost importance. The initial step in restoring an iron pagoda involves rust removal. However, this process may result in certain areas of the pagoda surface appearing too bright, creating a significant difference in the local color of the iron pagoda. When restoring iron cultural relics, efforts should be made to preserve their original appearance, ensuring no visible traces of repair. This approach maintains the sense of history and original style of the restored iron cultural relics. Therefore, the Yuquan Iron Pagoda requires surface treatment to achieve consistent local color and restore the historical essence of the pagoda.

The surface treatment of the iron pagoda involves the application of mending materials that closely resemble the color of the pagoda onto the areas requiring treatment. These mending materials must not only exhibit excellent bonding with the iron pagoda but also possess a color similar to that of the pagoda. As illustrated in Fig. 1, the rust colors observed on the iron pagoda are predominantly yellow, red, and black. Yellow rust is deemed harmful due to its poor stability and cannot serve as a raw material for repair materials. The colors of Fe₃O₄ and α-Fe₂O₃ are black and red, respectively, resembling the rust of the iron pagoda and possessing stable properties [8]. Therefore, Fe₃O₄ and α-Fe₂O₃ can be utilized as raw materials for crafting mending materials for iron cultural relics [9]. These materials are employed in the surface treatment of iron anchors [10]. Additionally, epoxy resin, known for its excellent aging resistance and adhesion properties, can be used as a raw material for mending materials [11]. When mixed with mineral pigments, epoxy resin is utilized to create mending materials for the surface treatment of an iron bell [12].

Currently, the colors of mending materials corresponding to different raw material contents are determined through manual testing, which is labor-intensive and time-consuming. Moreover, the production process of mending materials is subject to human factors such as technical expertise and experience level, which may introduce errors in the results. Therefore, there is a need to find a more efficient method to quickly identify the raw material content associated with different colors of mending materials.

Machine learning is increasingly employed as a computer-aided tool across various fields. It can handle complex data, extracting patterns and trends to aid in predictions and decision-making processes. By reducing the need for human intervention, machine learning enhances efficiency. Leveraging these advantages, machine learning algorithms can be utilized to predict the color of mending materials.

Machine learning in cultural heritage conservation

Machine learning has made significant contributions to the restoration and preservation of cultural relics. It aids in the classification of artifacts, enhancing the efficiency of cultural relic protection efforts. For instance, the ANN model can classify ceramic artifacts based on their origin [13]. Deep Neural Network (DNN) and Support Vector Machine (SVM) algorithms have been utilized to classify architectural heritage [14]. In the conservation of immovable artifacts, machine learning models like Relevance Vector Machine (RVM) predict diseases, while the Gray Model (GM) and Verhulst model forecast crack trends in immovable artifacts [15,16,17]. Moreover, machine learning techniques are employed in predicting the aging degree and remaining life of heritage buildings, using models such as ANN and logistic regression [18, 19]. The XGBoost model has been applied to predict fire risk levels in heritage buildings [20]. Convolutional Neural Networks (CNNs) coupled with artificial data enhance the documentation of heritage buildings [21]. In the analysis of damage to stone artifacts, models like Least Squares Support Vector Machine (LSSVM) and SVM-based models predict cracks and deterioration [22, 23]. The ANN model aids in eliminating potential human errors in identifying weathering of stone artifacts [24]. Machine learning techniques also facilitate the visualization of special cultural relics, where deep learning and computer vision are combined to detect damage in images of ruins [25]. Additionally, machine learning can detect climate change within heritage buildings, with XGBoost and CNN being used to predict such changes [26, 27]. The diverse applications of machine learning in cultural heritage are summarized in Table 2.

Table 2 The application of machine learning in cultural heritage

Full size table

This study collected data on the colors corresponding to different contents of α-Fe₂O₃, Fe₃O₄, and epoxy resin, along with their corresponding mending materials, through experiments. These datasets were employed to train three models: ANN, XGBoost, and LightGBM. The selection of the optimal model was based on a comparison of the predictive performance of the three models. Subsequently, the SHapley Additive exPlanations (SHAP) method was applied to analyze the impact of different raw materials on the color of mending materials [28, 29]. The study provides a detailed explanation of the application of this model and validates its practicability by utilizing it to produce mending materials of various colors required for the restoration of the Yuquan Iron Pagoda.

Research aim

This study seeks to employ a machine learning model to address the challenge of producing mending materials in different colors, with the ultimate goal of enhancing the efficiency of surface treatment for the Yuquan Iron Pagoda and providing valuable support for its restoration. The primary objectives include: (i) Increasing efficiency: The study aims to streamline the process of making mending materials, making the surface treatment of the Yuquan Iron Pagoda more efficient. (ii) Resource optimization: By leveraging machine learning, the study aims to reduce the wastage of manpower, material resources, and time associated with traditional methods of making mending materials. (iii) Restoration contribution: The overarching aim is to contribute significantly to the restoration efforts of iron cultural relics, with a specific focus on the Yuquan Iron Pagoda. By addressing these objectives, the study intends to make a meaningful impact on the restoration practices, ensuring a more efficient and resource-optimized approach to the surface treatment of cultural relics.

Materials and methods

Characterization of rust

To characterize the distribution of the main elements in the rust, an Energy Dispersive Spectrometer (EDS) from the American company FEI (Quanta 650) was used for elemental identification. X-ray diffraction (XRD) analysis of the rust was conducted using an X-ray diffractometer (D8/ADVANCE). The scanning of the detected diffraction angle (2θ) ranged from 5 to 55°.

Materials

In this experiment, a square iron plate was chosen and divided into small grids measuring 1 × 1 cm. Different contents of α-Fe₂O₃, Fe₃O₄, and epoxy resin were thoroughly mixed and stirred to produce 524 sets of mending materials with varying compositions. These 524 groups of mending materials were applied to the iron plate, as illustrated in Fig. 3. Following a curing and drying period of 7 days, the color of the 524 sets of samples was measured using a DS-620 spectrophotometer. The color data for the 524 sets of mending materials are provided in Table S1.

A spectrophotometer comprises a light source, integrating sphere, grating, and photodetector. During measurement, light reflected or transmitted from the object is split into different wavelengths by a beam splitter. The photodetector then measures the intensity of these wavelengths, converting them into digital signals to calculate the object’s color [30].

This device operates based on the relationship between light wavelength and intensity. When white light strikes the sample, it absorbs certain wavelengths and reflects or transmits others. The spectrophotometer breaks down these spectra, detecting absorption and reflection intensities across different wavelengths [30]. These signals are then processed to derive color data. Key features of spectrophotometers include: (1) unrestricted testing positions; (2) simulation of various light sources; (3) high measurement precision; and (4) capability to measure the “reflectance curve” of each color point.

Characterization of the color of the substance is expressed in terms of L, a, and b in the CIE Lab color space [31]. In this color space: L represents the brightness value, where a larger value indicates higher brightness. a denotes the red-green axis, with a positive value indicating red and a negative value indicating green. b represents the yellow-blue axis, where a positive value indicates yellow and a negative value indicates blue. The total color value, denoted as E, signifies the overall color condition of the substance. The formula for calculating E is as follows:

$$E=\sqrt{(L{)}^{2}+(a{)}^{2}+(b{)}^{2}}$$

(1)

Models

In this study, the approach for predicting the color of mending materials corresponding to different contents of Fe₃O₄, α-Fe₂O₃, and epoxy resin is outlined in Fig. 4. The strategy involves the following steps: (1) Experimental data collection: Colors of multiple groups of mending materials corresponding to different contents of Fe₃O₄, α-Fe₂O₃, and epoxy resin were obtained through experiments. (2) Model training: The experimental data were utilized to train three models-ANN, XGBoost, and LightGBM. (3) Model comparison: The prediction performance of the three models was compared, and the optimal model was selected based on the comparison results. (4) SHAP analysis: The SHAP method was employed to analyze the factors that have the greatest impact on the color of mending materials. By following this strategy, the study aims to develop an effective predictive model for the color of mending materials, providing insights into the key factors influencing color variations.

To compare the predictive performance of the three algorithms, the square correlation coefficient (R²) and mean absolute error (MAE) were employed as metrics to assess the accuracy and effectiveness of the predictive models [32].

$${R}^{2}=1-\frac{\sum_{i=1}^{N}{(\widehat{y}-y)}^{2}}{\sum_{i=1}^{N}{(\overline{y }-y)}^{2}}$$

(2)

$$MAE=\frac{1}{n}\sum_{i=1}^{n}\left|(\widehat{y}-y)\right|$$

(3)

In the Eqs. 2 and 3, $y$ denotes the experimental value, $\widehat{y}$ represents the estimated value, $\overline{y }$ represents the mean of the experimental value.

In this study, three robust machine learning algorithms (i.e., ANN, XGBoost, and LightGBM) were chosen to construct models for predicting the color of mending material. ANN models are renowned for their capability to handle intricate nonlinear relationships, making them applicable across diverse fields, including physical property modeling. XGBoost and LightGBM, both gradient boosting algorithms, are distinguished for their predictive accuracy. Moreover, they offer feature importance assessments, enabling researchers to discern the most influential input variables in the prediction process. This interpretability facilitates valuable insights into the underlying relationships between input and output variables. By harnessing the strengths of ANN, XGBoost, and LightGBM, the authors aimed to develop a comprehensive and precise predictive model for mending material color.

ANN

ANN is composed of numerous neurons with interconnections between them, where each neuron represents an output function known as an activation function. The connections between neurons are represented by weights, indicating the weighted value of the signal passing through the connection [33]. The processing units in artificial neural networks are categorized into input units, hidden units, and output units, as illustrated in Fig. 5. In this model, MATLAB software was employed for programming. The input layer of the model includes the content of Fe₃O₄ powder, the content of α-Fe₂O₃ powder, and the content of epoxy resin. The output layer of the model represents the E-value of the mending materials. The model adopts a feedforward neural network structure, with the hidden layer applying the Tansig transfer function and the output layer utilizing the Purelin transfer function [34]. The ANN model is trained using the Levenberg–Marquardt algorithm [35].

XGBoost

XGBoost is an iterative boosting algorithm [36]. It combines a gradient boosting framework with a decision tree model, iteratively training a series of decision trees to progressively enhance prediction performance. XGBoost utilizes residuals to correct prediction variables in each iteration. Illustrated in Fig. 6, XGBoost employs a level-wise growth strategy with depth constraints. This strategy facilitates multithreading optimization and mitigates the risk of overfitting, making it a robust approach for enhancing prediction accuracy.

LightGBM

LightGBM is a gradient-boosting framework that employs a histogram-based decision tree learning algorithm [37]. It combines two data collection and classification methods: Gradient One-Side Sampling (GOSS) and Exclusive Feature Bundling (EFB) [38]. Illustrated in Fig. 7, LightGBM adopts the leaf-wise growth strategy with depth constraints, enhancing accuracy and mitigating the risk of overfitting. This algorithm proves effective in improving predictive performance while maintaining robustness.

SHAP

The SHAP method represents a novel and burgeoning approach to elucidate machine learning models. Rooted in the Shapley value theory from game theory, it gauges the contribution of each feature in a machine learning model, thereby elucidating the model's prediction outcomes [39]. The SHAP method offers several advantages, including interpretability, high efficiency, stability, and comprehensiveness. By leveraging the SHAP method, users can gain deeper insights into the decision-making process of the model, thereby enhancing its interpretability and credibility. This method is particularly suitable for analyzing the influence of various factors on the color of materials in the present study.

Results and discussion

Chemical analysis of rust

Table 3 demonstrates the elements and contents of rust measured by EDS. Figure 8 shows the XRD patterns of the rust and the contents of the different crystalline phases are shown in Table 4.

Table 3 Energy spectrum results for rust

Full size table

Table 4 XRD Semi-quantitative analysis results

Full size table

As shown in Table 3, rust is mainly composed of Fe, C, and O, of which oxygen is the most abundant element. The Yuquan Iron Pagoda surface is mainly white cast iron, white cast iron in the C all in the form of permeable carbon (Fe₃C), so the carbon content is high.

Rust was detected by diffraction peaks of Fe₃O₄, γ-FeO(OH), α-FeO(OH), and α-Fe₂O₃. Among them, Fe₃O₄ has a spinel-like structure, α-FeO(OH) has an orthorhombic crystal system structure, and α-Fe₂O₃ has a tripartite crystal system structure, and these three rust products are stable due to their structures.

In the pre-corrosion stage of the pagoda, the corrosion of iron takes place mainly in the thin water film on its surface. In this process, the iron matrix provides Fe and the reaction of O₂ with the thin water layer provides OH⁻. γ-FeO(OH) and α-FeO(OH) rusts keep appearing. γ-FeO(OH) is unstable and gradually forms the stable α-FeO(OH), Fe₃O₄ with α-Fe₂O₃.

Models comparison and SHAP analysis

To mitigate the impact of model overfitting, data cleaning and random assignment were employed for data processing. Data cleaning involves identifying and removing outliers in the data to reduce the risk of overfitting. Random allocation entails randomly dividing the data into training, validation, and testing sets to prevent pseudo-overfitting. The 524 sets of data from Table S1 are randomly divided into three sets: 366 for training (70%), 79 for validation (15%), and 79 for testing (15%). The training set is used to train the model, the validation set aids in adjusting model parameters, and the testing set evaluates the prediction performance of the trained model. The MAE and R² of the three models are compared to assess their predictive performance. A smaller MAE or a larger R² suggests better prediction performance for the model.

To optimize the ANN model, different numbers of neurons in the hidden layer are considered. Excessive neurons in the hidden layer may lead to overfitting, making it challenging for the model to train and converge [40]. As depicted in Fig. 9, when the number of neurons exceeds 8, the R² of the testing set tends to be mostly negative, indicating overfitting. The performance of ANN models is compared with varying numbers of hidden layer neurons from 1 to 8. The results reveal that the optimal prediction performance for the ANN model is achieved when the number of hidden layer neurons is set to 2.

Table 5 presents the MAE and R² of the three algorithm models. As observed from Table 5, the R² values for the testing sets of all three models are greater than 0.9, indicating relatively good prediction performance for all algorithms. Among them, the XGBoost model exhibits the highest R² and the lowest MAE for the testing set, with values of 0.94238 and 0.68485, respectively. This highlights that the XGBoost model outperforms the other two models in terms of prediction accuracy.

Table 5 Comparison of prediction performance of three algorithm models

Full size table

For a more detailed analysis of the prediction performance of different algorithm models, the experimental values were compared with the predicted values of the three models, as illustrated in Fig. 10. In the Fig. 10, most of the data points are clustered around the diagonal line, indicating relatively good prediction performance for all three models. Notably, the data points of the training set for the ANN model and the validation set for the LightGBM model exhibit some scattering, whereas the data points for the XGBoost model align closely with the diagonal. This indicates that the XGBoost model achieves the highest prediction accuracy among the three models.

To further compare the prediction performance of the model, the errors between the experimental values and the predicted values were calculated, as depicted in Fig. 11. In the figure, the errors of most data points are close to zero, with the XGBoost model exhibiting the smallest errors. This confirms that the prediction performance of XGBoost is superior to the ANN model and LightGBM model in this study.

To further verify the accuracy of the XGBoost model prediction, three sets of mending materials corresponding to Fe₃O₄, α-Fe₂O₃, and epoxy resin in different contents were produced. Subsequently, the colors of the three groups of mending materials were measured. Three sets of Fe₃O₄, α-Fe₂O₃, and epoxy resin content data were input into the trained XGBoost model and the predicted values were obtained. The experimental measurement values were compared with the predicted values. The comparison results show that the maximum relative error between the experimental value and the predicted value of the XGBoost model is 0.02, as shown in Table 6. These three sets of data confirm that the XGBoost model exhibits good prediction effects and high accuracy.

Table 6 Comparison of experimental and predicted values for three groups of mending materials

Full size table

To study the influence of each raw material on the color of mending materials, SHAP feature analysis was performed on each raw material, as shown in Fig. S1. Fig. S1a, b show the relationship between the three raw materials and the color of mending materials. It can be seen from the figure that the order of the influence of the three raw materials on the color of mending materials is Fe₃O₄, α-Fe₂O₃, and epoxy resin. When the content of epoxy resin and α-Fe₂O₃ is fixed, as the Fe₃O₄ content increases, the E-value of the mending materials becomes smaller. Fe₃O₄ has almost no effect on the red-green and yellow-blue colors of mending materials. This shows that as the content of Fe₃O₄ raw materials increases, the color of mending materials becomes darker. Fig. S1c shows the effect of co-action between the raw materials on the color of the mending materials. It can be seen that the combined effect of Fe₃O₄ and α-Fe₂O₃ has a greater effect on the color of the mending materials. The effect of the epoxy resin together with the other two raw materials on the color of the mending materials is not significant.

Application of the XGBoost model

The model is helpful for the surface treatment of iron cultural relics. The process of finding the content of α-Fe₂O₃, Fe₃O₄, and epoxy resin corresponding to mending materials using the XGBoost model is shown in Fig. 12. The E-value data of mending materials corresponding to different contents of α-Fe₂O₃, Fe₃O₄, and epoxy resin were used to train the XGBoost model to obtain the trained model. As illustrated in the code provided in Table S3, the trained model is saved in a file with the suffix “joblib”, which can be called directly in the “Jupyter Notebook” application. Multiple sets of α-Fe₂O₃, Fe₃O₄, and epoxy resin data with different contents were artificially assumed, and these fictitious data were fed into the trained model. The E-values of the mending materials corresponding to these fictitious different contents of α-Fe₂O₃, Fe₃O₄, and epoxy resin will be generated by the trained model. These data are collected into a database. E_Database is the E-value corresponding to the different contents of α-Fe₂O₃, Fe₃O_4, and epoxy in the database.

At the same time, the iron cultural relics are pretreated with rust removal. The areas where the color differs significantly from its surrounding areas are selected as experimental areas. The color of the areas surrounding the experimental areas is measured. E_{iron cultural relics} is the E-value of the areas surrounding the experimental areas.

The E-value of the areas surrounding the experimental areas is compared to the E-value in the database. If $\frac{|{E}_{Database}-{E}_{Iron cultural relics}|}{|{E}_{Database}|}>0.01$, re-assume different contents of α-Fe₂O₃, Fe₃O_4, and epoxy resin data, and these data are input into the trained model. If $\frac{|{E}_{Database}-{E}_{Iron cultural relics}|}{|{E}_{Database}|}\le 0.01$, the contents of α-Fe₂O₃, Fe₃O₄, and epoxy resin corresponding to E_Database are the contents of raw material for the required mending materials.

This study takes the surface treatment of the Yuquan Iron Pagoda as a case to specifically illustrate the application process of this model. The rust on the iron pagoda has been removed, and the iron pagoda after rust removal is shown in Fig. 13a, c, and e. As can be seen in these three figures, the colors of areas A1, A2, and A3 are significantly different from the colors of their surrounding areas B1, B2, and B3, respectively. The diameters of regions A1, A2, and A3 are 29.43 mm, 28.11 mm, and 37.31 mm, respectively. The diameters of regions B1, B2, and B3 are 43.01 mm, 34.67 mm, and 49.83 mm, respectively. Three points within areas A1, A2, and A3 were selected separately, and these nine points were measured for color and recorded in Table 7. From Table 7, it can be seen that the difference between the L-values, a-values, b-values, and E-values of the three points W1.1, W1.2, and W1.3 within area A1 is less than 2. Therefore, the average of the L-values, a-values, b-values, and E-values of these three points is chosen as the ‾L-value, ‾a-value, ‾b-value, and ‾E-value for area A1. Similarly, the ‾L-values, ‾a-values, ‾b-values, and ‾E-values of areas A2 and A3 are the averages of the L-values, a-values, b-values, and E-values of the three points in their areas. These ‾L-values, ‾a-values, ‾b-values, and ‾E-values of areas A1, A2, and A3 are recorded in Table 8.

Table 7 Color of points in areas A1, A2, and A3

Full size table

Table 8 Color of areas A and B

Full size table

Three points within the surrounding areas B1, B2, and B3 were selected respectively, and the colors of these points were measured and recorded in Table 9. From Table 9, it can be seen that the difference between the L-values, a-values, b-values, and E-values of the three points within area B2 is less than 2, so the average of the L-values, a-values, b-values, and E-values of these three points was chosen as the ‾L-value, ‾a-value, ‾b-value and ‾E-value for area B2. Similarly, the ‾L-value, ‾a-value, ‾b-value, and ‾E-value of area B3 is the average of the L-values, a-values, b-values, and E-values of points P3.1, P3.2, and P3.3. The L-value, a-value, b-value, and E-value of P1.3 in area B1 differ significantly from P1.1 and P1.2, which may be due to excessive errors caused by human measurements. So, the ‾L-value, ‾a-value, ‾b-value, and ‾E-value for area B1 is the average of the L-values, a-values, b-values, and E-values of points P1.1 and P1.2. These ‾L-values, ‾a-values, ‾b-values, and ‾E-values of areas B1, B2, and B3 are recorded in Table 8.

Table 9 Color of points in areas B1, B2, and B3

Full size table

To perform surface treatment on areas A1, A2, and A3, the mending materials NA1, NA2, and NA3 for these areas A1, A2, and A3 need to be made. The E-values of mending materials NA1, NA2, and NA3 should be close to the ‾E-values of areas B1, B2, and B3. The corresponding α-Fe₂O₃, Fe₃O_4, and epoxy resin content of these mending materials NA1, NA2, and NA3 can be found in the database. E_Database is the E-value corresponding to the different contents of α-Fe₂O₃, Fe₃O_4, and epoxy in the database. E_B is the ‾E-value of the area B. E_NA is the E-value of the mending material NA.

Retrieve the data in the database, if $\frac{|{E}_{Database}-{E}_{B}|}{|{E}_{Database}|}\le 0.01$, then this E_Database is the E_NA of the mending material NA required for area A. The contents of α-Fe₂O₃, Fe₃O₄, and epoxy resin corresponding to E_NA are the contents of the raw materials for mending materials NA required in area A. These data are presented in Table 10.

Table 10 E-value of the mending materials with its corresponding α-Fe₂O₃, Fe₃O₄, and epoxy resin content

Full size table

The mending materials required for areas A1, A2, and A3 were made according to the data shown in Table 10, and these mending materials were applied to areas A1, A2, and A3, respectively. The effects of the surface treatment of areas A1, A2, and A3 are shown in Fig. 13b, d, and f.

Comparing the iron pagoda before and after surface treatment, it was found that the partial color of the iron pagoda after surface treatment was almost the same, and the surface treatment effect was excellent. The surface-treated pagoda is more beautiful and has more historical and artistic value.

The use of this model to assist in the surface treatment of iron cultural relics reduces the waste of manpower, material resources, and time, and resolves errors caused by human factors.

To clearly express the meaning of each letter, the letters are listed with their corresponding meanings in Table 11.

Table 11 Letters and their corresponding meanings

Full size table

Conclusion

In this study, chemical analysis of the rust from the Yuquan Iron Pagoda revealed the presence of several compounds, predominantly Fe₃O₄, γ-FeO(OH), α-FeO(OH), and α-Fe₂O₃, constituting approximately 13.1, 16.1, 40.2, and 30.6% of the rust, respectively. Due to their structural stability and suitable color characteristics, Fe₃O₄ and α-Fe₂O₃ were selected as the primary raw materials for the repair material.

Machine learning models were developed to predict the color of mending materials by mixing α-Fe₂O₃, Fe₃O₄, and epoxy resin in different contents. The resulting 524 groups of mending materials were experimentally measured and recorded as training data for the models. Models based on ANN, XGBoost, and LightGBM algorithms were constructed, showing good predictive performance. The XGBoost model exhibited the best performance with an MAE of 0.68485 and an R2 of 0.94238. The SHAP analysis highlighted that the content of Fe₃O₄ had the most significant impact on the color of mending materials.

Applying this model to the surface treatment of the Yuquan Iron Pagoda has enhanced work efficiency and addressed issues related to resource waste. The E-values corresponding to the mending materials required for different parts of the Yuquan Iron Pagoda are 26.73, 25.62, and 25.85. By inputting these values into the database, we can swiftly retrieve the contents of α-Fe₂O₃, Fe₃O₄, and epoxy resin associated with each E-value. For instance, the mending material with an E-value of 26.73 corresponds to α-Fe₂O₃, Fe₃O₄, and epoxy resin contents of 0.11 g, 0.84 g, and 2.62 g, respectively. Similarly, the mending material with an E-value of 25.62 corresponds to α-Fe₂O₃, Fe₃O₄, and epoxy resin contents of 0.60 g, 0.97 g, and 2.38 g, respectively. Lastly, the mending material with an E-value of 25.85 corresponds to α-Fe₂O₃, Fe₃O₄, and epoxy resin contents of 0.54 g, 0.91 g, and 2.40 g, respectively. The success of this model in the Yuquan Iron Pagoda case suggests its potential application in the restoration of other iron cultural relics, such as the Cangzhou Iron Lion. Overall, this research introduces a novel method for surface treatment in iron cultural relic restoration, contributing to the preservation and restoration efforts in this field.

Availability of data and materials

The datasets used and/or analysed during the current study are available from the corresponding author on reasonable request.

Abbreviations

ANN:: Artificial neural network
XGBoost:: EXtreme Gradient Boosting
LightGBM:: Light Gradient Boosting Machine
R2:: Squared correlation coefficients
MAE:: Mean absolute error
SHAP:: SHapley Additive exPlanations

References

Dillmann P, Mazaudier F, Hœrlé S. Advances in understanding atmospheric corrosion of iron. I. Rust characterisation of ancient ferrous artefacts exposed to indoor atmospheric corrosion. Corros Sci. 2004;46:1401–29.
Article CAS Google Scholar
Hœrlé S, Mazaudier F, Dillmann P, Santarini G. Advances in understanding atmospheric corrosion of iron. II. Mechanistic modelling of wet-dry cycles. Corros Sci. 2004;46:1431–65.
Article Google Scholar
Hu P, Jia M, Li M, Sun J, Cui Y, Hu D, Hu G. Corrosion behavior of ancient white cast iron artifacts from marine excavations at atmospheric condition. Materials. 2022;12:921–33.
CAS Google Scholar
Misawa T, Kyuno T, Suetaka W, Shimodaira S. The mechanism of atmospheric rusting and the effect of Cu and P on the rust formation of low alloy steels. Corros Sci. 1971;11:35–48.
Article CAS Google Scholar
Oosterhout GW. The transformation γ-FeO(OH) to α-FeO(OH). J Inorg Nucl Chem. 1967;29:1235–8.
Article Google Scholar
Tanaka H, Mishima R, Hatanaka N, Ishikawa T, Nakayama T. Formation of magnetite rust particles by reacting iron powder with artificial α-, β- and γ-FeOOH in aqueous media. Corros Sci. 2014;78:384–7.
Article CAS Google Scholar
Misawa T, Asami K, Hashimoto K, Shimodaira S. The mechanism of atmospheric rusting and the protective amorphous rust on low alloy steel. Corros Sci. 1974;14:279–89.
Article CAS Google Scholar
Khanam J, Hasan MR, Biswas B, Jahan S, Sharmin N, Ahmed S, Al-Reza S. Development of ceramic grade red iron oxide pigment from waste iron source. Heliyon. 2023;9:12854–67.
Article Google Scholar
Jia M, Hu P, Hu G. Corrosion layers on archaeological cast iron from Nanhai I. Materials. 2022;15:4980–95.
Article CAS PubMed PubMed Central Google Scholar
Liu T. Qing dynasty iron anchors in the collection of guangdong institute of cultural relics and archaeology protection and restoration. Hakka Cult Herit. 2022;1:36–43.
Google Scholar
Wang Y, Liu K, Wang C, Zhou S. Influence of solution concentration and temperature on the repair effect for electrophoretic deposition of rust-cracked reinforced concrete. J Build Eng. 2022;56:104772–86.
Article Google Scholar
Li N, Guo J. Introduction to the conservation and restoration of the iron bells of the Jinzi Museum. World Antiq. 2017;6:74–8.
Google Scholar
Barone G, Mazzoleni P, Spagnolo GV, Raneri S. Artificial neural network for the provenance study of archaeological ceramics using clay sediment database. J Cult Herit. 2019;38:147–57.
Article Google Scholar
Artopoulos G, Maslioukova MI, Zavou C, Loizou M, Deligiorgi M, Averkiou M. An artificial neural network framework for classifying the style of cypriot hybrid examples of built heritage in 3D. J Cult Herit. 2023;63:135–47.
Article Google Scholar
Liu B, Mu K, Ye F, Deng J, Wang J. Immovable cultural relics disease prediction based on relevance vector machine. Math Probl Eng. 2020;2020:1–9.
CAS Google Scholar
Liu B, Ye F, Mu K, Wang J, Zhang J. Wavelet correlation analysis relevance vector machine diseases prediction for immovable cultural relics. Evol Intel. 2021;15:2679–90.
Article Google Scholar
X. Zhang, H. Wang, Z. Wang, T. Ma, Q. Shang, W. Li, Open-air unmovable cultural relics health trend prediction, In: Proceedings of the 2016 International Forum on Management, Education and Information Technology Application (2016) 838–841.
El-Fetouh AA, Mohamed H, Shawky M. A framework based on geo-information neural system (GINS) for predicting remaining life of heritage buildings assets. Int J Comput Appl. 2012;58:5–11.
Google Scholar
Chen S, Chen J, Yu J, Wang T, Xu J. Prediction of deterioration level of heritage buildings using a logistic regression model. Buildings. 2023;13:1006–17.
Article CAS Google Scholar
Lei Y, Shen Z, Tian F, Yang X, Wang F, Pan R, Wang H, Jiao S, Kou W. Fire risk level prediction of timber heritage buildings based on entropy and XGBoost. J Cult Herit. 2023;63:11–22.
Article Google Scholar
Monna F, Rolland T, Denaire A, Navarro N, Granjon L, Barbé R, Chateau-Smith C. Deep learning to detect built cultural heritage from satellite imagery. -Spatial distribution and size of vernacular houses in Sumba, Indonesia. J Cult Herit. 2021;52:171–83.
Article Google Scholar
Liu B, Ye F, Mu K, Wang J, Zhang J. Crack prediction based on wavelet correlation analysis least squares support vector machine for stone cultural relics. Math Probl Eng. 2021;2021:1–10.
Google Scholar
Meng T, Huang R, Lu Y, Liu H, Ren J, Zhao G, Hu W. Highly sensitive terahertz non-destructive testing technology for stone relics deterioration prediction using SVM-based machine learning models. Herit Sci. 2021;9:1–9.
Article Google Scholar
Hatir ME, Barstuğan M, İnce İ. Deep learning-based weathering type recognition in historical stone monuments. J Cult Herit. 2020;45:193–203.
Article Google Scholar
Pathak R, Saini A, Wadhwa A, Sharma H, Sangwan D. An object detection approach for detecting damages in heritage sites using 3-D point clouds and 2-D visual data. J Cult Herit. 2021;48:74–82.
Article Google Scholar
Boesgaard C, Hansen BV, Kejser UB, Mollerup SH, Ryhl-Svendsen M, Torp-Smith N. Prediction of the indoor climate in cultural heritage buildings through machine learning: first results from two field tests. Herit Sci. 2022;10:176–88.
Article Google Scholar
Miglioranza P, Scanu A, Simionato G, Califano N. Machine learning and engineering feature approaches to detect events perturbing the indoor microclimate in Ringebu and Heddal stave churches (Norway). Int J Build Pathol Adapt. 2024;42:35–47.
Article Google Scholar
Wen X, Xie Y, Wu L, Jiang L. Quantifying and comparing the effects of key risk factors on various types of roadway segment crashes with LightGBM and SHAP. Accid Anal Prev. 2021;159:106261–72.
Article PubMed Google Scholar
Ekanayake IU, Meddage DPP, Rathnayake U. A novel approach to explain the black-box nature of machine learning in compressive strength predictions of concrete using Shapley additive explanations (SHAP). Case Stud Constr Mater. 2022;16:01059–79.
Google Scholar
K. Yuan, H. Zhou, Y. Wu, C. Wang, S. Jin, Spectrophotometric colorimeter based on LED light source and method for realizing the same: U.S. Patent 9243953, 2016–1–26.
Malounas I, Lentzou D, Xanthopoulos G, Fountas S. Testing the suitability of automated machine learning, hyperspectral imaging and CIELAB color space for proximal in situ fertilization level classification. Smart Agric Technol. 2024;8:100437–48.
Article Google Scholar
Chicco D, Warrens MJ, Jurman G. The coefficient of determination R-squared is more informative than SMAPE, MAE, MAPE, MSE and RMSE in regression analysis evaluation. PeerJ Comput Sci. 2021;7:623–46.
Article Google Scholar
Agatonovic-Kustrin S, Beresford R. Basic concepts of artificial neural network (ANN) modeling and its application in pharmaceutical research. J Pharm Biomed Anal. 2000;22:717–27.
Article CAS PubMed Google Scholar
Lei Y, Shu Y, Liu X, Liu X, Wu X, Chen Y. Predictive modeling on the surface tension and viscosity of ionic liquid-organic solvent mixtures via machine learning. J Taiwan Inst Chem Eng. 2023;151:105140–55.
Article CAS Google Scholar
Mammadli S. Financial time series prediction using artificial neural network based on Levenberg-Marquardt algorithm. Procedia Comput Sci. 2017;120:602–7.
Article Google Scholar
T. Chen, C. Guestrin, XGBoost: A scalable tree boosting system, In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2016) 785–794.
Ke G, Meng Q, Finley T, Wang T, Chen W, Ma W, Ye Q, Liu TY. A highly efficient gradient boosting decision tree. Adv Neural Inform Process Syst. 2017;30:3146–54.
Google Scholar
Park JH, Jo HS, See SH, Oh SW, Na MG. A reliable intelligent diagnostic assistant for nuclear power plants using explainable artificial intelligence of GRU-AE, LightGBM and SHAP. Nucl Eng Technol. 2022;54:1271–87.
Article CAS Google Scholar
Lundberg S, Lee S. A unified approach to interpreting model predictions. Adv Neural Inf Process Syst. 2017;30:4768–77.
Google Scholar
LeCun Y, Bengio Y, Hinton G. Deep learning. Nature. 2015;521:436–44.
Article CAS PubMed Google Scholar

Download references

Acknowledgements

Not applicable.

Funding

This research was supported by the Open Project of Key Scientific Research Base of the National Cultural Heritage Administration for the Protection of Unearthed Wood Lacquerware (2021H10198, 2023H10017).

Author information

Authors and Affiliations

Jingzhou Conservation Center, 108 Jingbei Road, Jingzhou, 434020, China
Xuegang Liu
School of Chemistry and Chemical Engineering, Hubei Key Laboratory of Coal Conversion and New Carbon Materials, Wuhan University of Science and Technology, Wuhan, 430081, China
Yuhang Liu, Ke Wang & Yang Lei
School of History and Culture, Hubei University, Wuhan, 430062, China
Yang Zhang
Shanxi Academy of Ancient Building and Painted Sculpture & Fresesco Preservation, Taiyuan, 030012, China
Hai An
Culture and Tourism Bureau of Dangyang City, Dangyang, 444100, China
Mingqiang Wang
Department of Chemical and Biomolecular Engineering, University of Delaware, 150 Academy Street, Newark, Delaware, 19716, USA
Yuqiu Chen

Authors

Xuegang Liu
View author publications
You can also search for this author in PubMed Google Scholar
Yuhang Liu
View author publications
You can also search for this author in PubMed Google Scholar
Ke Wang
View author publications
You can also search for this author in PubMed Google Scholar
Yang Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Yang Lei
View author publications
You can also search for this author in PubMed Google Scholar
Hai An
View author publications
You can also search for this author in PubMed Google Scholar
Mingqiang Wang
View author publications
You can also search for this author in PubMed Google Scholar
Yuqiu Chen
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

XL and YL wrote the main manuscript text; KW and YZ revised it critically for important intellectual content; YL and YC made substantial contributions to the conception or design of the work; HA and MW prepared figures. All authors reviewed the manuscript.

Corresponding authors

Correspondence to Yang Lei or Yuqiu Chen.

Ethics declarations

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

40494_2024_1295_MOESM1_ESM.xlsx

Supplementary Material1 Table S 1 summarizes experimental data measured by the colorimeter at different ratios of α-Fe2O3, Fe3O4, and epoxy resin. Table S 2 summarizes ANN code via MATLAB 2019b. Table S 3 summarizes XGBoost and LightGBM code via Jupyter Notebook. Fig. S 1 summarizes the SHAP summary plot (a), SHAP feature importance plot (b), and SHAP interaction plot (c).

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article

Liu, X., Liu, Y., Wang, K. et al. A color prediction model for mending materials of the Yuquan Iron Pagoda in China based on machine learning. Herit Sci 12, 183 (2024). https://doi.org/10.1186/s40494-024-01295-1

Download citation

Received: 01 April 2024
Accepted: 22 May 2024
Published: 06 June 2024
DOI: https://doi.org/10.1186/s40494-024-01295-1

A color prediction model for mending materials of the Yuquan Iron Pagoda in China based on machine learning

Abstract

Introduction

The Yuquan Iron Pagoda

Machine learning in cultural heritage conservation

Research aim

Materials and methods

Characterization of rust

Materials

Models

ANN

XGBoost

LightGBM

SHAP

Results and discussion

Chemical analysis of rust

Models comparison and SHAP analysis

Application of the XGBoost model

Conclusion

Availability of data and materials

Abbreviations

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Competing interests

Additional information

Publisher's Note

Supplementary Information

40494_2024_1295_MOESM1_ESM.xlsx

Rights and permissions

About this article

Cite this article

Share this article

Keywords