Mixture optimization for mechanical, environmental, and economic objectives in grouting slurry for repairing earthen sites

Anchor and fissure grouting are used to repair earthen sites. However, the common method to obtain the compressive strength of grouting slurry would cause material, labor, and time losses. In addition the material properties, environmental and economic benefits have gained increasing attention. This study proposes a design framework for multi-objective proportioning optimization based on machine learning and metaheuristics. The results indicated that the eXtreme Gradient Boosting (XGBoost) model, whose hyper-parameters were optimized by a genetic algo-rithm, can accurately predicted the compressive strength of the slurries. The impact of the variables on development of compressive strength can explain the internal reaction mechanisms. The analytical framework based on meta-heuristic and technique for order of preference by similarity to an ideal solution (TOPSIS) provided Pareto-optimal solutions in design scenario of each sub-dataset. The framework proposed in this study can efficiently achieve mechanical, environmental, and economic design objectives of anchor grouting and fissure grouting slurries for repairing earthen sites


Introduction
Earthen sites have valuable historical information and are distributed throughout China [1].However, natural factors such as wind, water, temperature, and earthquakes have created erosive patterns that are widely distributed at these sites.The most common are large-scale earthen site cracking and small-scale erosion fissures, which severely decrease the structural safety of earthen sites [2,3].Anchor and fissure grouting are repair methods used for cracking and fissure erosion at earthen sites, respectively.Anchor grouting combines anchor rods, grouting slurries, and site soil to mobilize the entire earthen site to stabilize the damaged part.Fissure grouting injects grouting slurries that are compatible with the earthen site into the fissure to improve the overall mechanical strength.Both methods require suitable slurries characterized by a multiplicity of admixtures, which makes the prediction of their properties difficult [4][5][6].Currently, three types of grouting slurries are commonly used to repair earthen sites in Northwest China: quicklime (Lime), calcined ginger nuts (CGN), and potassium silicate (PS).Their raw materials include traditional building materials such as Lime and CGN; natural materials such as loess (C), diatomite (Z), and quartz sand (S); supplementary cementitious materials such as fly ash (F) and bentonite (B); and binder materials such as PS, modified polyvinyl alcohol solution (SH), and sticky rice paste (N) [7][8][9][10][11][12][13][14][15][16][17][18][19][20][21].Among them, the preparation and utilization of some traditional building materials often involve large carbon emissions, whereas some newer materials have higher costs [22,23].Therefore, balancing the mechanical compatibility, environmental friendliness, and cost-controllable attributes of grouting slurries will become an important part of the research and promotion on repairing historically valuable earthen sites.
In recent years, data-driven machine learning (ML) algorithms have accurately predicted the mechanical strength of materials.They can particularly consider the non-linear relationship between strength and its influencing factors, which allows for improving the efficiency of experimental analyses [24][25][26].A Çalışkan et al. developed regression methods based on extreme learning machines (ELM), support vector machines (SVM), and the grouping method for data handling (GMDH) to estimate the compressive strength of fly ash and nano-calcite-cemented mortars.These models not only predicted highly accurate compressive strengths and ultrasonic pulse velocities (UPV), but also minimized time, materials, labor, and cost [27].Zhang et al. used a random forest (RF) optimized through a genetic algorithm (GA) to predict the compressive strength of alkali-activated materials.The optimization of the RF produced higher prediction accuracy than common algorithms, such as back propagation neural networks (BP), multiple linear regression (MLR), and SVM, and could also explain the effect of admixture compositions on mechanical strength [28].Kang et al. used 11 ML models to predict the mechanical strength of steel fiber-reinforced concrete and found that boosting-and tree-based models provided high prediction accuracy, and the K-nearest neighbor, ridge regression, and lasso regressor performed the worst.In addition, the water-cement ratio and silica-ash content were the most influential factors in the prediction [29].These studies showcase the utility of machinelearning-based research on the mechanical strength of concrete and cement mortar [30][31][32].However, the earthen site restoration should adhere to the principle of "minimum intervention, maximum compatibility", using materials with physical, mechanical, and hydraulic properties to the site soil.This makes it impossible for the cement mortar which is commonly used in modern construction to be used in earthen site restoration engineering.Therefore, it is necessary to conduct the research on the prediction of the compressive strength of grouting slurries for repairing earthen sites.In addition, previous studies used the relative amount of each admixture as an input variable for their prediction models.They also mostly disregarded the effects of internal chemical reactions and the compactness of each material particle on the mechanical strength of the admixture materials, which led to a focus on data-driven results and poor interpretability and clarity regarding the internal mechanisms involved.This approach may lead to theoretically feasible prediction results [33,34].Therefore, in the process of developing the dataset used for grouting slurries, it is necessary to both collect the data of each admixture dosage and consider the effects of the molar ratio of the main reacting oxides and the bulk density of the admixture materials on the compressive strength of the slurries.
It is necessary to develop targeted and intelligent design methods to comprehensively analyze the performance of different grouting slurries [35].This study aims to establish a framework for designing the mixtures of grouting slurries.For this, a large dataset containing three sub-datasets was established, totaling 523 sets of data.Subsequently, after optimizing the hyper parameter by the GA, ten ML algorithms were used to predict the compressive strength of each sub-dataset, and the model with the highest accuracy was selected.Based on the relative importance and SHAP analysis, the effect of the input variables on the compressive strength was obtained.Finally, the Metaheuristic algorithm and TOP-SIS were used to achieve multi-objective optimization of grouting slurries for repairing earthen sites, which must be analyzed after actual application.Compared with the existing research on optimizing admixture ratios using computational algorithms, this study provides three novelties.(1) An optimization model was developed for the ratios of admixture materials of anchoring grouting and fissure grouting slurries on earthen sites, which employs a dataset containing 523 sets of data.(2) This study considers the effect of the molar ratio of the main reacting oxides in grouting slurries and the bulk density of admixture materials on the mechanical strength of the slurries.These represent the effects of internal chemical reactions and the compactness of each material particle on the obtained compressive strength, respectively.(3) Detailed, objective functions for carbon emissions and costs during the production and transportation of each admixture were estimated; this, combined with the ML prediction model and multi-objective particle swarm optimization (MOPSO), permits the proposal of an intelligent framework for optimizing admixtures for repairing slurries with multiple objectives.This study provides insight into the prediction behavior of the compressive strength and mixture design optimization of anchor grouting and fissure grouting slurries for earthen sites, opening a new way for the application of slurries without the need for extensive experimental work.

Data processing and analysis
Relevant data must be collected to establish analytical models for predicting the compressive strength of slurries.In response to the research results on slurries for anchor grouting and fissure grouting at earthen sites in Northwest China, this study collected the compressive strength data of Lime   1 represents the sodium fluorosilicate selected as the curing agent in the PS slurry.The liquidsolid ratio was calculated based on the sum of the masses of the solid materials in the slurries [36].
The total testing dataset was divided according to the different slurry types into sub-dataset_1, sub-dataset_2, and sub-dataset_3.
In order to more accurately reflect the influence of each admixture on the compressive strength of the slurry, this study will use the molar ratio of the main reacting oxides and the bulk density of each mixture material as input variables.Among them, the molar ratio of oxides can describe the impact of different effective components involved in chemical reactions within different data sources on mechanical strength, as well as reflect the composition and structure of reaction products, which is more helpful for sub-sequent mechanism analysis of prediction models [34].The bulk density of admixtures can be used to describe the different physical properties of the same admixture from different sources.While combined with the amount of admixture added, it can more accurately describe the impact of this difference on compressive strength in the slurry.
The main chemical reactions generated during the mixing and curing of quicklime slurries are the following [37]: The chemical reaction in Eq. ( 4) is difficult to quantify.Therefore, sub-dataset_1 chooses the CaO content (X1), n(CaO)/n(H 2 O) (X2), n(SiO 2 )/n(CaO) (X3), and n(Al 2 O 3 )/n(CaO) (X4) to indicate the effect of the chemical reactions on the compressive strength.The molar ratio reflects the effect of the dosage on the final material (1) (2) performance.Only the CaO in quicklime, the SiO 2 and Al 2 O 3 in fly ash, and H 2 O in the binders were considered in this sub-dataset.For the bulk density or dry density of each admixture material, its effect on the compressive strength must be analyzed in combination with the mass ratio of each admixture.Their respective expressions are ρ b (Lime)* m(Lime) m(Admixture) , ρ b (FA)* m(FA) m(Admixture) , and ρ d (Soil)* m(Soil) m(Admixture) , which are referred to as ρ b (Lime) (X5), ρ b (FA) (X6), and ρ d (Soil) (X7) for concision.For the binders, in addition to X2, this study reflects the effect of their dosage on the compressive strength using the liquid-solid ratio (L/S) (X8).m(Binder)/m(CaO) (X9) reflects the effect of polar hydroxyl groups in SH or polysaccharide substances in sticky rice paste to constrain and regulate the crystallization process of calcium-based materials.This indicates the mass ratio of the effective components in the binders to m(CaO) [37,38].The effect of the content and arrangement of the active ingredients within the binders on the compressive strength is indirectly reflected by η (Binder) (X10), which indicates the viscosity of the binders [39].The curing time (X11) reflects the effect of the curing conditions on the compressive strength.These 11 variables form the input variables of sub-dataset _1.The molar ratios of the oxides were converted from the mass-ratio data in the original literature.Table 2 presents the statistical results of the input and output variables for sub-dataset_1.
The main chemical reactions of the slurries from sub-dataset_2 were added to Eq. ( 5) and Eq. ( 6) [40]: Table 3 presents the statistical results of the input and output variables for sub-dataset _2.In this table, SiO 2 * and Al 2 O 3 *indicate their content in CGN, whereas SiO 2 and Al 2 O 3 correspond to fly ash.In addition, only the CaO in CGN and the H 2 O in the water or binder were considered in this sub-dataset.In addition, in order to reduce the number of input variables in the model, ρ b (FA) is used to describe the product value of the bulk density and mass ratio of fly ash and bentonite in the slurry.And ρ d (Soil) is used to describe the product value of the density and mass ratio of site soil and quartz sand.The sub-dataset_2 contains limited data with bentonite and site soil.Bentonite has good lubricity and particle exchange properties similar to the fly ash.The site soil and quartz sand do not participate in chemical reactions within the slurry.
The solidification mechanism of the slurries in sub-dataset_3 is that of the silicate in PS gels with particles in fly ash and soil, whose combination forms a relatively (5)  stable structure [41].Therefore, the effects of chemical reactions on the compressive strength can be disregarded in this sub-dataset.The soil contains loess and diatomite.SSF is the firming agent used for the slurries.
where P 1 and P 2 are the parameters considered for correlation, and P 1 and P 2 are their mean values, respectively.ρ ranges from −1 to 1, where 1, 0, and −1 respectively represent strong forward, uncorrelated, and strong inverse relational relationships between these parameters.Figures 1, 2, 3 show the distributions and correlations of the different input and output factors in each sub-dataset.In these figures, dark blue indicates a positive correlation, and yellow negative correlation.
In sub-dataset_1, the number of positive and negative correlations between the input and output variables is approximately half of each, with a strong positive correlation between X11 and Y and weak correlations between X3, X4, and Y.In sub-dataset_2, the input variables were mostly inversely correlated with Y, the values of which were generally small, whereas there was a strong positive correlation between X13 and Y.In sub-dataset_3, there were mostly positive correlations between the input variables and Y, especially for X5.However, X1 and X7 exhibited weak correlations with Y.

Development and performance of models
The ten ML algorithms for predicting compressive strength in this study were developed using MATLAB.These algorithms involve several types of learning including single, integrated, tree-based, and deep learning.The GA can optimize the hyper parameters of each model.Performance evaluation indices can be used to discern prediction models with high generalization and accuracy.Subsequently, a relative importance analysis and SHAP interpretation can identify the significant input variables for predicting the compressive strength and determine whether their effects are positive or negative [43,44].A schematic diagram of each machine-learning algorithm and GA is shown in Fig. 4

Tuning of hyper-parameters
To guarantee high prediction accuracy with ML models, appropriate hyper parameter values must be selected.However, because the arithmetic principle of each algorithm varies, the best hyper parameter values must be obtained for different algorithms by using a controlled variable.In this study, GA was used to optimize the hyper parameters of each algorithm and provide their best combination.The optimization process requires the parameters of the GA to be set uniformly.For example, if the population size is set to 20, the maximum number of generations is set to 100, the crossover probability is set to 0.4, and the mutation probability is set to 0.05.for the sub-datasets.For ANN, the hyper-parameters are sizes of hidden layer 1, 2 and 3 (NodeNum_1, Node-Num_2 and NodeNum_3); for BP, the hyper-parameters are weight parameters (W1, W2) and bias parameters (B1, B2); for SVR, the hyper-parameters are coefficient of the penalty term(c) and gamma value of gaussian kernel (g); for RF, the hyper-parameters are the total number of the trees (tree_num) and the minimum number of samples required to be at a leaf node (minleaf ); for XGBoost, the hyper-parameters are maximum number of iterations (max_num_iters), maximum depth of the tree (params.max_depth) and learning rate (params.eta);for CNN, the hyper-parameters are result of neural network convolution on the input data (feature_map),reduction factor of learning rate (LearnRateDropFactor) and number of rounds to reduce learning rate (LearnRateDropPeriod); for LSTM, the hyper-parameters are size of hidden layer units (numHiddenUnits), LearnRateDropFactor, and LearnRateDropPeriod; for GRU, the hyper-parameters are numHidden Units, LearnRateDropFactor and Learn-RateDropPeriod; for RBF, the hyper-parameter is expansion speed (Spread); for ELM, the hyper-parameter is the number of hidden layer nodes (L).The hyper parameters of the BP are expressed in the matrix.

Assessment of models
The model prediction results are shown in Figs. 5, 6, 7.In Fig. 5, most data points in sub-dataset_1 are centralized within the 20% error region for all ten models, especially for XGBoost; most points are almost on top of the diagonal.In Fig. 6, the data points in sub-dataset_2 are uniformly distributed on both sides of the centerline, except for some with small measurements.This shows the good predictive performance of these models.In Fig. 7, although the data points in sub-dataset _3 are more dispersed, the points in XGBoost continue to be generally close to the centerline, showing the great predictive accuracy of this model.
To evaluate the accuracy of the predictions for each model more intuitively, we used the metrics in Table 6 to assess these models.R 2 indicates the proximity between the predicted and measured values.The closer the value of R 2 is to 1, the higher the prediction accuracy of the model [34].VAF indicates the degree of correlation between the predicted results of the model and the actual results.The closer the value of VAF is to 100, the higher the prediction accuracy of the model.RMSE and MAE indicate the deviation between predicted results of the model and the actual results.COV indicate the degree of dispersion between the predicted results of the model and the actual results.The closer the values of RMSE, MAE and COV are to 0, the higher the prediction accuracy of the model [42].y i , y P i , and y represent the meas- ured value of number i, the predicted value of number i, and the average of all measured values, respectively.XGBoost had the best evaluation metrics in each subdataset.In sub-dataset_1, for both the training and testing sets, the R 2 and VAF values are higher than 0.99 and close to 100, respectively.The RMSE and MAE values were both less than 0.05.The COV value in XGBoost was significantly lower than those in the other models.In the other two sub-datasets, the testing set of sub-dataset_3 contained the worst evaluation metrics, with R 2 , RMSE, MAE, VAF, and COV values of 0.933, 1.105, 0.751, 93.35, and 31.40,respectively.However, the metrics were still significantly better than those of other models.In summary, XGBoost has the best prediction accuracy and generalizability, yielding more robust results with lower uncertainty.In contrast, some widely used models, such as BP, SVR, and RF, did not perform as expected.

Relative importance of the input variables
Based on the test results of the model performance evaluation, XGBoost with GA-optimized hyperparameters was selected to analyze the relative importance of the input variables.This study imported datasets and hyperparameters to obtain these importance scores for each input variable in the three sub-datasets.They were then used to assess the contribution of each input variable in this model.The importance scores are visually comparable in Fig. 11.
The most important input variables in sub-dataset_1 were n(H 2 O)/n(CaO), curing time, m(Binder)/m(CaO), and L/S.Among these, n (H 2 O)/n(CaO) reflects the main chemical process of calcium hydroxide from quicklime and water in the slurries, which significantly impacts the compressive strength.The curing time reflects that the increase in the compressive strength of the grouting slurries is closely related to time.The m (Binder)/m(CaO) and L/S ratios indicate that the effects of binders on the compressive strength are considerable.
Similarly, the most important input variables in sub-dataset_2 were the curing time, CaO content, n(H 2 O)/ n(CaO), and n(H 2 O)/n(SiO 2 *).In sub-dataset_3, the values are L/S, c(PS), and the curing time.These results indicate that the calcium-based materials, binders, and curing time significantly impact the compressive strength.However, Fig. 11 shows that some input variables have a lower relative importance for the compressive strength in each sub-dataset.To simplify the analysis, this study neglected the variables with a small effect on the compressive strength and employed the remaining variables for re-modeling.However, instead of directly selecting the input variables with a significant effect on the compressive strength; this process must retain at least one input variable for each admixture material to reflect its effect on the slurries.Figure 12 shows the final input variables with their relative importance in each sub-dataset, which were obtained through re-modeling using GA-XGBoost.

SHAP analysis of the input variables
In the previous section, we selected the input variables for each sub-dataset.However, determining whether the effect of the variables on the compressive strength is positive or negative is difficult if only the relative importance analysis is considered.Therefore, the Shapley additive explanation was introduced [48].
The SHAP interpretation package in Python was used to obtain the analysis results.In Fig. 13, the plot of the curing time shows that with increasing time, the SHAP value first increases and then gradually stabilizes.Figures 14, 15 show similar trends.However, the SHAP value in these two sub-datasets decreased in the latter stage when the curing time increased.This indicates that the compressive strength of the slurries did not continuously increase over time.Therefore, determining the optimal curing time for each slurry was necessary.The variables L/S, n(H 2 O)/n(CaO) and m(Binder)/m(CaO) have similar trends in the plots, showing a decreasing SHAP value when each variable increases.However, the downward trend in the latter Fig. 7 Experimental and predictive values of the ten machine learning models for sub-dataset_3 two variables was not linear.These results indicate that, as the binder dosage increased, a large amount of water was added to the slurries, which deteriorated their compressive strength.On the other hand, the effective component in the binders deteriorates the compressive strength of the slurries by affecting the structure of the generated calcium carbonate.In the plots of CaO content, n(H 2 O)/n(CaO), and m(Binder)/m(CaO), the trends of the three variables indicate that as the dosage of quicklime increases, the compressive strength of the slurries increases.However, when the dosages of both quicklime and the binder increased, the compressive strength of the slurries decreased.In addition, the plot of ρ d (Soil) shows that the SHAP value first increases and then decreases when the variable increases.This indicates that higher   decrease and then gradually stabilizes.Combined with the trend of the CaO content, these results indicate that the compressive strength of the slurries did not continuously increase for higher CGN dosages.These results and the trend of L/S indicate that the dosage of water or binder can increase the compressive strength of the slurries within two specific ranges.The variables ρ d (Soil) and n(SiO 2 )/n(CaO) have similar trends; the SHAP value first increases and then decreases when each variable increases.This indicates that an increase in the dosage of soil, quartz sand, fly ash, or bentonite can increase the compressive strength of slurries within a certain range.
In the plots of Fig. 15, the variables L/S, n (SiO 2 )/n (K 2 O), and c (PS) have similar trends; the SHAP value first increases and then decreases when each variable increases, indicating that with an increase in the dosage, modulus, and concentration of PS, the compressive strength of slurries can increase within a certain range.The variables ρ d (Soil) and ρ b (FA) have similar trends; the SHAP value first decreases and then increases until it decreases when each variable increases.This indicates that the fly ash and soil or diatomite have higher mechanical strength when utilized alone in PS slurries.Finally, the SSF content variable has a complex, changing trend, which indicates that values of SSF content within 0.01 and 0.03 in slurries lead to higher compressive strengths.

Objective functions
The optimization of the slurry ratio requires the analysis of three evaluation indicators: compressive strength, carbon emission coefficient, and material cost.In general, the objective function must be set to minimize carbon emissions and costs.However, the objective function of compressive strength must be discussed along with the utilization of slurries on earthen sites.Presently, research on fissure grouting slurries requires the compressive strength to be slightly higher than that of the site soil.Anchor grouting slurries have higher compressive strength requirements for them to be compatible with the anchor rods and site soil.However, no research has proven that the compressive strength of grouting slurries must be much greater than that of the site soil for optimal compatibility [5,49].In fissure grouting engineering, when the mechanical strength of the grouting slurry is less than that of the site soil, the part repaired by the slurries is more susceptible to erosion damage than the soil site, which causes secondary damage to earthen sites.Meanwhile, in anchor grouting engineering, if the mechanical strength of the grouting slurry is too small, the external forces acting on the earthen sites damage the slurry-soil interface before the slurry-rod interface is destroyed.This would lead to greater damage to the sites.In addition, the mechanical properties of earthen sites may vary significantly in different regions, geological conditions, and construction periods, leading to a more complex analysis of the mechanical compatibility between slurries and earthen sites [50,51].Therefore, this study favors slurries with higher compressive strengths, which are more widely utilized in the engineering of earthen site repairs.The optimization must maximize the mechanical strength and minimize the carbon footprint and cost.
We modeled the objective function of the compressive strength of the slurry using the GA-XGBoost model.Because carbon emissions mainly originate from the production and transportation of admixtures in the slurries, the objective function of the carbon footprint (EC) can be expressed as follows [52]: EC R and EC T represent the carbon footprints dur- ing production and transportation, respectively; Q i rep- resents the content of each admixture in the slurry; (8)  EF i represents the carbon emission coefficient of each admixture; D i represents the transportation distance of each admixture; and EF T represents the carbon emission coefficient during transportation.The carbon emission coefficient of electricity consumption was 0.581 kgCO 2 / (kW•h) based on the average value from the State Grid of China for 2022 [53].However, several studies were conducted in Lanzhou, and many field tests did not specify their test sites.Therefore, to ease calculation and analysis, the transportation distance of each material was set according to the shortest distance from its origin to the Langongping Campus at the Lanzhou University of Technology.For the same material with different origins, the average distance was taken as the transportation distance.The carbon emission coefficient during the transportation was set to 0.18 CO 2 /(ton • km) [54].
Subsequently, the objective function for the production and transportation costs of the materials can be calculated as: Q i represents the content of each admixture in the slurry, P i is the price of each admixture, and P Ti repre- sents the transportation cost of each admixture.Some materials require secondary processing after purchase, and the additional costs incurred during processing must be included.The transportation costs vary in different regions; therefore, the cost data do not vary linearly based on transportation mileage.
Table 7 lists the carbon emission factors and costs during the production and transportation of each admixture, where "\" denotes negligible parts.The prices of the raw materials were obtained based on purchasing experience and consultation with the material manufacturer.The transportation cost was obtained from consulting multiple transportation companies; this cost does not show linear changes based on the actual transportation mileage in different regions.The production price of the site soil was set based on labor costs from the experiences of on-site sampling, crushing, and screening of our research team.The adhesive concentration in the parentheses is the concentration at which the binders were purchased, which may need to be diluted during utilization.The carbon-emission coefficient of each material was obtained from the China Products Carbon Footprint Factors Database.Some materials, such as SH, CGN, PS, and SSF, are estimated by analyzing the possible carbon emissions generated through their production processes; therefore, the results may be inaccurate [21,[55][56][57][58].

Mixture optimization of the slurry
After determining the objective functions, the constraint conditions of each function must be set before conducting multi-objective optimization.The range constraints in this study varied within a range based on the number of slurry components in each sub-database.The proportional constraints were selected based on the liquid-solid ratios of different slurries, as mentioned earlier.
In sub-dataset_1, binder (liquid) represents SH and sticky rice paste.Admixture (solid) represents quicklime, fly ash and soil.The ratio constraint is considered as follows: In sub-dataset_2, binder (liquid) represents water and sticky rice paste.Admixture (solid) CGN, fly ash, quartz sand, bentonite and soil.The ratio constraint is considered as follows: In sub-dataset_3, binder (liquid) represents PS.Admixture (solid) represents fly ash, soil and diatomite.The ratio constraint is considered as follows: In addition, when analyzing the objective functions, the compressive strength must be considered in both the early and late stages to ensure experiment practicality.
In summary, three different multi-objective optimization scenarios were set for sub-dataset_1 sets for 28 days compressive strength, 63 days compressive strength, carbon emissions, and cost as objective functions, with 33 sets of data; sub-dataset_2 sets for 28 days compressive strength, 60 days compressive strength, carbon emissions, and cost as objective functions, with 20 sets of data; and sub-dataset_3, for only sets 28 days compressive strength (because of the lack of long curing-time data), carbon emissions, and cost as objective functions, with 85 sets of data.In these optimization scenarios, the compressive strength was calculated using the GA-XGBoost model, whereas the carbon emissions and cost were calculated using polynomials.
Optimization problems aim to maximize the compressive strength and minimize carbon emissions and cost.However, maximizing compressive strength may lead to (12) 0.10 ≤ m(Binder)/m(Admixture) ≤ 0.70  Fig. 16 Optimization results of the different slurry mixture designs increasing the latter two; therefore, these objective functions have mutually constrained relationships, and optimal results cannot be achieved simultaneously [59].
The MOPSO results are represented using parallel coordinate plots, as shown in Fig. 16.The vertical axis represents each objective function, each curve represents a non-dominated solution in the Pareto solution set, and the red line represents the ratio with the highest relative closeness [60].Table 8 presents the results of the optimal slurry ratio for each sub-dataset.
In sub-dataset_1, the optimal mass ratio of quicklime in the Lime slurry was 0.195, which means that the ratio of quicklime in the solid materials of the slurry was 0.3.This satisfies the requirement for the Lime slurry to produce a slight expansion through a high lime mass ratio, which can improve the mechanical compatibility between the slurry and earthen sites.Compared with the mass ratio with the highest compressive strength, the carbon emission coefficient of the optimal ratio decreased by 2.1% and the cost decreased by 1.6%.Meanwhile, the optimal ratio of the slurry resulted in a 12% increase in fly ash usage.Combining the SHAP analysis in Fig. 13 with the carbon emissions and cost values in Table 7, a high mass ratio of fly ash addition in the slurry can not only improve the mechanical strength of the slurry, but also increase the utilization rate of coal-fired by-products with low carbon.On the other hand, the optimal result of the mass ratio of site soil was decreased by 14%.This is because the collection of site soil has a high sampling cost and long-distance transportation.
In sub-dataset_2, the optimal mass ratios of CGN, fly ash, and quartz sand in the CGN slurry were 0.278, 0.208, and 0.208, respectively.Compared with the mass ratio with the highest compressive strength value, both the carbon emission coefficient and cost of the optimal mass ratio have decreased by 19%.Meanwhile, the optimal quality ratio of slurry the CGN decreased by 5.6%, while both the usage of fly ash and quartz sand increased by 4.2%.As using CGN is an important factor affecting carbon emissions and costs in this slurry, controlling the usage of CGN is the most direct way to optimize carbon emissions and economic objectives.Based on the results of the SHAP analysis shown in Fig. 14, the optimal ratio meets the requirements for enhancing the mechanical strength of the slurry.In a study by Ren et al., the CGN slurry ratio with the best physical and mechanical properties is the mass ratio with the highest compressive strength value.But the optimal quality ratio in this article also meets the requirements of high mechanical strength, high density and low shrinkage rate for the anchoring slurry used in earthen sites.
In sub-dataset_3, the optimal slurry was PS-C.Compared with the mass ratio with the highest compressive strength value, the carbon emission coefficient of the slurry with the optimal mass ratio decreased by 39.3%, and the cost decreased by 43.6%.Meanwhile, the usage of PS decreased by 18.7%.Although the use of fly ash in the mass ratio with the highest compressive strength can reduce the carbon emissions and costs of the slurry, controlling the usage of PS is the optimal solution for optimizing the carbon emissions and economic objectives of the slurry.Then the usage of PS is an important factor affecting carbon emissions and costs in this type of slurry.Based on the SHAP analysis results in Fig. 15, the values of PS modulus, PS concentration, and SSF content of the optimal slurry meet meets the requirements for enhancing the mechanical strength of the slurry.In a study by Li et al., PS-C slurry not only meets the physical and mechanical properties of the repairing earthen site requirements, but also has good frost resistance, acid and alkaline resistance and water disintegration resistance.
The results above fully demonstrate that the proposed model framework can not only calculate solutions with better environmental and economic benefits, but also reduce the usage of the materials with high carbon emissions and high costs in the preferred solutions.Meanwhile, the optimized ratio results meet the actual engineering needs of repairing earthen site.

Conclusion
In this study, we propose an analytical framework for the intelligent mixed design of anchor and fissure grouting slurries used for repairing earthen sites in Northwest China.Based on this framework, the admixture ratio was optimized for three commonly used slurries, and different design schemes were proposed for the mechanical, environmental, and economic performance of the slurries.The following main conclusions were drawn: (1) The relationship between the mechanical strength and its influencing factors in slurries is mostly non- linear.This requires compressive strength prediction through ML algorithms.The GA optimized the hyper-parameters of each algorithm for subdatasets, which allowed each model to be compared under relatively equal conditions, leading to better prediction accuracy with each optimized model.Among all the ML models, GA-XGBoost had significantly better evaluation metrics in all three subdatasets, indicating extremely high accuracy and versatility in mechanical strength prediction.(2) Based on the raw materials, mix composition, binder types, and curing conditions, the input variables affecting the mechanical properties were designed for modeling.A relative importance analysis showed that the calcium-based materials, binders, and curing time significantly impacted the compressive strength.However, SHAP analysis revealed that these impacts were sometimes negative.
(3) The data from the three sub-datasets used a design framework based on the combination of meta-heuristic optimization and TOPSIS decision to obtain multi-objective optimization results.This indicated good performance and met the slurry-optimization requirements in the original studies.By comparing the optimal solutions of the Pareto sets in each subdataset, the optimizations of compressive strength, carbon emissions, and cost are interdependent and cannot be achieved simultaneously.

Fig. 1
Fig. 1 Distribution and correlation of variables for sub-dataset_1

Fig. 2
Fig. 2 Distribution and correlation of variables for sub-dataset_2

Figures 8 , 9 ,
10 show the radar plots of the values of the evaluation metrics for each sub-dataset, which contain the training and test sets.The models with high R2

Fig. 3
Fig. 3 Distribution and correlation of variables for sub-dataset_3

Fig. 4
Fig. 4 Graphical representation of the ten ML models and GA

Fig. 5
Fig. 5 Experimental and predictive values of the ten machine learning models for sub-dataset_1 Figures 13,14, 15 show the results of the global analysis and the SHAP value of each input variable in the three sub-datasets.The first plot of each figure presents the results of the global analysis.Each point corresponds to an input sample in the dataset and each row represents an input variable.Red points indicate larger SHAP values, whereas blue indicates smaller ones.The other plots in Figs. 13, 14, 15 show the relationships between the variables and their SHAP values.The global analysis plots show that, except for a few variables in the top rows, the boundary of positive and negative contribution for other variables is not obvious, which indicates that the relationship between the input variables and the predictions is complex.The other plots with input variables and their SHAP values show the

Fig. 6
Fig. 6 Experimental and predictive values of the ten machine learning models for sub-dataset_2

Fig. 8 Fig. 9
Fig. 8 Radar chart of the evaluation indexes of the ten ML models for sub-dataset_1

Fig. 10 Fig. 11 Fig. 12
Fig. 10 Radar chart of the evaluation indexes of the ten ML models for sub-dataset_3

Fig. 13
Fig. 13 SHAP values of the global analysis and seven input variables in sub-dataset_1

Fig. 14
Fig. 14 SHAP values of the global analysis and eight input variables in sub-dataset_2 Fig. 15 SHAP values of the global analysis and seven input variables in sub-dataset_3 ≤ m(Binder)/m(Admixture) ≤ 0.68 (14) 0.39 ≤ m(Binder)/m(Admixture) ≤ 1.25 Table 1 lists the 523 data sets from different studies, including data sources, slurry names, numbers, admixture materials, ratios, and curing times.SSF in Table

Table 1
Database summary Table 4 presents the statistical results of the input and output variables for sub-dataset _3.In this table, c (PS) indicates the mass concentration of the PS.n(SiO 2 )/n(K 2 O) indicates the modulus of PS, which refers to the molar mass ratio of SiO 2 to K 2 O in potassium silicate solution.And this variable represents the content of effective bonding components in PS solution.

Table 2
Statistics information for sub-dataset_1

Table 3
Statistics information for sub-dataset_2

Table 4
Statistics information for sub-dataset_3

Table 5
lists the best combinations of hyper parameter values

Table 5
Hyper-parameters of the ML algorithm

Table 6
Statistical parameters used to evaluate the performance of different machine learning algorithms

Table 7
Characteristic parameters for slurry mixture ingredients

Table 8
Optimal solutions for the slurry mixture