Ancient mural dynasty recognition algorithm based on a neural network architecture search

A neural network model needs to be manually designed for ancient mural dynasty recognition, and this paper proposes an ancient mural dynasty recognition algorithm that is based on a neural architecture search (NAS). First, the structural edge information of mural images is extracted for use by the neural network model in recognizing mural missions. Second, an NAS algorithm that is based on contrast selection (CS) simplifies the architecture search to an incremental CS and then searches for the optimal network architecture on the mural dataset. Finally, the identified optimal network architecture is used for training and testing to complete the mural dynasty recognition task. The results show that the top accuracy of the proposed method on the mural dataset is 88.10%, the recall rate is 87.52%, and the precision rate is 87.69%. Each evaluation index used by the neural network model is superior to that of classical network models such as AlexNet and ResNet-50. Compared with NAS methods such as ASNG and MIGO, the accuracy of mural dynasty recognition is higher by an average of 4.27% when using the proposed method. The proposed method is verified on CIFAR-10, CIFAR-100, ImageNet16-120 and other datasets and achieves a good recognition accuracy in the NAS-bench-201 search space, which averages 93.26%, 70.73% and 45.34%, respectively, on the abovementioned datasets.


Introduction
A mural is a type of painting with a clear social function.Murals produced in different periods exhibit certain differences in regard to painting methods, material selection, layout and color, and each dynasty has its own mural style and characteristics.Therefore, the study of murals across different dynasties can help historical researchers better understand the history of the various dynasties, which has positive relevance for both the development of human history and the development of Chinese murals.
At present, image recognition is being widely used in various fields [1][2][3][4][5][6], and convolutional neural networks (CNNs) serve as one of the core algorithms in the field of image recognition.In 2012, the AlexNet [7] model was applied to ImageNet, which then led to increasingly deeper neural networks, such as VGGNet [8], GoogLeNet [9], and ResNet [10], being proposed.In this context, CNNs were initially used for mural image classification [11].However, when processing some of the solid features in mural images, CNNs are unable to extract category features such as tone and texture, which is a task that requires the artificial design of neural networks and the collection of a large amount of training data.For instance, Anitescu et al. [12] employed an artificial neural network and adaptive collocation strategy to solve the second-order boundary value problem, and based on the concept of energy, Samaniego et al. [13] defined the approximation space through neural networks and used a gradient optimization algorithm to solve nonconvex optimization problems.Zou et al. [14] and Gao et al. [15] used scale invariant feature transform (SIFT) and cascade classification strategies, respectively, to construct different features for classifying Chinese mural paintings.However, SIFT produces errors when classifying different styles of murals and requires considerable manpower to construct features.Cao et al. [16] aimed to extract depth features through Inception-v3 and finally used a cross-entropy loss function and adaptive learning rate algorithm for training.The same team also made adaptive improvements to the characteristic capsule layer parameters of the original capsule network and adopted an adaptive optimization algorithm for optimizing the parameters [17].Nevertheless, in the experiment, these algorithms face difficulties in finding the best parameter configuration, the test effect in different categories fluctuates, and many labor costs are consumed in the construction of the network due to the lack of hardware computing power.
Designing high-performance network architecture usually requires strong computing power support and researchers' professional knowledge.In recent years, automatically constructing a suitable network architecture by searching a large architecture space, which is referred to as a neural architecture search (NAS) [18], has become popular.NAS has achieved good results in image classification [19,20], target detection [21] and other tasks in recent years, but these methods require the evaluation of many network architectures, and the necessary computing resources are too large.To improve the repeatability of the NAS algorithm, reduce the computational requirements of the search algorithm, and accurately measure the authenticity of the NAS algorithm, the first benchmark test of the NAS algorithm, NAS-bench-101, was proposed in [22].An NASbench-201 benchmark, which not only reduces the scope of the search space but also provides unified evaluation criteria for almost all of the latest NAS algorithms, was proposed in [23].In addition, as another popular search space, the chain search space is organizationally different from the cell-based search space.The main reason for this is that the cell-based search space uses the same cell to stack and generate the final network architecture, while the chain search space splices the predefined candidate modules into the final network architecture through different permutations and combinations.
Inspired by the existing process of artificial network model construction, Akimoto et al. [24] proposed a stochastic natural gradient method with an adaptive step-size mechanism, which is not dependent on parameter adjustment.Zheng et al. [25] proposed multinomial distribution learning based on NAS.In their work, the search space is considered a joint multinomial distribution, NAS can be transformed into a multinomial distribution learning problem, and the distribution is optimized to meet higher performance expectations.Tan and Le [26] addressed the problem of increasing the depth, width, and resolution of NAS by proposing a composite model scaling method to improve efficiency and accuracy while reducing the number of parameters.Hu et al. [27] proposed the discrete stochastic neural architecture search (DSNAS), which is an efficient differentiable neural network structure search framework, and the neural network model obtained from the DSNAS framework can be immediately used without retraining the network parameters.Xu et al. [28] proposed partial channel connections for the memory-efficient architecture search (PC-DARTS) method that improves the utilization of memory.Zheng et al. [29] proposed a new NAS framework that reduces the estimation error of the natural gradient in multivariate distribution, decreases the training cost in any hardware environment, and ensures the efficiency and feasibility of the framework in any search space.Sinha et al. [30] proposed a neural search space evolution scheme.Starting from a subset of the search space, the search space is developed by repeatedly searching the optimization space from the search space subset and then filling the search space subset.Chen et al. [31] proposed a novel framework called the training-free NAS that ranks architectures by analyzing the spectrum of the neural tangent kernel and the number of linear regions in the input space.
Since ancient mural recognition requires the manual design of neural networks, it is not only a time-consuming and labor-intensive task but also requires professional basic knowledge.Combining the advantages of NAS in the design of automatic network structures with the adaptation to complex scenes that has been achieved in recent years, this paper proposes an ancient mural dynasty recognition algorithm based on a NAS that solves the organization problem of artificially designed neural networks.At the same time, a multiscale edge fusion module is designed to enhance the recognition performance of the network.Finally, the multiscale edge fusion module is used as a candidate operation, and the NAS algorithm is used to construct a complete neural network model for the study of ancient mural dynasty classification.The contributions of this paper are summarized as follows: 1 The remainder of this article is organized as follows.Related work is introduced in the first section; the NAS-CS and the depth-separable edge selection module is described in the second section; the experiment and discussion are presented in the third section; and a summary is provided in the fourth section.

NAS-CS
Different from other NAS methods that regard an architecture search as a probability distribution problem or a differential architecture network, an NAS-CS algorithm for any search space is proposed in this paper.First, a fixed sampling rule is set.Both the cell-based search space and chain search space have uniform sampling standards.That is, for the cell-based search space (taking NAS-Bench-201 as an example), the different edges of each cell select preset operations that do not arbitrarily change the information flow, and for the chain search space, the candidate operation of each layer also selects candidate modules that do not arbitrarily change the information flow.Compared with other NAS methods, this sampling method exhibits the following benefits for constructing a neural network architecture.
The feature information is not affected by any other candidate operations in this sampling method, and the classification accuracy is only related to the input feature information.
This method can facilitate the layer-by-layer selection of candidate operations.The implementation method of the algorithm presented in this paper proceeds by determining the best candidate operation layer by layer.The completely random sampling method, however, increases the instability of the algorithm and cannot guarantee the effective reliability of the candidate operation.
The pseudocode of the algorithm proposed in this paper is as follows: In the above algorithm pseudocode, edge represents the preset depth in the chain search space as well as the number of edges of the cell in the cell-based algorithm.The core idea is to train all possible sampling methods of the layer to use a fixed sampling method in the process of determining different levels, to record the accuracy of model recognition, and to select the best candidate operation as the final choice of the final architecture in the layer.During this operation, the time complexity of the algorithm is O (edge * num), which is only related to edge and num.

Depth-separable edge selection module
The murals of various dynasties show different styles and characteristics based on different cultural beliefs and public aesthetics.For example, the caves from the Northern Wei Dynasty are mainly divided into three categories.The first category is the central pillar cave, the second category is the flat, square cave with a covered bucket roof, and the third category is the Zen cave with monk rooms on both sides of the main room.After the Tang Dynasty, the style of murals in grottoes changed to that of "sutra paintings." The scenes are grand, the colors are magnificent, and scenes portraying the pure land of bliss replace the scenes used by the previous Bunsen generation.Due to their different styles and contents, the panoramic composition of murals is of great significance in their identification.Therefore, a deep separable edge selection module is constructed in this paper.This module is used to extract the structural edge information in a mural image and assist the neural network model in realizing the dynasty recognition task for the mural.Figure 1 shows the overall structure of the depth-separable edge selection module.First, for the input feature information F ∈ R H ×W ×C , high-resolution feature information F 1 ∈ R H ×W ×4C and low-resolution feature information 2 ×C can be obtained by depth separable convolution ϕ(.) and maximum pooling pool(.) .The specific formulas for these functions are as follows: After obtaining different resolution feature information, feature fusion is performed.First, the bilinear interpolation method (Upsample(.)) is used to upsample the low-resolution feature information and make the feature map size consistent with the high-resolution feature information.Then, depthwise separable convolution is applied to further extract the features, and the convolution results are fused according to the elementwise product to obtain F fuse , which is expressed as follows.
where F fuse mainly suppresses the background noise and focuses attention on the region of interest by mutual weighting, but it also causes the disappearance of valuable clues.Therefore, residual aggregation is used to solve this problem.Its formulaic expression is as follows.
where Cat(.) represents the concatenation operation, and F e represents the final edge information feature map, which continues to flow with the information flow identified by the auxiliary neural network model.

Module and algorithm combination
In designing a deep separable edge selection module, the number of convolution channels is not specified, and no specific information is provided for determining where the module will have the best recognition effect in the neural network model.Traditional manual network design requires constant trial and error and a constant adjustment process, which not only takes a long time but also often makes it difficult to achieve satisfactory results.Treating the deep separable edge selection module as a candidate operation in the search space enables the NAS method to quickly construct a complete neural network model.Moreover, the number of channels in the module convolution process can be automatically adjusted over time according to the performance of the temporarily (1)  constructed model, and its spatial position in the network model can be set.Therefore, in this paper, the deep separable edge selection module is treated as a preset candidate operation, and the channel number strategy of each stage of the network model is fixed.The proposed NAS-CS is used to search for the optimal network architecture on a mural dataset.Finally, the optimal network architecture is used for training and testing to complete the mural dynasty recognition task.

Fig. 3 Sample mural images from different dynasties
The overall network structure proposed in this study is shown in Fig. 2. The number of channels is set for each level in the search space, and the depth-separable edge selection module is used as a candidate operation.When using the NAS-CS, a sampling array is used as the deep separable edge selection module.The number of channels is dynamically assigned to meet the connectivity needs of the constructed neural network model.The model participates in the training, which includes multiple sampling and complete architecture selection operations.Finally, the optimal architecture is used to complete the visual task of mural dynasty recognition.

Experimental environment and parameter settings
The experimental network is built using Python 3.8 and PyTorch 1.8.1.The hardware environment uses an Intel i5-9400F and an NVIDIA GeForce RTX 2060 SUPER with 16 GB memory.The software environment includes Windows 10 and PyCharm.The number of initial channels in the network is preset to 16, and SGD is used to optimize the network weight W. The initial learning rate is 0.1, the momentum is 0.9, the weight is attenuated, and the batch size is 16.In the search phase, the number of iterations is set to 100.After finding the optimal network model, 200 iterations are used to record its maximum accuracy.The hyperparameter settings are summarized in Table 1.
The NAS-Bench-201 search space and the network based on MobileNet are used as a frame to construct the chain search space for verifying the effectiveness of the algorithm.Each deep convolution in the chain search space based on MobileNet contains convolution kernels of three different sizes {3,5,7} and expansion rates of two different sizes {3,6}.In the mural search space, the constructed depth-separable edge selection module is integrated as a preset module.In the benchmark test of NAS-Bench-201 and chain search spaces, four random number seeds {2,4,6,8} are set to optimize the search space, and the search results are directly used to query

Comparison of model experiments on different datasets
The algorithm in this paper is designed to perform mural classification.In this section, its effectiveness in mural dynasty recognition based on mural datasets is demonstrated.To ensure fairness and show that there is no overfitting phenomenon in the model, the Caltech101 dataset is used to verify the validity of the model.

Experimental verification on the mural dataset
Dunhuang murals from the literature [6] are used as the dataset.This dataset includes 9630 mural images from the Northern Wei Dynasty, Northern Zhou Dynasty, Sui Dynasty, Tang Dynasty, Five Dynasties and Western Wei Dynasty.The image distribution of murals from different dynasties is shown in Table 2, and example mural images from different dynasties are shown in Fig. 3. To verify the effectiveness of the method described in this paper, the proposed method is compared with other artificially designed network models and other NAS methods in a chain search space with a deep separable edge selection module.The maximum accuracy, recall rate and accuracy rate are selected as evaluation indices.The DSNAS refers to the ratio of the number of positive samples that were correctly classified to the total number of positive samples.The higher the recall rate is, the higher the probability that the correct mural dynasty is predicted.
The experimental results of the algorithm reported in this paper and the artificially designed network model are shown in Fig. 4. As seen in the figure, the maximum accuracy of the proposed algorithm with the Dunhuang mural dataset is 88.10%, which is 6.42%, 4.85%, 8%, 9.04% and 17.94% higher than that of AlexNet [7], MobileNet_v2 [32], DenseNet [33], ResNet50 [10] and MNASNet [34], respectively.On the mural dataset, the recall rate and precision rate of the proposed algorithm also reached their highest values (87.52% and 87.69%, respectively), which were 5.88%, 3.21%, 7.04%, 8.46% and 15.83% higher than those of the four algorithms for   recall rate and 6.59%, 3.76%, 7.84%, 8.91% and 18.88% for precision rate, respectively.These findings demonstrate the effectiveness of the proposed algorithm on the mural dataset.In addition, we investigated the dynasty recognition accuracy of the method proposed in this study for murals from different dynasties, and the confusion matrix is shown in Fig. 5.Among the different dynasties, the recognition accuracy of the proposed method is the highest for the Sui Dynasty, reaching 100%, while those for the Northern Wei, Northern Zhou, Tang, Five and Western Wei Dynasties are 86.40%,92.03%, 74.78%, 93.56% and 86.49%, respectively.The main reason for the low accuracy of Tang Dynasty mural recognition is that the style of Tang Dynasty murals is very similar to that of Five Dynasties murals.The recognition accuracies of murals from the Northern Zhou and the Five Dynasties are higher than the average accuracy rate.

Experimental verification on the Caltech101 dataset
The Caltech101 dataset contains 102 image categories, including plane, water lily, and elephant, and the numbers of images in each category range from 40-800.An   example image is shown in Fig. 7. To further verify the effectiveness of the algorithm, the classification ability of this method is further verified on the Caltech101 classification dataset.
The comparison of the classification accuracy of different models is shown in Fig. 8.The recognition accuracy of different models is shown in Table 3. From Fig. 8 and Table 3, it can be seen that the classification accuracy of the AlexNet algorithm reaches its highest value of 73.62% after 118 iterations.The classification accuracy of the ResNet-50 algorithm reaches its highest value of 80.62% after 155 iterations.The classification accuracy of the MobileNetV2 algorithm reaches its highest value of 75.77% after 152 iterations.The classification accuracy of the DenseNet algorithm reaches its highest value of 82.84% after 166 iterations.The classification accuracy of the proposed algorithm reaches its highest level of 87.62% after 183 iterations.The accuracy of the proposed method is 4.78 percentage points higher that of the ResNet-50 algorithm, which has the best performance.

Comparison of model experiments in different search spaces
Considering that other models are not designed for mural dynasty recognition, the CIFAR-10, CIFAR-100 and ImageNet16-120 datasets are used to fairly compare the effectiveness of the NAS method against that of the benchmark.The CIFAR dataset is a standard color image classification dataset containing natural scenes.The image size is 32 × 32 pixels.The training set and test set contain 50 K and 10 K images, respectively.The CIFAR-10 dataset contains 10 types of images.The CIFAR-100 dataset is similar to the CIFAR-10 dataset, but it contains 100 types of images.The ImageNet16-120 dataset contains a total of 151.7 K training images, 3 K verification images, and 3 K test images.The image size is 16 × 16 pixels, and there are a total of 120 classes.Some image categories in the CIFAR-10 dataset are shown in Fig. 9.

NAS-Bench-201 search space experimental verification
To show that the model proposed in this paper not only has a high dynasty recognition value on the mural dataset and that the high classification accuracy is not caused by overfitting, the model proposed in this paper is compared with eight different algorithms, namely, ASNG, MDE-NAS, DDPNAS, DSNAS, PC-DARTS, MIGO, EvNAS and TENAS, in the NAS-Bench-201 search space, and the results are shown in Table 4.For the CIFAR-10 dataset, the average test accuracy of this method reaches its highest at 93.26%, with a variance of 0.20.This result shows that the algorithm has good optimization ability, and the small variance also shows that its stability is strong.The proposed algorithm also achieves average test accuracies of 70.73% and 45.34% on the CIFAR-100 and ImageNet16-120 datasets, respectively.Figure 10 shows a visual display of the recognition accuracies of different algorithms on the CIFAR-10, CIFAR-100 and ImageNet16-120 datasets in the NAS-Bench-201 search space.

Chain search space experimental verification
Considering that the performance of other models on the mural dataset is poor, the test data, which are limited to the mural dataset in this study, may result in an unfair comparison.To address this, the algorithm proposed in this study is compared with four NAS algorithms, i.e., ASNG, MIGO, MDENAS and DDPNAS, to further demonstrate the search performance of the algorithm in the chain search space.The experimental data results are shown in Table 5. Figure 11 is an intuitive display of the recognition accuracy of different algorithms on the CIFAR-10 and CIFAR-100 datasets in the chain search space.Compared with that of the other models, the superior performance of the proposed algorithm is verified, but it still exhibits a performance gap compared with that of individual comparison algorithms.For example, on the CIFAR-100 dataset, the average accuracy of the algorithm proposed in this paper is lower than that of the MIGO algorithm.However, on the CIFAR-10 dataset, the optimization result of the algorithm in this paper is superior, and the average accuracy can reach as high as 88.64%.

Conclusion
This paper proposes an ancient mural dynasty recognition algorithm based on NAS, which solves the problem of building a manually designed mural dynasty recognition network.To achieve this goal, a deep separable edge selection module is designed to enhance the recognition performance of the network.Then, a deep separable edge selection module is used as a candidate operation, and finally, the NAS algorithm is However, based on the findings of this study, there is still work to do in the future.The accuracy of the model for mural dynasty identification can be improved to benefit future practical applications.Specifically, more advanced deep learning models should be explored and introduced, and the size of the samples from the Dunhuang mural dataset should be further expanded to ensure that the model can accurately identify the dynasty information of murals across a wider range of scenarios.Furthermore, attention should also be given to accumulating diverse mural samples including changes in lighting, angle, and size, to improve the robustness of the model.In addition, active cooperation with cultural heritage protection, art research and other fields is needed.By leveraging the knowledge and feedback of experts in the field, the algorithm can be further optimized and more reliable support for practical applications to protect and pass on humanity's valuable cultural heritage can be provided.

Fig. 1
Fig.1The overall structure of the depth-separable edge selection module.Low-resolution and high-resolution features are obtained through depth-separable convolution of feature information F, the two are multiplied, and feature fusion is carried out according to element weighting.Multiple modules are combined to extract deep features

F e = ϕ 7 Fig. 2
Fig.2The overall network structure proposed in this study

Fig. 4 Fig. 5
Fig. 4 Comparison of experimental results between the proposed method and other artificially designed network models using the mural dataset.AR indicates average recall

Fig. 6
Fig. 6 Comparison of the dynasty recognition results of different NAS methods on the mural dataset

Fig. 7
Fig. 7 Sample image from the Caltech101 dataset

Fig. 8
Fig. 8 Comparison of the classification accuracy of different models on the Caltech101 dataset

Fig. 9
Fig. 9 Examples of image categories in the CIFAR-10 dataset

Fig. 10
Fig. 10 Comparison of the recognition accuracies of different architectures for the CIFAR-10, CIFAR-100 and ImageNet16-120 datasets in the NAS-bench-201 search space

Fig. 11
Fig. 11 Comparison of the recognition accuracies of different architectures for the CIFAR-10 and CIFAR-100 datasets in the chain search space

Table 2
Image distribution of murals across dynasties

Table 3
Recognition accuracy of different models on the Caltech101 dataset

Table 4
Comparison of the test accuracy between the proposed model and other NAS architecture comparisons (mean ± variance) in the NAS-bench-201 search space

Table 5
Comparison of the test accuracies of the proposed algorithm and other NAS algorithms in the chain search space (mean ± variance)