Digital inpainting of mural images based on DC-CycleGAN

Located in Dunhuang, northwest China, the Mogao Grottoes are a cultural treasure of China and the world. However, after more than 2000 years of weathering and destruction, many murals faded and were damaged. This treasure of human art is in danger. Mural inpainting through deep learning can permanently preserve mural information. Therefore, a digital restoration method combining the Deformable Convolution (DCN), ECANet, ResNet and Cycle Generative Adversarial Network (CycleGAN) is proposed. We name it DC-CycleGAN. Compared with other image digital inpainting methods, the proposed DC-CycleGAN based mural image color inpainting method has better inpainting effects and higher model performance. Compared with the current repair network, the Frechet Inception Distance (FID) value and the two-image structural similarity metric (SSIM) value are increased by 52.61% and 7.08%, respectively. Image color inpainting of Dunhuang murals can not only protect and inherit Chinese culture, but also promote academic research and development in related fields.


Introduction
Dunhuang murals are a cultural and artistic treasure for China and the people of the world.In the 1940s and 1950s, the problem of the fading and discoloration of murals was noticed, and people could only use copying to restore mural colors.Using the manual copying method, the restoration must be based on what is written in the ancient texts and the degree of fading of the murals.This method is time-consuming and requires several visits to the site, which is a tremendous challenge for conserving the murals and can even cause secondary damage.
In the 1980s and 1990s, computer image processing technology gradually replaced manual work and became the mainstream.This method could be used for the virtual restoration of murals, mural conservation, mural copying, determining the virtual evolution of the mural disease process, and realistic virtual displays of murals [1].Initially, the restoration software for used mural color could only quickly correct the photo hue and adjust the hue, color saturation, and brightness of the picture [2].The neural network restoration algorithm could only achieve the preliminary restoration of mural colors.Li et al. [3] proposed a model based on compressive total variation (CTV) to describe the relevant detailed information of an image using a priori information such as texture and structure to describe the overall image structure.Yao et al. [4] introduced domain and structure optimization measures based on the Criminisi algorithm to solve the problem of mis-matching.However, restoring mural colors is a very tedious task that requires multiple software programs to work together to recover the original mural colors.Thus, the large-scale restoration of the mural colors is not achievable.
The purpose of this paper is to apply color style transfer techniques to the color restoration of mural images.Specifically, we utilize the CycleGAN network embedded with ECANet modules and deformable convolution *Correspondence: Zhigang Xu xzg_cn@163.com 1 School of Computer and Communication, Lanzhou University of Technology, No.36 Pengjiaping Road, Qilihe District, Lanzhou 730050, Gansu, China (DCN) kernels to transfer the color style from reference mural images to grayscale mural images, thus accomplishing the color restoration of mural images.Through experiments, we have validated the feasibility of this method.
The rest of the paper is organized as follows: Sect."The Method" introduces the related work.Sect."Experimental analysis" presents the experiments conducted in this study.Finally, Sect."Conclusions" provides a summary of the research topic.

Related studies
With the rapid development of artificial intelligence technology and deep learning technology, more and more fields are applying intelligent technology [5].Deep learning has made significant breakthroughs in a target recognition, target classification, mural segmentation, and target tracking [6].For example, Qin et al. [7] proposed a restoration model based on multiscale attention networks to improve the authenticity of restored images by introducing multiscale attention groups.Zeng et al. [8] proposed a restoration network based on a context encoder to complete the restoration of broken images by encoding the contextual semantics of full-resolution inputs.Iizuka et al. [9] improved the local clarity of the repaired vision by introducing a global discriminator and a local discriminator.Yan et al. [10] added a shift connection layer to the U-net model and introduced bootstrap loss to the decoder features to improve the accuracy of the repaired image.Zeng et al. [11] used a deep convolutional neural network to generate a rough repair map of the broken image.They then used nearest-neighbouring pixel matching for controlled restoration, resulting in a more high-frequency realistic image.Hu et al. [12] investigated an adaptive color-reduction method so that the reduced image retained both the general color information and the local texture information of the original picture.Gatys et al. [13] used convolutional neural networks to separate an image's structural and color-texture information before texture synthesis.Wang et al. [14] extracted advanced image features and optimised the perceptual loss function by training residual networks to generate high-quality images.Justin et al. [15] use a feedforward network architecture to transfer color from a given image to a target image.Zhang et al. [20] provide comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth.
How to permanently preserve mural culture and how to save these endangered artistic treasures are the shared goals of every generation of scholars.In today's twentyfirst century, many scholars have emerged and have aimed to restore and preserve murals digitally.Xie et al. [27] proposed a better sample block-based image restoration algorithm and a sustainable restoration technique for digital virtualization.Gao [28] proposed a virtual restoration method based on minimum spanning trees to restore mural color.Fu et al. [29] proposed a novel enhanced white-out interactive system for mural image restoration.Zhou et al. [30] proposed an intelligent restoration technique for digital images of murals based on machine learning algorithms.Jiang et al. [31] proposed a computer-aided mural restoration solution based on brilliant line drawing generation.

DC-CycleGAN
Dunhuang murals, as the epitome of human civilization, have encountered various issues over time, such as severe damage and lack of color information.Moreover, each mural image is unique, requiring special attention in its digital restoration.Every detail must be carefully repaired.In 2017, Zhu et al. [17].Proposed the CycleGAN network based on the principle of cycle consistency.Inspired by this idea, we designed the DC-CycleGAN model in this paper, making modifications to the original generator and discriminator.By adhering to the principle of cycle consistency, we successfully accomplished the color restoration of mural images.
While traditional CycleGAN networks perform well in image translation tasks, they may encounter challenges in the color restoration of complex mural images.This is because mural images often possess intricate textures and details, and the lost color information is difficult to recover.Therefore, in our research, we introduce both ECANet modules and deformable convolution kernels to enhance the feature extraction and color restoration capabilities of the CycleGAN network for mural images (Fig. 1).

Efficient channel attention network
However, most existing methods aim to develop more complex attention modules to achieve better performance, which inevitably increases the complexity of the model.This requires more parameters during the training process and results in a significant decrease in performance.In order to overcome the trade-off between performance and complexity, in 2020, Wang et al. [16].Proposed Efficient Channel Attention Network (ECANet), which learns channel attention weights to adjust the importance of different channels for better modeling of image details and semantic information.Drawing inspiration from this idea, in this paper, we propose the ECANet-CycleGAN network.We add ECANet modules to the third layer of the encoder and decoder in the generator of the CycleGAN network, and also add ECANet modules to the third layer of the discriminator.
Through experiments, we have observed that this model has achieved a certain improvement in feature extraction capabilities.Figure 2 is a schematic of ECANet.

The deformable convolution
Then, the natural image is relatively smooth, while the brush strokes in the mural image make the image rich and delicate texture.The traditional convolution can not describe the texture in the mural image well.Therefore, a larger convolutional kernel is needed for better feature extraction of object shape and size.In 2017, Dai et al. [19].
Proposed Deformable Convolutional Networks (DCN) to enhance the modeling capability of object deformation and local details by increasing the size of the convolutional kernel.Inspired by this idea, we propose the DCN-Cycle-GAN model in this paper, where we replace the 4-layer convolutional kernel in the encoder of the generator with deformable convolutional kernels.Experimental results demonstrate that this model can better adapt to objects of different scales and shapes, improving the model's perceptual and expressive capabilities.Compared with the original CycleGAN network, the proposed method has a great improvement in the detail extraction of mural images.The method proposed in this paper can better restore the brush strokes and artistic styles in mural images.

Loss function
The traditional GAN [18] is composed of a generator and a discriminator; the generator wants to generate a more realistic image of the mural, and the discriminator discriminates against the authenticity of the mural.The two compete with each other to achieve dynamic equilibrium.The loss function of the GAN can be described as follows: The task that CycleGAN needs to accomplish is to migrate the style of the mural colors in domain A to domain B. Using two GANs to achieve asymmetric training of the data, not only do the loss functions of the two GANs need to be calculated, but also a loss function (1) needs to be defined to reflect the activity of the whole network.
The sum of these two loss functions is the loss function of CycleGAN.
In summary, the main contributions of this paper are as follows: A) According to the characteristics of complex details and rich colors of mural images (mural grayscale images), we adopt a Deformable convolution (DCN) strategy to better fit the details and color characteristics of mural images and achieve better digital restoration results.B) According to the insufficient performance of generative adversarial network in learning mural details, we propose a CycleGAN network with ECANet to improve the overall performance of the model, optimize the extraction of image detail features and color features, and obtain better results.

Experimental analysis
In this section, we present our experiments in five parts and discuss the results. (2) element-wise product

Datasets
The dataset for this paper comes from the books Dunhuang Architecture Research [32], The Complete Collection of Dunhuang Grottoes [33], Interpreting Dunhuang-Creating Dunhuang [34], and The Complete Collection of Chinese Dunhuang Murals.Beiliang Northern Wei [35], from which we selected 1236 well-preserved mural images to use as the dataset.Since there are few existing Dunhuang mural images, relatively complete mural images are few and far between.Therefore, the data set of this paper uses scanning existing relatively complete images from books.Because deep learning requires a large number of images as a data set, this paper uses python image cropping algorithm to expand the data set.The artistic style of Chinese mural images is different from that of paintings from other countries, and there is often a lot of white space in the works.In the process of cropping, there may be images with blank positions.We will manually screen out the images with image features as the data set.As shown in Fig. 4 The training set of DC-CycleGAN includes two, trainA and trainB, and the test set also includes two, testA and testB.However, during the restoration process, we found that if we use original mural images, i.e., faded or damaged mural images, the problem of inadequate feature extraction and color deviation will occur during restoration.Therefore, this paper will use DC-CycleGAN to color the decolorised murals and use the decolorised murals as the subject of restoration.In this paper, all the images of the data set are completely de-colored, and the influence of the image with noise on the repair effect is solved as much as possible.The sources of decolored murals in the data set are in total two aspects: 1. De-colored murals are processed by artificial means.This paper uses Photoshop software to modify the color scale, contrast and hue of the image, so as to achieve the purpose of color removal.2. By consulting different materials, such as black and white mural images in the collection of the British Museum.The last part of the mural images comes from the black and white mural images in the book of Century Dunhuang [36].
The training set is arranged such that the colored images are stored in the trainA folder and the de-colored images are stored in the trainB folder.The arrangement of the test set is to place the colored images in the testA folder and the de-colored images in the testB folder.
As shown in  For the generated image, we also need the discriminator in the network to determine whether it is a real image or not.Since the whole network contains two generators and two discriminators, and forms a ring, the cycle consistency principle is used to form a cycle generative adversarial network.In the training process, batch_size is 2, and the learning efficiency is 0.0002.In the testing phase, the corresponding images are also put into the corresponding data set, and the generator is used to generate the weights in the training phase, and the repaired image is finally generated.

Analysis of results
The environments used for the training and testing phases of this paper are Pytorch 1.9.0 and CUDA 11.3, the operating system used is Ubuntu 20.03 LTS, the CPU is an Intel(R) Xeon(R) Gold 6330 CPU @ 2.00 GHz, and The graphics cards used were two NVIDIA RTX 3090 with 24 GB of video memory.The loss function curve during the experiments can provide a better understanding of various issues encountered.In this study, we set the number of iterations to 10,000, and the training loss function of the model is depicted in Fig. 6.
From Fig. 6, it can be seen that as the model training progresses, the overall trend of the loss function is decreasing, but there are fluctuations.This demonstrates that the generator and discriminator are continuously engaged in a game.In particular, the loss function curve of DC-CycleGAN exhibits more pronounced fluctuations compared to the original CycleGAN, indicating a more intense adversarial process between the generator and discriminator in DC-CycleGAN.In summary, DC-CycleGAN has a stronger perception ability for image details compared to the original CycleGAN, making it more suitable for color restoration of mural images.
The comparison method of this experiment used objective evaluation, and the objective assessments of image quality could be divided into the following categories: 1. Complete reference image quality evaluation, which allows the restored mural images to be evaluated based on the original mural images.
2. Semireferenced image quality evaluation, which compares the restored murals based on the information from a specific part of the original mural images and then evaluates the quality.
3. Unreferenced image quality evaluation, which only evaluates the quality of the restored murals based on some image characteristics.
The FID characterises the distance between the real image and the generated image in the feature space, and a lower FID score represents a higher-quality generated image.the calculation formula is shown in Eq. ( 4): where and are the mean and covariance matrices of the real dataset; and are the mean and covariance matrices of the generated dataset; and T r denotes the sum of the elements on the diagonal of the matrix.
The SSIM determines the brightness and contrast in an image and compares the brightness and contrast between images to derive the similarity between images.Additionally, the SSIM result is a number ranging from 0 to 1.The larger the value, the smaller the difference between images.The calculation formula is shown in Eq. ( 5): The MSE is an evaluation index between the predicted and actual values, whose range is [0, + ∞).When the expected and actual values match exactly, the result is equal to 0, indicating a perfect model.Still, when the error between the predicted and actual values is more significant, the MSE value is more significant.The calculation formula is shown in Eq. ( 6): (4) The RMSE indicates that the value of MSE is openrooted based on MSE.

Subjective evaluation
To evaluate the method used in this paper, after 10000 rounds of training, the results generated by this method were compared with CycleGAN, ArtFlow [24], Chro-maGAN [25] and UGAN [26].The comparison results are shown in Fig. 7: The following conclusions can be drawn from the information in Fig. 7 above.The results of the Fig. 7c ArtFlow restoration have inadequate and incomplete image color restoration, ghosting in the background, overall light color, and color restoration deviation in some places.The overall result of the Fig. 7d ChromaGAN restoration is dark.So the Fig. 7d ChromaGAN restoration is also not perfect.The overall restoration achieved by Fig. 7e UGAN has a more significant problem.Overall, the restoration is yellowish.Fig. 7[e] UGAN is the worst of all of the methods used.Fig. 7f Ours is closer to the actual mural image.The overall tone and the original style have been largely retained.The color is also fuller and more adequate compared to the other pictures.The details of the mural picture and the details of the mural images are also preserved, and no ghosting occurs.Although it does not achieve the same effect as the reference image, the proposed method is still superior to the previous methods in terms of color inpainting and detail inpainting.The details and colors are also infinitely close to the reference images.In summary, the proposed method is feasible.various methods for mural image restoration.Table 1 shows the evaluation index data for ArtFlow, Chro-maGAN and UGAN methods used in this paper.

The comparison of the FID values, SSIM values, MSE values, and RMSE values can better show the results of
As can be seen from Table 1, which details the FID index, the method in this paper helps to improve the accuracy of FID; i.e., it enhances the quality of image generation and dramatically improves the SSIM value to make it closer to 1, i.e., it makes the error between images smaller, and the MSE and RMSE values also have a substantial reduction, which is more relative to the predicted and actual values.Among them, separate convolution can bring higher quality improvement while effectively reducing the number of parameters and the model size because lowering the model parameters helps to reduce overfitting and thus improves accuracy.

Ablation experiments
To verify the results of this improvement on the network model and the absolute accuracy, the original CycleGAN (CycleGAN), the ECANet added to the generator only (net_G), the ECANet added to the discriminator only (net_D), and the ECANet (Ours) added to the generator and discriminator were compared.The comparative experimental results are shown in Fig. 8 below: As can be seen from Fig. 8c CycleGAN the restoration result has a Mosaic situation at the edge of the picture, and the details of the ring in the figure are not repaired.In summary, the repair results of original CycleGAN are not ideal.Fig. 8d net_G Add the repair result of the ECANet to the generator, the overall picture is dark, and the details are not completely repaired.features of the image, so as to improve the detail characteristics of the repaired image.Although there is a certain gap with the reference image, it has been infinitely close to the reference image.In summary, the proposed method is successful in the color restoration of mural images.Table 2 shows the evaluation metrics data for CycleGAN, CycleGAN_netG, CycleGAN_netD, and the method proposed in this paper.
A) After adding the ECANet to the generator or discriminator only, the reduction effect and evaluation index values are not satisfactory and are even higher than those of the original CycleGAN method without adding the ECANet.B) 2) After adding the ECANet to both the generator and discriminator, the results are greatly improved; the FID value is much lower than that of the original CycleGAN method; the SSIM value is more solved than 1; and the MSE and RMSE values are more diminutive.Therefore, the method proposed in this paper is better for recovering mural images.

Comparison experiments
In reality, most murals only have degraded colors or missing colors, so this paper designed an experiment to perform color restoration work using faded murals as restoration substrates and comparing restoration drawings obtained using decolorised murals as restoration substrates.A comparison chart of the experimental results is shown in Fig. 9 below: As shown in Fig. 9 above, Fig. 9c Actual are the images restored using an actual mural image as the base plate, and Fig. 9d Ours are the images restored using a decolorised mural image as the base plate.The image of Fig. 9c Actual is cloudy and dim, and the image of Fig. 9d Ours is brighter and clearer.In terms of details, the restored image of Fig. 9c Actual shows ghosting and unsuccessful colorization, such as the parts of the first and second sets of red boxes.Some restoration results have the problem of color casting.On the contrary, the inpainting result of Fig. 9d Ours has been repaired to the greatest extent in color and improved in details.Although there is still  a gap between the inpainting result and the reference image, it has been infinitely close to the reference image.Table 3 below shows the restoration results obtained using the authentic mural as the base plate and the evaluation indicators of using the decolorised mural as the base plate.
It can also be concluded from the evaluation indexes that none of the evaluation indexes are better than the method proposed in this paper if authentic murals are used as restoration substrates.
Through 10000 iterations of experiments, the time overhead used by the restoration method using real murals as the base plate increased by 35%.In summary, the method using decolorised murals as the restoration base plate is feasible and outperforms the method using real murals as the restoration base plate in terms of restoration results and time overhead.

Application experiment
For the rigor of the experiments in this paper and the future application scenarios, we specially designed the  application experiments.The so-called application experiment is to find real random black and white de-colored murals as a data set, and directly conduct color inpainting experiments.The black and white decolorized murals selected randomly in this paper are partly from Century Dunhuang and partly from the black and white Dunhuang mural images collected by the British Museum.
Through the simple processing of these images, they were split into nine copies as the test data set.The following Fig. 10 shows the experimental results.
From the above experimental results, it can be observed that although random black and white mural images have issues such as image blur and lack of details.

Conclusions
In this paper, we propose a DC-CycleGAN method for mural image restoration that enhances the network's ability to extract the color of mural images and to obtain more realistic mural restoration images.The method outperforms the mainstream methods in terms of generation results, in which the FID index is significantly improved, the SSIM value is closer to 1, the MSE and RMSE are smaller than the other methods mentioned in this paper, and the details of the figures are restored to the maximum extent.Additionally, the lines are more precise, and the image background is cleaner.The method proposed in this paper still has some shortcomings.The digital restoration of murals is a long-term task that requires continuous iterations to gradually restore them to their original appearance.Although the method in this paper has achieved good achievements, there are still the following problems: 1.The restoration of the color of the mural by our method is all based on the existing mural and the damaged copy of the scientist, so the randomness is very strong.2. Compared with the reference image, our method can only be infinitely close.To go beyond the reference image, a lot of research and training are needed.3. The existing Dunhuang murals are relatively rare, and the number of datasets is far less than that of traditional deep learning networks.In view of the above three problems, we will continue to in-depth research in the future to solve the above problems as soon as possible.Finally, we want to apply our method to more academic research, combine our approach with indepth theoretical research, and create more academic contributions.We will maximise the advantages of our system and elevate the study of murals to the realm of cultural studies.
Figure 3 below shows the diagram of the Deformable Convolution kernel.

Fig. 3
Fig. 3 Illustration of 3 × 3 Deformable Convolution (a) after processing each work as a slice image, it is necessary to divide the training set and the test set.As shown in Fig.4(b).The slice images with a lot of white space are labeled and deleted.Finally, through the above method, each image was sliced into nine parts, and a data set of 10800 images was obtained.The training set and test set were designed according to the ratio of 5:1, in which the training set was 9000 and the test set was 1800, and the training set and test set were unified as 1024 × 1024 PNG pictures.The content of the training and test sets includes Buddha paintings, sutra change paintings, human portraits, decorative paintings, and story paintings.

Fig. 5 ,
(a) is the picture display stored in the trainA folder and testA folder, and (b) is the picture display stored in the trainB folder and testB folder.

Fig. 4
Fig. 4 Schematic diagram of making mural image slices

Fig. 5
Fig. 5 Some examples from the dataset.Where a and b are the nine segmented images, respectively

Fig. 6
Fig. 6 Loss function curve Fig. 8e net_D Adding the ECANet to the discriminator also has the problem of the original CycleGAN, and Mosaic appears in the repair results, and the detail reduction is poor.Fig. 8f Ours The method in this paper is ahead of the above methods in detail restoration, and fully restores the ribbon of the figure, the facial expression of the figure, and the details in the picture.Although the original CycleGAN has been very successful in color inpainting, it still lacks in details.The method in this paper adds the corresponding attention mechanism on the basis of it, and changes the ordinary convolution to the Deformable convolution (DCN).The whole network pays more attention to the detail

Fig. 7
Fig. 7 Comparison results of different methods

Fig. 8
Fig. 8 Comparison results of ablation experiments

Fig. 9
Fig. 9 Comparison between the restoration of the actual mural image as a base plate and the restoration of the method in this paper Fig. 10[b] CycleGAN shows the results using the original Cycle-GAN model, with an overall lack of image clarity and color bias.Fig. 10[c] ArtFlow The results using ArtFlow model are shown, the overall bias is blue, and the image details are not recovered enough.Fig. 10[d] ChromaGAN shows the results using the ChromaGAN model, with an overall yellow color bias and inadequate color detail restoration.Fig.10[e] Ours demonstrates the proposed method in this paper, which successfully accomplishes the color restoration task and achieves excellent results even when faced with random grayscale images.Based on the above, it can be concluded that this paper can be effectively applied to practical research.

Fig. 10
Fig. 10 Experiments on color Inpainting of random black and white mural images

Table 1
Comparison experimental FID indicators

Table 2
Ablation experiment evaluation index indicators

Table 3
Evaluation indexes of actual design experiments