Zero-sample mural superresolution reconstruction for enhanced perceptual quality

Cao, Jianfang; Hu, Xiaohui; Tian, Yun

doi:10.1186/s40494-023-00907-6

Research
Open access
Published: 04 April 2023

Zero-sample mural superresolution reconstruction for enhanced perceptual quality

Jianfang Cao^1,2,
Xiaohui Hu^1,2 &
Yun Tian¹

Heritage Science volume 11, Article number: 67 (2023) Cite this article

1027 Accesses
Metrics details

Abstract

Aiming at the problem of texture loss and poor perceptual quality in low-resolution mural images, this paper proposes a zero-sample mural superresolution reconstruction method called EPZSSR to enhance perceptual quality, and the specific model is obtained by training the image. The algorithm takes the zero-shot superresolution method as the framework, randomly cuts the original image into a 128 * 128 size, performs Gaussian blurring on the image, uses Lanczos interpolation to downsample the smooth image to reduce artifacts, and uses convolutional attention module and skip connection to optimize the network structure. SmoothL1Loss is used to enhance the robustness of the model, and the PI value is introduced as the perceptual quality evaluation index. The experimental results show that compared with other superresolution reconstruction algorithms, the peak signal-to-noise ratio of the algorithm in this paper is increased by 0.98–3.23 dB on average. The mural texture reconstruction effect is better, the PI value is reduced by 0.56 on average, the mural perception quality is better, and the running time is reduced by 89.68 s on average. It has a certain value for mural superresolution reconstruction.

Introduction

In recent years, superresolution reconstruction technology has been widely used in many fields, including high-definition television (HDTV), image compression, infrared imaging [1], remote sensing imaging [2], medical imaging [3], sonar imaging [4], white blood cell imaging [5], video perception [6], surveillance security [7] and other fields. The content of Dunhuang murals is full and bizarre, and the scenery varies. It completes the organic combination of religious culture and art with unique artistic techniques, showing a distinct and unique beauty. As a masterpiece of culture and art, it represents the highest achievement of Chinese traditional mural art with its grand momentum and lofty historical and cultural value. To make this cultural treasure further attention and development and utilization, mural superresolution reconstruction is of great significance.

Single-image superresolution reconstruction (SISR) [8] is one of the low-level vision tasks for reconstructing a high-resolution (HR) image with clear texture details from a single low-resolution (LR) image. There are two kinds of superresolution reconstruction algorithms: traditional algorithms and deep learning algorithms. There are three kinds of traditional SISR methods, including interpolation-based [9], reconstruction-based [10] and machine learning-based [11] single-image superresolution algorithms. There are two main types of deep learning algorithms: supervised SISR methods and unsupervised SISR methods. There is a pair of LR images and HR images using the SISR method for model training. Large-scale datasets are required for training to collect considerable manpower and material resources for training sets. The estimation of degradation is too time-consuming and has a certain estimation error, which leads to an unsatisfactory reconstruction effect. The model is too dependent on fixed degradation, and the performance of the model is seriously reduced when the real degradation is not simultaneous. Unsupervised SISR methods do not need to be paired LR images and HR images. The model can learn the image degradation method in real scenes. In 2014, Dong et al. [12, 13] proposed SRCNN, which first used convolutional neural networks for superresolution tasks. Since then, learning-based superresolution reconstruction has become a research hotspot. Before ResNet [14] was proposed, the academic community generally believed that the more network layers there were, the more complete the image feature information obtained, and the better the learning effect. However, as the network deepens, the model is prone to gradient dispersion and accuracy degradation. In 2015, He et al. [14] proposed ResNet, a network using residual learning (RL) to solve the problem of gradient dispersion and accuracy degradation in deep networks. Kim et al. [15] proposed VDSR, which uses a local residual network (ResNet) to optimize the network structure and solves the problems of information loss and loss caused by too deep traditional convolutional layers to a certain extent. In 2016, Shi et al. [16] proposed ESPCN based on SRCNN, using subpixel convolution for upsampling. Dong et al. [17] improved SRCNN and proposed FSRCNN, which uses deconvolution to achieve upsampling. By adding a shrinkage layer and expansion layer, the number of calculations is significantly reduced, and the network running speed is greatly improved. Kim et al. [18] proposed DRCN, which uses loop operation to improve the receptive field, uses global jump connections to share low-frequency information, increases network depth but limits the number of parameters, and improves network performance. In 2017, inspired by ResNet, VDSR and DRCN, Ying et al. [19] proposed DRRN, which introduced global and local residual learning to maintain high-frequency information during network operation. Lim et al. [20] proposed EDSR and MDSR, which remove batch normalization (BN) in the network and use a single network to deal with multiscale superresolution problems to reduce computing resource consumption. Zhang et al. [21] proposed RCAN, which uses channel attention (CA) to enhance the ability of network feature extraction. In 2017, Tong et al. [22] proposed dense skip connections, which create skip connections between layers to break the gradient chain rule and fuse different levels of information. In 2020, Guo et al. [23] proposed a dual regression scheme for single image superresolution, which utilizes double regression mapping to estimate the downsampling kernel and reconstruct LR images. This network can learn directly from the LR images and the proposed SR model is adaptable to real data. In 2021, Tal et al. [24] performed SR reconstruction for arbitrary blur kernel-degrading LR images through precise kernel estimation. Their method reduces the estimation error of the blur kernel and enables non-blind SR methods to work effectively under normal settings. In 2021, Liu et al. [25] proposed a multihop connected residual attention network to make full use of low-frequency and high-frequency information to improve reconstruction performance. Shocher et al. [26] proposed zero-shot superresolution (ZSSR), which was the first unsupervised superresolution reconstruction method based on CNN to achieve the best superresolution effect under nonideal conditions. ZSSR uses a single image to train a specific model, which can avoid the influence of dataset size on model prediction performance. However, the model uses a relatively monotonous bicubic downsampling degradation method to obtain low-resolution images, which easily introduces aliasing artifacts. In addition, the model network structure is too simple, and the expression ability is limited, which will lead to low reconstruction accuracy.

Considering the above problems, this paper proposes a zero-sample mural superresolution reconstruction method to enhance the perceptual quality. The main improvements are as follows: (1) bicubic interpolation is replaced with Lanczos interpolation for downsampling to reduce artifacts; (2) the network structure is optimized by means of skip connection and convolutional attention, and obvious performance gain is obtained under the premise of adding a small number of parameters; (3) SmoothL1Loss is used to integrate the advantages of L1Loss and L2Loss loss functions to accelerate network convergence and enhance model robustness; and (4) the PI value is introduced as the evaluation index of perceived quality to better measure the perceived picture quality.

Methodology

Related theory

LANCZOS interpolation

The purpose of image superresolution reconstruction is to reduce or remove image degradation in the process of acquiring or processing images. Therefore, to perform superresolution reconstruction, it is necessary to clarify the causes of image degradation and reconstruct the image along the inverse process of image degradation. Efrat et al. [27] found that an accurate blur model is more important than a complex image prior. At present, many SR methods have made important progress, but they only use simple bicubic interpolation to simulate image degradation. When the preassumption degradation mode of image superresolution does not match the degradation mode of the real image, the performance of the model will decrease.

The Sinc algorithm is the best interpolation algorithm in theory, which can fit any curve, but the assumption on which the algorithm depends is difficult to fully meet. The Lanczos interpolation effect can be close to the Sinc reconstruction algorithm. The algorithm is improved at the truncation. Alvarez et al. [28] noted that the Lanczos algorithm can reduce the frequency domain aliasing and can reduce the phenomenon of sawtooth and ringing. The calculation formula of the Lanczos interpolation algorithm is shown in Formula (1).

$$L(x) = \left\{ \begin{gathered} \sin c(x){\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} \sin c(x/a){\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} if{\kern 1pt} {\kern 1pt} - a < x < a \hfill \\ 0{\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} otherwise \hfill \\ \end{gathered} \right.$$

(1)

where a is a positive integer representing the size of the Lanczos kernel. When a is 2, the Lanczos2 algorithm is suitable for image reduction interpolation; when a is 3, the Lanczos3 algorithm is suitable for image magnification interpolation.

Convolutional attention

Shifting attention to the most important part is called the attention mechanism, which has the characteristics of few parameters, fast speed and good effect. In 2018, Sanghyun et al. [29] proposed a lightweight convolutional block attention module (CBAM), which combines channel attention (CA) and spatial attention (SA). The CA performs global maximum pooling and global average pooling on the input feature map in the spatial dimension. The spatial dimension is compressed to 1, the channel feature information is retained, and the pooling feature is extracted using convolution. The pooling features are added directly, and the channel attention weight is obtained by sigmoid activation. The CA is applied to the original convolution feature by multiplication weighting. The SA first concatenates the results of max pooling and avg pooling, uses a convolution to extract the pooling features, and compresses the channel dimension of the feature map to 1. After sigmoid activation, the spatial attention weight is obtained, and the original convolution features are weighted. CBAM is embedded into the existing network architecture as a plug-and-play module, which can improve the feature extraction ability of the network model without significantly increasing the number of computations and parameters.

Methods

The network structure of this paper is designed as follows: The network uses Lanczos interpolation to sample the required HR image size on the LR image and input it into the network. Since the input image and the output image are similar, to save computational consumption, a long skip connection (LSC) is added directly to the input and output, and only the residual information of the output and input is learned. The network uses the residual layer to extract the deep features of murals, which effectively solves the problem of gradient disappearance or gradient explosion caused by a network that is too deep. The residual layer consists of a combination of 8 convolutions and LeakyReLU activation, 2 convolution attention modules (CBAM) and a short skip connection (SSC). SSC combines shallow feature information and deep semantic information to ensure feature reusability and effectively alleviate network degradation. The network structure is shown in Fig. 1.

In the network, the LSC superimposes the input and the output of the residual layer directly to realize the fusion between the shallow and deep features. The SSC is responsible for fusing local deep and shallow features. The CBAMs, the combination of CA and SA, self-adapt the channel and spatial features to enhance the presentation capacity of convolution features, thereby improving the perceptual quality of the reconstructed mural.

Improve image degradation to reduce texture loss

To reduce the texture loss of the reconstructed mural, this paper first performs isotropic and anisotropic Gaussian blur on the image and then uses Lanczos interpolation to replace bicubic interpolation for downsampling to reduce artifacts. To reduce the texture loss of the reconstructed mural, this paper first performs isotropic and anisotropic Gaussian blur on the image and then uses Lanczos interpolation to replace bicubic interpolation for downsampling to reduce artifacts. Because the image brightness becomes dark after shrinking, gamma correction is used to change the brightness. Finally, Gaussian noise at a random noise level is added to the image. An example of image degradation is shown in Fig. 2.

Using jump connections to mitigate network degradation

To mitigate network degradation, this paper uses long skip connections (LSCs) and short skip connections (SSCs) to optimize the network structure. LSC directly adds the input and the output of the residual layer and fuses shallow features and deep features. SSC fuses local depth features. The long jump connection and short jump connection structures are shown in Fig. 3.

Convolutional attention to improve perceptual quality

Since the ZSSR network structure is too simple, this paper optimizes the network structure and embeds two convolutional attention modules (CBAM) into the front end and back end of the network. CBAM performs CA and SA serially and can adaptively adjust channel and spatial characteristics. Multiplying the convolutional attention feature with the input feature map can activate important features and suppress unimportant features. CBAM is inserted into the network as an independent module, which can improve the representation ability of convolution features, thus improving the perceptual quality of reconstructed murals, and the additional computational overhead is also small. The convolution attention module is shown in Fig. 4.

Using Smoothl1Loss to improve model robustness

The loss function is used to estimate the difference between SR images and HR images, which can help the network accelerate convergence and improve network quality. The smaller the loss value is, the better the model performance. The original network uses L1Loss, but L1Loss is not derivable at the 0 point, which affects convergence. Girshick [30] proposed SmoothL1Loss in the Fast RCNN paper. SmoothL1Loss is insensitive to outliers and noise and solves the problem of L1Loss zeros being not smooth. It is more robust and easier to converge to a local optimum. To make the visual effect of the reconstructed mural image more realistic, this paper uses SmoothL1Loss to guide the model training. The formula for calculating the loss function is shown in Eq. (2):

$$SmoothL1(x) = \left\{ \begin{gathered} 0.5x^{2} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} if{\kern 1pt} {\kern 1pt} \left| x \right| \le 1 \hfill \\ \left| x \right|{\kern 1pt} - 0.5{\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} otherwise \hfill \\ \end{gathered} \right.$$

(2)

SmoothL1Loss is a piecewise function that combines the advantages of L1Loss and L2Loss loss functions. Smooth L2Loss is used when x is small, and stable L1Loss is used when x is large.

Experiment

Experimental design

The hardware environment set up in this experiment is 16 GB of memory, 6 GB of graphics memory, the graphics card is an NVIDIA GeForce GTX 3060, CPU is Intel Core i7-12700H; software environment: the is the Windows 11 operating system, the experimental software is PyCharm, MATLAB R2022a, and the deep neural network library is PyTorch.

In this paper, the original image is randomly cut into a 128 * 128 size, and the image is blurred by Gaussian with a blur kernel size of 3*3. Lanczos interpolation is used to downsample the smooth picture, gamma correction is used to adjust the picture brightness, and random Gaussian noise is added with a standard deviation of 0.0125. Therefore, the low-resolution mural image is input into the training network. The algorithm uses SmoothL1Loss and Adam optimizers. The initial learning rate is set to 0.001, and the learning rate attenuation parameter is 0.5. When the learning rate is lower than 1e-6 or the number of iterations exceeds 3000, the training is stopped, and the reconstructed mural image is output.

Evaluation indicator

The peak signal-to-noise ratio (PSNR) is one of the evaluation indices of image reconstruction quality. By comparing the pixel difference between images, the image reconstruction distortion is quantitatively evaluated. The higher the PSNR value is, the smaller the reconstructed image distortion. The calculation formula of the PSNR value is shown in Eq. (3). With the development of superresolution reconstruction, it is found that PSNR or SSIM may not necessarily represent better reconstruction quality. Although the pixel error of SR and HR is small, the texture details of the reconstructed image are not necessarily in line with human visual habits. Mittal et al. [31] proposed a no-reference image quality evaluation index NIQE, which uses the visual perception of the human eye as an indicator. The more natural the image, the smaller the NIQE value. Blau et al. [32] proposed the perceptual index (PI) in the PIRM2018-SR challenge, which quantifies the image perceptual quality. The lower the PI value is, the better the image perception quality. To better measure the perceptual clarity of reconstructed murals, this paper uses MATLAB R2022a to calculate the PI value as the perceptual quality evaluation index. The calculation formula of the PI value is shown in Eq. (4).

$$PSNR = 10 \times {\text{lg}}\frac{{255^{2} \times W \times H}}{{\sum\limits_{i = 1}^{W} {\sum\limits_{j = 1}^{H} {\left[ {I_{x,y}^{HR} - I_{x,y}^{SR} } \right]^{2} } } }}$$

(3)

$$PI = \frac{1}{2}(NIQE(I^{SR} ) + (10 - MA(I^{SR} )))$$

(4)

where W and H represent the width and height of the image, I^HR represents the high-resolution mural, and I^SR represents the reconstructed mural.

Results and discussion

Contrast experiment

In this paper, an ancient mural image is taken as the experimental object, and the BI algorithm, SRCNN algorithm [12], ESPCN algorithm [16], DRCN algorithm [18], RCAN algorithm [21], ZSSR algorithm [26], MASA-SR algorithm [33] and the algorithm in this paper are selected to perform 2 × superresolution reconstructions on 9 local mural images. The experimental results are shown in Fig. 5. The quantitative analysis of the proposed EPZSSR model in terms of the PI, PSNR, running time and iterations is summarized in Table 1.

Table 1 Quantitative analysis results of the EPZSSR model proposed in this study

Full size table

The experimental results show that the reconstruction effect of the BI algorithm is the worst, and the edge sawtooth problem is serious. The SRCNN algorithm and the ESPCN algorithm have a certain improvement effect compared with the BI algorithm, but the image color reconstructed by the SRCNN is dim, and the image reconstructed by the ESPCN still has noise. The DRCN algorithm can extract the deep feature information of murals, but it increases artifacts. The color contrast of the image reconstructed by the RCAN algorithm is enhanced, but the image is relatively blurred. The image reconstructed by the ZSSR algorithm has sharp edges but poor subjective perception quality. The MASA-SR algorithm can transfer the texture details with the highest matching degree between the reference image and the test image to the low-resolution image and can reconstruct better mural texture details but has artifacts. The algorithm in this paper effectively alleviates the noise and artifacts in the mural. The reconstructed image has clearer structural details and brighter colors. The PI value of the reconstructed mural is the lowest, the image perception quality is significantly improved, and the overall visual effect is the best.

Ablation experiment

To prove the effectiveness of the network module in this paper, ablation experiments are carried out from the perspectives of spatial attention (SA), channel attention (CA), skip connection (SC) and loss function (Loss), and the corresponding quantitative results of the PI value, PSNR value and running time are displayed. The ablation study visual results of module effectiveness are shown in Fig. 6. The ablation study quantitative results of module effectiveness are shown in Table 2.

Table 2 Ablation study results of module effectiveness

Full size table

Among them, (3) is the details of the network structure of this paper. (1)(2)(3) Verify the effectiveness of SmoothL1Loss; (3) and (4) verify the effectiveness of SA; (3) and (5) verify the effectiveness of CA; (3) and (6) verify the effectiveness of SC. The results of Fig. 1 show that SmoothL1Loss can accelerate network convergence and improve the robustness of the model to abnormal noise. Using SA, CA and SC can improve the network feature extraction ability and reconstruct the texture details of the mural more clearly. Table 2 shows that the running time of the algorithm in this paper is suboptimal (241.74), but the optimal PI value (2.98) and PSNR value (30.42) can be obtained, which verifies the effectiveness of the network module in this paper.

To further verify the influence of the number and location of the convolutional attention module (CBAM) on the reconstruction effect, this paper designs related ablation experiments and shows the corresponding PI value, PSNR value and quantitative results of the running time. The ablation study visual results of the number and location of CBAM are shown in Fig. 7. The ablation study results of the number and location of CBAM are shown in Table 3.

Table 3 Ablation study results of the number and location of CBAM

Full size table

Among them, (5) is the setting of the number and location of CBAM in this paper. (1) to (4) set the number of CBAMs to 1, and (5) and (6) set the number of CBAMs to 2. (1)(3) and (2)(4) fix the relative position of SA and CA in CBAM, respectively, and place CBAM in the front end and back end of the network, respectively, to verify the influence of the absolute position of CBAM in the network on the reconstruction effect. The results show that the influence of the absolute position of CBAM in the network on the reconstruction effect is negligible. (5) and (6) verify the influence of the relative position of SA and CA in CBAM on the reconstruction effect. The results show that the reconstruction effect of using CA before using SA is better than that of using SA before using CA. On this basis, (1)(3)(5) and (2)(4)(6) ignore the absolute position of CBAM in the network and verify the influence of the number of CBAM on the reconstruction effect Table 3 shows that the algorithm in this paper obtained the optimal PI value (3.97) and PSNR value (36.7), as well as the algorithm in (4) obtained the shortest running time (145.18). The results show that setting a CBAM at the front end and back end of the network can significantly improve the reconstruction effect, although it takes a small amount of running time.

Conclusion

Aiming at the problem of texture loss and poor perceptual quality in low-resolution mural images, this paper proposes a zero-shot mural superresolution reconstruction method called EPZSSR with enhanced perceptual quality. The network-improved image degradation method performs degradation preprocessing on mural images to reduce texture loss. The original image is randomly cropped, the image is Gaussian blurred, and the smoothed mural image is downsampled using Lanczos interpolation. Gamma correction is used to adjust the brightness of the picture, and random Gaussian noise is added. Therefore, the low-resolution mural image is input into the training network, and the specific model is obtained by training the image. The skip connection is used to fuse shallow and deep features to effectively alleviate network degradation. The network structure is optimized by convolution attention, and the spatial features and channel features are adaptively adjusted so that the feature extraction ability of the model is significantly improved. SmoothL1Loss is used to accelerate network convergence and enhance model robustness. The PI value is introduced to measure the perceived quality of images. Compared with the existing algorithms, the peak signal-to-noise ratio of the algorithm in this paper is increased by 0.98–3.23 dB on average, the mural texture reconstruction effect is better, the PI value is reduced by 0.56 on average, the mural perception quality is better, and the running time is reduced by 89.68 s on average.

The shortcoming of the experiment is that the mural is only reconstructed at a rate of 2 times, and the mural reconstruction effect is not tested at a large-scale and multiple scales. Reconstruction effect improvement is not obvious for incomplete and unclear texture murals. The main work of the next step is as follows: the model is further stacked in a cross-scale manner to obtain multiscale reconstructed murals to meet different needs; the network structure is optimized to improve the training efficiency, and the number of model iterations is reduced to realize the real-time superresolution mural reconstruction.

Availability of data and materials

The datasets used and/or analysed during the current study are available from the corresponding author on reasonable request.

Abbreviations

HDTV:: High-definition television
SISR:: Single-image superresolution reconstruction
HR:: High-resolution
LR:: Low-resolution
RL:: ResNet,residual learning
ResNet:: Residual network
BN:: Batch normalization
CA:: Channel attention
ZSSR:: Zero-shot superresolution
CBAM:: Convolutional block attention module
CA:: Channel attention
SA:: Spatial attention
LSC:: Long skip connection
SSC:: Short skip connection
LSCs:: Long skip connections
SSCs:: Short skip connections
PSNR:: Peak signal-to-noise ratio
PI:: Perceptual index
SC:: Skip connection

References

Zou Y, Zhang LF, Liu CQ, Wang BW, Hu Y, Chen Q. Super-resolution reconstruction of infrared images based on a convolutional neural network with skip connections. Opt Laser Eng. 2021;146: 106717.
Article Google Scholar
Liu X, Liu Y, Zhang C, Jin WQ. Resolution improvement and data processing of remote sensing images. Laser Optoelect Pro. 2019;56(08):98–108.
Google Scholar
Xi ZH, Hou CY, Yuan KP. Medical image super resolution reconstruction based on residual network. Comput Eng Appl. 2019;55(19):191–7.
Google Scholar
Wang DQ. Research on high resolution reconstruction method of side scan sonar image. Jiangsu Univ Sci Technol. 2021. https://doi.org/10.27171/d.cnki.ghdcc.2021.000449.
Article Google Scholar
Wang W, Hu T, Li XW, Shen SW, Jiang XM, Liu JY. Study on super-resolution image reconstruction of leukocytes. Comput Sci. 2021;48(04):164–8.
Google Scholar
Wei HG, Liu JQ, Lin LQ, Yang J, Chen WL. A rate-distortion optimization algorithm based on visual perception. Chinese J Sci Inst. 2022;43(05):175–82.
Google Scholar
Xiao SW, Hu RM, Xiao J. Face-oriented surveillance video compression method based on hybrid resolution in NB-IOT environment. Comput Appl Softw. 2022;39(02):150–6.
Google Scholar
Van Ouwerkerk JD. Image super-resolution survey. Image Vis Comput. 2006;24(10):1039–52.
Article Google Scholar
Keys R. Cubic convolution interpolation for digital image processing. IEEE Trans Acoust Speech Signal Process. 1981;29(6):1153–60.
Article Google Scholar
Kim KI, Kwon Y. Single image super-resolution using sparse regression and natural image prior. IEEE Trans Pattern Anal Mach Intell. 2010;32(6):1127–33.
Article Google Scholar
He H, Siu WC. Single image super-resolution using Gaussian process regression. CVPR. 2011;2011:449–56.
Google Scholar
Dong C, Loy CC, He K, Tang XO. Learning a deep convolutional network for image super-resolution. In: Fleet D, Pajdla T, Schiele B, Tuytelaars T, editors. Computer Vision, ECCV 2014: 13th European Conference Zurich. Cham: Springer; 2014.
Google Scholar
Dong C, Loy CC, He K, Tang XO. Image super-resolution using deep convolutional networks. IEEE Trans Pattern Anal Mach Intell. 2015;38(2):295–307.
Article Google Scholar
He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. IEEE Conf Comput Vision Pattern Recognit. 2016;2016:770–8.
Google Scholar
Kim J, Lee JK, Lee KM. Accurate image super-resolution using very deep convolutional networks. IEEE Conf Comput Vision Pattern Recognit. 2015;1:123.
Google Scholar
Shi WZ, Caballero J, Huszar F, Totz J, Aitken AP, Bishop R, et al. Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. IEEE Conf Comput Vision Pattern Recognit. 2016;11:1245.
Google Scholar
Dong C, Loy CC, Tang X. Accelerating the super-resolution convolutional neural network European conference on computer vision. Berlin: Springer; 2016.
Google Scholar
Kim J, Lee JK, Lee KM. Deeply-recursive convolutional network for image super-resolution. IEEE Conf Comput Vision Pattern Recognit. 2016;1:122.
Google Scholar
Ying T, Jian Y, Liu X. Image super-resolution via deep recursive residual network. IEEE Conf Comput Vision Pattern Recognit. 2017;11:145.
Google Scholar
Lim B, Son S, Kim H, Nah S, Lee KM. Enhanced deep residual networks for single image super-resolution. IEEE Conf Comput Vision Pattern Recognit. 2017;21:1132–40.
Google Scholar
Zhang YL, Li KP, Li K, Wang LC, Zhong BN, Fu Y. Image Super-Resolution Using Very Deep Residual Channel Attention Networks of the Proceedings of the 15th European Conference 15th European on Computer Vision. Munich: Springer; 2018.
Google Scholar
Tong T, Li G, Liu XJ, Gao QQ. lmage super-resolution using dense skip connections. Proceedings of the IEEE International Conference on Computer Vision. 2017;2017:4799–807.
Guo Y, Chen J, Wang J, Cheng Q, Cao J, Deng Z, et al. Closed-loop matters: Dual regression networks for single image super-resolution. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2020;2020:5407–5416.
Tao G, Ji X, Wang W, Chen S, Lin C, Cao Y, Lu T, et al. Spectrum-to-kernel translation for accurate blind image super-resolution. Adv Neural Inf Process Syst. 2021;34:22643–54.
Google Scholar
Liu ZX, Zhu CJ, Huang J, Cai DJ. Image super resolution reconstruction of multi hop residual attention network. Computer Science. 2021;48(11):258–67.
Google Scholar
Shocher A, Cohen N, Irani M. Zero-shot super-resolution using deep internal learning. Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR). 2018;2018:3118–26.
Efrat N, Glasner D, Apartsin A, Nadler B, Levin A. Accurate blur models vs. image priors in single image super-resolution. IEEE Int Conf Comput Vision. 2013;12:2832–9.
Google Scholar
Alvarez V, Ponomaryov V, Sadovnychiy S. Image super-resolution via wavelet feature extraction and sparse representation. Radioengineering. 2018;27:602–9.
Article Google Scholar
Woo S, Park J, Lee JY, Kweon IS. CBAM Convolutional block attention module. 15th European conference on computer vision (ECCV). 2018;11211:3–19.
Girshick R. Fast R-CNN. Proceedings of the IEEE international conference on computer vision (ICCV). 2015; 2015:1440–8.
Mittal A, Soundararajan R, Bovik AC. Making a “completely blind” image quality analyzer. IEEE Signal Process Lett. 2012;20(3):209–12.
Article Google Scholar
Blau Y, Mechrez R, Timofte R, Michaeli T, The Z-M. PIRM challenge on perceptual image super-resolution. Proc Eur Conf Comput Vision. 2018;2018:334–55.
Google Scholar
Lu LY, Li WB, Tao X, Lu JB, Jia JY. MASA-SR: matching acceleration and spatial adaptation for reference-based image super-resolution. IEEE Conf Comput Vision Pattern Recognit. 2021;2021:6364–73.
Google Scholar

Download references

Acknowledgements

Not applicable.

Funding

This study was funded by the Humanities and Social Sciences Research Project of the Ministry of Education (Planning Fund Project) (21YJAZH002).

Author information

Authors and Affiliations

Department of Computer Science & Technology, Xinzhou Normal University, No. 1 Dunqi East Street, Xinfu District, Xinzhou, 034000, China
Jianfang Cao, Xiaohui Hu & Yun Tian
School of Computer Science & Technology, Taiyuan University of Science and Technology, Taiyuan, 030024, China
Jianfang Cao & Xiaohui Hu

Authors

Jianfang Cao
View author publications
You can also search for this author in PubMed Google Scholar
Xiaohui Hu
View author publications
You can also search for this author in PubMed Google Scholar
Yun Tian
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

CJF devised the study plan and led the writing of the article, CJF and HXH conducted the experiment and collected the data. TY conducted the analysis, and CJF supervised the whole process and gave constructive advice. CJF was a major contributor in writing the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Jianfang Cao.

Ethics declarations

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article

Cao, J., Hu, X. & Tian, Y. Zero-sample mural superresolution reconstruction for enhanced perceptual quality. Herit Sci 11, 67 (2023). https://doi.org/10.1186/s40494-023-00907-6

Download citation

Received: 17 January 2023
Accepted: 12 March 2023
Published: 04 April 2023
DOI: https://doi.org/10.1186/s40494-023-00907-6

Zero-sample mural superresolution reconstruction for enhanced perceptual quality

Abstract

Introduction

Methodology

Related theory

LANCZOS interpolation

Convolutional attention

Methods

Improve image degradation to reduce texture loss

Using jump connections to mitigate network degradation

Convolutional attention to improve perceptual quality

Using Smoothl1Loss to improve model robustness

Experiment

Experimental design

Evaluation indicator

Results and discussion

Contrast experiment

Ablation experiment

Conclusion

Availability of data and materials

Abbreviations

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords