Skip to main content

Protection of Guizhou Miao batik culture based on knowledge graph and deep learning


In the globalization trend, China’s cultural heritage is in danger of gradually disappearing. The protection and inheritance of these precious cultural resources has become a critical task. This paper focuses on the Miao batik culture in Guizhou Province, China, and explores the application of knowledge graphs, natural language processing, and deep learning techniques in the promotion and protection of batik culture. We propose a dual-channel mechanism that integrates semantic and visual information, aiming to connect batik pattern features with cultural connotations. First, we use natural language processing techniques to automatically extract batik-related entities and relationships from the literature, and construct and visualize a structured batik pattern knowledge graph. Based on this knowledge graph, users can textually search and understand the images, meanings, taboos, and other cultural information of specific patterns. Second, for the batik pattern classification, we propose an improved ResNet34 model. By embedding average pooling and convolutional operations into the residual blocks and introducing long-range residual connections, the classification performance is enhanced. By inputting pattern images into this model, their categories can be accurately identified, and then the underlying cultural connotations can be understood. Experimental results show that our model outperforms other mainstream models in evaluation metrics such as accuracy, precision, recall, and F1-score, achieving 94.46%, 94.47%, 93.62%, and 93.8%, respectively. This research provides new ideas for the digital protection of batik culture and demonstrates the great potential of artificial intelligence technology in cultural heritage protection.


Batik is an ancient manual dyeing technique found in many countries around the world, such as China, Indonesia, Malaysia, Singapore, India, and Japan. Influenced by multiple factors like social culture, geographical environment, and economy, batik from different regions presents distinct characteristics and styles in patterns, colors, and compositions [1,2,3]. Batik in China has a long history, dating back to the Qin and Han dynasties over two thousand years ago. It gained popularity during the Six Dynasties period and reached its peak in the Sui and Tang dynasties. However, with the passage of time and social development, this ancient handicraft has gradually declined. The impact of modernization, coupled with a lack of inheritors, has caused Chinese batik to face technological and cultural loss. Its development level lags far behind countries like India, Indonesia, and Malaysia, and it is on the verge of disappearing. This not only means the end of a traditional craft, but also the loss of related historical, cultural, social, and economic information.

In-depth research on the connotation of batik patterns is of great significance for understanding their historical evolution and cultural connotation, and can provide valuable references and inspiration for related fields such as art design and anthropology. At the same time, exploring new ideas and methods of digital preservation through the introduction of advanced computer technologies such as knowledge graphs, natural language processing (NLP), and deep learning can provide a systematic digital storage space for batik culture. Furthermore, constructing computational models can enable the automatic classification and identification of batik patterns, thereby revealing their intrinsic laws and relationships, and providing new opportunities for the inheritance and innovation of batik. This is of great significance for sustaining the vitality of batik culture and promoting the digital protection of cultural heritage.

Benefiting from the advantages of isolated terrain, Guizhou Miao batik has been relatively well protected and developed compared to other regions in China, enjoying the reputation of “the hometown of batik” [4,5,6]. The Miao people, an ethnic group with only spoken language but no written script, have made batik patterns an important carrier for inheriting history, religion, beliefs, customs, etc. [7]. With its mysterious style, beautiful patterns, clear themes, heavy connotations, and rich subjects, Guizhou Miao batik patterns have thrived through generations and become one of the most representative Chinese cultural heritages.

The production principle of batik is to use the hydrophobicity of wax to prevent certain areas of the fabric from being dyed, thus creating patterns. The materials used in the production of batik mainly include wax (e.g. beeswax, paraffin, etc.), white cotton cloth, indigo, and copper knives. As shown in Fig. 1, the production of batik mainly involves four steps. (i) Melting wax. The solid wax is heated to the melting point, which turns it into a liquid state and provides plasticity for the subsequent drawing process. (ii) Wax painting. A copper knife is used to scoop the melted wax and draw various patterns on the white cloth. (iii) Dyeing. The wax-painted cloth is immersed in indigo water, allowing the dye to penetrate the areas not protected by wax. (iv) Removing wax. The dyed cloth is placed in boiling water, where the wax melts at high temperatures and is washed away by the water flow, resulting in blue-and-white batik patterns. Indigo is fermented from the leaves of bluegrass, a common plant in the mountainous regions of Guizhou. It is worth mentioning that bluegrass is not only a dyestuff but also a medicinal plant. Its properties of resisting humid climates and preventing mold growth help protect the texture and color stability of batik fabrics under the humid climatic of southern China.

Fig. 1
figure 1

The production process of batik

Batik fabrics are mainly used for decorating daily life, expressing beliefs, and conveying emotions. They are commonly used as bedsheets, door curtains, window curtains, skirts, and accessories. The patterns on the fabric are varied and have special meanings, some of which are illustrated in Fig. 2. The subjects of the patterns include plants, animals, and geometric shapes. Different patterns have different meanings. For example, the butterfly pattern contains the Miao people's mythical story of the "蝴蝶妈妈" (Butterfly Mom), the fish pattern means praying for having children, and the fish-bird pattern represents the marital harmony. In addition to the meaning of goodness, Miao culture also has certain taboos. To avoid conflicts between different cultures, the use of taboo patterns needs to be treated with caution. For example, the combination of the curled grass pattern and the horseshoe pattern represents guiding the way for the deceased.

Fig. 2
figure 2

Batik patterns

Batik has attracted widespread attention due to its high aesthetic and cultural value. In the innovative design of batik patterns, TIAN et al. [8] proposed an automatic generation method for batik patterns based on fractal geometry. By introducing fractal theory, they realized the automatic generation and transformation of batik patterns, providing new ideas for batik pattern design. LV et al. [9] introduced an interactive genetic algorithm (IGA-BPFIF) into the innovative design of batik patterns. By using user interaction feedback to guide the optimization process of the genetic algorithm, they generated new batik patterns that conform to users' aesthetic preferences. In their later research [10], they also proposed an improved collaborative filtering algorithm to recommend batik patterns. By analyzing user preferences, the algorithm recommends batik patterns of interest to users. Hu et al. [11] proposed a generative design method for batik patterns based on shape grammars. Considering the low abstraction level of traditional shape grammars in batik design, DING et al. [12] improved it based on comprehensive predicate grammar encoding and used particle swarm optimization to optimize the parameters of predicate shape grammars, enhancing the diversity and artistry of batik pattern generation.

A deep understanding of the cultural background and spiritual connotation of batik patterns is the foundation for protecting them [13]. In the exploration of Miao batik culture, researchers have focused on studying the migration history, religious beliefs, aesthetic characteristics, and the meanings of specific patterns. For example, Zhennan and Yahaya [7] introduced in detail the origins and meanings of typical patterns such as fish, birds, butterflies, dragons, and plants, and summarized the Miao people's aesthetic preference for symmetry and fullness. In addition, some scholars have also discussed the artistic changes and regional differences of batik patterns [14]. Although these studies are highly practical, they have not organized these isolated knowledge into a systematic framework, resulting it difficult to accumulate, disseminate, and reuse. Considering the complexity, diversity, and sensitivity of culture, in this paper, we construct a knowledge graph to effectively store, organize, and manage batik pattern knowledge.

In terms of feature extraction from batik images, Chen and Cheng [15] proposed an extraction method based on morphological processing techniques and the canny algorithm, obtaining independently editable pattern contours. In the retrieval of batik patterns, YUAN et al. [16] proposed a method combining global and local features. First, global features and local features are extracted through Zernike moments (ZMs) and curve transformations, respectively. Then, these two types of features are combined by matching weighted bipartite graphs, and the visual similarity between patterns is calculated through supervised distance measurement. Finally, similar shape patterns are retrieved. Since Miao batik images are highly abstract, it is difficult to accurately distinguish their subjects through simple understanding. Moreover, culture is highly sensitive, and once incorrectly identified and used, it may cause very serious consequences. Therefore, in this paper, we employ an improved ResNet34 model to extract features from batik images and automatically classify them.

This paper proposes to establish a dual-channel mechanism between batik images and culture to reduce the difficulty of understanding and applying batik knowledge. As shown in Fig. 3, in the direction from “culture to image,” users can retrieve pattern images by the names, meanings, and taboos of the patterns. For example, when designers want to use patterns that represent “having many children” in their products, they can search for this keyword in the knowledge graph. The system will filter out patterns that match the meaning, such as pomegranate patterns and fish patterns, and display a large number of vivid image examples. The rich batik image resources not only stimulate designers’ creative inspiration but also endow the products with cultural connotations. In the direction from “image to culture,” pattern images are input into the classification model to identify the category of the pattern and further explore its underlying cultural connotations. For example, when designers come across a beautiful batik pattern and want to use it in their products, but do not know the pattern's name, cultural meaning, historical origin, usage taboos, and other information, they dare not use it rashly. At this time, they can upload the pattern image to the model, which will identify the category of the pattern, such as butterfly pattern or dragon pattern, and then explore its culture through the knowledge graph. This process of starting from images, exploring, and perceiving the underlying cultural connotations can bring ordinary people closer to traditional handicrafts and enhance their cognition of cultural heritage, strengthening their willingness to apply batik elements in their work and life.

Fig. 3
figure 3

The dual-channel mechanism between batik images and culture

The main contributions of this paper can be summarized as follows:

  1. (i)

    For the organization and management of batik pattern knowledge, we adopt a method that integrates NLP and knowledge graphs to construct a Batik Pattern Knowledge Graph (BPKG). Entities and relationships of batik patterns are automatically extracted from a large amount of textual data and semantically associated. Compared with traditional knowledge base construction methods, our approach can organize and manage domain knowledge more comprehensively and systematically.

  2. (ii)

    For batik pattern classification, we propose an improved ResNet34 model. By embedding average pooling and convolutional operations into the residual blocks and introducing long-range residual connections, this structure can enhance the network's feature extraction and representation ability and effectively alleviate the gradient vanishing problem. Compared with traditional convolutional neural networks (CNN) and original residual networks, our model achieves superior performance on the batik pattern classification task.

  3. (iii)

    We created a large-scale batik pattern image dataset containing 15,148 images. The dataset covers eight subjects. This is the largest and most complete image dataset in the field of Chinese batik, which is of great significance in promoting the digitalization and intelligent development of batik culture.

In summary, we conduct research on the construction of BPKG, the design of classification model, and the creation of batik dataset, providing new ideas and methods for the digital protection and inheritance of batik culture, with strong theoretical and practical value. Advanced technologies often take longer time to become popular in the cultural heritage field [17], the innovation of this paper is here, that is, to introduce these advanced computer technologies into batik, exploring the integration of traditional culture and artificial intelligence (AI) technology. The research results can not only promote the digital development of batik culture but also provide new perspectives and approaches for the intelligent protection and inheritance of other traditional culture heritages, which has exemplary significance and promotional value.

The overall structure of this paper is as follows. In Sect. "Related works", we review the related works. In Sect. "Methods", we introduce the knowledge graph and ResNet model in detail. In Sect. "Experiments", we present the specific research work, first constructing the batik pattern knowledge graph, then improving the ResNet34 model, and finally applying it to batik image classification. Sect. "Conclusion" provides a summary.

Related works

Knowledge graph

The concept of knowledge graph was first proposed by Google in 2012. It is a method of graphically representing knowledge and its relationships. Formalized knowledge representation methods include predicate logic, frame-based, ontology-based, etc. Among them, the predicate logic method uses predicates and logical connectives to describe concepts and relationships; the frame-based method uses frame structures to organize and describe concepts; and the ontology-based method uses elements such as classes, instances, attributes and relationships to describe domain-specific knowledge. In comparison, the ontology-based method can ensure the uniqueness of knowledge understanding in the process of transfer and sharing, and can meet the requirements of diverse knowledge types and complex semantic relations, making it a widely used knowledge representation method.

In the automatic construction of knowledge graphs, with the help of NLP and machine learning techniques, some studies have realized the semi-automatic extraction and fusion of knowledge, greatly reducing the time and cost required for semantic processing and graph construction. The key technologies involved include entity extraction, relation extraction, and knowledge fusion. Entity extraction involves extracting entities and their attributes from text, commonly used methods include rule-based and machine learning-based methods [18, 19]. Rule-based methods identify entities by defining a series of matching patterns and rules, such as regular expressions, dictionary matching, etc. Machine learning-based methods treat entity extraction as a sequence labeling problem and automatically identify entities by training sequence labeling models (e.g., HMM, CRF, BiLSTM-CRF, etc.).

Relation extraction involves identifying the semantic relationships between entities from text, such as “located in” and “belongs to”. Commonly used methods include pattern matching, key word extraction-based, and machine learning [20,21,22]. The pattern matching method extract relationships by defining relation templates and trigger words. The key word extraction method determine the relationships between entities by identifying the key verbs in sentences. The machine learning method treat relation extraction as a classification problem and predict the relation types of entity pairs by training classifiers (e.g., CNN, RNN, attention mechanisms, etc.).

Knowledge fusion is the integration of entities and their relationships from different data sources into a unified knowledge graph to create a more comprehensive and richer knowledge base. It mainly includes two tasks: entity linking and knowledge merging [23, 24]. Entity linking is to link entity mentions in text to existing entities or create new entities. Commonly used methods include collaborative filtering, random walks, rule-based and deep learning methods. Knowledge merging is to identify equivalent entities, relationships and attributes in different data sources and merge them. Commonly used methods include similarity-based clustering, logical rule-based reasoning, ontology matching-based techniques, etc.

According to different functions and application scenarios, knowledge graphs can be divided into two categories: general knowledge graphs and domain knowledge graphs. General knowledge graphs aim to cover a wide range of fields and topics, while domain knowledge graphs focus on specific fields, industries, or topics, with higher requirements for the depth and quality of knowledge.

In the general domain, commonly used knowledge graphs include Wikidata, DBpedia, Freebase, YAGO, and Google Knowledge Graph [25, 26]. Advances have also been made in domain knowledge graphs, such as medicine, social interaction, agriculture, aviation, product design, cultural heritage, progress has also been made [27,28,29]. For example, due to the large scale and fragmented nature of knowledge in the cultural heritage domain, which is not conducive to knowledge dissemination and management, Carriero et al. [30] established the Italian cultural heritage KG ArCo. Similarly, Dou [31] et al. used the Bi-GRU model to extract entity relations and constructed a knowledge graph of China’s Twenty-Four Solar Terms. Fan [32,33,34] et al. have conducted a series of studies on China’s intangible cultural heritage. They constructed unimodal and multi-modal knowledge graphs, integrating multi-source heterogeneous data such as text, images, and audio, providing new ideas for the multi-level and multi-angle presentation of intangible cultural heritage knowledge. Yang [35] et al. developed a public cultural knowledge graph platform that supports various application functions such as knowledge query, reasoning, clustering, ranking, similarity, classification, and visualization.

Although knowledge graphs have been widely applied in the cultural heritage domain, research on batik culture is still very limited. As one of the representative projects of China's intangible cultural heritage, batik culture contains rich knowledge in history, art, folklore, and other aspects. However, most of this knowledge exists in unstructured forms such as text and images, which is not conducive to effective knowledge management and utilization. Therefore, constructing a batik knowledge graph to structurally represent and organize batik-related knowledge is of great significance for the digital protection, dissemination, and innovative development of batik culture. Based on the batik knowledge graph, intelligent batik craft display systems, intelligent tools to assist batik pattern design, and batik culture popularization platforms for the public can be developed to effectively promote the dissemination and application of batik.

Image classification

Image classification is an important task in computer vision. Its goal is to assign input images to predefined categories or labels. Early image classification methods mainly relied on handcrafted features and traditional machine learning algorithms. These methods usually include two parts: feature extraction and classification. In the feature extraction stage, commonly used methods include grayscale histograms, edge histograms, texture features (such as Gabor filters), color histograms, etc. The extracted features are usually low-level or mid-level. In the classification stage, the extracted features are input into traditional statistical methods or machine learning algorithms, such as support vector machines (SUM), k-nearest neighbor (KNN) algorithms, decision trees, random forests, etc. [36, 37], to train classifiers.

However, traditional methods have some limitations, such as the dependence of feature engineering on expert knowledge and the limited feature representation capability. With the rise of deep learning, methods based on deep neural networks have gradually replaced traditional methods and become the mainstream technology in this field [38]. Unlike traditional methods, deep learning methods can automatically learn hierarchical feature representations without the need for handcrafted features. In these methods, feature extraction and classification tasks are usually performed end-to-end. End-to-end means that the entire process from input images to output classification results is completed in the same neural network, without the need for separate feature extraction and classification steps. Common deep learning image classification models include CNN, ResNet, RNN, and long short-term memory networks (LSTM). In addition, some studies attempt to combine traditional methods with deep learning methods, using deep networks to extract features and then using traditional machine learning algorithms for classification [39], to leverage the advantages of both types of methods.

In recent years, image classification techniques have shown broad application prospects in the field of cultural heritage [40, 41], especially in the classification of patterns and style recognition. Ding et al. [42] proposed a nearest-neighbor method-based image classification model for She clothing, which integrates the texture features and spatial layout features of the clothing texture to improve the classification accuracy. In their follow-up research [43], they introduced CNN and designed a color feature fusion strategy optimized by the flower pollination algorithm, making the classification model simultaneously consider multiple visual features such as color, texture, and space of clothing, achieving better classification results. Kong et al. [44] focused on the pattern classification problem of Yao ethnic clothing and brocade, proposing a multi-target classification method based on Faster R-CNN. This method can not only simultaneously identify multiple patterns in an image but also synchronize pattern classification and localization, providing support for fine-grained analysis and application of patterns. In addition, they also developed a corresponding mobile application, allowing users to understand the meaning of patterns through scanning, greatly promoting the dissemination of Yao culture. To improve the classification effect of CNN on specific datasets, Jia and Liu [45] improved CifarNet and proposed the CalicoNet model, which classified 12 types of blue calico patterns. Furthermore, Fang et al. [46] combined images with text descriptions and proposed a multi-modal image classification model MICMLF, which was validated on the New Year Print and Clay Figurine datasets.

In the research of batik image or pattern classification, scholars have also carried out some work. Nurhaida et al. [1] made an early attempt to apply machine learning to Indonesian batik image classification. They proposed a method based on the gray-level co-occurrence matrix and KNN classifier, achieving an accuracy of 85% on a self-built dataset. Danis et al. [2] designed a classification method based on CNN for Japanese batik images, obtaining an accuracy of 90.14% on a self-built dataset. Pramerdorfer et al. [3] adopted the idea of transfer learning, using the pre-trained VGG16 model to classify Indonesian batik images, achieving an accuracy of 89% on a public dataset. These studies show that both traditional machine learning methods and deep learning-based methods have made positive progress in the batik image classification task. However, reviewing the existing literature, we find that research on Chinese batik in the image classification field is still relatively scarce, with a large gap compared to other countries. In our survey, no research work specifically targeting Chinese batik image classification has been found. One reason for this situation may be that the digitization level of Chinese batik is relatively low, and large-scale, high-quality image datasets have not been constructed, which to a certain extent limits the application of machine learning and deep learning-based classification methods. This situation reflects, to a certain extent, that the development level of Chinese batik culture still lags behind other countries, and there is an urgent need to strengthen scientific and technological innovation to promote the digital protection and intelligent inheritance of Chinese batik culture.


Research framework

To help understand and apply batik knowledge, we comprehensively utilize knowledge graphs, NLP techniques, and an improved ResNet34 model to establish a dual-channel mechanism batik images and culture. This mechanism integrates the image features and cultural semantic information of batik patterns. On the one hand, it can retrieve relevant images based on cultural information; on the other hand, it can automatically identify the category of a batik image and then associate it with the corresponding cultural semantics. Our research framework is shown in Fig. 4. In the knowledge graph part, we first take the text data from the literature as the object, then use NLP techniques such as Word2vec and dependency parsing to automatically extract the key entities (such as "butterfly pattern" and "Butterfly Mom") and their relations (such as "mean" and " belong to") in the batik domain. Next, we refer to the seven-step method to construct the ontology, perform semantic modeling and normalization processing on the extracted entities and relations, forming a structured and semantic batik knowledge graph. Finally, we use the Neo4j graph database to store the knowledge and provide visualization and query interfaces. In the batik image classification model part, we first construct a large-scale batik pattern image dataset through data collection and annotation. Then, based on the classic ResNet34 model, we propose an improved structure that enhances the model's ability to extract and represent the features of batik patterns. Finally, we train and evaluate the improved ResNet34 model on the constructed batik pattern image dataset and apply it to the batik image classification task.

Fig. 4
figure 4

Research framework

Word embedding

Word embedding is a technique that maps words to vectors, representing the semantic relationships between them through their positions in the vector space. Compared to traditional one-hot representation, word embeddings can better capture the semantic information of words and have been widely used in NLP tasks. Considering the complexity of the task and the computational resources, we chose the lightweight model Word2Vec, which is sufficient for our task. The Word2Vec model includes two architectures: Continuous Bag-of-Words (CBOW) and Skip-gram models. Figure 5 shows the structures of these two models, which consist of an input layer, a projection layer, and an output layer.

Fig. 5
figure 5

Two training models of Word2Vec. a CBOW model b Skip-gram model

The training goal of CBOW is to predict the center word through the context words. Specifically, given a sequence of words wt-c, …, wt-1, wt+1, …, wt+c of length 2c + 1. Where wt is the center word, c is the context window size, and T is the total number of words in the corpus, the objective function of CBOW can be expressed as:

$$L_{CBOW} = \frac{1}{T}\sum\nolimits_{t = 1}^{T} {\log p(w_{t} |w_{t - c} , \cdots ,w_{t - 1} ,w_{t + 1} , \cdots ,w_{{t + {\text{c}}}} )}$$

where p(wt | wt-c, …, wt-1, wt+1, …, wt+c) represents the conditional probability of the center word wt given the context words wt-c, …, wt-1, wt+1, …, wt+c.

Skip-gram predicts the context words through the center word. Its objective function is:

$$L_{SK} = \frac{1}{T}\sum\nolimits_{t = 1}^{T} {\sum\nolimits_{ - c \le j \le c,j \ne 0} {\log p(w_{t + j} |w_{t} )} }$$

where p(wt+j | wt) represents the conditional probability of the context word wt+j given the center word wt.

During training, CBOW and Skip-gram learn the vector representations of words by maximizing the objective function. However, since the size of the corpus is usually large, directly computing the conditional probability is costly. To improve training efficiency, the Word2Vec model employs a strategy called Negative Sampling to approximate the conditional probability.

The idea of Negative Sampling is that for each positive sample (wt, wc), K negative samples (wt, wc) | k = 1,…,K are randomly sampled, and then the following function is maximized:

$${\text{loss}} = \log \sigma (v_{{w_{c} }}^{^{\prime}T} v_{{w_{t} }} ) + \sum\nolimits_{k = 1}^{K} {{\rm E}_{{w_{k} \sim P_{n} (w)}} [\log \sigma ( - v_{{w_{k} }}^{^{\prime}T} v_{{w_{t} }} )]}$$

where wt represents the center word, wc represents the context word, vwt and vwc represent the word embeddings of the center word and context word respectively; v’Twc represents the transpose of vwc, which is used for vector dot product computation to calculate the similarity between vwt and vwc; Pn(w) is the negative sampling distribution, usually chosen as the 3/4 power of the word frequency distribution in the training corpus; Ewk ~ Pn(w)[·] represents the expected value of computing the expression inside the bracket [·] for the negative sample word wk sampled from the distribution Pn(w); σ(·) is the sigmoid function, which maps a real value to the interval (0,1) and can be interpreted as a probability value.

Knowledge graph

A knowledge graph is a structured knowledge representation that describes the relationships between entities in the form of a graph. In a knowledge graph, knowledge is stored and organized as triples of "entity-relation-entity", represented as: G = (E, R, S). Here, E is the set of entities, containing n different entities, i.e., E = {e1, e2,…,en}. Each entity represents a concrete object or abstract concept, such as "animal pattern" and "butterfly pattern". R is the set of relations, containing m different relations, i.e., R = {r1, r2,…,rm}. Each relation represents the connection between two entities, such as “belong to” and “symbolize”. S E*R*E is the set of triples. Each triple (ei, rk, ej) represents a piece of knowledge, indicating that there exists a relation rk between entity ei and entity ej. For example, the triple (butterfly pattern, symbolize, respecting ancestors) represents “butterfly pattern symbolize respecting ancestors”.

Knowledge graphs are usually stored using graph databases, which is a NoSQL database that uses a graph structure (consisting of nodes and edges) to store and query data. Entities are stored as nodes of the graph and relations are stored as directed edges connecting two nodes. Each node and edge can have their own properties to describe additional information about the entities and relations. Graph databases provide support for the storage, querying, and visualization of knowledge graphs.

Improved ResNet34 model

In deep learning, CNN has become the mainstream method for image classification tasks. By stacking multiple convolutional layers and pooling layers, it can automatically learn and extract hierarchical features from images and integrate the feature representation with the classifier in an end-to-end architecture. However, traditional CNN models often encounter the problem of gradient vanishing or exploding when the network deepens to a certain extent, making it difficult to improve the classification performance.

To solve this problem, He et al. [47] proposed the residual network (ResNet). The idea of ResNet is to introduce the concept of identity mapping, i.e., adding "shortcut connections" that directly pass input information to the output in the network. In this way, even when the network is very deep, gradients can be directly propagated to shallower layers, thus alleviating the problem of gradient vanishing. The design principle of the ResNet is shown in Fig. 6.

Fig. 6
figure 6

Design principle of the ResNet

In Fig. 6, xa-1 is the input feature of the a-th layer of the network. It should be noted that ResNet has multiple variants, such as ResNet18, ResNet34, ResNet50, and ResNet101, with the number representing the number of layers in the model. H(xa-1) represents the base mapping, expressed as:

$$H(x_{a - 1} ) = F(x_{a - 1} ) + x_{a - 1}$$

where H(xa-1) is the ideal mapping we hope the network to learn. However, ResNet does not directly learn this mapping but lets the network fit a residual mapping F(xa-1), expressed as:

$$F(x_{a - 1} ) = H\left( {x_{a - 1} } \right) - x_{a - 1}$$

From Eq. (5), it can be seen that learning the base mapping H(xa-1) is equivalent to learning the residual mapping F(xa-1). However, the residual mapping is usually easier to optimize, especially when the network is deep. This is because for identity mapping, the network only needs to learn the residual mapping as 0, while for nonlinear mapping, the network needs to fit a complex function.

The basic component unit of ResNet is the basic-block, and Fig. 7 shows the structure composed of two basic-blocks.

Fig. 7
figure 7

Schematic diagram of two basic-blocks

In Fig. 7, Conv represents convolution; BN is batch normalization; ReLU is the activation function. The mathematical expression of this residual idea is as:

$$H_{{\text{a}}} = f(F(x_{a - 1} ,\omega_{a - 1} ) + H_{a - 1}{\prime} )$$

where Ha represents the output feature of the residual structure; F(xa-1,wa-1) represents the residual mapping, xa-1 represents the input feature of the a-th layer, wa-1 represents the input weight of the a-th layer; Ha-1 represents the nonlinear mapping of the previous residual feature to ensure that it can be added to the residual mapping; f is the activation function.

Although ResNet can effectively solve the problem of gradient vanishing, for the task of batik image classification, the original ResNet still has some shortcomings. First, the residual blocks are weak in modeling capabilities for global and long-distance dependencies. Second, as the network deepens, the stacking of residual blocks may lead to the loss of resolution and semantic information of the feature maps. To address these issues, this paper proposes an improved ResNet model, as shown in Fig. 8. Considering the model's performance and computational efficiency, this paper selects ResNet34 as the basic architecture for improvement.

Fig. 8
figure 8

Improved ResNet34 model

Compared with the original ResNet34, the improvements in this paper are mainly reflected in two aspects:

  1. (i)

    In the residual block, the 1*1 convolution is replaced by a combination of average pooling and convolution. The average pooling layer can reduce the resolution of the feature map, thus introducing multi-scale features. At the same time, the introduction of average pooling also increases the sparsity of the network, which helps to improve the model's generalization ability and robustness.

  2. (ii)

    After two residual blocks, a long-range residual connection is introduced to add the original input features to the output features of the residual block. This cross-layer feature reuse mechanism can help the network capture global and long-distance dependencies while alleviating the gradient vanishing problem.

Through the above improvements, the ResNet34 model in this paper can better adapt to the characteristics and requirements of the batik image classification task. On the one hand, the introduction of multi-scale features and the increase of sparsity enable the model to capture the rich texture and detail information in batik images. On the other hand, the addition of long-range residual connections enhances the model's ability to model global semantic information, which helps to distinguish similar batik categories. Figure 9 shows the overall classification model framework used in this paper, where the input image size is 3*256*256.

Fig. 9
figure 9

Batik image classification model framework


BPKG construction

Batik ontology model construction

Ontology originates from philosophy and is later used in computer science to describe conceptual entities and their relationships. As a key component of the batik knowledge graph, the batik ontology can provide a clear structure and standardized semantics for the construction of the knowledge graph. We adopt the seven-step method [48] to construct the batik ontology model and define four ontology concepts: pattern, meaning, worship consciousness, and prototype source, as shown in Table 1. We also define nine relationships based on ontology concepts, such as mean, belong to, worship, origin from, etc. as shown in Table 2.

Table 1 Definition ontology concepts
Table 2 Definition of relationships between concepts

Batik knowledge extraction, fusion, and visualization

Considering the accuracy, richness, and scale of batik knowledge, we select literature and book data for research. We downloaded 100 journal papers and thesis from CNKI and obtained 30 e-books from the Chaoxing platform and the Internet. Subsequently, we extracted the text data from these documents and will use NLP techniques for analysis.

When processing Chinese text, word segmentation is required. By writing a Python program to call the Jieba library and adding a custom batik dictionary, the word segmentation effect is improved. For example, for the sentence “蝴蝶妈妈造就了苗族祖先姜央(The Butterfly mom created the Miao ancestors Jiang Yang),” if the dictionary is not added, the segmentation result is “蝴蝶/妈妈/造就/了/苗族/祖先/姜央” (Butterfly/Mom/created/the/Miao/ancestors/Jiang Yang), after adding the dictionary, the result becomes “蝴蝶妈妈/造就/了/苗族/祖先/姜央” (Butterfly Mother / created / the / Miao / ancestors / Jiang Yang). This more closely aligns with the actual context in Miao culture because “Butterfly Mom” is a complete term in their culture. Our batik dictionary contains 50 terms.

In the entity extraction task, we adopt a method based on the batik dictionary. Using the terms in the batik dictionary as center words, we perform clustering on the words in the text data based on cosine distance. Considering that Skip-gram has higher accuracy in word embedding training, this paper uses it for training. By calling the Word2vec model in the Gensim library and setting the corresponding parameters (as shown in Table 3), we obtain the embedding representations of the words.

Table 3 Word2vec parameter settings

Based on the trained word embedding, we use the terms in the batik dictionary as center words and perform clustering on the words in the text data based on cosine distance, obtaining 50 clusters. Next, we manually filter and de-duplicate these 50 clusters, finally obtaining an entity set containing 120 terms. Compared with rule-based and machine learning-based methods, the dictionary-based method has the advantages of simple implementation and high computational efficiency. However, its disadvantage is the difficulty in discovering new entities outside the dictionary. To improve the coverage of entity extraction as much as possible, we further expand the entity set by manually filtering the clustering results on top of the initial clustering.

The relation extraction task is built upon entity extraction, aiming to extract the semantic relations between two or more entities from text data. In this research, we use the Language Technology Platform (LTP) developed by the Research Center for Social Computing and Information Retrieval at Harbin Institute of Technology for relation extraction. LTP provides various Chinese NLP functions, among which the dependency parsing module can automatically identify the dependency relations between words in a sentence, providing an important grammatical basis for relation extraction.

Specifically, we first input the entity pairs (entity1, entity2) obtained from entity extraction and their corresponding sentences into LTP. For example, for the sentence “石榴纹是一种常见植物纹” (Pomegranate pattern is a common plant pattern) containing the entities “pomegranate pattern” and “plant pattern”, we input it into LTP's dependency parsing module. The module automatically splits the sentence into the words “石榴纹: “(pomegranate pattern),” 是 “(is),”一种 “(a),” 常见"(common), and "植物纹"(plant pattern), and performs part-of-speech tagging for each word. Next, LTP identifies the dependency relations between the words in the sentence based on predefined Chinese dependency parsing rules. For example, the rule “if a noun forms a subject-predicate relationship with ‘是’(is), then the noun is the subject” can be used to identify the subject-predicate relationship between “pomegranate pattern” and “is”. Similarly, the rule “if there exists an ‘is’ predicate between two nouns, and the second noun has a verb-object relationship with ‘is’, then the two nouns have an ‘is’ relationship” can identify the relationship between “pomegranate pattern” and “plant pattern”.

Through the above analysis, LTP obtains the dependency parsing tree of the sentence, as shown in Fig. 10. It can be seen from the figure that there exists an SBV (subject-predicate relationship) between “pomegranate pattern” and “is”, and a VOB (verb-object relationship) between “is” and “plant pattern”. According to our predefined relation extraction rule, i.e., "if two entities in a sentence are connected through SBV and VOB relationships, then there exists an 'is' relationship between these two entities", we extract the triple (pomegranate pattern, is, plant pattern) from this parsing tree.

Fig. 10
figure 10

Dependency parsing results

Furthermore, we perform normalization processing on the extracted triples based on the entity attributes and relation patterns defined in the ontology. For example, the ontology stipulates that “plant pattern” is a pattern category, and if a pattern entity has an “is” relationship with it, then it can be converted to a “belong to” relationship. Therefore, we finally obtain the normalized triple (pomegranate pattern, belong to, plant pattern).

We repeat the above steps for all sentences containing entity pairs, obtaining a large number of triples. However, because some sentences have similar grammatical structures, the triples they generate may be duplicate; moreover, for the same entity pair, the same or similar semantic relations may be expressed in different sentences, resulting in the existence of multi-valued triples. To solve this problem, we adopt a de-duplication strategy, deleting completely duplicate triples; for multi-valued triples, we perform de-duplication through manual filtering, i.e., judging whether different triples express the same semantics, and keeping only one if they do.

After a series of processing steps, we finally obtain 419 triples. To intuitively display these complex semantic relations, we use the Neo4j graph database for storage and generate the BPKG shown in Fig. 11. In this knowledge graph, nodes represent batik pattern entities, edges represent relations between entities, and different types of relations are distinguished by edges of different colors. Through this graphical method, we can more clearly and intuitively display the connections between batik patterns, which helps to better understand and apply this knowledge.

Fig. 11
figure 11

The structure of BPKG. Nodes of different colors represent different types of entities, and the directed edges between nodes represent the relationships between entities

Semantic retrieval

Semantic retrieval of the BPKG can be realized through Cypher, a declarative query language for querying and manipulating graph databases. Users can express complex semantic query needs by defining the matching of nodes and relations. In Cypher, nodes are represented by parentheses “()”, and relations are represented by square brackets “[]”. For example, (ni)-[r]- > (nj) represents that node ni is connected to node nj by relation r. The commonly used clauses in Cypher include Match, Where, and Return. Among them, Match defines the basic matching pattern of the query, and is applicable to both nodes and relations. Where specifies the matching conditions, usually limiting the properties of the variables appearing in the match. Return specifies the returned results.

For example, if users want to query all pattern types in the BPKG, they can use the following code: $ MATCH (n)-[r:属于]- > (m) RETURN n, r, m. This query means to match all node pairs (n, m) that exist in the “属于” (belong to) relationship and return them. n and m represent the start and end points of the “belong to” relation, respectively, and r represents the relation itself. The query results are shown in Fig. 12. Through this hierarchical classification system, users can quickly understand the classification system of batik patterns and browse the specific patterns under each category.

Fig. 12
figure 12

Belong to relationship. In BPKG, we divide batik patterns into three main categories: animal, geometric, and plant patterns. Among them, animal patterns are further divided into eight subcategories: butterfly, fish, bird, dragon, frog, chicken, human, and other animal patterns, containing a total of 43 specific patterns. Plant patterns are divided into three subcategories: flower, fruit, and leaf patterns, containing a total of 18 patterns. Geometric patterns are divided into 23 subcategories such as drum, vortex, and dot patterns

In addition to querying the classification of patterns, users can also retrieve based on the meaning of patterns. For example, if one wants to query all patterns that imply praying for having children, the following code can be executed: $ MATCH p = ()-[r:寓意]- > (m:子嗣绵延) RETURN p. This query will match all pattern nodes that have an "子嗣绵延" (praying for having children) relationship and return the results. Similarly, if a user wants to query the meaning of all patterns, the query statement is: $ MATCH (n)-[r:寓意]- > (m) RETURN n, r, m. The query results are shown in Fig. 13. It can be seen that each batik pattern contains specific cultural meanings. For example, the butterfly represents "respecting ancestor", the pear blossom mean "hope", and the pomegranate pattern has both "loving nature" and " praying for having children". Through BPKG, these cultural connotations hidden behind the patterns are visualized, helping users to better understand and inherit batik culture.

Fig. 13
figure 13

Mean relationship. We summarize the meanings of batik patterns into 15 categories: 热爱自然(loving nature), 敬重祖先(respecting ancestor), 子嗣绵延(praying for having children), 纳吉求福(praying for blessing), 夫妻和睦(marital harmony), 代表女性(women), 代表男性(men),阴阳平衡(yin-yang balance), 趋吉避凶(quxie bixiong), 民族迁徙情怀(migration), 生生不息(endlessness), 福寿延年(fushou yannian), 期盼丰收(harvest), 宗族团结(solidarity), and希望(hope)

In summary, we construct a BPKG containing 120 entities and 419 triples, covering multiple dimensions of information such as the classification, meaning, and worship, with high knowledge coverage and semantic richness. Through the Cypher query language, users can perform semantic retrieval to obtain the patterns and their semantic information of interest. This knowledge not only helps the digital protection of batik culture but also provides more materials and inspiration for the design and application of batik patterns.

Pattern classification


To construct the batik pattern dataset, we visited the Qiandongnan Miao and Dong Autonomous Prefecture, Guizhou Province, China, and collected more than 1,000 original images of batik objects using professional equipment. In the image preprocessing stage, we first manually filtered the original images based on the completeness and clarity of the patterns, removing some low-quality images. Then, we used Photoshop software to extract independent pattern samples from the original images, using the pattern boundaries as the cropping regions. Finally, we obtained a batik pattern dataset containing 15,148 images. This dataset covers the eight most representative batik patterns: bird (2853 images), butterfly (2064 images), dragon (2235 images), drum (2065 images), fish (2122 images), pomegranate (1803 images), roxburgh rose (2006 images), and plant(3145 images). Note that in this study,the pomegranate and roxburgh rose are excluded from the plant category. Some sample images are shown in Fig. 14.

Fig. 14
figure 14

Sample examples of the batik pattern dataset. a bird; b butterfly; c dragon; d drum; e fish; f plant; g pomegranate; h roxburgh rose

Due to differences in acquisition conditions and cropping regions, the samples in the dataset show significant heterogeneity in size, color, texture, and other aspects. At the same time, some samples have large inter-class differences and small intra-class differences, which will interfere with the model's feature extraction and classification. To reduce interference, we normalize all images to a size of 3*256*256, which can largely retain the detailed features of the patterns without placing too high hardware requirements on the deep learning model.

Experimental environment

The hardware and software configurations in the experiments are as follows: Intel(R) Core(TM) i7-11800H processor, 8 cores and 16 threads, main frequency 2.3 GHz, 16 GB running memory, graphics processor GeForce RTX 3080, 12 GB video memory; deep learning model framework uses Pytorch1.9.1 and Torchvision 0.10.1.

To evaluate the performance of the proposed model, we select five classic CNNs as benchmark models, namely AlexNet, VGG16, Inception v3, ResNet34, and ResNet50. Among them, AlexNet is the first deep network structure that achieved breakthrough results in large-scale image classification tasks. It promoted the development of deep learning in computer vision. VGG16 increases the depth of the network while keeping the receptive field unchanged by using a series of small-sized convolution kernels. This improves the ability of feature representation. Inception v3 adopts multi-scale convolutional kernels and a parallel convolutional structure, which improves the network's adaptability to scale. By introducing residual connections, ResNet34 and ResNet50 effectively solve the problem of difficult training in deep networks, making the training of ultra-deep networks possible. These five models have been widely used and validated in many visual tasks such as image classification and object detection, making them very suitable as benchmarks for measuring the performance of new models. In the experiments, to ensure fairness of comparison, we do not use the pre-trained weights of these models on other large-scale datasets, but train them from scratch on our own dataset, making their starting state consistent with our model.

We randomly divide the dataset into training and test sets in a ratio of 8:2, i.e., 11,538 training samples and 3,610 test samples. This division ratio is a common empirical value, which can ensure a sufficient number of training samples while leaving enough samples for performance evaluation in the test set. The training parameters of our model are shown in Table 4, using the Adam optimizer for training, with 100 iterations and a batch size of 8.

Table 4 Training parameter setting

Experimental results and analysis

To comprehensively evaluate the performance of the classification model, we adopt four commonly used indicators: accuracy, precision, recall, and F1-score. The calculation formulas of these indicators are as follows:

$$Accuracy = \frac{TP + TN}{{TP + TN + FP + FN}}$$
$$Precision = \frac{TP}{{TP + FP}}$$
$$Recall = \frac{TP}{{TP + FN}}$$
$$F1 - Score = \frac{2 \times Recall \times Precision}{{Recall + Precision}}$$

where TP represents the number of samples that the label is true and the prediction result is also true; TN represents that the label is false and the prediction result is also false;

FP represents that the label is false and the prediction result is true; and FN represents that the label is true and the prediction result is false.

Inputting the test set samples into the six trained classification models, we obtain the confusion matrices shown in Fig. 15. It can be seen that different models show different advantages on the task of classifying eight batik patterns.

Fig. 15
figure 15

Confusion matrix diagrams of classification models. a AlexNet; b VGG16; c Inception v3; d ResNet50; e ResNet34; f Our model

From an overall perspective, there are some differences in the classification effects of the eight patterns. We calculated the accuracy of different patterns based on the confusion matrices, as shown in Table 5. Among them, the drum pattern achieved the best performance, with an average accuracy of 98.79% across the six models. The reason for this phenomenon may be that the texture features of drum are more prominent, and the differences between samples are smaller, making them easier to recognize. In comparison, the classification effects of pomegranate and roxburgh rose are slightly lower, averaging 64.54% and 73.473% respectively. The accuracy of the pomegranate pattern is only 43.77% in the AlexNet model. After further analysis of the confusion matrix, we found that pomegranate and roxburgh rose patterns had the most samples misclassified as the plant pattern, with 165 and 306 misclassified samples, respectively. This is due to the fact that pomegranate and roxburgh rose are also plants, and they differ less from plant samples in shape and texture, which poses a greater challenge to feature extraction and classification.

Table 5 Accuracy comparison of different patterns

Specifically, in the bird classification, ResNet34 performs best, with 571 correctly classified samples, and our model closely follows, correctly classifying 570 samples, only one less than ResNet34. In contrast, AlexNet performs the worst, correctly classifying only 519 samples. Similar to the bird classification, in the classification of butterflies, our model also classified only 1 less than ResNet34. In dragon classification, ResNet34 ranks first with 437 correctly classified samples, and our model closely follows with 431 correctly classified samples, with a very small gap between the two. AlexNet performs the worst on this task, with 325 correctly classified samples. In drum classification, our model, ResNet34, and ResNet50 all achieve the best performance, with the number of correctly classified samples reaching 413. In the classification of fish, our model achieves the highest number of correctly classified samples (425), outperforming the other five models. AlexNet performs the worst, with only 343 correctly classified samples. In plant classification, our model performs the best with 566 correctly classified samples, while AlexNet only correctly classifies 486 samples. Similar to plant classification, in pomegranate classification, our model also performs the best (266), while AlexNet is the worst (158). In roxburgh rose classification, our model (329) far outperforms ResNet34 (300), showing a clear advantage. The above results indicate that our model and ResNet34 both show the best performance on the classification tasks of the eight patterns, followed by ResNet50, Inception v3, and VGG16, while the AlexNet performs the worst.

To evaluate the performance of each model under different classification thresholds, we plot the ROC curves shown in Fig. 16. The ROC curve reflects the relationship between the true positive rate (TPR) and the false positive rate (FPR) under different classification thresholds. Among them, TPR represents the proportion of correctly classified positive samples among all positive samples, and FPR represents the proportion of incorrectly classified negative samples among all negative samples. The closer the curve is to the upper left corner, the better the classification performance of the model. The area under the curve (AUC) is a commonly used comprehensive evaluation indicator, with values ranging from 0 to 1, and larger values indicating better classifier performance.

Fig. 16
figure 16

ROC curves of classification models. a AlexNet; b VGG16; c Inception v3; d ResNet50; e ResNet34; f Our model

From Fig. 16, we can see that our model achieves excellent performance on all patterns, with its ROC curve being the closest to the upper left corner and having the highest AUC value of 0.9920. ResNet34 and ResNet50 also performed well, with average AUC values of 0.9895 and 0.9879, respectively. In comparison, the ROC curves of Inception v3, VGG16, and AlexNet are relatively farther from the upper left corner, with average AUC values of 0.9797, 0.9767, and 0.9564, respectively, indicating their relatively poorer classification performance. Especially in the pomegranate and roxburgh rose classification, the advantage of our model is more obvious. This may be because they contain a large number of similar features, and our model, by fusing local and global information, better captures and utilizes these key features, thus obtaining more accurate classification results.

To further evaluate the performance of the four models, we calculate their accuracy, precision, recall, and F1-score, with detailed data shown in Table 6. Our model achieves the best results on all indicators, with accuracy and precision both exceeding 0.9446, recall reaching 0.9362, and F1-score reaching 0.938. ResNet34 and ResNet50 also perform well, but our model outperforms ResNet50 by more than 1% on all indicators. This indicates that although deeper CNNs can learn higher-dimensional feature representations, ResNet50 has begun to show a slight overfitting on the batik dataset, possibly due to the limited sample size and longer gradient backpropagation paths. The performance of Inception v3 and VGG16 is worse than that of ResNet50, but better than AlexNet. AlexNet performs the worst, with values below 0.79 on all indicators. These results further verify the effectiveness and superiority of our model on the batik pattern classification task.

Table 6 Comparison of classification indicators


Batik, as a unique decorative art, has played an important role in the history of Chinese ethnic minorities. Batik patterns not only have aesthetic value but also carry the cultural memory and emotional identity of the Miao people. However, in the changes of modern society, batik culture is facing the dilemma of insufficient inheritance and accelerated loss.

To promote the dissemination and protection of Chinese batik, this paper comprehensively applies knowledge graph, NLP, deep learning, and other artificial intelligence technologies to construct a dual-channel mechanism connecting batik images and cultural knowledge. First, through NLP techniques, entities and relationships are extracted from a large number of documents, and stored and visualized with the Neo4j database. Second, an improved ResNet34 model is proposed to improve the accuracy of image classification. Finally, the improved model is applied to the automatic classification of batik images, providing a tool for understanding batik images.

This research reduces the difficulty for non-professional users to understand and apply batik knowledge, opening up new paths for the cross-border dissemination and innovative application of batik culture. Whether they are professional designers or the general public interested in batik, they can use this mechanism to cognize, understand, appreciate, and innovate batik from different perspectives and at different levels. In future research, we will strive to further expand the scale of BPKG, enrich its semantic associations, optimize the retrieval and question-answering capabilities, and develop a cultural popularization platform, hoping to provide more reliable theoretical foundations and practical support for the dissemination and development of cultural heritage.

Availability of data and materials

The data that support the findings of this study are available from the corresponding author upon reasonable request.


  1. Nurhaida I, Noviyanto A, Manurung R, Arymurthy AM. Automatic Indonesian’s batik pattern recognition using SIFT approach. Proc Comput Sci. 2015;59:567–76.

    Article  Google Scholar 

  2. Mardani DA, Pranowo P, Santoso AJ. Deep learning for recognition of Javanese batik patterns. InAIP Conf Proc. 2020.

    Article  Google Scholar 

  3. Agastya IM, Setyanto A. Classification of Indonesian batik using deep learning techniques and data augmentation. In2018 3rd international conference on information technology, information system and electrical engineering (ICITISEE), 2018, 13:27–31.

  4. Han D, Cong L. Miao traditional patterns: the origins and design transformation. Vis Stud. 2023;38(3–4):425–32.

    Article  Google Scholar 

  5. Dong B, Bai K, Sun X, Wang M, Liu Y. Spatial distribution and tourism competition of intangible cultural heritage: take Guizhou, China as an example. Herit Sci. 2023;11(1):64–80.

    Article  Google Scholar 

  6. Chen Z, Ren X, Zhang Z. Cultural heritage as rural economic development: Batik production amongst China’s Miao population. J Rural Stud. 2021;81:182–93.

    Article  Google Scholar 

  7. Zhennan LY, Yahaya SR. An aesthetic study on traditional batik design of miao ethnicity in China. Kupas Seni. 2021;9(2):12–25.

    Article  Google Scholar 

  8. Tian G, Yuan Q, Hu T, Shi Y. Auto-generation system based on fractal geometry for batik pattern design. Appl Sci. 2019;9(11):2383–403.

    Article  Google Scholar 

  9. Lv J, Zhu M, Pan W, Liu X. Interactive genetic algorithm oriented toward the novel design of traditional patterns. Information. 2019;10(2):36–50.

    Article  Google Scholar 

  10. Ding N, Lv J, Hu L. Application of improved collaborative filtering algorithm in recommendation of batik products of miao nationality. InIOP Conf Series. 2019;677(2):022038–48.

    Article  Google Scholar 

  11. Hu T, Xie Q, Yuan Q, Lv J, Xiong Q. Design of ethnic patterns based on shape grammar and artificial neural network. Alex Eng J. 2021;60(1):1601–25.

    Article  Google Scholar 

  12. Ding N, Lv J, Hu L. Research on national pattern reuse design and optimization method based on improved shape grammar. Int J Comput Intell Syst. 2020;13(1):300–9.

    Article  Google Scholar 

  13. Yang L, Li J. Research on the creation of Chinese national cultural identity symbols based on visual images. Math Prob Eng. 2022.

    Article  Google Scholar 

  14. Bo H. Study on the batik patterns and crafts of the miao costumes in Northwestern Guizhou Province. In2014 2nd International Conference on Advances in Social Science, Humanities, and Management, 2014, 250–254.

  15. Chen D, Cheng P. A method to extract batik fabric pattern and elements. J Textile Instit. 2021;112(7):1093–9.

    Article  Google Scholar 

  16. Yuan Q, Xu S, Jian L. A new method for retrieving batik shape patterns. J Am Soc Inf Sci. 2018;69(4):578–99.

    Article  Google Scholar 

  17. Fiorucci M, Khoroshiltseva M, Pontil M, Traviglia A, Del Bue A, James S. Machine learning for cultural heritage: a survey. Pattern Recogn Lett. 2020;133:102–8.

    Article  Google Scholar 

  18. Kim G, Lee C, Jo J, Lim H. Automatic extraction of named entities of cyber threats using a deep Bi-LSTM-CRF network. Int J Mach Learn Cybernet. 2020.

    Article  Google Scholar 

  19. Liu X, Wei F, Zhang S, Zhou M. Named entity recognition for tweets. ACM Trans Intell Syst Technol. 2013;4(1):1–5.

    Article  Google Scholar 

  20. Wen H, Zhu X, Zhang L, Li F. A gated piecewise CNN with entity-aware enhancement for distantly supervised relation extraction. Inf Process Manage. 2020;57(6):102373–87.

    Article  Google Scholar 

  21. Wang H, Qin K, Zakari RY, Lu G, Yin J. Deep neural network-based relation extraction: an overview. Neural Comput Appl. 2022.

    Article  PubMed  PubMed Central  Google Scholar 

  22. Geng Z, Chen G, Han Y, Lu G, Li F. Semantic relation extraction using sequential and tree-structured LSTM with attention. Inf Sci. 2020;509:183–92.

    Article  Google Scholar 

  23. Yang B, Liao YM. Research on enterprise risk knowledge graph based on multi-source data fusion. Neural Comput Appl. 2022;34(4):2569–82.

    Article  Google Scholar 

  24. Nguyen HL, Vu DT, Jung JJ. Knowledge graph fusion for smart systems: a survey. Inf Fusion. 2020;61:56–70.

    Article  Google Scholar 

  25. VRANDEČIĆ D, KRöTZSCH M. Wikidata: A free collaborative knowledge base. Communications of the ACM, 2014, 57(10): 78–85.

  26. Lehmann J, Isele R, Jakob M, Jentzsch A, Kontokostas D, Mendes PN, Hellmann S, Morsey M, Van Kleef P, Auer S, Bizer C. Dbpedia–a large-scale, multilingual knowledge base extracted from wikipedia. Semantic Web. 2015;6(2):167–95.

    Article  Google Scholar 

  27. Li L, Wang P, Yan J, Wang Y, Li S, Jiang J, Sun Z, Tang B, Chang TH, Wang S, Liu Y. Real-world data medical knowledge graph: construction and applications. Artif Intell Med. 2020;103:101817–51.

    Article  PubMed  Google Scholar 

  28. Bharadwaj AG, Starly B. Knowledge graph construction for product designs from large CAD model repositories. Adv Eng Inform. 2022;53: 101680.

    Article  Google Scholar 

  29. Díaz-Rodríguez N, Lamas A, Sanchez J, Franchi G, Donadello I, Tabik S, Filliat D, Cruz P, Montes R, Herrera F. EXplainable neural-symbolic learning (X-NeSyL) methodology to fuse deep learning representations with expert knowledge graphs: the MonuMAI cultural heritage use case. Information Fusion. 2022;79:58–83.

    Article  Google Scholar 

  30. Carriero VA, Gangemi A, Mancinelli ML, Nuzzolese AG, Presutti V, Veninata C. Pattern-based design applied to cultural heritage knowledge graphs. Semantic Web. 2021;12(2):313–57.

    Article  Google Scholar 

  31. Dou J, Qin J, Jin Z, Li Z. Knowledge graph based on domain ontology and natural language processing technology for Chinese intangible cultural heritage. J Vis Lang Comput. 2018;48:19–28.

    Article  Google Scholar 

  32. Fan T, Wang H. Research of Chinese intangible cultural heritage knowledge graph construction and attribute value extraction with graph attention network. Inf Process Manage. 2022;59(1): 102753.

    Article  Google Scholar 

  33. Fan T, Wang H, Hodel T. CICHMKG: a large-scale and comprehensive Chinese intangible cultural heritage multimodal knowledge graph. Herit Sci. 2023;11(1):115–33.

    Article  Google Scholar 

  34. Fan T, Wang H. Multimodal sentiment analysis of intangible cultural heritage songs with strengthened audio features-guided attention. J Inf Sci. 2022.

    Article  Google Scholar 

  35. Yang Y, Zhang G, Wang J, Ye S, Hu J. Public cultural knowledge graph platform. In2017 IEEE 11th International Conference on Semantic Computing (ICSC) 2017, 322–327.

  36. Zhang L, Song M, Liu X, Sun L, Chen C, Bu J. Recognizing architecture styles by hierarchical sparse coding of blocklets. Inf Sci. 2014;254:141–54.

    Article  Google Scholar 

  37. Li J, Wang JZ. Studying digital imagery of ancient paintings by mixtures of stochastic models. IEEE Trans Image Process. 2004;13(3):340–53.

    Article  PubMed  Google Scholar 

  38. Zhao Y, Hao K, He H, Tang X, Wei B. A visual long-short-term memory based integrated CNN model for fabric defect image classification. Neurocomputing. 2020;380:259–70.

    Article  Google Scholar 

  39. Janković BR. A comparison of methods for image classification of cultural heritage using transfer learning for feature extraction. Neural Comput Appl. 2023.

    Article  Google Scholar 

  40. Paolanti M, Frontoni E. Multidisciplinary pattern recognition applications: a review. Comput Sci Rev. 2020;37:100276–99.

    Article  Google Scholar 

  41. Liu Q. Technological innovation in the recognition process of Yaozhou Kiln ware patterns based on image classification. Soft Comput. 2023.

    Article  PubMed  PubMed Central  Google Scholar 

  42. Ding X, Zou C, Chen J, Zou F. Extraction and classification of She nationality clothing via visual features. Text Res J. 2016;86(12):1259–69.

    Article  CAS  Google Scholar 

  43. Ding X, Li T, Chen J, Ma L, Zou F. Research on the clothing classification of the she ethnic group in different regions based on FPA-CNN. Appl Sci. 2023;13(17):9676–93.

    Article  CAS  Google Scholar 

  44. Kong Q, Shi Z, Feng Y, Yang M, Zhang M, Zeng S, Li R, Yu K, Shen J. Classification method of ethnic minority patterns based on faster R-CNN. J Phys. 2020;1575(1):012137–43.

    Article  Google Scholar 

  45. Jia X, Liu Z. Element extraction and convolutional neural network-based classification for blue calico. Text Res J. 2021;91(3–4):261–77.

    Article  CAS  Google Scholar 

  46. Fan T, Wang H, Deng S. Intangible cultural heritage image classification with multimodal attention and hierarchical fusion. Expert Syst Appl. 2023;231: 120555.

    Article  Google Scholar 

  47. He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, 770–778.

  48. Luo Z, Deng M, Yongjian L, Jingling Y. An ontology construction method for educational domain. In2013 Fourth International Conference on Intelligent Systems Design and Engineering Applications, 2013, 99–102.

Download references


The authors are appreciated for anonymous reviewers’ insightful comments.


This work is supported by Guizhou Provincial Basic Research Program (Natural Science) under Grant No. ZK[2023]029 and Joint Open Fund of Guizhou Provincial Department of Education [2022]436.

Author information

Authors and Affiliations



H.Q. and Y.L. designed the study, performed the experiments, analyzed the results, and wrote the main manuscript. D.L. provided valuable insights and suggestions on the methodology, and knowledge graph construction. Y.Z. prepared the dataset used in the study. All authors reviewed the manuscript.

Corresponding author

Correspondence to Yiting Li.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Competing interests

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Quan, H., Li, Y., Liu, D. et al. Protection of Guizhou Miao batik culture based on knowledge graph and deep learning. Herit Sci 12, 202 (2024).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: