Skip to main content

A study on the construction of knowledge graph of Yunjin video resources under productive conservation


Nanjing Yunjin, a highly representative Chinese silk weaving handicraft, was included in the Representative List of Intangible Cultural Heritage of Humanity in 2009. However, due to modern silk weaving technology advancements, aesthetic style evolution, and inadequate public recognition of Yunjin culture, the art faces a decline in market recognition and practitioners, posing a risk to its preservation. Addressing this issue necessitates product innovation, efficient knowledge storage, management, and utilization, and enhancing public cultural identity for Yunjin. Following the government’s “productive conservation” concept for intangible cultural heritage (ICH) projects in the handicraft category, this study uses Yunjin video resources as the primary data source. It constructs a domain knowledge graph (DKG) using an ontological approach to effectively and systematically preserve Yunjin knowledge. Furthermore, the study leverages Neo4j network topology to reveal intricate and diverse relationships within Yunjin knowledge, uncovering rich cultural connotations. Lastly, Cypher is employed for semantic queries, graph visualization, and domain expert evaluation. Evaluation results indicate that the constructed Yunjin DKG meets quality standards, supporting the development of products that align with market aesthetics while preserving Yunjin’s intrinsic cultural values. This approach fosters a complementary relationship between economic benefits and ICH. Additionally, the Yunjin DKG application presents a technical path for knowledge interconnection, integration, and discovery within ICH projects in the handicraft category.


In recent years, domain knowledge graphs (DKGs) have gained prominence in preserving and developing intangible cultural heritage (ICH) [1, 2]. DKGs refine, extract, correlate, and integrate data related to ICH projects, helping audiences interpret and appreciate the essence of ICH [3,4,5]. Currently, the primary focus of ICH project DKGs is to effectively organize information resources and link multi-source heterogeneous data [6]. Examples include the Europeana project [7], ICHPEDIA project [8], and I-Treasures project [9]. These projects’ semantic links describe traditional knowledge, enabling audiences to understand ICH culture. DKGs provide a technical path to address this challenge. However, the current application of DKGs in ICH projects primarily focuses on conventional knowledge such as “bearer,” “time,” and “institution,” neglecting the importance of addressing the practical issue of demonstrating cultural connotations and national spirit through ICH products. In response to the aforementioned limitations, this study focuses on developing a DKG centered on ICH products, aiming to improve public comprehension and appreciation of the cultural connotations and national spirit inherent in these vital heritage artifacts. Consequently, this contributes to the ongoing preservation and development of ICH.

In China, there are 629 handicraft ICH projects following the “productive conservation” development concept. This concept emphasizes authenticity, integrity, and inheritance while transforming ICH projects into socially valuable resources [10]. The “productive conservation” concept aligns with the World Heritage Convention’s “Inclusive Economic Development” advocacy. This study selects Nanjing Yunjin, a highly representative Chinese ICH handicraft, to construct a DKG under the “productive conservation” concept. Yunjin’s history spans over 1600 years, representing the highest level of silk weaving technology and ranking among the top four famous Chinese brocades. Yunjin embodies the core themes of Chinese auspicious culture, expressing power, fortune, prosperity, longevity, happiness, and wealth. UNESCO inscribed the Nanjing Yunjin hand-weaving technique on the Representative List of Intangible Cultural Heritage of Humanity in 2009.

The primary contributions of this study are:

Constructing a DKG for ICH handicraft projects based on ontological approach and “productive conservation,” demonstrating the production process, product value, market feedback, and cultural connotation of Yunjin products [11]. This assists bearers in product innovation, production operators in management and promotion, and consumers in understanding the intrinsic cultural connotation, thereby improving market recognition.

Providing a paradigm for the framework construction of domain ontology and DKGs for ICH handicraft projects by normalizing and constraining Yunjin domain data through migration learning and reusing mature ontology-related concepts.

Utilizing video resources as the main information source for the knowledge graph and applying the optimized frame difference method to significantly improve the efficiency of instance extraction by domain experts, addressing the difficulties of data extraction and story interpretation of ICH handicraft projects.

The rest of this paper is organised as follows: “Related work” section discusses the work related to the construction of DKG. “Methodology” section describes the methodological structure and process of this study in detail. “Empirical analysis” section presents the empirical evidence and the evaluation by domain experts. “Conclusion” section summarizes the work related to this study, and describes the next research directions and contents.

Related work

The DKG, with its robust semantic processing and open interconnectivity, has found extensive applications in numerous cultural heritage (CH) digital projects. These applications can be broadly classified into two categories: resource integration, classification, and retrieval; and the application of digital and intelligent technologies to enhance data and facilitate the production of cultural memory through user-oriented services.

The Europeana project, for example, classifies and retrieves European cultural collections by extracting elements such as themes, time, and institutions from digital heritage like music, books, and artworks.Footnote 1 The Ichpedia project constructs an ICH encyclopedia system based on web data, enabling users to search and relate ICH elements through various search methods [12]. Carriero et al. built a Knowledge Graph (KG) of Italian CH, allowing users to retrieve relationships between resources and enrich the connotations behind CH [13]. The CultureSampo project recreates Finland’s collective national memory in the 19th century by bringing together scattered memory resources [14].

As KG technology develops in the CH field, the organization of CH knowledge has shifted from traditional coarse-grained approaches to the integration of internal characteristics of information resources. Ontology theory is commonly applied to organize CH knowledge resources [15]. Domain ontologies are usually constructed manually or through semi-automatic construction by reusing available ontologies. Domain-generic ontologies like the CIDOC Conceptual Reference Model (CIDOC CRM),Footnote 2 the Historical Context Ontology (HiCO)Footnote 3 and the Europeana Data Model (EDM),Footnote 4 have been used for information integration, exchange, sharing, and reuse in the CH domain. Project-applied ontologies, such as ArCo ontology (ArCo)Footnote 5 and CrossCult ontology (CrossCult),Footnote 6 have been developed for specific projects. Top-level ontologies like Friend of a Friend (FOAF),Footnote 7 GeoNames,Footnote 8 Visual Representation Ontology (VRO),Footnote 9 Time ontology (Time),Footnote 10 and the Ontology for Media Resource (Ma-ontology)Footnote 11 describe concepts independent of a defined domain.

Various data models and descriptive frameworks have been proposed for domain-specific issues according to the “construct validity” theory of ontologies. Examples include Vincenzo’s Drammar integrated ontology model for theater elements [16]. Kalita et al.’s ontology for traditional dance practices of the Rabha tribe in northeast India [17]. Isa WMW et al.’s knowledge base for ICH of copperware category [18]. Isa WMW et al.’s knowledge base for ICH of copperware category [19].

However, no specific ontology models have been designed for ICH projects of handicraft categories requiring “productive conservation”. This study addresses this gap by constructing domain ontologies and DKGs based on existing results and ICH project needs, using CIDOC CRM as the main framework and incorporating concepts from Time and Ma-ontology models. Yunjin, an ICH handicraft project, transmits skills and culture through physical products circulating in the market [20]. Utilizing mining techniques, this study extracts Yunjin features and performs secondary clustering of the knowledge system. Informed by analogous cultural ontologies and core conceptual principles, the Yunjin domain ontology was established, emphasizing representative top-level concepts. The core concept determination process for the Yunjin domain ontology is depicted in Fig. 1.

Fig. 1
figure 1

The process of determining the core concept of ontology of Yunjin domain

The specific construction processes of DKGs, including processing video resources, concept sets, concept hierarchies, attribute relationships, and visual presentation of graph databases, are detailed in the following section.


The data from refined processing allows for optimal production decisions [21]. This study constructs a KG of the Yunjin project to sort out and visualize the Yunjin knowledge chain, efficiently store and utilize different project data, thereby ultimately achieving a spiral development channel of production-consumption-transmission. It takes “productive conservation” as the core concept, with video resources as the basic data source, and the main body is divided into four parts, including data resource layer, schema layer, physical data layer and application layer. The data resource layer selects videos related to Yunjin products from multiple channels, such as promotional documentaries, social videos and replicated images, extracts the key frames through the optimized frame difference method, while converting them into structured data by semantic annotation correction by Yunjin experts. The schema layer selects the Protege tool to study and migrate similar ontologies in accordance with the Yunjin cultural knowledge system and the characteristics of productive conservation products, followed by continuous iterative modification of multiple ontology construction methods in accordance with the Yunjin knowledge architecture and cultural connotations, to construct a base of ontology about the Yunjin projects, define attribute relationships, categories and hierarchical structures. The physical data layer uses the graph database Neo4j to store the annotated and extracted data instances in key frames. The application layer uses Cypher for the query validation of the required knowledge units and the visual presentation of all the associated elements of the KG. The organizational flow chart of the dkg of Yunjin is shown in Fig. 2.

Fig. 2
figure 2

DKG framework for semantic organization of Yunjin video resources

Organization of the data resource layer

Data extraction and annotation

Since the ICH projects of fabric handicraft techniques are complicated and delicate, and the operation is dynamic and variable, the video resources is capable of carrying a variety of information, this study primarily selects the video resources as the main data source of the KG. The video resources of Yunjin are divided into three major categories, including promotional documentaries, social videos and replicated images. Yunjin cultural relics are primarily showcased in documentaries, while Yunjin modern commercial works and a large number of cultural and creative works are primarily marketed through new media platforms such as Weibo, Red and Douyin through social videos, which are objective and detailed records of techniques and bearers in replicated images. The processing of video resources is divided into three main steps. The first is to obtain video resources, which are primarily derived from the Yunjin Research Institute, the Yunjin finished product showroom and the new media promotion platform. The second step is to obtain the textual description information corresponding to the video resources, which are primarily derived from archival literature and the professional guidance of the Yunjin ICH project bearers. The third step is to perform key frame filtering, semantic annotation and instance extraction on the screened Yunjin video content, which ultimately results in a raw dataset with Yunjin video resources as the main source.

Extraction and semantic annotation of key frames

Each video is composed of two-dimensional images in frame by frame with unstructured time series, with key frame to characterize the key information content in the video, which can explicitly and completely record the core of video representation. The extraction of key frame can decrease the amount of video data indexing, avoid a substantial amount of repeated data use, thereby improving the efficiency of semantic annotation. The scenes of Yunjin videos are primarily concentrated on physical objects (machines, raw materials, artistic conception drawing, works and so forth), characters (bearers, consumers, researchers and so forth), events (exhibitions, interpretations, creations and so forth) and material (humanities, history, nature and so forth). Each frame is likely to encompass multiple thematic concepts, which require refinement of granularity layer by layer and iterative extraction of combinations [22].

The video is normally composed of images, narration and music, which means that the complete data source should be extracted by decomposing the images of Yunjin video, thereby extracting the corresponding narration [23]. The similarity [24] between frames is relatively sensitive to the subtitle and other factors, which is conducive to extracting the narration. For this reason, a key frame extraction algorithm based on the frame difference method was adopted in this study. For the sake of avoiding the calculation of complex key frame extraction process, pixels with different weights and position information were added to be considered [25], in which the difference data were smoothed by moving window smoothing [26], and the maximum value of the interval was selected as one of the key frames, with the algorithm workflow shown in Fig. 3.

Fig. 3
figure 3

Framework of key frame extraction operation based on optimized frame difference

The basic principle of key frame extraction consists of small number of frames, large amount of information and strong expressive power [27]. In view of the specialization of Yunjin, automatic machine decomposition cannot fully realize the extraction of key frames of Yunjin video resources at the finest granularity. At the present time, it is necessary to rely on manual re-screening of the key frames after machine processing, followed by semantic annotation of their patterns, materials, emotions and other features based on video narration, Yunjin bearers, historical literature and multiple examinations, as well as the triad semantic description of attributes and relationships through OWL language.

Video semantic annotation is categorized into bottom semantics (video features and data attributes), middle semantics (object attributes and relationship attributes) and top semantics (theme and emotion). Among them, the data attributes in bottom semantics are primarily the basic features of video and frame rate; the association between people, events and objects in middle semantics requires manual perception and judgment based on contextual information of video resources, while the theme and emotion in top semantics require Yunjin experts to perform abstract concept understanding and refinement. A video resource content is aggregated by video knowledge elements, which are logical video units with the smallest amount of units, allowing the characterization of a certain knowledge concept of Yunjin. These are composed of one or more semantic triad combinations with indivisibility, integrity and relevance.

Organization of the schema layer

Construction method of ontology

The construction of domain ontology model has standardized the explicit description of the information related to Yunjin knowledge domain, while constraining Entity, Concept and Relation, which is the core link of DKG. This study selects CIDOC CRM as the base framework for ontology reuse, with reference to some attributes in ontology such as Ma-ontology and TIME as the core categories of ontology in this study. In the meantime, the relevant attributes such as customized “knowledge element”, “market information feedback” and “product value” were reused and extended by ontology as appropriate. The details of the categories and object attributes of Yunjin video resource ontology model (YJVO) are shown in Table  1.

Table 1 List of class and attributes of ontology for Yunjin video resources

Construction of core classes for ontology

  • Video resources class. A Yunjin product requires workflows such as artistic conception drawing, weaving and marketing, during which events and information are recorded in a series of video resources to characterize Yunjin domain knowledge. Yunjin experts manually identify and define key frames, while taking advantage of RDF triads for formal representation of extracted video data information, and multiple semantic triads (resource identifier—attribute–attribute value) are logically combined to form video knowledge elements. The abstract knowledge element concept is formalized as the smallest knowledge unit reflecting Yunjin knowledge, which inherits basic attributes and data attributes such as name, duration, resolution and frame rate of the video resource. Meanwhile, the video knowledge element is closely related to the core classes of objects, characters, time, place and events in ontology, which are combined with each other to form a complete knowledge network. This study reuses the concept extension of MediaResource class in Ma-ontology, describes key frames with Image class, in addition to customizing Yunjin work VideElement.

  • Production class. The high market recognition and value embodiment are the internal driving force for the sustainable inheritance of Yunjin skills. This study reuses E12 Production in the framework of ICH project of CIDOC CRM, while customizing two sub-categories of market intelligence and product value in accordance with the inherent characteristics of the skill items of ICH projects. The information presented in the market information feedback such as product sales, market share, consumer preferences, changes in the aesthetic context, the response to the promotion of activities and the situation of competing products are the precursors to the production and management strategies of Yunjin. Under productive protection, Yunjin products need to feature artistic value, practical value and symbolic value, with different purchasing needs of customer groups for Yunjin product value determining the different elements that constitute the product value.

  • Agent class. The marketed Yunjin products require a full understanding of the interests and preferences of consumers, and the concept of person relationship between producers-consumers and consumers-consumers in forming interest circles on product requirements is analogous to the construction theory of Six Degrees of Separation. As a consequence, this study reuses FOAF to describe person relationships based on semantic web technologies such as RDF and XML, with a parent class Agent, and sub-classes Person, Group and Organization. In the meantime, ontology also draws on the attributes of Relationships to describe the existence of powerful relationships such as mentorship and cooperation in the weaving, promotion and sales of Yunjin, such as the object attribute of Mentor of, and the object attribute of Influenced by, in which different consumer groups in the market maximize the benefits of Yunjin industry as a result of the “star effect”.

  • Time class. Yunjin of “every inch of gold” requires more than 120 steps just for the weaving process, from the preliminary market research and concept design, to the mid-term pattern drawing and fabric weaving, to the final product inspection and sales promotion, which is time-consuming and labor-intensive. The most skilled technicians work together to weave only 5 or 6 cm a day on a grand flowers floor loom, which demonstrates the complexity of the process. In this study, the TIME is used to define the interval of Yunjin weaving and the instance of the occurrence of events. Since Yunjin has been handed down for more than 1,500 years, which was even a royal fabric in Yuan, Ming and Qing dynasties, some patterns and works need to describe the dynastic changes, thereby inheriting Chinese historical dynasties under the time period.

  • Geography and spatial class. In this study, E44 Place Appellation and its sub-classes E45 Address, E47 Spatial Coordinates and E48 Place Name were reused to describe the occurrence locations of events and activities related to Yunjin, the collection places of natural objects, the creation places of man-made objects and the creation places of conceptual objects and the described spatial locations, so as to facilitate the perception of geographic space by the application layer of KG.

  • Event class. In this study, the events and social changes related to the development process of Yunjin are defined as E5 Event, while the practical activities and representational behaviors of people such as weaving, performance and promotion are defined as subclass Activity. As mentioned above, the finished products of Yunjin are a systematic technical project, thereby reusing E63 Beginning of Existence to describe the temporal reasoning of the enduring project, which defines E12 Production and E65 Creation as mapping to physical entities and conceptual objects. The specific events in the production and promotion of Yunjin trigger market feedback that drives production, thereby defining the object properties of Effect by.

  • Physical object class. In this study, E70 Thing was reused to describe the physical object class, and most of the precious raw materials in Yunjin weaving are taken from nature. Yunjin is a cooked jacquard silk fabric, which does not require dyeing and printing after weaving. This requires the technicians to process the raw materials taken from nature manually, such as dyeing and tautening silk to meet the requirements, and twisting the fibers of male peacock feathers into peacock feather threads. In this study, the materials were classified into E18 Physical Thing and E71 Man-Made Thing, and has value object attributes to manifest different product values of different objects in brocade weaving. Yunjin has not only left behind a dazzling array of physical works in the long history, but also a large number of well-known cultural works. For this reason, E24 Physical Man-Made Thing was reused to primarily describe Yunjin-related physical works, and E28 Conceptual Object was adopted to describe the products derived from Yunjin with logical thinking after consideration by the characters.

Construction of attribute relationship model for ontology

Object attributes of the core classes in the defined ontology model are constrained by Domain and Range, while the knowledge network is topologized around the concept of “productive conservation”. Since video knowledge elements can be recorded figuratively and expressed conceptually, each entity is associated with video knowledge elements. There are 33 groups of object attributes, such as Has value, Apply to and Participate in, among which the main relationship attributes are shown in Table 2.

Table 2 Main relationship attributes of ontology model of Yunjin video resources.

The ontology editing tool Protege [28] was employed to edit and enter the primary attributes and classes of Yunjin, while the ontograf plug-in was applied to perform the visualization of the association relationships between core classes and sub-classes. The dotted line is the mapping of the attributes of the 33 objects in the 7 core classes, while the solid line is the Domain pointing to Range, which is shown in Fig. 4.

Fig. 4
figure 4

Ontology model of semantic organization for Yunjin video resources

Organization of the physical data layer

The ontology penetrates into the knowledge level, which constructs a domain knowledge description system by defining concepts and their relationships. Nevertheless, ontology model is merely applicable to the association structure with fixed and uncomplicated relationship types between concepts. In this study, the knowledge element theory is introduced to conduct a complete fine-grained revelation of the complex, multi-dimensional and dynamically changing web-like multi-semantic relationships among the multidimensional knowledge of Yunjin. According to the Resource Description Framework (RDF) proposed by the World Wide Web Consortium (W3C), a triadic model in the form of subject-predicate-object was adopted to describe specific knowledge. Considering the combination of N (\(N \ge 1\), N is a positive integer) semantic triads with certain logical relationships as more coarse-grained knowledge elements with certain topological structure. The domain identification terms of knowledge elements were taken as trigger words to obtain knowledge items of knowledge elements and specific semantic association relations among knowledge elements, and eventually the knowledge elements were linked to form the knowledge network and organization system of ICH projects. The process of capturing the triadic set of semantic relations of knowledge elements in Yunjin video resources would involve two parts, namely information extraction and knowledge fusion.

Information extraction is the process of entity extraction, attribute extraction and relationship extraction of the data stored in the Yunjin video resource database, corresponding to the class, data attribute, object attribute and category hierarchy relationships in ontology.

  • Entity extraction is the process of defining real entities in the raw data. The accuracy and recall of extracted entities is closely linked to the effectiveness of subsequent tasks such as relationship extraction [29]. In this study, the core concept of productive conservation is used to develop the explicit representation of the associated knowledge of Yunjin products, so as to extract events, characters, objects and time in the topological knowledge network for named entity identification, with production and market as the focus.

  • Attribute extraction is the process of annotating entities with attributes and enriching them [30]. For instance, the attributes of the project, rank and speciality of the Yunjin bearer, the gender, age and preference of the consumer are used to create a three-dimensional image of the character, which will lead to more precise innovation and market positioning of the work.

  • Relationship extraction is the process of defining the semantic relationships of inter-entity relatedness in the raw data, as well as implementing semantic links between entities [31]. The core class and sub-classes are associated with subordinate relationships and other types of relationships defined under semantic associations, such as E12 Production as the core class, the first level sub-classes YJVO: market intelligence and YJVO: product value, the second level sub-classes YJVO: artistic value, YJVO: practical value and YJVO:Symbolic value, with customised semantic relations “Reflect to”, “Has Time” and “Apply to” and so forth.

Knowledge fusion aligns the same conceptual entities from different data sources and referent items, followed by integrating them into a distinct knowledge identity in the KG, which solves the issues of semantic ambiguity and data redundancy, thereby improving the consistency of knowledge association [32]. As a result of the sparsity of keyword extraction results from video resources, domain experts and web data are needed to supplement them. This has also resulted in multi-source heterogeneity of the source data, with some data appearing to be synonymous and heteronymous, which requires similarity entity matching to improve the accuracy of selection. For instance, the Zhuanghua hand-weaving techniques of Yunjin have various expressions in different video sources, such as “Nanjing Yunjin wooden machine Zhuanghua hand-weaving”, “Zhuanghua” and “Wahua pan-weaving” and so forth, which are standardized as “Zhuanghua” in the KG to improve the efficiency of KG retrieval and association.

Entity extraction, attribute extraction and relationship extraction refer to the process of acquiring factual knowledge such as entities and their relationships from multi-source data, so as to form a knowledge triad, which is a fundamental and necessary part of knowledge extraction and KG construction. Entity disambiguation, co-reference disambiguation and knowledge merging eliminate synonymous entities from multi-source heterogeneous knowledge bases and remove data redundancy to achieve knowledge fusion, which is the guarantee for improving the quality of KGs. For instance, the address of both the Yunjin Research Institute and the Yunjin Museum is 240 Chating East Street, Jianye District, Nanjing, and the Spatial Coordinates are 2° 02′ 11.5″ N 118° 44′ 42.0″ E. The use of co-reference disambiguation suggests that the 2 nouns point to a unified reference body in the real world, making them still valid out of context. Nevertheless, given the lack of a mature framework model for the Yunjin DKG, the specificity of domain entity constructions, the sparsity of annotated data, the overlap of relationships and the heterogeneity of multiple sources of complementary data sources, this study cannot rely entirely on machine learning for deep data processing. It requires research related to self-training semi-supervised learning Yunjin entity extraction, Yunjin domain relationship extraction and multi-feature similarity entity alignment with the support of domain experts, so as to achieve the construction and improvement of the DKG.

Organization of the application layer

Neo4j is the most widely used high performance graph database [33] for building DKGs. It supports distributed cluster deployments, which allows it to scale easily in response to data requirements, with high efficiency in data search, capable of reaching hundreds of millions of retrievals per second [34]. This study combines ontology and Neo4j to apply to the domain of ICH projects, which is a further deepening of the existing digital humanities (DH) research. On the one hand, the graph storage structure of ontology data structure was performed based on the mapping rules of design ontology, so as to improve the query efficiency of the domain ontology. On the other hand, the description of Yunjin digital resources from different dimensions, the organization of association and multidimensional knowledge representation between different types of Yunjin digital resources were achieved, which is conducive to further exploration of the development and utilization of Yunjin video resources.

The previous chapters focus on the excavation of Yunjin cultural features and the construction of the requirement system, normalize the core concepts, attributes and subfacets of Yunjin domain, as well as complete the design of ontology structure. On this basis, this chapter carries out data mapping and knowledge linking to construct the visualization framework of Yunjin KG, which is shown in Fig. 5. The four elements of Class, Instance, Object Property and Data Property in the ontology of Yunjin resource constructed above are mapped with the four elements of Label, Node, Relationship and Property of Neo4j database respectively. Among them, the nodes represent the concept objects in the Yunjin video resources, and different nodes represent different concept objects. Through the Cypher query language, semantic queries between nodes and relationships can be made to obtain the full view of the KG [35], while node relationship queries can also be made to aggregate entities of a certain topic and visualize the form of association and presentation. If there are complex relationships, the semantic representation of the KG can be enhanced by increasing the attributes of nodes or relationships to achieve the multidimensional organization and representation of knowledge.

Fig. 5
figure 5

Visualization framework of Yunjin KG

Empirical analysis

In the empirical section, the case of Da Guan Yuan Gown was selected. The Yunjin Research Institute integrated the traditional techniques of Yunjin into modern fashion while setting trends, followed by driving sales of the VGRASS “Da Guan Yuan” women’s collection for spring/summer in 2022. This is a classic case study of Yunjin’s philosophy of “productive conservation”. This study adopts the high-end technique of Zhuanghua hand-weaving techniques of Yunjin and the precious material of peacock feather, which is sampled from one of the four famous Chinese novels Dream of the Red Chamber, with the newly promoted movie queen walking in the show to propagate it. With the perfect collision of traditional culture and modern business, it has attracted numerous attentions.

Processing of video data

Selection of video resources

The empirical video resources selected for this study are classified into two parts, one is the production process of Da Guan Yuan Gown, which demonstrates the high level of techniques of Yunjin, while the other is the promotion activities of Da Guan Yuan Gown under the star effect, which embodies the more contemporary atmosphere of Yunjin products. The main video is the video clip of the beautiful burst scene of Da Guan Yuan Gown that the “Bearer Jianshun Yang Produced Da Guan Yuan Gown Weaving” and “Golden Rooster Award” unveiling ceremony of the best actress winner Zhang Xiao Pei wearing a dazzling Yunjin peacock feather Zhuanghua hand-weaving techniques satin. Furthermore, a series of video clips of Zhuanghua hand-weaving techniques of Grand Flowers Floor Loom, Da Guan Yuan high-end customized gown exhibition and artistic craftsman were selected to supplement the information.

Extraction of key frames of the video

First and foremost, the original video was read, while the inter-frame similarity was calculated by traversing each frame of the original video to obtain the inter-frame difference matrix. Afterwards, the inter-frame difference data was smoothed by setting the size of the smoothing window. Lastly, the smoothed inter-frame difference matrix was extracted from the maximum value (or minimum value) in accordance with the time series. In this study, the extracted values were extracted by the maximum value method to determine the location of key frames and output all key frames. The smoothed inter-frame difference of the video clip of “Bearer Jianshun Yang Produced Gown Weaving” is shown in Figure  6.

Fig. 6
figure 6

“Bearer Jianshun Yang produced gown weaving” video. a Inter-frame differential distribution. b Inter-frame differential after smoothing

This video is 3 min 10 s long, with resolution of 852 × 480, which consists of 5700 frames. The key frames of 90 frames were extracted from the original video with frame rate of 30 fps by frame difference method, so as to obtain the key information of the video with high efficiency. The following is the result of extracting 24 key frames from 1400 frames of this video, as shown in Figure  7.

Fig. 7
figure 7

Display of partial framing results of the video of bearer Jianshun Yang produced gown

As a result of the presence of transitions, shifts and other images [36], the extracted key frames still need to be manually reviewed after automatic extraction to eliminate the key frames that do not have knowledge elements. The experts of Yunjin conducted a second manual review of the extracted key frames after optimizing the frame difference method to minimize the redundancy of the key frames, thereby reducing the difficulty of manual semantic annotation of the data. The final key frame extraction results of the five related videos are shown in Table  3.

Table 3 Key frame extraction of video resources related to “Da Guan Yuan Gown”

Formal representation of video knowledge elements

The extracted key frames and video resources were subject to manual semantic annotation, while experts in Yunjin domain corrected and enriched the attribute contents. Meanwhile, they formed a complete video knowledge element by deconstructing and reorganizing the contextual information described from different perspectives around the same topic. For instance, the key frames of video 0001 and 0004 were extracted to precisely record and describe the knowledge concept of the process of weaving “Da Guan Yuan Gown by Grand Flowers Floor Loom”, while experts refined the character attributes of the bear Jianshun Yang. Weaving is a continuous event involving multiple object attributes and relational attributes of categories such as characters, objects and activities. The expert analyzed and reasoned that the subject matter and emotions expressed by the high-level semantics of the knowledge concept are abundant in artistic value and oriental aesthetics, which are logically combined by multiple semantic triads [37], with attribute descriptions and value ranges determined.

Visualization of KGs

In this study, key frames underwent semantic annotation, and data information was extracted to generate a triad through knowledge extraction and fusion processing. This data was converted to CSV format and imported into the Neo4j graph database for deep aggregation of knowledge units, adhering to the Efficiency Principle. This process resulted in the visualization of the Yunjin KGs, consisting of 300 entity nodes and 495 relationships.

Users can input query content, which invokes the corresponding Cypher query statement match (n) return (n) in the back-end. The query result presents a knowledge network with interconnected mappings, offering a more intelligent and user-friendly experience.

The Cypher statement enables fine-grained retrieval of information resources, with entities distinguishable by color. Users can extract and enlarge the information of single or multiple entities for personalized knowledge services. For Yunjin product consumers, the KGs provide implied correlations among Yunjin works through matching and querying, assisting in expanding their knowledge. Production operators and bearers of Yunjin products can gain a comprehensive understanding of basic and content information through the KGs, facilitating product management and innovation.

Quality evaluation

The constructed DKG targets the Yunjin project, requiring high domain knowledge depth and accuracy. Domain experts used Cypher queries to evaluate its usage capability. The DKG includes entities, entity-attribute values, and entity-entity relationships, assessed by accuracy, consistency, completeness, and timeliness.

Accuracy assessment is vital for DKGs, ensuring effective KG application. This study used authoritative video resources from Nanjing Yunjin Research Institute and combined machine learning with manual extraction to process data. The dataset relied on expert identification and government-provided original data resources, ensuring data quality. “Grand Flowers Floor Loom” was evaluated using MATCH (nname: “Grand Flowers Floor Loom”) − (m)RETURN (nname: ’Grand Flowers Floor Loom’) − (m); composite Fig. 8a shows the query results.Domain experts reviewed knowledge nodes and relational attributes to confirm accuracy.

Consistency evaluates whether knowledge expressions in KGs are conflict-free.Using “Jianshun Yang” as an example and executing a MATCH query, Composite Fig. 8b shows that both Peony Woven Gold Color Velvet with Blue Ground and Da Guan Yuan Gown consistently point to Jianshun Yang as their bearer. The goal is to eliminate knowledge inconsistencies.

Completeness assesses the extent of knowledge coverage in KGs within the Yunjin domain. Composite Fig. 8c demonstrates the available knowledge for Da Guan Yuan Gown, showing 16 direct nodes and 9 relationship types. Domain experts confirmed the KGs’ high completeness. However, as Yunjin isn’t an absolutely closed domain and KGs require continuous supplementation, achieving absolute completeness is challenging. Based on the current sampling results, the knowledge related to common attributes and relationships is relatively complete.

Fig. 8
figure 8

Composite image of query results

Timeliness assesses the currency of knowledge in the DKG and query performance. As knowledge is dynamic, real-time updates are crucial to capture the latest Yunjin exhibitions, collaborations, and market trends. The Yunjin DKG should enable easy modifications and rapid queries.

This study used Neo4j for organizing Yunjin knowledge, taking advantage of its graph storage structure for flexible connectivity. This structure allows adding, modifying, and deleting node knowledge and relationship attributes. Load CSV and Cypher statements were utilized to manage entity-relationship data, with Load CSV enabling bulk import of nodes and relationships from CSV files, creating the Yunjin DKG. Neo4j’s index-free adjacency strategy ensures efficient graph traversal, providing minimal query latency and real-time results, making it the ideal choice for constructing a timely DKG.

This study constructed and evaluated a DKG for the Nanjing Yunjin project, aligning with the productive conservation concept of Yunjin. Domain experts assessed the DKG’s quality in terms of accuracy, consistency, completeness, and timeliness using Cypher query language, and the results met the requirements. The Nanjing Yunjin knowledge graph serves as a valuable resource for producers, bearers, consumers, and other stakeholders, with potential for ongoing expansion and improvement in future research.


In this study, the domain ontology was combined with the Neo4j graph database, resulting in the construction method of the DKG of Yunjin oriented by the concept of “productive conservation”, which can not only facilitate the knowledge storage and knowledge representation of Yunjin resources in an effective manner, but also serve as a reference for bearers and institutions to innovate Yunjin products. Meanwhile, it is also in a good position to deliver efficient, intelligent and visualized knowledge services to users.

In the upcoming period, the construction and improvement of the DKG of Yunjin will require long-term systematic work. The improvements and explorations will be further carried out in the aspects of data extraction of video resources, knowledge discovery and in-depth knowledge services, with a view to introducing new ideas for the construction of the knowledge platform of ICH projects of handicraft category.

  • The key frame extraction of video resources is characterized by data sparsity, which calls for multi-channel data to supplement. The entity alignment of the semantic annotation and multi-source heterogeneous data sets of video key frames is currently mostly dependent on manual means, which will be further probed in the fields of natural language processing and machine learning, thereby progressively operating in the direction of manual recognition-semi-automatic recognition-machine full recognition for the data processing of video resources of ICH projects.

  • At the present stage of the research, the descriptive annotation of Yunjin video knowledge elements has been inadequate, the development of the application layer of the data resources of KGs is insufficient. In this context, the research on linked data, multi-modal narrative framework construction and visualization techniques will be further reinforced, while the empirical exploration will also be performed on the visualization, narrativization and immersive experience of the data resources of ICH projects.

Availability of data and materials

Data available on request from the authors.















Intangible cultural heritage


Domain knowledge graph


Knowledge graphs


Cultural heritage


Resource Description Framework


Web ontology language


World Heritage Convention


Digital humanities


Yunjin video resource ontology model


International Committee for Documentation of the International Council of Museums Conceptual Reference Model


Korean Cultural Heritage Data Model


Europeana Data Model


Friend of a Friend


Time ontology


Ontology for Media Resource


World Wide Web Consortium


  1. Mosbach S, Menon A, Farazi F, Krdzavac N, Kraft M. Multiscale cross-domain thermochemical knowledge-graph. J Chem Inf Model. 2020;60(12):6155–66.

    Article  CAS  Google Scholar 

  2. Kejriwal M, Szekely P. Knowledge graphs for social good: an entity-centric search engine for the human trafficking domain. IEEE Trans Big Data. 2017;8(3):592–606.

    Article  Google Scholar 

  3. Wang M, Wang H, Qi G, Zheng Q. Richpedia: a large-scale, comprehensive multi-modal knowledge graph. Big Data Res. 2020;22(10): 100159.

    Article  Google Scholar 

  4. Abu-Salih B, Al-Tawil M, Aljarah I, Faris H, Wongthongtham P, Chan KY, Beheshti A. Relational learning analysis of social politics using knowledge graph embedding. Data Min Knowl Discov. 2021;35(4):1497–536.

    Article  Google Scholar 

  5. Fan T, Wang H. Research of Chinese intangible cultural heritage knowledge graph construction and attribute value extraction with graph attention network. Inf Process Manag. 2022;59(1): 102753.

    Article  Google Scholar 

  6. Tian H, Zhang X, Wang Y, Zeng D. Multi-task learning and improved TextRank for knowledge graph completion. Entropy. 2022;24(10):1495.

    Article  Google Scholar 

  7. Isaac A, Haslhofer B. Europeana linked opendata-data. europeana. eu. Semant Web. 2013;4(3):291–7.

    Article  Google Scholar 

  8. Park SC. ICHPEDIA, a case study in community engagement in the safeguarding of ICH online. Int J Intang Herit. 2014;9:69–82.

    Google Scholar 

  9. Charisis V, Hadjidimitriou S, Hadjileontiadis LJ. FISEVAL—a novel project evaluation approach using fuzzy logic: the paradigm of the i-treasures project. Expert Syst Appl. 2022;202: 117260.

    Article  Google Scholar 

  10. Winthrop RH. The strange case of cultural services: limits of the ecosystem services paradigm. Ecol Econ. 2014;108:208–14.

    Article  Google Scholar 

  11. Monaco D, Pellegrino MA, Scarano V, Vicidomini L. Linked open data in authoring virtual exhibitions. J Cult Herit. 2022;53:127–42.

    Article  Google Scholar 

  12. Merillas OF, Rodríguez MM. An analysis of educational designs in intangible cultural heritage programmes: the case of Spain. Int J Intang Herit. 2018;13:190–202.

    Google Scholar 

  13. Carriero VA, Gangemi A, Mancinelli ML, Nuzzolese AG, Presutti V, Veninata C. Pattern-based design applied to cultural heritage knowledge graphs. Semant Web. 2021;12(2):313–57.

    Article  Google Scholar 

  14. Koho M, Ikkala E, Leskinen P, Tamper M, Tuominen J, Hyvönen E. WarSampo knowledge graph: Finland in the second world war as linked open data. Semant Web. 2021;12(2):265–78.

    Article  Google Scholar 

  15. Kejriwal M, Sequeda JF, Lopez V. Knowledge graphs: construction, management and querying. Semant Web. 2019;10(6):961–2.

    Article  Google Scholar 

  16. Lombardo V, Pizzo A, Damiano R. Safeguarding and accessing drama as intangible cultural heritage. J Comput Cult Herit. 2016;9(1):1–26.

    Article  Google Scholar 

  17. Kalita D, Deka D. Ontology for preserving the knowledge base of traditional dances (OTD). Electron Libr. 2020;38(4):785–803.

    Article  Google Scholar 

  18. Isa WMW, Zin NAM, Rosdi F, Sarim HM, Wook TSMT, Husin S, Jusoh S, Lawi SK. An ontological approach for creating a brassware craft knowledge base. IEEE Access. 2020;8:163434–46.

    Article  Google Scholar 

  19. Li H, Zhu L, Shen W, Du X, Guan S, Deng J. Research on knowledge organization and visualization of historical events in the Republic of China era. Libr Trends. 2020;69(1):138–63.

    Article  Google Scholar 

  20. Bortolotto C. Commercialization without over-commercialization: normative conundrums across heritage rationalities. Int J Herit Stud. 2021;27(9):857–68.

    Article  Google Scholar 

  21. Tasias KA. Integrated quality, maintenance and production model for multivariate processes: a Bayesian approach. J Manufact Syst. 2022;63:35–51.

    Article  Google Scholar 

  22. Qiao C, Hu X. A neural knowledge graph evaluator: combining structural and semantic evidence of knowledge graphs for predicting supportive knowledge in scientific QA. Inf Process Manag. 2020;57(6): 102309.

    Article  Google Scholar 

  23. Muhammad K, Hussain T, Del Ser J, Palade V, De Albuquerque VHC. DeepReS: a deep learning-based video summarization strategy for resource-constrained industrial surveillance scenarios. IEEE Trans Ind Inform. 2019;16(9):5938–47.

    Article  Google Scholar 

  24. Mahum R, Irtaza A, Nawaz M, Nazir T, Masood M, Shaikh S, Nasr EA. A robust framework to generate surveillance video summaries using combination of zernike moments and r-transform and deep neural network. Multimed Tools Appl. 2022;82:1–25.

    Article  Google Scholar 

  25. Bi Y, Li D, Luo Y. Combining keyframes and image classification for violent behavior recognition. Appl Sci. 2022;12(16):8014.

    Article  CAS  Google Scholar 

  26. Li Y, Wang J, Sun X, Li Z, Liu M, Gui G. Smoothing-aided support vector machine based nonstationary video traffic prediction towards B5G networks. IEEE Trans Veh Technol. 2020;69(7):7493–502.

    Article  Google Scholar 

  27. Jiang Z-G, Shi X-T. Application research of key frames extraction technology combined with optimized faster R-CNN algorithm in traffic video analysis. Complexity. 2021.

    Article  Google Scholar 

  28. Cheng Y-J, Chou S-L. Using digital humanity approaches to visualize and evaluate the cultural heritage ontology. Electron Libr. 2022;40(1/2):83–98.

    Article  Google Scholar 

  29. He S, Sun D, Wang Z. Named entity recognition for Chinese marine text with knowledge-based self-attention. Multimed Tools Appl. 2022;81:1–15.

    Article  Google Scholar 

  30. Liu W, Cheng X, Xie S, Yu Y. Learning high-order structural and attribute information by knowledge graph attention networks for enhancing knowledge graph embedding. Knowl-Based Syst. 2022;250: 109002.

    Article  Google Scholar 

  31. Yu H, Li H, Mao D, Cai Q. A relationship extraction method for domain knowledge graph construction. World Wide Web. 2020;23:735–53.

    Article  Google Scholar 

  32. Holzinger A, Malle B, Saranti A, Pfeifer B. Towards multi-modal causability with graph neural networks enabling information fusion for explainable AI. Inf Fusion. 2021;71:28–37.

    Article  Google Scholar 

  33. Lopez-Rodriguez V, Ceballos HG. Modeling scientometric indicators using a statistical data ontology. J Big Data. 2022;9(1):1–17.

    Article  Google Scholar 

  34. Zhu Z, Zhou X, Shao K. A novel approach based on neo4j for multi-constrained flexible job shop scheduling problem. Comput Ind Eng. 2019;130:671–86.

    Article  Google Scholar 

  35. Rizvi SZR, Fong PW. Efficient authorization of graph-database queries in an attribute-supporting rebac model. ACM Trans Priv Secur. 2020;23(4):1–33.

    Article  Google Scholar 

  36. Mounika Bommisetty R, Khare A, Siddiqui TJ, Palanisamy P. Fusion of gradient and feature similarity for keyframe extraction. Multimed Tools Appl. 2021;80:15429–67.

    Article  Google Scholar 

  37. Hogan A, Blomqvist E, Cochez M, d’Amato C, Melo GD, Gutierrez C, Kirrane S, Gayo JEL, Navigli R, Neumaier S. Knowledge graphs. ACM Comput Surv. 2021;54(4):1–37.

    Article  Google Scholar 

Download references


This research was supported by experimental data created in the Nanjing YunJing Institute. We also thank the ICH bearers of Nanjing YunJing for semantic annotation and data verification for this study. At the same time, we would like to express our deepest gratitude to the anonymous reviewers, whose comments improved this work tremendously.


This research was supported by the Jiangsu Provincial Social Science Foundation Project “Research on Knowledge Extraction and Organization of Nanjing Cloud Brocade Video Resources Based on Knowledge Meta” (Project No. 21TQC006). Jiangsu Provincial University Social Science Foundation Project “Research on the Implementation Path of the Living Heritage of Cloud Brocade Values in the DH Perspective” (Project No. 2021SJA0135).

Author information

Authors and Affiliations



Conceptualization: LL; investigation: LL, CW, XL; methodology: LL, LlJ; software: LL; data preparation: LL, CC; writing—original draft preparation: LL; writing—review and editing: LL, YGT. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Lu Lu.

Ethics declarations

Competing interests

The authors claim there is no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and Permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lu, L., Liang, X., Yuan, G. et al. A study on the construction of knowledge graph of Yunjin video resources under productive conservation. Herit Sci 11, 83 (2023).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI:


  • Intangible cultural heritage
  • Productive conservation concept
  • Domain knowledge graph
  • Video resources