Using knowledge graphs and deep learning algorithms to enhance digital cultural heritage management

Huang, Y. Yuexin; Yu, S. Suihuai; Chu, J. Jianjie; Fan, H. Hao; Du, B. Bin

doi:10.1186/s40494-023-01042-y

Research
Open access
Published: 18 September 2023

Using knowledge graphs and deep learning algorithms to enhance digital cultural heritage management

Y. Yuexin Huang^1,2,
S. Suihuai Yu¹,
J. Jianjie Chu¹,
H. Hao Fan³ &
…
B. Bin Du^4,5

Heritage Science volume 11, Article number: 204 (2023) Cite this article

2811 Accesses
2 Citations
Metrics details

Abstract

Cultural heritage management poses significant challenges for museums due to fragmented data, limited intelligent frameworks, and insufficient applications. In response, a digital cultural heritage management approach based on knowledge graphs and deep learning algorithms is proposed to address the above challenges. A joint entity-relation triple extraction model is proposed to automatically identify entities and relations from fragmented data for knowledge graph construction. Additionally, a knowledge completion model is presented to predict missing information and improve knowledge graph completeness. Comparative simulations have been conducted to demonstrate the effectiveness and accuracy of the proposed approach for both the knowledge extraction model and the knowledge completion model. The efficacy of the knowledge graph application is corroborated through a case study utilizing ceramic data from the Palace Museum in China. This method may benefit users since it provides automated, interconnected, visually appealing, and easily accessible information about cultural heritage.

Introduction

Cultural heritage management, which involves documentation, conservation, restoration, interpretation, and education, plays a critical role in preservation and communication of its significance to a broad range of audiences [1, 2]. Global initiatives like the Charter on the Preservation of Digital Heritage by UNESCO [3] and the Ten Perspectives on the Future of Digital Culture presented by Europeana [4] have encouraged a shift to digital methods, a transition that has been significantly accelerated by innovations in artificial intelligence, blockchain [5], ontology [6], and virtual reality [7]. This move towards digitization, seen in the digital transformation of collections by institutions like the Hamburg Museum of Art [8] and the Metropolitan Museum of Art [9], fosters increased accessibility and enhanced conservation. Concurrently, the development of databases like the Global Fashion Style Cultural Database in Japan [10] and the Intangible Cultural Heritage Database in China [11] reflects optimization efforts in digital heritage management. Standardized data models and representation languages like Extensible Markup Language (XML) [12], Web Ontology Language (OWL) [13], and Resource Description Framework (RDF) [14] further facilitate data management and sharing, promoting interoperability and meaningful information exchange. Given the significance of cultural heritage, digital technologies can potentially improve access, preservation, and interpretation.

However, digital cultural heritage management encounters numerous challenges that should be addressed. Despite the existence of diverse methods proposed by researchers for cultural heritage management [14], many of them still necessitate manual efforts in the collection and processing of massive cultural heritage data [12, 13], consequently leading to high labor costs and the absence of automatic updates. To enhance the scalability and efficiency of digital cultural heritage management, it is crucial to develop intelligent frameworks that leverage advanced technologies. Secondly, traditional approaches still need to address the lack of visualisation, fragmentation, and limited data reasoning. Cultural heritage databases and platforms [10] often suffer from fragmentation, where information is stored in silos, leading to poor correlation between data. As a result, it becomes challenging to identify relationships between cultural heritage objects, limiting analysis and interpretation. Overcoming fragmentation is vital to improving data reasoning and analysing cultural heritage data effectively. Thirdly, digital cultural heritage management has predominantly focused on preservation and management rather than providing adequate support for using scenarios. For example, when museums may store and manage artefacts digitally, it often lacks effective search, interpretation, storytelling, collaboration, accessibility, and inclusivity, making it difficult to derive the required information from cultural heritage resources. Consequently, addressing intelligent frameworks, fragmentary data, and insufficient application are critical to promoting the effective utilisation and accessibility of cultural heritage. In response, introducing new technologies to improve automation, increase data visualisation capabilities, and improve data application capabilities can solve the above challenges.

Regarding the application of digital technologies in the management of cultural heritage, knowledge graphs shows promise by integrating disparate data sources, bridging structured and unstructured data, and providing a visual representation of information [15, 16]. The knowledge graph applications span search engines, e-commerce [17], intelligent manufacturing [18], and electronic healthcare due to their superior interconnection, visualization, and data integration abilities. Knowledge graphs show significant advantages over traditional methods [12,13,14] in terms of visualization [19] and reasoning [20]. Furthermore, knowledge graphs can be effectively employed as input and output in various deep learning models. Therefore, using knowledge graphs and deep learning algorithms can make cultural heritage management more intelligent, address fragmentary cultural data, and enhance visualization capabilities.

The knowledge graph applied to digital cultural heritage management is shown in Fig. 1. The existing Palace Museum website presents ceramic information in list format. This approach limits visual engagement during browsing and provides limited insights into the similarities and differences among ceramic pieces. On the right of Fig. 1, the knowledge graph is utilized to enhance digital ceramic management to address this limitation. The knowledge graph create a highly automated, interconnected, visually appealing, and readily accessible informational structure for ceramics. In the historical research about knowledge graphs applied to digital cultural heritage management, Charles et al. [21] designed a cultural heritage knowledge network that incorporates information about places, people, and concepts. Given the metadata format of the network, identifying matching cultural heritage datasets poses a challenge. Hyvonen et al. [22] proposed an approach integrating a knowledge graph for identifying semantic relationships between different cultural heritage data. However, due to the limitations of transformation rules and missing relations between entities, locating the required cultural heritage knowledge resources can be challenging and inefficient. Debruyne et al. [23] created a knowledge graph for Ireland’s history, and further research will focus on determining the relationship between different attributes and the evolution of names for places. Meanwhile, deep learning algorithms have been used in conjunction with knowledge graphs to refine digital cultural heritage management further. Dou et al. [24] proposed the Chinese intangible cultural knowledge graph, where the Att-BiLSTM and Bi-GRU machine learning models are employed as the entity extraction and relation extraction models, respectively. Nevertheless, these models are not pipelines, leading to an accumulation of errors. Additionally, Bobasheva et al. [25] used machine learning to enhance cultural metadata and information retrieval, but the model struggles with overlapping entities and relations, which diminishes the extraction effect. In summary, the research presented above may need to be improved due to missing links, an appropriate metadata format, the relevance of datasets to cultural heritage, the availability of training sets, and the complexity of scenes. Furthermore, the pipeline of extraction tasks and overlapping triples may reduce the effectiveness of machine learning models.

Knowledge graph construction involves domain ontology construction, knowledge extraction (has two steps of entity extraction and relation extraction), knowledge completion, knowledge fusion, knowledge recommendation, and the knowledge graph application, of which knowledge extraction and knowledge completion are key steps [26]. Specifically, entity extraction has been developed for a long time, which involves statistical algorithms and machine learning algorithms, such as support vector machines [27, 28], Conditional Random Fields [29, 30], BiLSTM-CRF [31], large-scale pre-training models BERT [32]. Relation extraction is the next step after entity extraction and can be challenging when there are overlapping entities and relations [33]. Relation extraction initially utilises semantic rules and templates before machine learning algorithms such as BiLSTM [34] and Lattice LSTM [35]. To address the problem of overlapping entity-relation triples, joint entity-relation triple extraction methods, such as the Seq2Seq model with a copy mechanism [36], reinforcement learning [37], Graph Convolutional Networks [38], Multi-Turn QA models [39], and the CASREL model [40], have been proposed. To solve the problem of overlapping entity-relation triples in knowledge extraction, this study proposes a cascade binary tagging framework inspired by a novel cascade framework from CASREL [40], which shows promising results in addressing overlapping triples. The second core step of knowledge graph construction involves knowledge completion, which can be divided into rule-based, description logic-based, and case-based methods [41]. Afterwards, knowledge embedding is introduced in the knowledge completion task, which can be classified into four categories: combination models (e.g., RESCAL [42], NTN [43], HOLE [44]), translation models (e.g., TransE [45], DISTMULT [46]), convolutional neural network-based models (e.g., ConvE [47], ConvKB [48]), and graph-based models (e.g., R-GCN [49]). While the first three types process each triple independently, the graph-based models perform convolution operations on neighbour nodes, considering the knowledge graph as a whole and utilising the implicit information in the relationship path and neighbour nodes to analyse the semantic relationship between entities. The proposed model in this study enhances the GATs by incorporating multi-order relations and adjacent node features for knowledge completion. The proposed model employs an improved graph attention layer and ConvKB [48].

Throughout the literature review of knowledge graphs applied to cultural heritage, knowledge extraction models, and knowledge completion models, four challenges are identified as follows:

(1)
A knowledge graph framework specific to digital cultural heritage management should be established, as generic frameworks from other areas may not capture the unique features of the local heritage. This issue could potentially be resolved through interdisciplinary research that combines expertise in cultural heritage management with knowledge graphs to co-develop a tailored framework.
(2)
It is challenging to develop a knowledge extraction model compatible with cultural heritage data due to overlapping triples. One potential approach to this could involve the joint entity-relation extraction model to identify and resolve these overlapping triples, thereby enhancing the compatibility of the model.
(3)
A knowledge graph completion model appropriate for cultural heritage data is required. The knowledge completion model should consider the specific nuances of local cultural heritage data. The development of the model might necessitate the integration of machine learning methods with the detailed study of cultural heritage data properties.
(4)
There is an evident necessity for research to pivot towards the knowledge graph application, which is intuitively responsive to a variety of using scenarios in digital cultural heritage management tasks. These scenarios encompass a diverse range from searching and interpretation to storytelling and collaboration, as well as ensuring accessibility and inclusivity. Enhanced interactivity and user-friendly design in the knowledge graph visualization can lead to increased user support and engagement.

Therefore, this study introduces a cultural heritage management method that utilises knowledge graphs and deep learning algorithms to overcome the abovementioned obstacles. Specifically, a generic knowledge graph-driven cultural heritage management framework is proposed to facilitate digital cultural heritage management. In addition, a knowledge extraction model is developed to extract entities and relations jointly from massive fragmentary cultural text automatically. Moreover, a knowledge completion model is introduced to predict missing knowledge and complete the knowledge graph. The effectiveness of the proposed approach is demonstrated in a case study using ceramic data from the Palace Museum in China. The proposed knowledge graph-driven cultural heritage management framework allows for acquiring interconnected and visualised cultural heritage information.

Accordingly, the study is structured as follows: "Methodology" Section describes a knowledge graph framework for digital cultural heritage management. The knowledge extraction model and knowledge completion model are also proposed in "Methodology" Section. "Case study: The ceramic knowledge graph for the Palace Museum in China" "Discussion" Section illustrates the case study using ceramic data from the Palace Museum in China. Section 4 discusses the application value and limitations. Finally, "Conclusion" Section 5 is the conclusion and future works.

Methodology

The framework of digital cultural heritage management using knowledge graphs and deep learning algorithms

This study proposes a framework which uses knowledge graphs and deep learning algorithms to support digital cultural heritage management to address the above challenges, as depicted in Fig. 2. Furthermore, a joint entity-relation triple extraction model is developed to extract entities and relations from fragmented cultural heritage data. This model automatically identifies entities and relations defined by experts, enabling efficient data integration. Additionally, a knowledge completion model is presented to deal with the issue of missing information in cultural heritage data. The knowledge completion model enables the completion of missing information, further enhancing the quality of the knowledge graph. Finally, a cultural heritage knowledge graph is constructed to satisfy cultural heritage management requirements.

Knowledge extraction model

Task description

Knowledge extraction is the process of identifying subjects, relations, and objects represented as (subject, relation, object). For instance, the sentence Pastel nine peach vase, its whole body painted peach tree and flat peach contains the entity-relation triples (Pastel nine peach vase, has pattern of, peach tree) and (Pastel nine peach vase, has pattern of, flat peach). Knowledge extraction becomes a fixed extraction task when working with a pre-defined set of entity-relation triples. However, conventional knowledge extraction models have challenges in overlapping circumstances, as shown in Fig. 3. Previous research has treated relations as discrete labels attached to entity pairs [50, 51], an approach that exhibits constrained efficacy in the context of multiple overlapping entity-relation triples.

The framework of knowledge extraction model

To solve the problem of overlapping entity-relation triples in knowledge extraction, this study proposes a cascade binary tagging framework inspired by a novel cascade framework from CASREL [40]. Unlike the approach of assigning discrete labels to entity pairs [50], triples are modelled directly and define training objectives at the triple level. The proposed method shows promising results in addressing overlapping triples.

The objective of the joint entity-relation triple extraction model is to maximize the data likelihood for the training set D, given an annotated sentence $x_{j}$ from D and overlapping triples $T_{j}=\left\{ (s,r,o)\right\}$ in $x_{j}$:

$$\begin{aligned}\prod _{j=1}^{|D|}\left[ \prod _{(s, r, o) \in T_j} p\left( (s, r, o) \mid x_j\right) \right] \\ =\prod _{j=1}^{|D|}\left[ \prod _{s \in T_j} p\left( s \mid x_j\right) \prod _{(r, o) \in T_j \mid s} p\left( (r, o) \mid s, x_j\right) \right] \\ =\prod _{j=1}^{|D|}\left[ \prod _{s \in T_j} p \!\!\left( s \!\! \mid \!\! x_j\right) \!\! \prod _{r \in T_j \mid s} p_r\left( o \!\! \mid s \!, \! x_j\right) \!\! \prod _{r \in R \backslash T_j \mid s} \!\! p_r\left( \!\! o_{\varnothing } \!\! \mid \!\! s, \! x_j\right) \right] . \end{aligned}$$

(1)

where the notation $T_{j}$ represents a set of triples, where each subject $s \in T_j$ appears in at least one triple. To be precise, $T_j \mid s$ represents the triples led by s in $T_{j}$, and $(r, o) \in T_j \mid s$ represents a (r, o) pair led by the subject s in $T_{j}$. R represents all possible relationships. $R \backslash T_j \mid s$ denotes all relations except those led by s in $T_{j}$. $o_{\varnothing }$ denote a ‘null’ object.

The chain rule of probability is applied in this formulation, as shown in Eq. 1. The structure is designed to deal with the problem of overlapping triples without making assumptions regarding how multiple triples may share entities within a sentence. Moreover, this formulation inspires a novel tagging scheme for triple extraction. A subject tagger is learned $p\left( s \mid x_j\right)$ that recognizes subject entities. For relation r, an object tagger is learned $p_{r}\left( o\mid s, x_j\right)$ to recognize relation-specific objects for a given subject. In contrast to classifying relations for pairs of subjects and objects, this approach treats each relation as a function mapping a subject to an object. The tagger allows multiple triples to be extracted simultaneously, as depicted in Fig. 3. A subject tagger is applied to identify all subjects. The relation-specific object taggers for each subject are used to identify all relevant relations and their corresponding objects. This study uses binary taggers over a deep bidirectional transform, namely the BERT [32], to instantiate the subject tagger and relation-specific object taggers.

The contribution of the proposed formulation has several advantages. Firstly, the final evaluation criteria are directly optimized at the triple level of the data likelihood by starting from the triple level. Secondly, it handles overlapping triples without making assumptions about how they share entities in a sentence. Finally, the proposed novel tagging scheme enables multiple triple extraction simultaneously.

BERT encoder

The encoder module extracts feature information ${\textbf {x}}_{j}$ from $x_{j}$, and feature information is fed into the following tagging modules. BERT is a multi-layer bidirectional Transformer-based language representation model that learns deep representations based on each word’s left and right contexts, as depicted in Fig. 4. It comprises a stack of N Transformer blocks of the same type. Each Transformer block performs multi-head self-attention, layer normalization, and feedforward neural network operations as follows:

$$\begin{aligned}{} & {} {\textbf{h}}_0={\textbf{S}} {\textbf{W}}_s+{\textbf{W}}_p. \end{aligned}$$

(2)

$$\begin{aligned}{} & {} {\textbf{h}}_\alpha ={Trans}\left( {\textbf{h}}_{\alpha -1}\right) , \alpha \in [1, N]. \end{aligned}$$

(3)

In this equation, ${\textbf {S}}$ is a matrix of one-hot vectors of sub-word indices, ${\textbf {W}}_{s}$ represents the sub-word embedding matrix, ${\textbf {W}}_{p}$ is the positional embedding matrix, and p is the position index. ${\textbf {h}}_{\alpha }$ denotes the hidden state vector, which is the context representation of the input sentence at the $\alpha -th$ layer. N is the total Transformer blocks’ number. Since a sentence is considered as input rather than a pair of sentences, segmentation embedding is not considered in Eq. 3. For a more detailed explanation of the Transformer structure, refer to [52].

Cascade decoder

Based on the previous formulation, the cascade binary tagging scheme is instantiated through a two-step cascade process: a subject tagger followed by a set of relations-specific object taggers, as illustrated in Fig. 3. First, subjects are detected in the input sentence. Afterwards, all possible relations related to each candidate subject are examined to determine whether they can associate objects in the sentence with that subject.

To recognise potential subjects, the subject tagger module decodes the encoded vector ${\textbf {h}}_{N}$ obtained from BERT. The module utilises two identical binary classifiers to detect subjects’ starting and ending positions. For this purpose, each token is assigned a binary tag (0/1) that signifies whether it represents a subject’s start or end position as follows:

$$\begin{aligned}{} & {} p_i^{start\_s}=\sigma \left( {\textbf{W}}_{ {start}} {\textbf{x}}_i+{\textbf{b}}_{{start}}\right) . \end{aligned}$$

(4)

$$\begin{aligned}{} & {} p_i^{end\_s}=\sigma \left( {\textbf{W}}_{{end }} {\textbf{x}}_i+{\textbf{b}}_{{end }}\right) . \end{aligned}$$

(5)

where $p_i^{start\_s}$ and $p_i^{end\_s}$ represent the probability of the i-th token identification in the input sequence based on the start and end positions of the subject. The corresponding token is assigned ‘1’ if the probability reaches a certain threshold, it is assigned ‘0’ otherwise. ${\textbf{x}}_i$ is the encoded representation of token i in the input sequence. ${\textbf{W}}_{(\cdot )}$ is the trainable weight, ${\textbf{b}}_{(\cdot )}$ is the bias, and $\sigma$ is the sigmoid activation function.

The subject tagger optimises the following likelihood function as Eq. 6 to identify the span of a subject s given representation ${\textbf{x}}$. Using the results of the start-end position taggers, a subject’s span is determined using the nearest start-end pair match principle.

$${p_\theta }(s\mid {\bf{x}}) = \prod\limits_{t \in \{ start\_s,end\_s\} } {\prod\limits_{i = 1}^L {{{\left( {p_i^t} \right)}^{{\bf{I}}\left\{ {y_i^t = 1} \right\}}}} } {\left( {1 - p_i^t} \right)^{{\bf{I}}\left\{ {y_i^t = 0} \right\}}}$$

(6)

where L is the sentence length. ${\textbf{I}}\{z\}=1$ when z is true, and ${\textbf{I}}\{z\}=0$ when it is false. $y_i^{{start\_s}}$ represents the start position for the i-th token in ${\textbf{x}}$, and $y_i^{{end\_s}}$ represents the end position. The parameters $\theta =\left\{ {\textbf{W}}_{ {start }}, {\textbf{b}}_{{start }}, {\textbf{W}}_{{end }}, {\textbf{b}}_{{end }}\right\}$.

Objects and their relationships with subjects are simultaneously identified in relation-specific object taggers, as shown in Fig. 3. Each object tagger identifies the corresponding object(s) for every detected subject. Contrary to the subject tagger, the relation-specific object tagger considers the subject’s characteristics. On each token, the relation-specific object tagger performs the following operations:

$$\begin{aligned}{} & {} p_i^{ {start\_o }}=\sigma \left( {\textbf{W}}_{{start }}^r\left( {\textbf{x}}_i+{\textbf{v}}_{{sub }}^k\right) +{\textbf{b}}_{{start }}^r\right) . \end{aligned}$$

(7)

$$\begin{aligned}{} & {} p_i^{ {end\_o}}=\sigma \left( {\textbf{W}}_{{end }}^r\left( {\textbf{x}}_i+{\textbf{v}}_{{sub }}^k\right) +{\textbf{b}}_{{end}}^r\right) . \end{aligned}$$

(8)

where $p_i^{{start\_o }}$ and $p_i^{{end\_o}}$ are the probabilities of identifying an object’s start and end positions. ${\textbf{v}}_{{sub}}^k$ is the encoded representation vector of the k-th subject.

Each subject is decoded iteratively using the same process. To make the additions of ${\textbf{x}}_i$ and ${\textbf{v}}_{{sub}}^k$ in Eq. 7 and Eq. 8, the dimensions of the two vectors should remain consistent. Therefore, the averaged vector representation is calculated between the start and end tokens of the k-th subject as ${\textbf{v}}_{{sub}}^k$.

For relation r, the object tagger optimises the following likelihood function to find out the scope of the object o in a sentence ${\textbf{x}}$ and a subject s:

$${p_{{\phi _r}}}(o\mid s,{\bf{x}}) = \prod\limits_{t \in \{ start\_o,end\_o\} } {\prod\limits_{i = 1}^L {{{\left( {p_i^t} \right)}^{{\bf{I}}\left\{ {y_i^t = 1} \right\}}}} } {\left( {1 - p_i^t} \right)^{{\bf{I}}\left\{ {y_i^t = 0} \right\}}}$$

(9)

where $y_i^{{start\_o }}$ represents the binary tag of the object’s start position for the i-th token in ${\textbf{x}}$, and $y_i^{{end\_o }}$ represents the tag of the object’s end position. In the case of a ‘null’ $o_{\varnothing }$, the tags $y_i^{{start\_o}}=y_i^{ {end\_o}}=0$. The parameters $\phi _r=\left\{ {\textbf{W}}_{{start }}^r, {\textbf{b}}_{{start }}^r, {\textbf{W}}_{{end }}^r, {\textbf{b}}_{ {end }}^r\right\}$.

Training objective

Taking log of Eq. 1, the training objective $J(\Theta )$ is:

$$\begin{aligned} \sum _{j=1}^{|D|}&{\left[ \sum _{s \in T_j} \log p_\theta \left( s \mid {\textbf{x}}_j\right) +\sum _{r \in T_j \mid s} \log p_{\phi _r}\left( o \mid s, {\textbf{x}}_j\right) \right. } \\&\left. +\sum _{r \in R \backslash T_j \mid s} \log p_{\phi _r}\left( o_{\varnothing } \mid s, {\textbf{x}}_j\right) \right] . \end{aligned}$$

(10)

where $p_\theta (s \mid {\textbf{x}})$ is defined in Eq. 6 and $p_{\phi _r}(o \mid s, {\textbf{x}})$ is defined in Eq. 9. The parameters $\Theta =\left\{ \theta ,\left\{ \phi _r\right\} _{r \in R}\right\}$. A stochastic gradient descent algorithm trains the model by maximizing $J(\Theta )$ over shuffled mini-batches.

Knowledge completion model

Task description

The task of knowledge completion is to predict the missing relationships between entities within a knowledge graph. Conventional methodologies like Graph Attention Networks (GATs) [53] have exhibited impressive performance in knowledge completion tasks, primarily due to their capacity to capture first-order neighbour node information. Nevertheless, given their disregard for relational characteristics, these methodologies may not be directly transferable to knowledge graphs.

The framework of knowledge completion model

The proposed model in this study enhances the GATs by incorporating multi-order relations and adjacent node features for knowledge completion. The proposed model employs an improved graph attention layer and ConvKB [48]. By leveraging the enhanced graph attention mechanism with multi-order relations and adjacent node features, the model captures the complex relations between entities and generates predictions for missing relations in the knowledge graph. As a result, the model can generate accurate predictions for missing relations in the knowledge graph. The knowledge completion model is shown in Fig. 6.

The novelty of this work lies in the enhancement of GATs by incorporating multi-order relations and adjacent node features. Furthermore, this model employs an improved graph attention layer with a ConvKB decoder. These enhancements enable the model to capture complex relations between entities, significantly contributing to knowledge graph completion.

Graph attention encoder

To obtain the embedding of an entity $e_{i}$, it is necessary to learn the embeddings of triples which have relationships with $e_{i}$. The operation is illustrated in Fig. 5. The linear transformation used here is performed by concatenating entity and relation feature vectors associated with a specific triple $t^{k}_{ij}=(e_{i},r_{k},e_{j})$:

$$\begin{aligned} \mathbf {{c}_{ijk}}={\textbf{W}}_{1}\left[ \overrightarrow{h_{i}}\left\| \overrightarrow{h_{j}}\right\| \overrightarrow{g_{k}}\right] . \end{aligned}$$

(11)

where $\mathbf {{c}_{ijk}}$ is the vector representation of the triple $t^{k}_{ij}$. $\overrightarrow{h_{i}}$, $\overrightarrow{h_{j}}$ and $\overrightarrow{g_{k}}$ are the embeddings of $e_{i},e_{j}$ and $r_{k}$. Additionally, $\varvec{{\textbf {W}}}_{1}$ represents the linear transformation matrix. Each triple is assigned an importance score $b_{ijk}$. The attention value of the triple is obtained by applying a linear transformation parameterized by a weight matrix $\varvec{{\textbf {W}}}_{2}$, followed by the LeakyReLU non-linearity:

$$\begin{aligned} b_{ijk}=\text { LeakyReLU }\left( {\textbf{W}}_{2} c_{ijk}\right) . \end{aligned}$$

(12)

Accordingly, the softmax function is applied to $b_{ijk}$ to obtain relative attention values. Figure 5 illustrates the calculation of the relative attention value $\alpha _{ijk}$:

$$\begin{aligned} \alpha _{i j k}={\text {softmax}}_{j k}\left( b_{ijk}\right) =\frac{\exp \left( b_{i j k}\right) }{\sum _{n \in N_{i}} \sum _{r\in R_{in}} \exp \left( b_{i n r}\right) }. \end{aligned}$$

(13)

where ${N}_{i}$ represents the neighborhood of entity $e_{i}$ and ${R}_{ij}$ represents relationship between entities $e_{i}$ and $e_{j}$. Based on attention values for each triple representation, the updated embedding of entity $e_{i}$ is as follows:

$$\begin{aligned} \mathbf {{h}^{\prime }_{i}}=\sigma \left( \sum _{j \in N_{i}} \sum _{k \in R_{ij}} \alpha _{i j k}\mathbf {c_{ijk}}\right) . \end{aligned}$$

(14)

To stabilize the learning process and capture features about the neighbourhood, multi-head attention [52] is used, and representation of embeddings is obtained by concatenating M attention mechanisms:

$$\begin{aligned} &h_{i}^{\prime }=\Vert _{m=1}^{M} \sigma \left( \sum _{j \in N_{i}} \alpha _{i j k}^{m} c_{i j k}^{m}\right) . \end{aligned}$$

(15)

As shown in Fig. 5, this transformation is performed on the input relation embedding matrix ${\textbf {G}}$ which is parameterized by a weight matrix $\varvec{{\textbf {W}}}^{R} \in {{\mathbb {R}}^{T\times T^{\prime }}}$, where $T^{\prime }$ represents the dimension of the output relation embedding. The linear function can be calculated as follows:

$$\begin{aligned}&G^{\prime }=G\cdot {\varvec{{\textbf {W}}}^{R}}. \end{aligned}$$

(16)

Average embedding is used to obtain the final embedding vectors for entities rather than concatenating embedding from multiple heads. While learning an entity’s embedding, it loses the information associated with its initial embedding. To overcome this issue, $\varvec{{\textbf {H}}^{i}}$ is linearly transformed to obtain $\varvec{{\textbf {H}}^{t}}$ using a weight matrix $\varvec{{\textbf {W}}}^{E} \in {{\mathbb {R}}^{T^{i} \times T^{f}}}$, and embedding generated from the final attention layer:

$$\begin{aligned}&h_{i}^{\prime }=\sigma \left( \frac{1}{M} \sum _{m=1}^{M} \sum _{j \in {N}_{i}} \sum _{k \in R_{ij}} \alpha _{i j k}^{m} c_{i j k}^{m}\right) . \end{aligned}$$

(17)

$$\begin{aligned} \varvec{{\textbf {H}}}^{\prime \prime }={\textbf {W}}^{E}\varvec{{\textbf {H}}}^{t}+\varvec{{\textbf {H}}}^{f}. \end{aligned}$$

(18)

The proposed model introduces an auxiliary relation for n-hop neighbours between two entities, extending the concept of an edge to a directed path. In this case, the embedding of the auxiliary relation consists of all other embeddings along the path. Iteratively, the model gathers information from distant neighbours of an entity. Figure 6 illustrates the aggregation process for learning entity embeddings and introduces an auxiliary edge between n-hop neighbours. Entity embeddings are normalized after each generalized GAT layer and before every main iteration.

Training objective

This model uses a translational scoring function from [45], which learns embeddings such that the condition $\textbf{h}_{i}+\textbf{g}_{k} \approx \textbf{h}_{j}$ holds for a given valid triple $t^{k}_{ij}=(e_{i},r_{k},e_{j})$. To minimize the L1-norm dissimilarity measure, $d_{t_{i j}}=\left\| \overrightarrow{h_{i}}+\overrightarrow{g_{k}}-\overrightarrow{h_{j}}\right\| _{1}$ is used to learn entity and relation embeddings. The model is trained using the hinge loss given by Eq. 19. The latter set consists of triples obtained by replacing valid triples’ head or tail entity with entities unrelated to the relation. The calculation is shown in Eq. 20.

$$\begin{aligned} &L\left( \Omega \right) =\sum _{t_{ij} \in S} \sum _{t\prime _{ij} \in S^{\prime }} \max \left\{ d_{t^{\prime }_{ij}}-d_{t_{ij}}+\gamma , 0\right\} . \end{aligned}$$

(19)

$$\begin{aligned} S^{\prime }=\underbrace{\left\{ t_{i^{\prime } j}^k \mid e_i^{\prime } \in \varepsilon \backslash e_i\right\} }_{\text{ replace } \text{ head } \text{ entity } } \cup \underbrace{\left\{ t_{i j^{\prime }}^k \mid e_j^{\prime } \in \varepsilon \backslash e_j\right\} }_{\text{ replace } \text{ tail } \text{ entity } }. \end{aligned}$$

(20)

where $\gamma >0$ is a margin hyperparameter, S is a set of valid triples, and $S^{\prime }$ represents a set of invalid triples.

ConvKG decoder

ConvKB [48] is used as a decoder in the model. The convolutional layer analyzes $t^{k}_{ij}$’s global embedding properties on each dimension and generalizes transitional properties in the knowledge prediction model. Score functions with multiple feature maps can be expressed as follows:

$$\begin{aligned} f\left( t_{i j}^k\right) =\left( \Vert _{m=1}^{\Omega } {\text {ReLU}}\left( \left[ \textbf{h}_i, \textbf{g}_k, \textbf{h}_j\right] * \omega ^m\right) \right) \cdot {\textbf{W}}. \end{aligned}$$

(21)

where $\omega ^{m}$ is the $m^{th}$ convolutional filter, $\Omega$ is a hyperparameter indicating the number of filters to be used, $*$ is a convolution operator, and ${\textbf{W}} \in {\mathbb {R}}^{\Omega k \times 1}$ represents the linear transformation matrix.

The model is trained using soft-margin loss defined:

$$\begin{aligned} {\mathcal {L}}=\sum _{t_{i j}^k \in \left\{ S \cup S^{\prime }\right\} } \!\!\log \! \left( 1+\exp \left( l_{t_{i j}^k} \cdot f\left( t_{i j}^k\right) \right) \right) +\frac{\lambda }{2}\Vert {\textbf{W}}\Vert _2^2. \end{aligned}$$

(22)

where $l_{t_{i j}^{k}}= {\left\{ \begin{array}{ll}1 &{} \text{ for } t_{i j}^{k} \in S \\ -1 &{} \text{ for } t_{i j}^{k} \in S^{\prime }\end{array}\right. }$.

Case study: The ceramic knowledge graph for the Palace Museum in China

The Palace Museum, located in the heart of Beijing in China, is one of the most prominent museums in China. The museum’s extensive collection of ancient Chinese art objects, including ceramics, reflects the country’s rich cultural heritage and long history. To increase accessibility to the collection, the museum has developed a digital heritage database that currently contains digital records of 10,624 ceramics. Given the considerable quantity of artefacts and the restricted visualisation capabilities of the database, it necessitates a novel knowledge management approach to enhance information acquisition.

This case study aims to address these issues by using real ceramics from the Palace Museum in China to assist cultural heritage professionals and enthusiasts in retrieving ceramic data more efficiently. This study comprises five critical steps: ceramic ontology definition, ceramic dataset construction, knowledge graph extraction, knowledge graph completion, and the knowledge graph application. Figure 7 depicts the case study process. The study employed Python 3.6, PyTorch 2.7, CUDA 10.1, Neo4j Graph database, and SQL Server. The study was conducted using Ubuntu 16.04 and Windows 10 operating systems using a CPU I7 7700K CPU and a GPU 1080Ti 11 G.

Ceramic ontology definition

A ceramic ontology is a structure of information that helps form a structured knowledge graph. A ceramic cultural heritage ontology was designed in this study. To ensure the systematic development of this ceramic ontology, we adhered to well-established ontology engineering principles [54]. Ceramic ontology definition involves the definition of domain and scope, the reuse of existing ontologies, the enumeration of critical terms, the definition of concepts and properties, and the instantiation of these definitions.

Specifically, the study defines the ontology domain and scope of ceramic cultural heritage and refers to relevant historical ontologies for reference. The ceramic ontology was developed using the CIDOC Conceptual Reference Model (CIDOC CRM) [55], widely accepted and established as the ontology model in the cultural heritage domain. It is worth noting that the CIDOC CRM, consisting of 99 classes and 188 properties. In light of the specific scope of this study, we included a portion of the CIDOC CRM to form the foundation of the ontology.

Then, the core classes of the ontology were determined through extensive consultations with a multidisciplinary team. Essential terms were enumerated based on CIDOC CRM and the Palace Museum digital platform. The multidisciplinary team for this study comprises cultural heritage researchers and computer programmers. Multidisciplinary teams are selected because they should have worked for at least five years to ensure this study’s validation. Cultural heritage researchers classify creative culture design, such as chronology, processing techniques, shape, function and excavation sites. Computer programmers are responsible for computer technology for the ceramic knowledge graph. This study analysed the domain ontology with five cultural heritage researchers and two programmers. These researchers worked together to develop the domain ontology. The process of domain ontology construction was as follows: (1) Opinions on the ontology classification were collected from a multidisciplinary team, and the following topics were discussed: What are the components of knowledge of a ceramic object? What framework can be developed? (2) Implementing focus groups. The multidisciplinary team discussed the topics, and the researcher recorded the verbal and non-verbal information of the participants using video. (3) The concept classification is derived by analysing and organising information using content analysis and framework structure methods. Finally, domain ontology concepts include name, dynasty, pattern, shape, colour, glaze, and function. Name refers to the names of objects, people, and organisations. For example, ‘Blue and white bowl with pine, bamboo and plum pattern’ is the name of porcelain. Dynasty, pattern, shape, and colour include ‘Qing dynasty’, ‘Bamboo pattern’, ‘round shape’, and ‘blue and white’. The glaze is an impervious layer of a vitreous substance, which can serve to colour, decorate, underlying design or texture.

Additionally, the properties of the concepts were defined by incorporating the inverse property feature and the object-centric nature of CIDOC CRM ontology, which provided flexibility in the ontology construction. The relations between the concepts were established using a top-down approach, and 32 relations were defined. After the ontological model was constructed, it was filled with data obtained from the website of the Palace Museum in China. The ceramic ontology can provide a unified and structured framework for the ceramic knowledge graph, as shown in Fig. 8. We used the standard Web Ontology Language (OWL) for formalisation language, ensuring our ontology’s compatibility and interoperability with various digital platforms. With a balance between knowledge graphs and deep learning algorithms, we aim to offer a robust methodology for capturing, representing, and exploring complex ceramic data.

Ceramic dataset construction

After defining the ceramic ontology, the ceramic dataset was constructed. The procedure of ceramic dataset construction is illustrated in Fig. 9. Firstly, ceramic data was scraped from the website of the Palace Museum [56] using Beautiful Soup. The scraping process entailed identifying tag names and classes, extracting full text, and locating hrefs. The statistical overview of the ceramic corpus is shown in Table 1. The ceramic corpus includes 1151 porcelain pieces, 4658 images, 639,676 words and 3674 sentences. The ceramic corpus was then annotated using BRAT software [57]. Finally, the corpus was transformed into a dataset for knowledge graph extraction and a dataset for knowledge graph complementation, as shown in Fig. 9(d). The statistical overview of the dataset for knowledge graph extraction is shown in Table 3. Each row of the dataset used for knowledge graph extraction includes sentences and entity-relation triples contained in the sentences, with a total of 3,674 sentences and 8,689 triples. The dataset was partitioned into training, validation, and test sets in a ratio of 8:1:1, respectively. The statistical overview of the dataset for knowledge graph completion is shown in Table 2. Each row of the dataset used for knowledge graph complementation is an entity-relation triple and contains entity ID, relation ID, entity embedding, and relation embedding. In the early experiments, we tried out several methods of dividing the dataset. The knowledge extraction model showed improved stability and generalisation when we used an 8:1:1 partitioning. The knowledge complementation model performed better with a 7:2:1 split. With a larger validation set, we could more robustly fine-tune models’ hyperparameters and reduce overfitting.

For knowledge graph extraction task, the dataset comprises sentences and entity-relation triples. Each sentence contains some entity-relation triples, each head entity and tail entity labelled with ‘name’, ‘type’, etc., and each relation labelled with ‘have name of’, ‘have pattern of’, etc. These labels are the ground truth, annotated by a multidisciplinary team manually. The knowledge extraction model is trained to predict these labels based on the sentences and triples, and their performance is evaluated based on how well they match these ground truth labels. For knowledge graph completion task, the dataset comprises entity-relation triples, and each is labelled with entity ID, relation ID, entity embedding and relation embedding. These labels are the ground truth, annotated by a multidisciplinary team manually. The knowledge graph completion model is trained to predict these labels, and their performance is evaluated based on how well they match these ground truth labels. In order to evaluate the consistency of the annotation, we selected 100 sentences randomly and calculated the F-value. The results were calculated to be 96.2% for entity consistency and 93.5% for relationship consistency, demonstrating the reliability of the annotation results.

$$\begin{aligned} &P=\frac{\text{ Consistent } \text{ annotations: } A_{1} \text { and } A_{2} }{\text{ Number } \text{ of } \text{ annotations } \text{ from } A_{2} },\\&R=\frac{\text{ Consistent } \text{ annotations: } A_{1} \text { and } A_{2} }{\text{ Number } \text{ of } \text{ annotations } \text{ from } A_{1} }, \\&F=\frac{2 \times P \times R}{P {+} R}. \end{aligned}$$

(23)

Table 1 Statistical overview of the crawled ceramic corpus

Full size table

Table 2 Statistical overview of the ceramic dataset for knowledge graph extraction

Full size table

Table 3 Statistical overview of the ceramic dataset for knowledge graph completion

Full size table

Table 4 The parameters of knowledge extraction model

Full size table

Table 5 The parameters of knowledge completion model

Full size table

Table 6 The comparative results of knowledge completion

Full size table

Knowledge graph extraction

This section discusses the automated extraction of ceramic knowledge and the evaluation of the knowledge extraction model proposed in Section 3.2, which was trained through a ten-fold cross-validation method on the dataset (Table 2). The experiment of knowledge graph extraction employed Python 3.6, PyTorch 2.7, CUDA 10.1, and Ubuntu 16.04 using CPU I7 7700K and GPU 1080Ti. The validation set was utilized to determine the thresholds for the two layers and the parameters, detailed in Table 4. A stochastic gradient descent algorithm was used to train the model using the Adam optimizer, and early stopping was implemented to prevent overfitting. The pre-training model contained 12 million parameters and 12 bidirectional Transformer blocks with 256 hidden states and four attention heads. To ensure a fair comparison, input sentences were limited to 100 words [50, 58]. The training process stopped after 13 epochs without improvement, the total training time of about 40 h for 13 epochs. Results show that the model achieved an F1-score of 86.7%, a precision of 86.4%, and a recall of 87.1%. Although the loss value was high at the start, it reduced rapidly and approached zero after six epochs, with precision results stabilizing as the number of epochs increased.

A comparison was made between the proposed model and state-of-the-art knowledge extraction models, which we detail as follows: (a) NovelTagging [50]: A sequence tagging joint learning model. Its architecture comprises a bidirectional Long Short-Term Memory (BiLSTM) encoding layer and an LSTM decoding layer. (b) GraphRel [58]: A graph neural joint learning model. Its structure includes a BiLSTM encoding layer and a bidirectional Graph Convolutional Network (Bi-GCN) layer for decoding dependency trees. (c) Our Proposed Model (BiLSTM): Noting that the encoding layers of the above models use BiLSTM, the encoding layer of our model was adjusted to BiLSTM for ablation experiments.

We evaluated all models on the constructed ceramic dataset for the knowledge extraction task. For (a)-(c) models, we used 300-dimensional word vectors created using the skip-gram variant of the word2vec algorithm [59], trained on the Chinese Wikipedia corpus for ablation experiments. Word2Vec is an algorithm designed to transform words into numerical word vectors. It redefines high-dimensional word features into a low-dimensional space, which allows machines to better understand words and their semantic relationships. Normally, a 300-dimensional word vector balances computational efficiency and word information. By representing words as vectors, we can feed them into neural network models (such as BiLSTM or Bi-GCN) and train these models to recognize patterns across the 300-dimensional space.

Figure 10 shows that the proposed model outperformed the other comparative models. NovelTagging employs sequence labelling, which is unable to eliminate overlapping triples. As a result, it achieves the lowest F1 scores compared to the other three models. The decoding layer of GraphRel introduces the graph and enhances the knowledge extraction process. Thus, the F1-scores (55.2%) are slightly higher than NovelTagging (40.3%). The proposed model achieves F1-scores of 86.7% on the ceramic dataset, an improvement of 57.1% over the GraphRel. Despite not having pre-training, the proposed model (Our Model BiLSTM) achieves an F1-score of 82.2%, which improved by approximately 48.9% compared to Multi-Head. The proposed model, which uses BERT pre-training, performed approximately 5.5% better than the proposed model with a BiLSTM encoder. As a result, the comparison demonstrates the proposed model’s superiority compared to NovelTagging and GraphRel.

Knowledge graph completion

This subsection evaluates the knowledge completion model proposed in Section 3.3 on the ceramic dataset (Table 3). The performance assessment for knowledge graph completion task was conducted utilizing the identical hardware and software configurations as those detailed in Section 3.3 for the knowledge graph extraction process (Fig. 11). The embeddings of entity and relation were generated using 100-dimensional word vectors generated by skip-gram of word2vec. In Fig. 12, words of greater relevance or the same type are close, which illustrates the effective capturing of semantic information of the words.

The knowledge completion model was trained using a ten-fold cross-validation method by first training the graph attention layer for encoding and then the ConvKB decoder for relation prediction. The parameters were determined using the validation set, as presented in Table 5. In order to prevent overfitting, the stochastic gradient descent algorithm with Adam as the optimizer was used for training. The proposed model was evaluated using commonly-used knowledge completion evaluation methods, including the average rank (MR) of 139, the average reciprocal rank (MRR) of 47.7%, and the Hits@1, Hits@3, and Hits@10 values of 41.7%, 49.9%, and 59.0%, respectively. The total training time is approximately 3 h. The training process is illustrated in Fig. 13, which shows that the loss value decreases as the training progresses and eventually stabilizes, indicating the robustness of the model.

A comparison was made with other outstanding models to evaluate the proposed knowledge completion model. The comparative models are as follows: TransE [45], ConvKB [48], and R-GCN [49]. As shown in Table 6, the proposed model outperforms TransE, ConvKB, and R-GCN. Specifically, R-GCN based on the graph convolutional neural network demonstrates poor knowledge completion performance, reflected in its low values across all five items. Although ConvKB based on the convolutional neural network shows the highest MR value, other results perform worse than the proposed model. In contrast, the method presented in this study demonstrates relatively stable and robust results, with four items achieving the highest scores.

The knowledge graph application

The proposed framework and deep learning models were used in the case study to construct the ceramic knowledge graph. This study employed Python 3.6, PyTorch 2.7, CUDA 10.1, Neo4j Graph database, and SQL Server. The study was conducted using Ubuntu 16.04 and Windows 10 operating systems using a CPU I7 7700K and a GPU 1080Ti 11 G. The Flask web framework was employed to facilitate the automatic implementation of the knowledge extraction function, knowledge completion function, and the development of the cultural heritage knowledge graph. Initially, the relevant Flask classes were imported, and instances of these classes were created to serve as the foundation for the WSIG (Web Server Gateway Interface) application. In addition, the instantiated object was utilised as a decorator. This allowed it to wrap another function and ultimately return a different function to the browser. In order to deploy the model, the configuration of Gunicorn and Nginx was necessary. Following the successful deployment of the knowledge extraction model and knowledge completion model, the functionalities pertaining to knowledge extraction and knowledge completion within the cultural heritage domain were effectively realised. The ceramic knowledge graph was constructed with a technical architecture consisting of four layers: data infrastructure storage, knowledge graph architecture, knowledge graph service, and the knowledge graph application (Fig. 14). The data infrastructure layer provides a relational database, knowledge-searching functions, graph data, and a big data platform. The knowledge graph architecture layer produces and manages the knowledge graph. The knowledge graph service layer supports semantic query and retrieval services, reasoning services, and knowledge production services. Finally, the knowledge graph application layer offers a platform for users to search, collect, compare, manage knowledge, etc. It is noticed that the process model training did not require a significant amount of computational resources.

The users of the ceramic knowledge graph application can be diverse, ranging from academic researchers and students to cultural heritage professionals, policymakers, and the general public. Each group has unique needs and requirements. Academic researchers and students might utilise the ceramic knowledge graph as a tool for deepening their understanding of historical artefacts. Therefore, the primary need is a convenient and accurate representation of artefacts, contexts, and relationships. Cultural heritage professionals, such as museum curators and heritage site managers, might benefit from using the ceramic knowledge graph to catalogue, organise, and manage collections. In addition, it would be beneficial to have features such as easy data editing, and robust searching. It is possible for policymakers to use the ceramic knowledge graph to inform their decision-making processes. It would be beneficial to them if the data could be visualised and interpreted easily. Public members may be interested in exploring the ceramic knowledge graph out of personal interest or for informal learning purposes. Improving the user-friendliness and interactivity of the knowledge graph visualizations could facilitate more effective user engagement. Considering the diversity of user types, including cultural and museum researchers, students, and the general public, we have conducted a summary of user scenarios for the main needs with expert advice, including searching, collection, comparison, management, and detailed information.

In the knowledge graph application section, the interface of the home page is discussed (Fig. 15). The home page provides a convenient and intuitive method for users to navigate through interconnected ceramics information and images simply by left-clicking and dragging to alter the viewing scope. The knowledge graph application represents ceramic information in a highly connected and visually compelling format.

The search page allows users to search for specific terms such as ‘underglaze colour’ and ‘twig pattern’ (Fig. 16). Using a combination of entity-relation triple extraction and text-matching, this model identifies entities and relations from the input sentences and matches related knowledge with those entities and relations. For instance, if a user queries ‘bottles with short neck and plump shoulder’, the knowledge graph application platform will identify the entities ‘short neck’ and ‘plump shoulder’. It will match them with related bottles, such as ‘Yongzheng style bucket colour wrapped branch pattern plum bottle’ and ‘Blue and white wrapped branches flower plum bottle’. The result is a visual representation of associated cases, enabling users to acquire information in a relational, visual, and intuitive way.

The collection, comparison, and management page helps users to obtain detailed ceramic information (Fig. 17). The knowledge graph visualisation indicates related entities and relationships by clicking on a text or picture. The collection, comparison, and management page allows users to learn more through the ceramic knowledge graph visualisation, such as ‘round belly’, ‘enamel’, ‘bottle’, and ‘flower’. Compared with other cultural heritage knowledge acquisition and management methods, the knowledge graph visualisation presents information in an interconnected, visual, and intuitive way.

The detailed analysis page allows users to obtain further decomposed information (Fig. 18). It is possible to extract information about patterns, functional types, colours, glaze classifications, and shapes using the joint entity-relation triple extraction model of the knowledge graph.

This case study illustrates the effectiveness of the knowledge graph application in acquiring ceramic knowledge for researchers focused on intangible cultural heritage. The knowledge graph application provides a visual representation of interconnected ceramic knowledge. It allows automatic entity and relation extraction, knowledge graph completion, relevant knowledge matching, and visualised case correlations. The ceramic knowledge graph application streamlines the process of acquiring ceramic knowledge, making it a valuable tool for researchers in digital cultural heritage management.

Discussion

The knowledge graph application for digital cultural heritage management

The application of knowledge graph has many benefits, including enhanced search and discovery, improved interpretation and storytelling, facilitated collaboration and data integration, increased accessibility and inclusivity, and better preservation and conservation as follows (Table 7).

(1)
Enhanced search and discovery: The knowledge graph application enables museums to provide more accurate and personalised search results, improving the overall user experience. For example, museums have implemented a knowledge graph-based search engine that allows users to search for artefacts by different criteria, such as materials, periods, and regions, and obtain a visual representation of the search results. Additionally, the knowledge graph application play a significant role in managing cultural heritage data by organising and categorising vast amounts of information, aiding in efficient data retrieval and exploration.
(2)
Improved interpretation and storytelling: The knowledge graph application helps museums present their collections more engagingly by connecting different artefacts and their contexts. For example, museums use a knowledge graph to create digital stories that illustrate the life and work of Vincent van Gogh. By incorporating cultural heritage data into knowledge graphs, museums can effectively showcase the historical significance and narratives associated with each artefact, enhancing the interpretive experience for visitors.
(3)
Facilitated collaboration and data integration: The knowledge graph application enables museums to share and integrate their data more efficiently internally and with external partners. This capability is particularly beneficial in managing cultural heritage data, as it allows for the seamless integration of diverse information sources related to art history, including archives, catalogues, and bibliographies.
(4)
Increase accessibility and inclusivity: The knowledge graph application can help museums to provide more inclusive and accessible experiences for visitors with different backgrounds and interests. For example, museums create a knowledge graph that allows users to explore their collections through different keywords, such as colour, glaze, and shape. By incorporating knowledge graphs in the management of cultural heritage data, museums can foster collaboration between different stakeholders and ensure comprehensive access to valuable information.
(5)
Better preservation and conservation: The knowledge graph application can help museums manage their collections more efficiently and effectively by providing a more comprehensive and interconnected view of the objects and their metadata. For example, museums develop a knowledge graph that incorporates information about the physical and chemical properties of their artworks and their historical and cultural contexts to support preservation and conservation efforts.

The knowledge graph application assists in the organisation and management of cultural heritage data, enabling museums to better understand the relationships between artefacts, make informed decisions regarding conservation methods, and ensure the long-term preservation of cultural heritage for future generations.

Table 7 Digital outcomes of the ceramic knowledge graph and potential applications

Full size table

Limitations of knowledge graphs for digital cultural heritage management

Although the knowledge graphs significantly improves digital cultural heritage management, there are some limitations could be addressed.

The first limitation is the ceramic knowledge graph platform requires a lot of labelled data. As with any machine learning model, the quality of the results depends on the quantity and quality of the data used to train the model. A substantial amount of labelled data is required to be processed into the knowledge extraction model and the knowledge completion model so that they can recognise and predict entities and relations within the ceramic domain.

The second limitation is the incompleteness of the existing ontology framework compared with CIDOC CRM. The ontology framework used in the knowledge graph platform may only partially capture some of the nuances and complexities of the ceramic domain. This could limit the accuracy and completeness of knowledge representation. Because of this limitation, it may be necessary to modify the ontology framework to make it more aligned with the ceramic domain and provide a more comprehensive representation of knowledge in the ceramic domain.

The third limitation is the requirement for more data verification and testing in ceramics or other tangible and intangible cultural heritage. While the ceramic knowledge graph application has shown promise in providing visualised and interconnected ceramic knowledge, its effectiveness and accuracy in different cultural heritage domains may require further testing and verification. More research should be explored to show the feasibility of applying knowledge graphs and proposed deep learning models to other heritage resources, such as textiles, paintings, and sculptures.

Despite the benefits of the knowledge graph application in managing digital cultural heritage, there is a clear deficiency in robust user testing or evaluation in this study. However, user testing often falls short in practice due to constraints like time, cost, or availability of representative user groups. To address this, a comprehensive framework for user testing that can accommodate the diversity of the user population, along with regular user feedback cycles, should be adopted to continually refine and improve the system.

The final limitation is the dependence of the model on textual data. Though integrating machine learning techniques for multimedia data analysis can enhance knowledge graphs, it adds complexity to the process. It necessitates significant alterations to the current data pipeline, from multimedia data ingestion to feature extraction and potential schema restructuring. Despite these difficulties, further research into multimodal data integration and machine learning in knowledge graphs is critical.

Conclusion

Effective digital cultural heritage management is often hampered by fragmented data, intelligent framework and insufficient application, which poses challenges to accessing and utilising valuable cultural heritage information. This study proposes a digital cultural heritage management approach that combines knowledge graphs and deep learning algorithms to address these issues. Specifically, a knowledge graph-driven framework is proposed to automatically manage cultural heritage data, along with a joint entity-relation triple extraction model that extracts entities and relations for knowledge graph construction. The joint entity-relation triple extraction model uses an encoder based on the BERT to contextualise input sentences and extract entities and relations simultaneously through a cascade decoder, overcoming the challenge of overlapping triples in the knowledge extraction of cultural heritage. Additionally, a knowledge completion model with graph attention is proposed to complete missing knowledge. The results demonstrate that the proposed joint entity-relation extraction model and knowledge completion model outperform well-known models. The proposed approach was validated by a case study of ceramic data at the Palace Museum in China, which significantly enhances search and discovery, improves interpretation and storytelling, facilitates collaboration and data integration, and increases accessibility, inclusivity, and preservation.

Future work will address the completeness of the ontology framework in the cultural heritage domain. The proposed approach will involve further validation using a wide range of cultural artefacts. Future implementations should also include user evaluation. Additionally, it will be crucial to incorporate multi-modal data, such as audio, video, and image files, so as to capture a more complete representation of cultural heritage.

Availability of data and materials

The datasets used and/or analysed during the current study are available from the corresponding author following the appropriate copyright acquisition procedure.

References

Zhou J, Wang W, Zhou J, Zhang Z, Lu Z, Gong Z. Management effectiveness evaluation of world cultural landscape heritage: a case from China. Herit Sci. 2022;10:22. https://doi.org/10.1186/s40494-022-00660-2.
Article Google Scholar
Mekonnen H, Bires Z, Berhanu K. Practices and challenges of cultural heritage conservation in historical and religious heritage sites: evidence from North Shoa Zone, Amhara Region Ethiopia. Herit Sci. 2022;10:172. https://doi.org/10.1186/s40494-022-00802-6.
Article Google Scholar
UNESCO. Charter on the preservation of digital heritage. 2003. https://en.unesco.org/about-us/legal-affairs/charter-preservation-digital-heritage. Accessed 15 Dec 2021.
Europeana. The Europeana platform; 2018. https://pro.europeana.eu/page/ten-perspectives-on-the-future-of-digital-culture-dan-cohen. Accessed 4 Dec 2018.
Trček D. Cultural heritage preservation by using blockchain technologies. Herit Sci. 2022;10:6. https://doi.org/10.1186/s40494-021-00643-9.
Article Google Scholar
Ranjgar B, Sadeghi-Niaraki A, Shakeri M, Choi SM. An ontological data model for points of interest (POI) in a cultural heritage site. Herit Sci. 2022;10:13. https://doi.org/10.1186/s40494-021-00635-9.
Article Google Scholar
Fan Z, Chen C, Huang H. Immersive cultural heritage digital documentation and information service for historical figure metaverse: a case of Zhu Xi, Song Dynasty China. Herit Sci. 2022;10:148. https://doi.org/10.1186/s40494-022-00749-8.
Article Google Scholar
Hamburger Kunsthalle. Hamburg Museum of Art; 2018. https://www.hamburger-kunsthalle.de/en. Accessed 13 April 2022.
Metropolitan Museum of Art. Metropolitan Museum of Art. 2022. https://www.metmuseum.org/. Accessed 13 April 2022.
Kataoka H, Satoh Y, Abe K, Minoguchi M, Nakamura A. Ten-million-order human database for world-wide fashion culture analysis. In: Proceedings of the IEEE/CVF Conference on computer vision and pattern recognition workshops. Long Beach, USA: IEEE; 2019. p. 305–312. https://doi.org/10.1109/CVPRW.2019.00040.
Chinese National Academy of Arts. China intangible Cultural Heritage Database; 2022. https://www.ihchina.cn/tuji_list.html. Accessed 13 April 2022.
Bloomfield R, Mazhari E, Hawkins J, Son YJ. Interoperability of manufacturing applications using the core manufacturing simulation data (CMSD) standard information model. Comput Ind Eng. 2012;62(4):1065–79. https://doi.org/10.1016/j.cie.2011.12.034.
Article Google Scholar
Hu H, Liu Y, Lu WF, Guo X. A knowledge-based approach toward representation and archiving of aesthetic information for product conceptual design. J Comput Inf Sci Eng. 2022;22(4): 041011. https://doi.org/10.1115/1.4053674.
Article Google Scholar
Wu Z, Liao J, Song W, Mao H, Huang Z, Li X, et al. Semantic hyper-graph-based knowledge representation architecture for complex product development. Comput Ind. 2018;100:43–56. https://doi.org/10.1016/j.compind.2018.04.008.
Article Google Scholar
Bi Z, Wang S, Chen Y, Li Y, Yoon Kim J. A knowledge-enhanced dialogue model based on multi-hop information with graph attention. CMES. 2021;128(2):403–26.
Article Google Scholar
Liang C, Wu Z, Huang W, Giles CL. Measuring prerequisite relations among concepts. In: Proceedings of the 2015 conference on empirical methods in natural language processing. Lisbon, Portugal: ACL; 2015. p. 1668–1674. https://doi.org/10.18653/v1/D15–1193.
Lai B, Zhao W, Yu Z, Guo X, Zhang K. A multi-domain knowledge transfer method for conceptual design combine With FBS and knowledge graph. In: International Design Engineering Technical Conferences and Computers and Information in Engineering Conference. St. Louis, USA: American Society of Mechanical Engineers; 2022. https://doi.org/10.1115/DETC2022-89700.
Lyu M, Li X, Chen CH. Achieving knowledge-as-a-service in IIoT-driven smart manufacturing: a crowdsourcing-based continuous enrichment method for industrial knowledge graph. Adv Eng Inform. 2022;51: 101494. https://doi.org/10.1016/j.aei.2021.101494.
Article Google Scholar
Li H, Wang Y, Zhang S, Song Y, Qu H. KG4Vis: a knowledge graph-based approach for visualization recommendation. IEEE Trans Vis Comput. 2021;28(1):195–205. https://doi.org/10.1109/TVCG.2021.3114863.
Article Google Scholar
Zhao X, Liu Y, Xu Y, Yang Y, Luo X, Miao C. Heterogeneous star graph attention network for product attributes prediction. Adv Eng Inform. 2022;51: 101447. https://doi.org/10.1016/j.aei.2021.101447.
Article Google Scholar
Charles V, Manganinhas H, Isaac A, Freire N, Gordea S. Designing a multilingual knowledge graph as a service for cultural heritage - some challenges and solutions. In: International Conference on Dublin Core and Metadata Applications. Porto, Portugal: DCMI; 2018. p. 29–40. https://doi.org/10.5555/3308533.3308538.
Hyvönen E, Rantala H. Knowledge-based relational search in cultural heritage linked data. Digit Scholarsh. 2021;36(suppl 2):155–64. https://doi.org/10.1093/llc/fqab042.
Article Google Scholar
Debruyne C, Munnelly G, Kilgallon L, O’Sullivan D, Crooks P. Creating a knowledge graph for Ireland’s lost history: Knowledge engineering and curation in the beyond 2022 project. J Comput Cult. 2022;15(2):1–25. https://doi.org/10.1145/3474829.
Article Google Scholar
Dou J, Qin J, Jin Z, Li Z. Knowledge graph based on domain ontology and natural language processing technology for Chinese intangible cultural heritage. J Vis Lang Comput. 2018;48:19–28. https://doi.org/10.1016/j.jvlc.2018.06.005.
Article Google Scholar
Bobasheva A, Gandon F, Precioso F. Learning and reasoning for cultural metadata quality: coupling symbolic AI and machine learning over a semantic web knowledge graph to support museum curators in improving the quality of cultural metadata and information retrieval. J Comput Cult. 2022;15(3):1–23. https://doi.org/10.1145/3485844.
Article Google Scholar
Bao Q, Zhao G, Yu Y, Zheng P. A node2vec-based graph embedding approach for unified assembly process information modeling and workstep execution time prediction. Comput Ind Eng. 2022;163: 107864. https://doi.org/10.1016/j.cie.2021.107864.
Article Google Scholar
Du B, Lu Y, Cheng X, Zhang W, Zou X. The object-oriented dynamic task assignment for unmanned surface vessels. Eng Appl Artif Intell. 2021;106: 104476. https://doi.org/10.1016/j.engappai.2021.104476.
Article Google Scholar
Hearst MA, Dumais ST, Osuna E, Platt J, Scholkopf B. Support vector machines. IEEE Intell Syst. 1998;13(4):18–28. https://doi.org/10.1109/5254.708428.
Article Google Scholar
Wallach HM. Conditional random fields: An introduction. Technical Reports (CIS). 2004;p. 22. http://www.inference.org.uk/hmw26/papers/crf_intro.pdf. Accessed 31 Jan 2022
Du B, Lin B, Zhang C, Dong B, Zhang W. Safe deep reinforcement learning-based adaptive control for USV interception mission. Ocean Eng. 2022;246(4): 110477. https://doi.org/10.1016/j.oceaneng.2021.110477.
Article Google Scholar
Li L, Wang P, Yan J, Wang Y, Li S, Jiang J, et al. Real-world data medical knowledge graph: construction and applications. Artif Intell Med. 2020;103: 101817. https://doi.org/10.1016/j.artmed.2020.101817.
Article Google Scholar
Devlin J, Chang MW, Lee K, Toutanova K. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv. 2018. https://doi.org/10.48550/arXiv.1810.04805.
Huang Y, Yu S, Chu J, Su Z, Zhu Y, Wang H, et al. Design knowledge graph-aided conceptual product design approach based on joint entity and relation extraction. J Intell Fuzzy Syst. 2022;44(3):5333–55.
Article Google Scholar
Rönnqvist S, Schenk N, Chiarcos C. A recurrent neural model with attention for the recognition of Chinese implicit discourse relations. arXiv. 2017. https://doi.org/10.18653/v1/P17-2040.
Zhang Y, Yang J. Chinese NER using lattice LSTM. arXiv. 2018.https://doi.org/10.48550/arXiv.1805.02023.
Zeng X, Zeng D, He S, Liu K, Zhao J. Extracting relational facts by an end-to-end neural model with copy mechanism. In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics. vol. 1. Melbourne, Australia: ACL; 2018. p. 506–514. https://doi.org/10.18653/v1/P18–1047.
Zeng X, He S, Zeng D, Liu K, Liu S, Zhao J. Learning the extraction order of multiple relational facts in a sentence with reinforcement learning. In: Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (EMNLP-IJCNLP). Hong Kong, China: ACL; 2019. p. 367–377. https://doi.org/10.18653/v1/D19–1035.
Kipf TN, Welling M. Semi-supervised classification with graph convolutional networks. arXiv. 2016. https://doi.org/10.48550/arXiv.1609.02907.
Li X, Yin F, Sun Z, Li X, Yuan A, Chai D, et al. Entity-relation extraction as multi-turn question answering. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Florence, Italy: ACL; 2019. p. 1340-1350. https://doi.org/10.18653/v1/P19–1129.
Wei Z, Su J, Wang Y, Tian Y, Chang Y. A novel cascade binary tagging framework for relational triple extraction. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Seattle, USA: ACL; 2020. p. 1476-1488. https://doi.org/10.18653/v1/2020.acl–main.136.
Zamini M, Reza H, Rabiei M. A review of knowledge graph completion. Information. 2022;13(8):396. https://doi.org/10.3390/info13080396.
Article Google Scholar
Nickel M, Tresp V, Kriegel HP. A three-way model for collective learning on multi-relational data. In: the 28th International Conference on Machine Learning. Washington, USA: Omnipress; 2011. p. 809–816. https://doi.org/10.5555/3104482.
Socher R, Chen D, Manning CD, Ng A. Reasoning with neural tensor networks for knowledge base completion. Adv Neural Inf Process Syst. 2013;26:926–34. https://doi.org/10.5555/2999611.2999715.
Article Google Scholar
Nickel M, Rosasco L, Poggio T. Holographic embeddings of knowledge graphs. In: Proceedings of the AAAI Conference on Artificial Intelligence. vol. 30. Phoenix, USA: AAAI Press; 2016. p. 1955-1961. https://doi.org/10.1145/3471165.
Bordes A, Usunier N, Garcia-Duran A, Weston J, Yakhnenko O. Translating embeddings for modeling multi-relational data. Adv Neural Inf Process Syst. 2013;26:2787–95. https://doi.org/10.5555/2999792.2999923.
Article Google Scholar
Yang B, Yih WT, He X, Gao J, Deng L. Embedding entities and relations for learning and inference in knowledge bases. arXiv. 2014. https://doi.org/10.48550/arXiv.1412.6575.
Dettmers T, Minervini P, Stenetorp P, Riedel S. Convolutional 2d knowledge graph embeddings. In: Proceedings of the AAAI conference on artificial intelligence. vol. 32. New Orleans, USA: AAAI Press; 2018. p. 1811-1818. https://doi.org/10.48550/arXiv.1707.01476.
Nguyen DQ, Nguyen TD, Nguyen DQ, Phung D. A novel embedding model for knowledge base completion based on convolutional neural network. arXiv. 2017. https://doi.org/10.18653/v1/N18-2053.
Schlichtkrull M, Kipf TN, Bloem P, Van Den Berg R, Titov I, Welling M. Modeling relational data with graph convolutional networks. In: The Semantic Web: 15th International Conference (ESWC). vol. 10843. Heraklion, Greece: Springer; 2018. p. 593–607. https://doi.org/10.1007/978–3–319–93417–4_38.
Zheng S, Wang F, Bao H, Hao Y, Zhou P, Xu B. Joint extraction of entities and relations based on a novel tagging scheme. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics. Vancouver, Canada: ACL; 2017. p. 1227–1236. https://doi.org/10.18653/v1/P17–1113.
Miwa M, Sasaki Y. Modeling joint entity and relation extraction with table representation. In: Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). Doha, Qatar: ACL; 2014. p. 1858–1869. https://doi.org/10.3115/v1/D14–1200.
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, et al. Attention is all you need. Adv Neural Inf Process Syst. 2017. https://doi.org/10.48550/arXiv.1706.03762.
Article Google Scholar
Veličković P, Cucurull G, Casanova A, Romero A, Lio P, Bengio Y. Graph attention networks. arXiv. 2017. https://doi.org/10.48550/arXiv.1710.10903.
Asunción GP, Mariano FL, Oscar C. Ontological Engineering: with examples from the areas of knowledge management, e-commerce and the semantic web. London: Springer; 2006.
Google Scholar
Doerr M. The CIDOC conceptual reference module: an ontological approach to semantic interoperability of metadata. AI Mag. 2003;24(3):75–92. https://doi.org/10.5555/958671.958678.
Article Google Scholar
Palace Museum in China. Ceramic collection in the Palace Museum in China; 2022. https://www.dpm.org.cn/collection/ceramics.html?fully_types=990861. Accessed 13 April 2022.
Brat annotation program; 2021. https://brat.nlplab.org/index.html. Accessed 15 Dec 2021.
Fu TJ, Li PH, Ma WY. Graphrel: Modeling text as relational graphs for joint entity and relation extraction. In: Proceedings of the 57th annual meeting of the association for computational linguistics. Florence, Italy: ACL; 2019. p. 1409–1418. https://doi.org/10.18653/v1/P19–1136.
Rong X. Word2vec parameter learning explained.arXiv. 2014. https://doi.org/10.48550/arXiv.1411.2738.

Download references

Acknowledgements

We express our deepest gratitude to the Palace Museum in China for the ceramic data, Yaokang Zhu for conducting the data analysis, Qian Li for the interface design, and students in Key Laboratory of Industrial Design and Ergonomics, Ministry of Industry and Information Technology for data collection and annotation. We also sincerely thank the anonymous reviewers whose comments improved this study tremendously.

Funding

This study is supported by the Chinese Special Projects of the National Key Research and Development Plan (2019YFB1405702).

Author information

Authors and Affiliations

Key Laboratory of Industrial Design and Ergonomics, Ministry of Industry and Information Technology, Northwestern Polytechnical University, 710072, Xi’an, China
Y. Yuexin Huang, S. Suihuai Yu & J. Jianjie Chu
School of Industrial Design Engineering, Delft University of Technology, 2628CE, Delft, Netherlands
Y. Yuexin Huang
College of Computer Science and Technology, Zhejiang University, Hangzhou, 310027, China
H. Hao Fan
Ocean Institute, Northwestern Polytechnical University, Taicang, 215400, China
B. Bin Du
Department of Automation, Shanghai Jiao Tong University, Shanghai, 200240, China
B. Bin Du

Authors

Y. Yuexin Huang
View author publications
You can also search for this author in PubMed Google Scholar
S. Suihuai Yu
View author publications
You can also search for this author in PubMed Google Scholar
J. Jianjie Chu
View author publications
You can also search for this author in PubMed Google Scholar
H. Hao Fan
View author publications
You can also search for this author in PubMed Google Scholar
B. Bin Du
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Conceptualisation: YH, SY, JC, and BD; methodology: YH, JC, HF, and BD; analysis: YH and BD; software: YH and BD; data preparation: YH, HF, and BD; writing-original draft preparation: YH; writing-review and editing, YH, SY, JC, HF, and BD; visualisation: YH and BD; supervision: SY and JC; project administration: SY and JC; funding acquisition: SY and JC. All authors read and approved the final manuscript.

Corresponding author

Correspondence to J. Jianjie Chu.

Ethics declarations

Competing interests

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this study.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article

Huang, Y.Y., Yu, S.S., Chu, J.J. et al. Using knowledge graphs and deep learning algorithms to enhance digital cultural heritage management. Herit Sci 11, 204 (2023). https://doi.org/10.1186/s40494-023-01042-y

Download citation

Received: 02 May 2023
Accepted: 02 September 2023
Published: 18 September 2023
DOI: https://doi.org/10.1186/s40494-023-01042-y

Using knowledge graphs and deep learning algorithms to enhance digital cultural heritage management

Abstract

Introduction

Methodology

The framework of digital cultural heritage management using knowledge graphs and deep learning algorithms

Knowledge extraction model

Task description

The framework of knowledge extraction model

BERT encoder

Cascade decoder

Training objective

Knowledge completion model

Task description

The framework of knowledge completion model

Graph attention encoder

Training objective

ConvKG decoder

Case study: The ceramic knowledge graph for the Palace Museum in China

Ceramic ontology definition

Ceramic dataset construction

Knowledge graph extraction

Knowledge graph completion

The knowledge graph application

Discussion

The knowledge graph application for digital cultural heritage management

Limitations of knowledge graphs for digital cultural heritage management

Conclusion

Availability of data and materials

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords