- Research article
- Open Access
An ontological data model for points of interest (POI) in a cultural heritage site
Heritage Science volume 10, Article number: 13 (2022)
Cultural heritage (CH) reflects on the history of a society and its traditions and it is treated as the nation’s memory and identity. Digitizing and web, beside its benefits, brought some challenges in disseminating and retrieving CH information, which has heterogeneous content varying widely in type and properties yet encompassing rich semantic links. Semantic web technologies, especially ontologies, provide a common understanding inside a domain that helps sharing knowledge and interoperability. They can be very helpful in data modeling for a better information retrieval compared to relational databases as they take into account the semantics of information, guarantee reusability, and make information machine-readable that can offer more flexibility to intelligent services and applications. CH community is one of the first domains to make use semantic web technologies to deal with this issue. CIDOC CRM is the most used and famous ontology in CH domain, which is an ISO standard since 2006. Heritage sites are composed of many points of interest that attract visitors to find out about them. However, information about a particular POI is complex and interconnected with other people, events, and objects. In this paper, we aim to develop a POI-based data model for heritage sites in Iran using concepts from CIDOC CRM integrated with GeoSPARQL, the standard ontology in geospatial field, to incorporate spatial semantics with heritage information. This way the user can freely explore their preferred information about the places they desire. This can make it possible to use the data model for location-based services and applications in heritage sites.
After World War II following the destruction of the valuable cultural heritage, there was a substantial need to protect and preserve these monuments. To this end, the UNESCO world heritage convention (WHC) was created in 1972 to identify, register and protect both natural and cultural heritage.Footnote 1 Iran is also one of the countries with a large number of entries in the UNESCO world heritage list (ranked 11th on WHL with more than 20 sites registered and a number of others on the waiting list) . This reflects Iran’s rich culture and the need for more attention and action. Cultural heritage has a high socio-economic potential, including tourism as most prominent example [2, 3]. The problem of cultural resource management cannot be addressed by one organization because of the complexity of the tasks and the vastness of information. Therefore, numerous organizations take control of different parts of the work, but because of differences in provision, updating, and maintenance of data, and various data standards, as well as different data policies, information on cultural heritage are distributed in different organizations, and there are heterogeneity problems among these trustees managing cultural heritage [4,5,6]. Information about a particular heritage site is also complex and interconnected with other people, events, that is to say, cultural and historical heritage are intertwined, and this requires the creation of a structure where all information about a place or (point of interest) POI is provided in an integrated and interconnected manner.
In spite of the massive information available on cultural heritage, the multitude of trustee organizations, and the discrepancies in their tasks and policies, as well as differences in documentation, data collection and storage, and the distribution of the data, resulted in low interoperability thus making it difficult to manage cultural resources, share the knowledge among community and provide services required by various users. Semantic web (SW) technologies and particularly ontology are capable of establishing a common understanding inside a domain . Ontologies formally represent a thorough understanding of a domain by connecting its entities and pieces of information with respect to functions and content of the domain, therefore eliminates informal, partial, and personal terms and viewpoints . This brings a shared understanding of the domain, which makes it easier to share knowledge and increases interoperability among the community . Here we present a list of benefits of using ontologies for knowledge management rather than other traditional methods:
Although relational databases (RDB) are capable to deal with large amounts of data, they are not designed to preserve the semantics of the very data. This makes it difficult to exchange the information and integrate it with others, therefore results in low interoperability [10, 11].
As discussed, ontologies provide a shared vision in a particular domain of interest, which is the key element for sharing knowledge .
An important advantage promised by ontologies is their reusability . It is shown through case studies that once an ontology is developed for a domain it can be used many times since it captures the mechanism and content of that domain and its problems can be solved through continuous evaluations [14, 15]. Possible shortcomings can be solved through extending the domain ontology by further specifications. This reduces the costs of reimplementation of a knowledge management system from scratch .
When a domain knowledge is represented formally by means of a common and shared language, they become understandable not only for humans but also for automated computer systems and web agents . As a result, web services or search engines can improve their performances in terms of fast and accurate information retrieval alongside providing smart, context-aware applications, exploiting the semantically enriched information.
Generally, LBSs are designed for special uses and therefore, the data and the service are tightly coupled through predefined and non-extendable schemas . Another problem is the concept of places within these services. Location is often defined manually beforehand by only its geographic coordinates . However, a place is much further than a set of latitude and longitude. The topology of a place and its semantics should be considered in LBSs to not limit such application to predefined POIs and give it the capability to harvest and integrate various information [20, 21]. In this study, we propose a data model that integrates both CH information and location semantics to support an intelligent location-based user guide for tourists in a heritage site. This enables users to explore places they are interested in based on the spatial semantics. For example, a user entering a complex of heritage sites and museums might ask, “Where is the nearest place that has oil paintings from 1600?” The anticipated system that utilizes the proposed model can provide answer to such semantic questions.
Using semantic web technologies and the knowledge management systems developed, large national and cross-country initiatives are established to collect the distributed heritage data and preserve and present the history in a broader sense and as a whole. This matter is also serious in our country, Iran, which has many tangible and intangible cultural resources at international and national level. This study attempts to design a spatial POI-based data model for heritage sites reusing two standard ontologies, CIDOC CRM and GeoSPARQL. We will use concepts from CIDOC CRM to model the cultural heritage information related to a POI and connect it to GeoSPARQL through a mediation to incorporate spatial semantics. Therefore, the goal of this study is to create a knowledge base for heritage sites, which enables the ontological data model to be used in semantic location-based services (SLBS) applications, such as user-guides and recommender systems and makes information ready to be used in liked data platform.
In recent decades, there has been a worldwide effort in this area of collecting and harmonizing and integrating cultural heritage information through SW technologies and especially ontologies and it is still an ongoing active field. In fact, cultural heritage is one of the first domains to adopt SW tools and recommendations and coming along its evolvement [22,23,24]. It started from simple knowledge organization systemsFootnote 2 (SKOS), like vocabularies and thesauri, for example, the Getty vocabulariesFootnote 3 (AAT, TGN, ULAN, and CONA) that contain structured terminology for art, architecture, decorative arts, cultural and archival materials, visual proxies, the names of geographic places, the names of artists, and bibliographies. Then, there are metadata schemas, such as VRA Core and CDWA. Visual Resource Association (VRA) Core Categories,Footnote 4 developed based on Dublin Core (DCFootnote 5), to describe the visual cultural material as well as the pictorial surrogates that represent and document them. The Categories for the Description of Works of ArtFootnote 6 (CDWA) is a set of procedures also a metadata schema for the description and classification works of art, architecture, groups and collections of works, and related images.
Perhaps the CIDOC Conceptual Reference ModelFootnote 7 is the most widely acknowledged ontology in the CH domain, which provides descriptions and a formal structure for defining the implicit and explicit concepts and relationships used in CH documentation. The CIDOC CRM  is a top-level ontology intended to promote a shared understanding of CH information by providing a common and extensible semantic framework that facilitates the integration, mediation, and exchange of heterogeneous cultural heritage information. It can provide the “semantic glue” necessary to mediate between different sources of CH information, such as items published by galleries, libraries, archives and museums (also called GLAMs). Currently, CIDOC CRM is the only data model that is an ISO standard (ISO 21127:2006) in the CH area.
There are data models that were developed based on CRM in some countries. CRM-EH (English Heritage) was developed by the English Heritage. It was designed with the intention to capture the detailed excavation/analysis procedures . In Korea, Korean Cultural Heritage Data Model (KCHDM) was developed mainly based on CIDOC CRM. It is an ontological model for integrating heterogeneous heritage data from different institutions in Korea and serve as a mediating means for collecting and connecting various database systems . For the CultureSampo (Finnish culture on the semantic web) project, Hyvönen et al. developed national ontology based on the thesauri of their own country in the FinnONTO  project. They employed content independent recommendations of W3C, such as RDF, SKOS, and OWL, but they converted their national ISO abiding thesauri into lightweight ontologies and created the national KOKO ontology infrastructure, which consists of one high level and mediating ontology called YSO and 14 other field specific ontologies . In the EuropeanaFootnote 8 project which aimed to collect, enrich, and provide access to cultural heritage information of institutes all over the Europe, a data model was developed, called European Data Model (EDM). This top-level ontological model was created to replace the older flat Europeana Semantic Elements (ESE) metadata due to general shortcomings of metadata schemas. The model reuses constructs from other standards, such as DC and FOAF,Footnote 9 to which institutions can map their data . MONument Damage Information System (MONDIS) is an ontological framework developed to capture and reason over the built heritage documentation of damages, interventions, changes, and natural disaster occurrences, for diagnosing current condition of the buildings that can be helpful for their conservation . Recently, HEritage Resilience Against CLimate Events on Site (HERACLES) ontology is being developed in the course of a project with the same name. It aims for better management and monitoring of built heritage health by modeling climate change effects and different types of damage it can cause for various type of materials through specific mechanisms. It is still in the early stages, going through tests and awaiting acceptance of experts and stakeholders .
Nowadays, with the advent of smartphones consisting of various sensors (e.g. GPS, accelerometer, gyroscope, compass, and light sensor) context-aware services have attracted a great deal of interest, especially location-based services (LBS) . LBSs offer customized information based on location of the users, giving it an added value . They have also been of interest in CH field for providing recommender systems and user guides. For example, the SMARTMUSEUM project developed a mobile recommender system for users interested in cultural heritage in three outdoor, indoor, and web-based scenarios . The system is built on top of the Finnish KOKO ontology described above, which resulted in better user experience through accurate recommendation. In , a mobile augmented reality (AR) application is presented based on linked open data (LOD) data published within the project that captures CH information for nearby POIs of the user. Kim et al. developed an AR mobile application based on the previously described KCHDM ontology they proposed . They provided multimedia information for three POIs inside a palace using visual location detection method.
After discussing the needs and aims of this study and introducing the fundamentals of the data modelling and the trend in CH domain, this section is dedicated to present the methodology and first design steps of this research towards reaching its goal. First, the chosen study area and selected POIs will be discussed, then the methodology used and related issues of developing the spatially-enabled POI-based ontological data model with required classes and properties will be presented. The section ends with discussing the extraction and applying of the cultural heritage data into the developed data model.
The study area and POIs
The Sa’dabad complex is a historical and cultural complex of palaces built by Qajar dynasty at the beginning of the nineteenth century. After Qajar, kings of Pahlavi dynasty resided in the place and added more palaces to it. The complex covers an area of 110 ha with 180 ha of natural forests, gardens, springs, and rivers in north of Tehran, Iran. It contains 18 palaces, which belonged to royal families. After the 1979 revolution, the place was turned into a set of museums and galleries for public exhibition and is run under the responsibility of Cultural Heritage Organization of Iran. The complex with its many museums and galleries has a vast amount of cultural heritage objects and various information associated with different events occurred during the monarchy period makes this place an appropriate case for this research. Amongst the many palaces in the complex, the Mellat museum and museum of fine arts are chosen as the POIs for this study, which are located at the south side of the complex area shown in Fig. 1.
Mellat palace, also called White palace for its color, is the largest building in the complex with 54 units. It used to be the summer residence of the Mohammad Reza shah, the second king of Pahlavi dynasty. It was also used for official affairs and meetings. Museum of fine arts is another one of the magnificent buildings in the complex. It was used as a royal court from 1968 to 1979, but after the revolution it was called the museum of fine arts because of the great collection of painting from Safavid, Afshar, Zand, and Qajar periods collected by the Mohammad Reza’s last wife, Farah.
Ontological data model development approach
There are numerous approaches for ontology design and development. In this study, we used the steps outlined in  for creating our data model. In Fig. 2, an overview of steps are shown, which will be discussed in following subparts.
Step one: domain and scope
The goal of this study is to create a POI-based data model for heritage sites. It is clear that the biggest part of the domain is CH, however it is not only that. As stated before, we want to give it a capability of spatial reasoning so that it could be used in LBS system, therefore we have to extend the domain and give it a Geospatial scope too. As a result, this data model should involve a mediation between the two domains.
Step two: reusing existing ontologies
As discussed in previous part, scope of our data model involves two domains. Therefore, we reuse an ontology from each domain. We selected CIDOC CRM from CH domain and GeoSPARQL from geospatial domain since they are both ISO standard, verified, validated and used many times in other projects and have got mature over time. The CRMgeo extension has also combined the two GeoSPARQL and CIDOC CRM ontologies . However, there are differences between this extension and the data model intended in this study. The CIDOC CRM is an event-centric ontology and the events take place in a specific place and a specific time, thus a specific spacetime volume . The CRMgeo attempts to use the geospatial standard GeoSPARQL to define the spacetime necessary for historical events. On the other hand, the intention of this study is to develop a data model for historical POIs. Therefore, it combines both object-centric and event-centric modeling visions with the POI in the center. The CIDOC CRM is used to model historical data related to the POIs and GeoSPARQL to add spatial semantics necessary for LBSs.
There are a number of ontologies in the CH domain, but probably the most trusted one is the CIDOC CRM . This data model has been through a long and intensive development process since its first version. By adapting itself to the various needs and functions of the CH community, CIDOC CRM has been used in different projects from large-scale ones to local and small-scale ones. It has become an ISO standard and currently it is the only ISO data model in CH and archaeology fields, therefore it has gained an appropriate level of credibility. In this research, we are going to use concepts from this ontology for developing our data model. At the time of writing, CIDOC CRM is in version 6.2.3Footnote 10 and has 99 classes and 188 properties, which is quite large for this study as it cover all aspects of archeology and cultural heritage. Since we want to develop an data model for presenting multimedia information related to the events, people, and objects of a heritage POI to its visitors, we have to use a light ontology with classes and properties needed based on the available data. The CIDOC CRM ontology has several major concepts shown in Fig. 3. This figure is a qualitative schema of the overall model. CIDOC CRM is based on an event-centric information modeling, which means other classes like persons, concepts, and places are connected to each other via events [25, 40,41,42]. Events are temporal entities that connect other major concepts together as can be seen in the Fig. 3.
Temporal entity is a top class and it contains the concepts like event and activity and it is at the center of this data model being a glue that holds everything attached together. Event is a general concept referring to historical happenings, but activity is a subclass of event and it refers to actions that have been done by human action such as construction, creation, and production. Actors are people, historic and influential figures, or groups that participated in the events. The physical and conceptual objects have specific locations and they witnessed or were influenced by temporal entities. There is a type concept that can be applied to all classes in order to refine the kind of subject in that class. Also, there is another similar concept, appellation, which is all sorts of names that are or were given to a particular entity for referring or identifying it. This is the overall schema of the CIDOC CRM and the other subclasses further elaborate on concepts and add more specific details. The data model that we use in this study, is developed based on the classes and properties of this ontology according to the information gathered.
GeoSPARQL, on the other hand, is an Open Geospatial Consortium (OGC)Footnote 11 standard for modelling, representing, querying and accessing spatial data on the SW [43, 44]. This ontology has three main classes shown in Fig. 4.
Spatial object is the top concept and feature and geometry are its subclasses. Any entity in the world that has a spatial location, such as schools, parks, police stations, museums, etc. can be an instance of the feature. The geometric characteristics of these features are stored in geometry, which can be further defined using simple feature vocabularies, such as point, lineString, polygone, surface, etc. Spatial entities are related to each other somehow, for example, overlap, within, cover, etc. and these relations are called topological relations in spatial science. In GeoSPARQL, the eight 2D topological relations, also called RCC8  or Egenhofer relations , are incorporated for spatial objects, which are essential for spatial semantics.
Step three: enumerate important terms
In this step we have to point out the outstanding terms that are used in information related to POIs in order to decide which classes are needed from ontologies, especially CIDOC CRM. CH data in Iran has several problems. First of all, there has not been any effort for making data machine-readable or linking the data or providing any SPARQL endpoints or raw data dumps. In addition, there is no special portal providing a large amount of data about heritage sites in Iran. The heritage information is mostly at preliminary stages and mostly kept privately by the heritage organization. Fortunately, there is one online database for the Sa’dabad complexFootnote 12 that has a list of objects at each museum with images and textual description for them (Fig. 5). However, there is no uniform and well-structured metadata for the object descriptions and the information is not rich enough. Therefore, we used the manual corpus extraction method to extract information and for identifying the entities in the textual descriptions so that we could select the classes and properties needed for modeling this data from the CIDOC CRM ontology. In addition, with the keywords identified, we searched the web for other related information to aggregate and integrate them together in our data model.
Obviously, the data finding, extraction, and gathering was done manually as there was no standard database or structured data in any form. The manual data discovery has certain limitations. It takes a lot of time and effort to extract the information needed. Nevertheless, it was the only way to collect information, as there is a lack of structured databases and portals or endpoints to ingest from.
Step four: define classes and the class hierarchy
After generating the terms in information, we have to define and select appropriate classes for them. In Fig. 6, the selected classes from CIDOC CRM and GeoSPARQL, the mediating class, and the class hierarchy is shown.
As it can be seen, E1 CRM Entity is the superclass of all the classes from CIDOC CRM. Its existence is necessary in case a property can be attached to all other classes and it can be used as the domain of that property. E5 Event holds the historical events as its individuals. Then there is the E7 Activity, the subclass of E5 Event. The distinction between these two classes is that events bring instantaneous changes of the state, whereas activities are human actions therefore caused by instances of the class E39 Actor and they bring changes to an object. In the CIDOC CRM, E5 Event has three subclasses, while here we didn’t use E64 End of Existence based on the available information. The other subclass used here is E63 Beginning of Existence, whose instances are the events that bring anything into existence. It has its own subclasses like birth but are not used here. Amongst the subclasses of E7 Activity, E10 Transfer of Custody and E12 Production are used. Then there is E24 Physical Man-Made Thing that includes all persistent physical things that are created by human for a purpose. Here we use this class for the buildings of the POIs. Its subclass, E22 Man-Made Object, is used for physical and conceptual objects that are holed in the heritage places. E36 Visual Item comprises all visual things that are recognizable intellectual or conceptual signs, marks, and images. E38 Image is the subclass of the E36 Visual Item that is for the visual objects with form, tone, and color on the surface of photos, paintings, prints, sculptures, or even directly on electronic media. E39 Actor is the classes that holds human beings, individually or in-group, who have intentionally have performed actions for which they can be held responsible. Its subclass, E21 Person, is for the individual real persons who lived or at least are assumed to. E52 Time-span is used to define temporal extent of instances of E5 Event and any of its subclasses that are valid for a certain time. E53 Place comprises extents in space, on the surface of the earth. It has not to be exact coordinates and it is more of a general aspect of place that can be the position of any physical reference. E55 Type is used to categorize and classify instances of all CRM classes. Terms from other thesauri and controlled vocabularies can be used in this sense, which makes this class as an interface that connects CIDOC CRM to other knowledge organization systems. E57 Material is a specialization of the class E55 Type and comprises the concepts of materials. On the other hand, Spatial Object is the superclass of all the classes from GeoSPARQL. Feature and Geometry are the two main classes of GeoSPARQL. Feature is for any entity that has some spatial location, which is the POI in our case. These features have geometric characteristics and they are stored through the class Geometry. Geometry has sixteen subclasses, two of which, point and polygon, are used here. POIs can be represented both by a single point and with a polygon for finer spatial reference. Place feature subclass of feature, is the mediating class that we developed to create a link between the two ontologies, which will be explained later in detail.
Step five: define the properties (slots) of classes and step six: define the facets of the slots
Since defining properties involves define their facets (domain and range), we will discuss both steps in one part. However, before we define properties, there is a need to decide how we are going to design our data model. One of the features of CIDOC CRM ontology is that it has an inverse property for each property. It means that for each property there is another one with the opposite domain and range. This gives us flexibility in designing our data model.
There are two main data modeling approaches in the CH domain, event-centric and object-centric data modeling . When all the information are attached to the object for its description, the modeling method is based on object-centricity. While in event-centric data modeling the information is connected together through events allowing all the events and activities that an object was involved in, to be modeled in a machine-readable way rather than just a textual description of the object and other entities such as actors, time periods, locations and other details related could be linked to the object via events. Also, the chain of activities and changes of the object could be modeled and thus reasonable for machines which is not possible in object-centricity. Therefore, the event-centric modeling is more expressive than the object-centric approach . Since we are using concepts from the CIDOC CRM ontology, our data model would have the event-centric characteristic, however, as the objective of this study is to integrate the related information to a POI as a whole to form a knowledge graph of the place, object-centricity seems to be necessary in our data model. This gives the data model an object-centric feature too; as a result, the data model in this study has both the modeling approaches in this sense (Fig. 6).
As it can be seen, the POI is in the center of the model and other concepts are connected to it, which shows the object-centricity of the model. In addition, the POI is linked to GeoSPARQL concepts via Place feature. The Place concept in CIDOC CRM refers to “immobile” objects such as cities, rivers, buildings, ships, etc. In addition, places can be related to spatial features, which has geometry therefore; we defined the property isFeature to connect E53 Place to the class Place feature. Geometry can be defined in two ways in GeoSPARQL, Well-Known Text (WKT) or Geography Markup Language (GML). As it is obvious, we have used WKT for storing geometry of POIs. Moreover, the POIs are encoded both in point and polygon features. Although LBS applications mostly use simple point geometry, places are polygons in real world. Furthermore, there are some complexes, such as our study area (Sa’adabad), that include many POIs and this can be useful for such cases.
In Table 1, all the properties used in the model are shown with their domain, range, and description. The classes of CRM are denoted by “E” and its properties are denoted by “P”. The inverse properties of CIDOC CRM have an “i” as an indicator. The classes and properties of GeoSPARQL are denoted by “ogc”.
The property “has url” is not a part of CIDOC CRM and it is added to model for linking web resources to the visual entities such as videos and images. As it can be understood from the class hierarchy and the properties of the data model, there can be many other links between classes than what is depicted in the Fig. 7. However, it would be tedious to put all the possible connections in the model figure. Therefore, the main idea of the design of the data model is shown. In the end, the characteristics of the data model are summarized in Table 2.
Step seven: create instances
After designing and constructing the structure of the ontological data model, it is time to populate it with the data. The key information like events and objects related to the POIs were extracted from the official website of the Sa’adabad palace (sadmu.ir) and based on those related multimedia and information that were gathered from the web. These information were organized according to the model in their specific classes and possible links to other classes were made to form the knowledge graph of the POIs. As previously discussed, the necessary design measures were taken to make this model suitable for representing the information and multimedia related to a place as the CIDOC CRM is top-level ontology and is not designed for just a specific purpose. In the Fig. 8, a partial view of the knowledge graph of the museum of fine arts is shown.
The building of the museum of fine arts can be seen as an instance of E24 Physical Man-Made Object and it is linked to an instance of E53 Place that is the place where the building is located. Obviously, it has the same name as the building. Then the place is linked to its point and polygon geometries, which are defined in WGS84 Coordinate Reference System (CRS) with WKT format. The start of the construction of the building is considered as an instance of the class E63 Beginning of Existence with recording of its time as an instance of the E52 Time-Span, which was motivated by the first king of the Pahlavi dynasty, Reza Shah. Its construction had been interfered but with order of the king’s son, Mohammadreza Shah, it was continued. Noticeably, there is no event recorded for this POI as there was no information available of the events occurred in this place. In the graph, three of the instances of the E22 Man-Made Object is shown. Their images representing them are recorded as instances of the E38 Image and thumbnail is added in the graph to suggest that, however, the URL of the image is linked with the property “has url”. The textual descriptions of the items are linked with the other data property “P3 has note”, which are in Persian. To make the items more recognizable for users to choose based on their preference, they were classified using the class E55 Type. In the figure, one of the objects is sculpture and the other one is a painting. In addition the material of the objects are recorded as instances of the E57 Material. The production of the objects are recorded with an instance of the E12 Production so that other related information such as producer, their image, some textual biography of the artist, the time and the place of its occurrence can be linked to it as a benefit of event-centric data modelling. Also, as is evident, the history of the objects can be recorded as a contextual information. For example, the statue “Kolo Dance” has witnessed the 2500th anniversary of the celebration of Persian Empire as it was a gift from the representative of the former Yugoslavia to Iran.
After designing the data model with the needed classes and properties and applying it to the information related to the POIs, it should be formalized in an appropriate syntax. Web Ontology LanguageFootnote 13 (OWL) is developed and recommended by W3C’s Web Ontology Working Group for ontology development. These OWL files of the POIs can be ingested into a triple store for storage, updating and querying and providing an SPARQLFootnote 14 endpoint on a server for potential LBSs to formulate the requests of the user in SPARQL queries to retrieve the information over http and augment it on user’s real-world view. The overall system architecture is presented in Fig. 9 below.
For building the designed data model Protégé tool is used in this research. Protégé is a widely known graphical software for developing and maintaining ontologies. This ontology editor is free and open source and is used in many projects and studies . The screenshot of the data model developed in the Protégé environment is shown in Fig. 10. For each of the POIs, an OWL file containing the ontology and the individuals and instances related to that POI was created. The OWL language is built upon Resource Description FrameworkFootnote 15 (RDF) and RDF schema.Footnote 16 RDF is a web resource description model with the triple structure of subject-predicate-object. However, it does not define meaningful terms thus; it is neutral in this sense. While, OWL covers this drawback to be suitable for ontology formalization. OWL has several syntaxes that are used in various tools for dealing with ontologies. RDF/XMLFootnote 17 and turtle are two RDF based syntaxes for OWL and there are non-RDF based one such as OWL/XML,Footnote 18 the Manchester OWL syntax,Footnote 19 and OWL API. The syntax chosen in this research is RDF/XML since it is widely supported in any tool related to OWL and ontology.
Content of the data model
In the Table 3, an overview of the data and contents ingested into the data model is shown. In this table the number of instances for event, actor, and objects and the multimedia (images, videos, and text) for each POI is shown. The events that have happened in the POI are counted as the events of the place because the user wants to know about events happened in that place. Therefore, events that are not directly related to the place are not included in the count of events of a POI. The instances of person are included in actors since the person is subclass of the actor and they are important in the history of the places. Also, the activities that are directly related to a POI are included in the events.
Triple store and SPARQL endpoint server
The Apache Jena FusekiFootnote 20 server provides a triple store using TDB technology and SPARQL server using SPARQL 1.1 query language support. Therefore, it was used to store the prepared OWL files in RDF/XML syntax and queries were executed against the SPARQL endpoint through the server UI. The endpoint can be used by other applications to issue their queries over HTTP in a REST-style format. An example of a SPARQL query in Fuseki server (version 3.12.0) interface is represented in Fig. 11.
Validating results using semantic queries
When the user selects a desired item to see in the potential tour guide application based on this data model, it has to perform semantic queries on the knowledge base to retrieve the selected information. Based on the properties and links between entities, these queries can be formed. In this study, a scenario is designed for information delivery to a user in a convenient and understandable way so that the user can easily select what type of information they want to access. Therefore, we perform the queries involved in the scenario to see if the data model can return the desired information. Semantic queries are done in SPARQL query language using Apache Jena Fuseki web interface. In Table 4, a list of prefixes for different URLs used in queries are shown. In the first step, when the user is in front of one of the POIs, the anticipated LBS application identifies the place using GPS and other sensors of user’s smart phone. Then, the app could display the name and the image of that place and the three items of events, objects, and actors for the user to select from in a potential UI. The query for the name and the image of a POI is shown in Table 5.
As it can be seen, the query is based on the White Palace knowledge base and the rest of the scenario will continue on this POI. After knowing the name of the place, the user can select from the three choice of event, object, and actor about that place. If the user selects “event”, they should see a list of events that happened in that place. The query for this step is brought in Table 6.
As explained earlier, both the activities and events happened in the POI are considered in selection of “events” by the user. Based on Table 3, White Palace has witnessed three events and the query result includes three events for this POI and no activities.
Then the user might select an event to find out more about it. So the system must be able to navigate the user to other information related to that event such as videos and images of it or the actors involved and other instances like time-span and textual description. Here we assume that the user has chosen the first event in the list (Amuzegar_taking_office) and the query for retrieving its properties is shown in Table 7.
An event can be involved many triples but here the ones from the data model are needed so in the query a filter is employed for this matter. It is evident that there is a P3 property which links the event to its textual description. There are two P11 properties that indicate the actors involved in the event. Also, there are two P67i properties that connect visual items referring to the event. In the end a P4 that describes the time-span of the event. The user can view the instance of any of them by selecting the related property. For making that possible in the SPARQL code above, by putting the “?instances” in front of the SELECT, the instances of the properties would be retrieved and for the visual items, this line of code (?instances spOnto:hasURI ?url) in the WHERE clause is needed to retrieve the URL of the multimedia file. In Table 8, for example, it is assumed that the user wants to see the visual items referring to this event.
When the user chooses “object”, a list of objects currently hold in the POI must be retrieved. However, for objects a prior filtering level can be applied so that users can have a better insight to the objects in the place. Both on the type and material of the items, this filtering can be done. In Table 9, for example, types of objects in the Museum of Fine Arts is queried.
For filtering the objects based on their material, in the where clause above this code (?X spOnto:P45_consists_of ?materials) should be run. Continuing the scenario, we assume that the user has chosen “sculpture”. Then a list of objects type sculpture must be retrieved. The SPARQL code for this step and results are shown in Table 10.
Now the user chooses one of the objects of the above list to find out more about it. In Table 11, it is shown how to retrieve the properties linked to a selected object.
This object has six properties linked to it that two of them are the images that represent the object. The P7i property indicates that the object was involved in an event. The P3 property links the object to its textual description and the P2 indicates its type. Finally, the P45 indicates the material of the object. As previously shown, by selecting “?instances” the system can retrieve the instances that these properties are linked to. However, this is different for the P7i_witnessed property as it links the object to an event that it was present at. This can give user further information about the objects in a place by showing its connection to other concepts in other places. Therefore, it can give the user a complete and through history of the item. In Table 12, finding about this event and information related to it, is shown.
So this object has witnessed the 2500th anniversary of the Pahlavi dynasty ceremony and based on its properties the user can find out more about this event. The P3 property links the event to its textual description that describes how this object is related to the event and other details about the event. As previously shown, the system can retrieve the instances that the properties are linked to by selecting “?instances” and one more query line with hasurl property in the WHERE clause to retrieve the URLs of the visual items referring to the event.
At the end, the user can see a list of people and groups related to the POI by selecting “actor”. As already explained, “person” the subclass of actor is also included in this list. In Table 13, it is shown how to retrieve this list and the results for the Museum of Fine Arts.
Then the user should be able to find out more about each of the results in the list above by selecting it and see how this person or actor is related to the place. We assume that the user has selected “Salvadore_Dali” and in Table 14, it is shown how to retrieve the properties that this instance is involved. Since this person might be involved in an activity or event, there can be two kinds of properties and this instance can be both the range and the domain of a property.
As we can see the two domain properties are the link to a textual description and an image that describe the person and it is shown before how to retrieve these instances. On the other hand, the range property shows that this person was involved in production of an object in the POI. In Table 15, it is shown how the system can retrieve this object and related information.
So it is evident now that the object is “the dream painting” and there are four properties that describe this object. The P2 property denotes the type of the object which is definitely painting. The P3 property is a link to the textual description of the object and the P138i property links the object to its image. The P108i property connects the object to its production activity which was discussed before.
As discussed earlier, we integrated the model with GeoSPARQL, the standard spatial ontology, to give it spatial semantics capability, which is needed in user guide and recommendation systems for heritage sites. Through this integration, it is possible to use various spatial filters and rules such as ogcf:buffer and ogcf:distance and eight 2D topological relations like ogc:sfContains, ogc:sfWithin and ogc:sfOverlaps. Below, there are pseudocodes for two different spatial semantic queries generated using the concepts in the data model developed. In the first one the goal is to return a number of POIs that have, for example, paintings in them and sort them according to their distance from the user (Table 16).
In the next one, the aim is to retrieve POIs that are in a specified buffer around the user (Table 17).
Semantic web has shown great promises in helping CH industry in terms of education, entertainment and business. More and more initiatives are started and supported to promote memory organizations of the society in national and international scale. Iran is a rich country in terms of CH and world heritage sites count. However, there has been almost no research carried out regarding applications of SW in Iran’s CH. The aim of this study was to design and develop a spatially-enabled POI-based data model for heritage sites in Iran by reusing CIDOC CRM, the ISO ontology in CH domain, integrating it with GeoSPARQL, the standard geospatial ontology. The results showed that the data model is capable of incorporating the heritage information of the sites and also spatial semantics related. In addition, the ontological data model was easy to adopt for other POIs, which is a proof of reusability of ontology. This data model could be a vision for LBS applications as it is capable of handling spatial semantics. This notion of combined object and event centric modeling can be applied to other historical places certainly with possible extensions and class modifications to fit to the information related to that place. The information about the POI is modeled using domain ontology. Ontologies can help prevent biased information and such data model allows visitors to explore places based on their preferences freely. Moreover, information can be extended easily as of ontologies main functionality and therefore the information would not be a one-time use only.
We used GeoSPARQL for incorporating the spatial semantics. It should be investigated that what could be other alternatives and their performances should be compared. Furthermore, only the local information was used in the data model. The possibility of integrating and enriching the data model with linked open data and global knowledge bases should be studied. A limitation of this work is that only Persian users were considered and the data model does not support multiple languages also the date format is in Persian calendar, which needs consideration in further studies. For future work, we are going to apply the data model for other POIs in the Sa’adabad complex and develop a location-based user-guide system with friendly UX/UI and use this data model as its backbone and semantic query engine. This way we can further validate our data model and better show the advantages of employing SW in CH.
Availability of data and materials
Data available on request from the authors.
United Nations Educational, Scientific and Cultural Organization
World heritage list
World heritage convention
Point of interest
Simple knowledge organization system
Art & Architecture Thesaurus
Thesaurus of Geographic Names
Union List of Artist Names
Cultural Objects Name Authority
Visual Resource Association
Categories for the Description of Works of Art
- CIDOC CRM:
International Committee for Documentation of the International Council of Museums Conceptual Reference Model
Galleries, libraries, archives and museums
Korean Cultural Heritage Data Model
World Wide Web Consortium
Resource Description Framework
Web Ontology Language
European Data Model
Europeana Semantic Elements
Friend of a Friend
MONument Damage Information System
HEritage Resilience Against CLimate Events on Site
Global Positioning System
Semantic location-based services
Open Geospatial Consortium
Geography Markup Language
Coordinate Reference System
Universal Resource Locator
Universal Resource Identifier
Application Programming Interface
EXtensible Markup Language
User Experience/User Interface
Roque MI, Forte MJ. Heritage tourism in Iran. In: Experiencing Persian heritage. Emerald Publishing Limited: Bingley; 2019. p. 29–42.
Ashworth G, Page SJ. Urban tourism research: recent progress and current paradoxes. Tour Manag. 2011;32:1–15. https://doi.org/10.1016/j.tourman.2010.02.002.
Timothy DJ, Boyd SW. Heritage tourism in the 21st century: valued traditions and new perspectives. J Herit Tour. 2006;1:1–16. https://doi.org/10.1080/17438730608668462.
Messaoudi T, Véron P, Halin G, De Luca L. An ontological model for the reality-based 3D annotation of heritage building conservation state. J Cult Herit. 2018;29:100–12. https://doi.org/10.1016/j.culher.2017.05.017.
Doerr M. The CIDOC CRM, an ontological approach to schema heterogeneity. In: Dagstuhl seminar proceedings. Schloss Dagstuhl-Leibniz-Zentrum für Informatik; 2005.
Amin A, van Ossenbruggen J, Hardman L, van Nispen A. Understanding cultural heritage experts’ information seeking needs. In: Proceedings of the 8th ACM/IEEE-CS joint conference on digital libraries. New York: Association for Computing Machinery; 2008. p. 39–47.
Fensel D. Ontologies. In: Fensel D, editor. Ontologies: a silver bullet for knowledge management and electronic commerce. Berlin: Springer; 2001. p. 11–8.
Chandrasekaran B, Josephson JR, Benjamins VR. What are ontologies, and why do we need them? IEEE Intell Syst Appl. 1999;14:20–6. https://doi.org/10.1109/5254.747902.
Edgington T, Choi B, Henson K, et al. Adopting ontology to facilitate knowledge sharing. Commun ACM. 2004;47:85–90. https://doi.org/10.1145/1029496.1029499.
Martinez-Cruz C, Blanco IJ, Vila MA. Ontologies versus relational databases: are they so different? A comparison. Artif Intell Rev. 2012;38:271–90. https://doi.org/10.1007/s10462-011-9251-9.
Hert M, et al. Relational databases as semantic web endpoints. In: Aroyo L, Traverso P, Ciravegna F, et al., editors. The semantic web: research and applications. Berlin: Springer; 2009. p. 929–33.
Daraio C, Lenzerini M, Leporelli C, et al. The advantages of an ontology-based data management approach: openness, interoperability and data quality. Scientometrics. 2016;108:441–55. https://doi.org/10.1007/s11192-016-1913-6.
Keil JM, Schindler S. Comparison and evaluation of ontologies for units of measurement. Semant Web. 2019;10:33–51. https://doi.org/10.3233/SW-180310.
Pinto HS, Martins JP. Reusing ontologies. In: AAAI 2000 spring symposium on bringing knowledge to business processes. Karlsruhe: AAAI; 2000. p. 7.
Bontas EP, Mochol M, Tolksdorf R. Case studies on ontology reuse. Citeseer.
Simperl E. Reusing ontologies on the semantic web: a feasibility study. Data Knowl Eng. 2009;68:905–25. https://doi.org/10.1016/j.datak.2009.02.002.
Caldarola EG, Rinaldi AM. An approach to ontology integration for ontology reuse. In: 2016 IEEE 17th international conference on information reuse and integration (IRI); 2016. p. 384–93.
Yus R, Bobed C, Mena E. A knowledge-based approach to enhance provision of location-based services in wireless environments. IEEE Access. 2020;8:80030–48. https://doi.org/10.1109/ACCESS.2020.2991051.
Uzun A, Salem M, Küpper A. Semantic positioning—an innovative approach for providing location-based services based on the web of data. In: 2013 IEEE seventh international conference on semantic computing; 2013. p. 268–73.
Jiang L, Yue P, Guo X. Semantic location-based services. In: 2016 IEEE international geoscience and remote sensing symposium (IGARSS); 2016. p. 3606–9.
Ilarri S, Illarramendi A, Mena E, Sheth A. Semantics in location-based services. IEEE Internet Comput. 2011;15:10–4. https://doi.org/10.1109/MIC.2011.156.
Noor S, Shah L, Adil M, et al. Modeling and representation of built cultural heritage data using semantic web technologies and building information model. Comput Math Organ Theory. 2019;25:247–70. https://doi.org/10.1007/s10588-018-09285-y.
Liu D, Bikakis A, Vlachidis A, et al. Evaluation of semantic web ontologies for modelling art collections. In: Kirikova M, Nørvåg K, Papadopoulos GA, et al., editors. New trends in databases and information systems. Cham: Springer International Publishing; 2017. p. 343–52.
Hyvönen E. Publishing and using cultural heritage linked data on the semantic web. Synth Lect Semant Web Theory Technol. 2012;2:1–159.
Doerr M. The CIDOC conceptual reference module: an ontological approach to semantic interoperability of metadata. AI Mag. 2003;24:75. https://doi.org/10.1609/aimag.v24i3.1720.
Cripps P, Greenhalgh A, Fellows D, et al. Ontological modelling of the work of the centre for archaeology. CIDOC CRM Tech Pap. 2004.
Kim S, Ahn J, Suh J, et al. Towards a semantic data infrastructure for heterogeneous cultural heritage data—challenges of Korean Cultural Heritage Data Model (KCHDM). In: 2015 digital heritage; 2015. p. 275–82.
Hyvönen E, Viljanen K, Tuominen J, Seppälä K. Building a national semantic web ontology and ontology service infrastructure—the FinnONTO approach. In: Bechhofer S, Hauswirth M, Hoffmann J, Koubarakis M, editors. The semantic web: research and applications. Berlin: Springer; 2008. p. 95–109.
Mäkelä E, Hyvönen E, Ruotsalo T. How to deal with massively heterogeneous cultural heritage data—lessons learned in CultureSampo. Semant Web. 2012;3:85–109. https://doi.org/10.3233/SW-2012-0049.
Doerr M, Gradmann S, Hennicke S, et al. The Europeana data model (EDM). In: World library and information congress: 76th IFLA general conference and assembly; 2010. p. 15.
Cacciotti R, Valach J, Kuneš P, et al. Monument damage information system (MONDIS): an ontological approach to cultural heritage documentation. ISPRS Ann Photogramm Remote Sens Spat Inf Sci. 2013;5:W1.
Hellmund T, Hertweck P, Hilbring D, et al. Introducing the HERACLES ontology—semantics for cultural heritage management. Heritage. 2018;1:377–91. https://doi.org/10.3390/heritage1020026.
Schiller J, Voisard A. Location-based services. London: Elsevier; 2004.
Ruotsalo T, Haav K, Stoyanov A, et al. SMARTMUSEUM: a mobile recommender system for the Web of Data. J Web Semant. 2013;20:50–67. https://doi.org/10.1016/j.websem.2013.03.001.
van Aart C, Wielinga B, van Hage WR. Mobile cultural heritage guide: location-aware semantic search. In: Cimiano P, Pinto HS, editors. Knowledge engineering and management by the masses. Berlin: Springer; 2010. p. 257–71.
Kim H, Matuszka T, Kim J-I, et al. Ontology-based mobile augmented reality in cultural heritage sites: information modeling and user study. Multimed Tools Appl. 2017;76:26001–29. https://doi.org/10.1007/s11042-017-4868-6.
Noy NF, McGuinness DL. Ontology development 101: a guide to creating your first ontology. 2001.
Hiebel G, Doerr M, Eide Ø. Integration of CIDOC CRM with OGC standards to model spatial information. In: 41st computer applications in archaeology and quantitative methods in archaeology conference CAA; 2013. p. 303–10.
Hiebel G, Doerr M, Eide Ø. CRMgeo: a spatiotemporal extension of CIDOC-CRM. Int J Digit Libr. 2017;18:271–9. https://doi.org/10.1007/s00799-016-0192-4.
Lin C-H, Hong J-S, Doerr M. Issues in an inference platform for generating deductive knowledge: a case study in cultural heritage digital libraries using the CIDOC CRM. Int J Digit Libr. 2008;8:115–32. https://doi.org/10.1007/s00799-008-0034-0.
Bountouri L, Gergatsoulis M. The semantic mapping of archival metadata to the CIDOC CRM ontology. J Arch Organ. 2011;9:174–207. https://doi.org/10.1080/15332748.2011.650124.
Araújo C, Martini RG, Henriques PR, Almeida JJ. Annotated documents and expanded CIDOC-CRM ontology in the automatic construction of a virtual museum. In: Rocha Á, Reis LP, editors. Developments and advances in intelligent systems and applications. Cham: Springer International Publishing; 2018. p. 91–110.
Battle R, Kolas D. Geosparql: enabling a geospatial semantic web. Semant Web J. 2011;3:355–70.
Open Geospatial Consortium. OGC GeoSPARQL—a geographic query language for RDF data. http://www.opengis.net/doc/IS/geosparql/1.0. Document 11-052r4. 2012.
Randell DA, Cui Z, Cohn AG. A spatial logic based on regions and connection. KR. 1992;92:165–76.
Egenhofer MJ. Reasoning about binary topological relations. In: Günther O, Schek H-J, editors. Advances in spatial databases. Berlin: Springer; 1991. p. 141–60.
Dijkshoorn C, Aroyo L, van Ossenbruggen J, Schreiber G. Modeling cultural heritage data for online publication. Appl Ontol. 2018;13:255–71. https://doi.org/10.3233/AO-180201.
Noy NF, Crubézy M, Fergerson RW, et al. Protégé-2000: an open-source ontology-development and knowledge-acquisition environment. In: AMIA... annual symposium proceedings. AMIA Symposium; 2003. p. 953.
We would like to express our deepest gratitude to the anonymous reviewers, whose comments improved this work tremendously.
This research was financially supported by the Ministry of Science and ICT (MSIT), Korea, under the Information Technology Research Center (ITRC) support program (IITP-2021-2016-0-00312) supervised by the Institute for Information & communications Technology Planning & Evaluation (IITP) and the Ministry of Trade, Industry and Energy (MOTIE) and Korea Institute for Advancement of Technology (KIAT) through the International Cooperative R&D program. (Project No. P0016038).
The authors claim there is no conflict of interest.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Ranjgar, B., Sadeghi-Niaraki, A., Shakeri, M. et al. An ontological data model for points of interest (POI) in a cultural heritage site. Herit Sci 10, 13 (2022). https://doi.org/10.1186/s40494-021-00635-9
- Cultural heritage
- Semantic web
- Spatial semantics