Skip to main content

Scene clusters, causes, spatial patterns and strategies in the cultural landscape heritage of Tang Poetry Road in Eastern Zhejiang based on text mining


The burgeoning field of digital humanities has provided important modern technological means for text mining in literary works. Chinese classical poetry, as a treasure in the world’s artistic treasury, holds significant value in recognizing the heritage of world culture. In this study, taking the 1589 Tang poems from the Tang Poetry Road in Eastern Zhejiang as an example, we constructed a research framework that explores the aesthetics of classical Chinese poetry landscapes and spatial imagery at the urban agglomeration scale by utilizing geographic and analytical tools such as Python programming, Gephi co-occurrence semantic networks, and GIS kernel density analysis. The framework exhibits three key innovations: (1) a text processing approach that treats individual characters as semantic units in ancient poetry texts, (2) a combined approach of Python programming techniques and Gephi visualization tool for social network analysis, and (3) a study focusing on the integration of textual and spatial aspects of literary landscape heritage corridors at the urban cluster scale. The constructed framework greatly enhances the efficiency and accuracy of Tang poetry text mining, it enables the extraction of natural and cultural landscape spatial imagery along the Tang Poetry Road, the construction of scene depictions, the identification of key regions within the scenes, and the derivation of location-specific strategies. This study broadens the scope of exploring the cultural heritage value of Tang poetry literature and provides practical guidance for the development of cross-regional heritage corridors.


In comparison to traditional approaches, Digital humanities [1] emphasizes the comprehensive analysis of vast repositories of humanistic and social science resources, with a keen focus on uncovering macroscopic and trend-driven patterns. Currently, digital humanities has sparked a vibrant intellectual revolution, heralding a reinvigorated mode of knowledge production [2]. Statistical data reveals that there are currently over 180 distinguished institutions or projects worldwide bearing the esteemed moniker of “digital humanities,” and scholars from diverse domains, including archaeology, history, literature, and art, are actively propelling the advancement and prominence of this field [3,4,5,6,7]. As an essential component of digital humanities research, text mining leverages information retrieval and extraction, computational linguistics, and natural language processing techniques to enable the efficient conversion of unstructured textual data and the fruitful extraction of valuable insights [8].

Numerous scholars consider literary works as crucial materials for digital humanities research. For instance, Baumard explored the cultural evolution of love in literary history by establishing a database of ancient world literature novels [9]. Hussain conducted textual analysis of ancient Malay manuscripts, folklore, and traditional poetry to uncover the landscape elements of Malay culture [10]. Alex employed natural language processing techniques to create a geographical resolver and validated it using Edinburgh literary texts [11]. These studies demonstrate that literary works serve as significant carriers that reflect landscape features, human activities, and local elements. In this regard, China's ancient poetry, with its rich history spanning three thousand years and a vast corpus of millions of samples, undoubtedly holds a significant position within the world's literary treasury. Undertaking digital humanities research using Chinese classical poetry as the material holds crucial significance in uncovering the value of world cultural heritage. However, compared to Indo-European literary traditions, Chinese ancient poetry, which belongs to the Sino-Tibetan language family, often relies on word order and function words as the primary means of expressing grammatical meaning [12]. This characteristic consequently brings about complex variations in word meanings, posing substantial challenges to the quantitative mining of knowledge and information in classical poetry texts.

In response to this challenge, several studies focusing on text mining of classical Chinese poetry have been undertaken. For instance, Xu [13] analyzed landscape information in historical poetry texts from Mount Lu, investigating the multi-scale spatial patterns of landscapes and their influencing mechanisms. Xi [14] used cluster network visualization techniques to establish social network relationships between poets and scenic spots using the Tang poetry texts from the Eastern Zhejiang Tang Poetry Road. Barbado [15] employed NLP methods to conduct sentiment and semantic analysis of Spanish sonnets, assigning psychological and emotional labels to each poem. However, these studies still have issues that merit attention in terms of pre-processing of classical poetry word entries, text clustering visualization, and integration of textual and spatial aspects:

  1. (1)

    Standardizing the term entries for classical poetry, which involves merging multiple forms of term entries with the same semantic meaning, is one of the crucial preprocessing steps in text mining [16]. This step effectively avoids redundant and loose parameters while enhancing the statistical significance of the results. Current research mostly involves creating a word entry database for classical poetry texts through word segmentation and then manually determining the “many-to-one” correspondence between source terms and target terms [17,18,19]. This approach requires a considerable amount of effort and lacks generalizability.

In ancient Chinese, particularly in the composition of classical poetry, monosyllabic words at the character level account for over 80% of the vocabulary [20]. Furthermore, in term entries, nouns primarily provide semantic sources, while words expressing color, quantity, direction, and other qualifiers serve as auxiliary components [21]. Thus, we believe that single-character nouns are key to capturing the core semantics of classical poetry. Extracting single-character nouns from poetic lines attains an effect tantamount to the standardization of specialized vocabulary in classical poetry. Furthermore, the implementation of this process depends on computer techniques such as machine segmentation and part-of-speech tagging. In summary, our research introduces an innovative approach that replaces the manual merging of variant term entries with the automated extraction of monosyllabic nouns. This approach greatly enhances the efficiency and reusability of preprocessing classical poetry corpora, offering a more automated and robust solution.

  1. (2)

    Dividing a text collection into clusters based on word associations and visualizing the text clustering in a network form has become a significant achievement in text mining research [22]. Previous studies have often utilized tools like ROST CM6 to visualize semantic feature networks [23, 24]. However, when dealing with large-scale text data, ROST CM6 struggles to effectively represent dense network nodes and complex network connections. In comparison, leveraging technologies such as OpenGL, UI, and Netbeans, Gephi provides 12 visualization layout models and 16 built-in network algorithms, allowing smooth handling of networks with millions of elements [25]. Particularly, the models optimized through community detection algorithms and the Fruchterman-Reingold layout algorithm offer stronger visual and analytical capabilities for clustering [26]. Therefore, we use the Gephi network visualization tool for the visual analysis of complex text clustering.

  2. (3)

    The spatial attributes of textual information have emerged as a focus in processing historical text semantics [27,28,29]. This visualization holds potential for guiding the preservation of historical cultural heritage, the revitalization of urban and rural aesthetics, and the development of distinctive tourism industries [30]. Previous studies on the integration of spatial information in ancient poetry texts have often focused on exploring relevant poems at the scale of specific scenic spots to uncover local landscape characteristics [31, 32]. However, there has been limited research that conducts cross-regional comparisons of literary landscape characteristics at the scale of urban clusters. Hence, this study seeks to contribute to the comprehensive coordination and zoning control of the natural and cultural aesthetics in the eastern Zhejiang region by contrasting the Tang poetry landscape features in three cities within the region.

Therefore, this research embarks on an exploration from the perspective of landscapes, utilizing the poetic works of the Tang Dynasty along the Tang Poetry Road in eastern Zhejiang as its corpus. Through a series of steps including text acquisition, text preprocessing, text flow linguistics processing, text flow mathematical processing, feature extraction, and feature selection (Fig. 1), it aims to unearth and present the natural and cultural heritage landscapes depicted in Tang poems of eastern Zhejiang. The research objectives are as follows: (1) to uncover the elements of natural and cultural landscapes within the Tang poetry texts and construct a database of Tang poetry landscape elements specific to eastern Zhejiang, (2) to extract the aesthetic imagery of Tang poetry landscapes in Shaoxing, Taizhou, and Ningbo cities, thereby promoting the visual transmission of poetic aesthetic heritage, and (3) to identify key regions constituting distinctive characteristics of landscape imagery and conduct causal analyses, thereby guiding the development of differentiated landscapes in different regions.

Fig. 1
figure 1

Text mining process flowchart


To establish a semantic system of poetic landscapes from a landscape perspective, this study presents a research framework (Fig. 2) for mining literary landscapes in Tang poetry corpora. The framework consists of three main components: collection and preprocessing of original texts, extraction of scenic imagery, and spatial analysis. These components culminate in the derivation of research conclusions.

Fig. 2
figure 2

Research framework for mining Tang poems based on NLP techniques

Research area

The Tang Poetry Road stretches over 300 km, covering an approximate area of 30.534km2, it is regarded as a significant cultural trail like of the Silk Road and the Ancient Tea Horse Road, possessing substantial natural and cultural heritage value. Therefore, the research focuses on the Eastern Zhejiang, which includes parts of Hangzhou, Ningbo, Shaoxing, Zhoushan, and Taizhou administrative regions [33] (Fig. 3). However, the construction of the poetry road is still faced with low resource-sharing and utilization among counties and cities, lack of integration of culturally characteristic pearls, weak brand appeal, and insufficient influence [34].

Fig. 3
figure 3

Source: Base maps from standard maps. (

Research Area—eastern Zhejiang.

Data source

A collection of 1,589 Tang Poems from eastern Zhejiang , by 376 poets was used as the source material. Two hundred and eighty-three (283) poems with no clear geographic information were excluded as well as 11 poems written in Zhoushan and Hangzhou with strong text contingency. In the end, the research material consisted of 888 poems written in Shaoxing, 311 poems written in Taizhou, and 85 poems written in Ningbo (Additional file 1).

Research methods

Text mining of Tang poems based on character-level segmentation techniques

The Python Jieba module was utilized for named entity recognition, followed by the implementation of code using functions such as Replace, Collections, and Posseg. These functions facilitated character-level segmentation, frequency analysis, and part-of-speech tagging. Subsequently, low-frequency characters appeared less than twice and meaningless characters were removed to generate a collection of single characters representing Tang poem landscapes. Finally, following the method used by Zhang Zheng [34], and Li Xianfeng [19], we categorized 596 scenic elements and 1038 historical allusions of Tang poems into eight categories: skyscape, landscape, waterscape, animals, plants, people, construction, and allusions (Fig. 4) (Additional files 2 and 3).

Fig. 4
figure 4

Chinese character segmentation process flowchart

Co-occurrence semantic network based on Fast-Newman algorithm and Fruchterman-Reingold layout algorithm

A co-occurrence matrix was constructed based on the frequency of co-occurring single-character elements in each poem after cleaning. By using the modular operation, average weighted degree calculation , and filtering functions of the Gephi 0.10.1 software, the scenic elements were automatically clustered to generate a co-occurrence semantic network. This allowed the extraction of the Tang poetry scenes from Shaoxing, Ningbo, and Taizhou cities, respectively. The Fast-Newman algorithm (F-N algorithm) was then used to evaluate the quality of the clustering. The core indicator of the F-B algorithm, the modularity (Q-value), measured the goodness of the clustering. The formula is as follows:

$$Q = \frac{1}{{2{\text{m}}}}\sum\limits_{vw} {\left[ {\mathop A\nolimits_{vw} - \frac{{\mathop k\nolimits_{v} \mathop k\nolimits_{w} }}{2m}} \right]} \delta \left( {\mathop c\nolimits_{v} ,\mathop c\nolimits_{w} } \right)$$

where n represents the number of nodes in the complex network, m represents the number of connections between nodes, v and w represent any two nodes in the network, and Avw equals 1 if there is an edge between the two nodes; otherwise, Avw equals 0. Cv represents the community to which node v belongs, Kv and Kw represent the degrees of nodes v and w respectively, and δ(Cv, Cw) determines whether nodes v and w belong to the same community, with a value of 1 if they belong to the same community, and 0 otherwise. The range of the Q-value was between -0.5 and 1. When the Q-value is between 0.3 and 0.7, the clustering effect is good (Fig. 5).

Fig. 5
figure 5

Co-occurrence semantic network principle diagram

The Fruchterman-Reingold layout algorithm (F-R layout algorithm) is an improved energy model that is a generalization of the spring model. It reduces line crossings and achieves an overall uniform layout by combining aesthetic standards, thus suitable for most network datasets. The formula is as follows:

$$E_{{\text{s}}} = \sum\limits_{i = 1}^{n} {\sum\limits_{j = 1}^{n} \frac{1}{2} } k(d(i,j) - s(i,j))^{2}$$
$$E = \mathop E\nolimits_{{\text{s}}} + \sum\limits_{i = 1}^{n} {\sum\limits_{j = 1}^{n} {\frac{{rw_{i} w_{j} }}{{d(i,j)^{2} }}} }$$

where i and j represent two nodes, d(i, j) represents the Euclidean distance between the two points, s(i, j) represents the natural length of the spring, k is the spring force coefficient, r is the constant of the Coulomb force between the two points, and w is the weight between the two points.

Spatial differentiation characteristics based on Kernel density estimation (KDE)

Batch input of the location information of poems in the Complete Works of Tang Poetry on the Tang Poetry Road was crawled using the Python Requests module. From the locations associated with the 1,284 Tang poems from the Baidu API we constructed an integrated dataset of Tang poetry text spatial information for various scene categories. The GIS kernel density analysis tools calculated the spatial differentiation features of the Tang Poetry Road's scene imagery in Zhejiang province. The KDE is a non-parametric method for probability density estimation. It models the probability density function based on the observation data itself, without making any distribution assumptions for the problem of unknown observed data distribution. The formula is as follows:

$$\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{f} (y) = \frac{1}{nh}\sum\limits_{i = 1}^{n} {k(\frac{{y - \mathop Y\nolimits_{i} }}{h})} ,y \in R$$

Here, n represents the number of samples of known independent identically distributed random variables. K(·) is the kernel function that determines the effect of each sample data point Yi(i = 1,2,…,n) when estimating the density of the random variable y, and h is the bandwidth that affects the smoothness of the probability density estimation.


Extraction of spatial image of Poetry Road

Through text network analysis, a category cluster graph of natural and cultural sources was obtained for Shaoxing, Taizhou, and Ningbo cities (Fig. 6) (Additional file 4). The size of the nodes and text corresponded to the weighted degree of the natural and cultural landscape elements, i.e., the importance of the element in the text. The thickness of the lines represented the edge weight between the source and target elements, i.e., the degree of closeness of the correlation between a pair of elements. Finally, the different colors between elements represented different clusters, and the correlation between elements within the same cluster was closer. To further clarify the specific direction of the sources of imagery, the corresponding specific imagery was extracted in the same way to obtain a cluster graph of single characters and words. Finally, based on this cluster of Tang poems, categories of behavior and emotion were extracted to obtain a behavior-emotion word frequency analysis, ultimately achieving a systematic excavation of natural and cultural sources, behavior, and psychology (Figs. 7, 8, 9, 10, 11, 12, 13, 14 and 15) (Additional file 5). According to the variables in Table 1, Shaoxing City had the highest number of surviving poems in the Tang Poetry Road in eastern Zhejiang, with the most clusters and nodes and the highest average weighted degree, while Ningbo had the least (less than 10%). Specifically, Kuaiji ancient prefecture in Yuezhong (now the main urban area of Shaoxing City), as the Southeastern Metropolis had an unprecedentedly prosperous economy and a convergence of people and traffic, thus making it the most densely distributed area of Tang poetry. Taizhou and Ningbo scored secondary core points due to the breakthrough development of transportation and commerce in their prefectural government Tiantai County and county-level residence Yuyao, respectively [35].

Fig. 6
figure 6

Co-occurrence semantic network of natural-cultural landscape units in each city

Fig. 7
figure 7

Natural-cultural-behavioral psychological element analysis of Shaoxing City (cluster 1)

Fig. 8
figure 8

Natural-cultural-behavioral psychological element analysis of Shaoxing City (cluster 2)

Fig. 9
figure 9

Natural-cultural-behavioral psychological element analysis of Shaoxing City (cluster 3)

Fig. 10
figure 10

Natural-cultural-behavioral psychological element analysis of Shaoxing City (cluster 4)

Fig. 11
figure 11

Natural-cultural-behavioral psychological element analysis of Taizhou City (cluster 5)

Fig. 12
figure 12

Natural-cultural-behavioral psychological element analysis of Taizhou City (cluster6)

Fig. 13
figure 13

Natural-cultural-behavioral psychological element analysis of Taizhou City (cluster7)

Fig. 14
figure 14

Natural-cultural-behavioral psychological element analysis of Ningbo City (cluster 8)

Fig. 15
figure 15

Natural-cultural-behavioral psychological element analysis of Ningbo City (cluster9)

Table 1 The status quo of Tang Poetry Road Remains in each city

The spatial image of Poetry Road in Shaoxing City

Shaoxing City had four clusters, which were ranked according to influence based on the average weighted degree as follows: Cluster 1 (424) > Cluster 3 (204) > Cluster 2 (174) > Cluster 4 (67) (Figs.7, 8, 9 and 10).

Cluster 1

The most common behavior in Cluster 1 in Shaoxing was meeting friends (18%), often accompanied by extroverted emotions such as free and unfettered and joyful as well as introverted emotions such as yearning and loneliness (> 5%). Further exploration revealed that skyscapes, waterscapes, plants, and landscape elements such as clouds, springs, smoke, water, streams, creeks, flowers, grass, pines, mountains, rocks, and dust were the most prominent features (average weighted degree is more than 424). Specifically, these were represented by images such as white clouds, spring breeze, misty clouds, running water, Shan Creek, peach blossoms, fragrant grass, the sound of pines, green mountains, white stones, and dust-free. This suggested that spring, represented by images such as peach blossoms and fragrant grass, was the ideal time for poets to engage in outdoor activities such as meeting friends, poetry recitation, scaling heights, and angling, which triggered outgoing emotions such as free and unfettered and joyful. Meanwhile, typical images of animals, people, construction, and allusions included white cranes, mountain birds, ape cries, and human world, fairies, Zhigong, mountain temples, pine paths, monk's rooms, and Liu Lang and Wuling, which resonated with the legends of immortals such as “Liu Ruan Meets Immortals” and “Peach Blossom Spring” in related studies of the Tang Poetry Road in eastern Zhejiang [36]. This deepened the connotations of peach blossoms and the artistic conception of spring and inspired more inward emotions.

Cluster 2

Spiritual practice (43%) was the most frequently recurring behavior in Cluster 2, often accompanied by introverted emotions such as quiet, worried, and ethereal (> 5%). Further investigation revealed that the features of allusions, people, landscape, and constructions such as Taoist Zen words, mount pungent, and fairy feather man, as well as the themes of guests, Tao, master and valley, caves, fields, and gates, mansions, palaces, were particularly evident (average weighted degree is more than 174). These features were specifically manifested as happy excursion, Penglai mountain, and immortal, as well as Zen guest and Taoist master, caves, flat farmland; out of doors, county offices, and dragon palace. Caves were the primary places for spiritual cultivation in Taoist beliefs. Additionally, typical images of skyscapes, waterscapes, and animals, such as early morning, the morning sun, water source, and spirit birds, together formed the imagery of dawn spiritual practices.

Cluster 3

The term farewell (29%) was a unique and predominant behavior in Cluster 3 and was often accompanied by extroverted emotions such as romantic, free, and unfettered and introverted emotions such as loneliness, and melancholy (> 5%). Further exploration revealed that skyscapes, and waterscapes including wind, moon, autumn; river, sea, lake, tide, and characters and architectural elements such as gentlemen, boats, barca, and towers were particularly prominent features (average weighted degree is more than 204). These were reflected in the phrases autumn wind, wind and rain, bright moon, and river, sea, sea tides, Jinghu lake, and gentlemen’s meeting, gentleman’s recollection and small boat, expeditionary sail, and gate tower which created typical cultural images that symbolized the loneliness and hardships of being away from home. Meanwhile, the terms both shores, single hill, and sand cay in landscapes, cicadas sound, autumn goose in animals, and red leaf, lotus and flower in plants suggested the seasonal change of autumn. The representatives of the Wei and Jin Dynasty's elegance and suavity, such as Xie An, and Wang Youjun had become important subjects for poets to sing and write about.

Cluster 4

The phrase exploring historic landmarks (24%) was the most common behavior in Cluster 4. The phrase market trading was a unique behavior and often accompanied by extroverted emotions such as refreshment, and leisure time, as well as introverted emotions such as melancholy and loneliness (> 5%). Further exploration revealed that the characteristic elements of construction and allusions such as city, house, pavilion, and carriage, as well as Emperor Yu, and Xi Shi were more prominent (average weighted degree is more than 67). They were specifically reflected in phrases like city walls, neighboring households, palaces, village gate and Dayu Cave, and washing gauze describing the status of the southeast metropolis and the prosperity of urban and rural areas in the Tang Dynasty, thus triggering extroverted emotions of refreshment and leisure. Meanwhile, the region of Kuaiji in eastern Zhejiang was the birthplace and rise of the Xia culture and the establishment of the Yue state's hegemony. Yu's widespread relics of the tomb, temple, cave, well, and the Xishi Mountain were important materials poets used to express their feelings. The typical images of the star, Dayu Cave, mountains and rivers, river fish, peach branches, and natives of Wu together rendered the vastness of time and space and strengthened the charm of the Wu and Yue cultures.

The spatial image of Poetry Road in Taizhou City

Taizhou City had three clusters, which were ranked according to their influence based on the average weighted degree as follows: Cluster 5 (126) > Cluster 6 (124) > Cluster 7 (72) (Figs. 11, 12 and 13).

Cluster 5

The term farewell (17%), a behavior clustered in Group 5, was mainly associated with introverted emotions such as yearning and melancholy (> 5%). Further investigation revealed that landscape, skyscape, and waterscape elements such as mountain, road, ridge; wind, moon, night, autumn; sea, spring, river, and tide were particularly prominent features (average weighted degree is more than 126). They were specifically expressed as lofty mountain, long road, green mountain; wind blowing, moonlight, midnight, autumn in mountain and seaside, waterfall and spring, great river, and wave tide. Together, they depicted the wild grandeur of the Zhejiang Tide and set off a distant and melancholy atmosphere. Meanwhile, typical images of animals, plants, people, construction, and allusions such as ape crying, carp; pine branch, tree of heaven; zen monk, gentlemen’s send-off and Ruan Ji’s wine jointly created a tranquil and profound environment, with Ruan Ji’s wine implying the mood of drinking alone under the moon and missing loved ones.

Cluster 6

The phrase spiritual practice (38%) was the most frequent activity in Cluster 6 and was often accompanied by introverted emotions such as homesickness, yearning, and indifference (> 5%). Further investigation revealed that elements such as clouds, sky, sun; humans, guests, recluses, water, stream, and pond were more prominent in the skyscapes, people, and waterscapes (average weighted degree is more than 124). They were specifically manifested as a white cloud, cloudy sky, sunset; no man’s land, mountain guest, mountains and rivers, and spiritual stream, creating a secluded forest where few people have ventured. Plants such as slippery moss, and moss green often appeared as symbols of temples in Tang poetry, while the landscapes, animals, and allusions such as the peaks, Hanyan rock, thousands of ravines; spiritual bird, and Penglai Mountain, Qiongtai fairyland, and the immortals collectively referred to the natural landscape and strong Taoist atmosphere of the forest in the mountains.

Cluster 7

The phrase spiritual practice (36%), which was the most common activity in Cluster 7, was often accompanied by introspective emotions such as boundless, ruthless, and wandering (> 5%). Further investigation revealed that certain elements of landscape, allusions, and construction such as stone, cave, and dust as well as myths and legends, Taoist zen words; and bridges, platforms, and cities were more prominent in this category (average weighted degree is more than 72). These elements were specifically reflected in the stone bridge, tree of heaven, the Yellow Emperor, immortality as well as the platform of tower, and city wall, and imbued with a strong Taoist cultural flavor. Additionally, typical images in the categories of skyscapes, waterscapes, animals, plants, and people, such as wind and thunder; waterfalls, running water; fish and dragon, white crane; pines and firs, and the immortals, frequently occurred together, indicating the close relationship between the pursuit of the Tao and the practice of meditation.

The spatial image of Poetry Road in Ningbo City

Taizhou City had two clusters, which were ranked according to their influence based on the average weighted degree as follows: Cluster 8 (49) > Cluster 9 (27) (Figs. 14, 15).

Cluster 8

The phrase spiritual practice (31%) was the most frequently observed behavior in Cluster 8, often accompanied by introverted emotions such as solitude, separation, and lonely (> 5%). Further exploration revealed that elements related to people, landscapes, and waterscapes such as people, visitors; stone, peaks, mountains and sea, streams, and pools had distinctive features (average weighted degree is more than 49). This was reflected in phrases such as common people, exiled recluse; stone cave, green mountain, Cloud Peak, and sea tide which confirmed the mountainous landscape characteristics of Zhejiang Province's thousand peaks competing, and ten thousand gullies contending(千岩竞秀, 万壑争流). Additionally, typical images related to skyscapes, animals, plants, construction, and allusions included snowy clear, whooping crane, birdsong; pine blossom, underwood; city of God, house, and Master Xuedou where elements such as mythical creatures like cranes and birds, immortal plants like pine trees, and mystical cities expressed the poets' longing for enlightenment and immortality.

Cluster 9

The phrases meeting friends (19%) and exploring historic landmarks (19%) were the most frequently recurring behaviors in Cluster 9 and were often accompanied by introverted emotions such as solitude, deep longing, and seclusion (> 5%). Further exploration revealed that certain elements related to skyscapes and waterscapes such as clouds, breeze, moon, sky, autumn, and night, as well as water, spring, river, and stream, were particularly prominent in evoking these emotions (average weighted degree is more than 27). The desolate temporal imagery of white clouds, refreshing breeze, bright moon, and autumn night, and the expansive spatial imagery of spring water, river flip, and jade stream vividly expressed the inward feelings of solitude and seclusion. Additionally, typical imagery of landscapes, plants, people, construction, and allusions, such as hundreds of valleys, wisteria hanging, the true immortal; window grilles, door catch, and Zen mind created the picturesque scene of secluded mountain retreats, making them an excellent destination for poets seeking to explore the past.

Credibility assessment of the clustering results

In addition to providing supporting evidence by comparing specific poetic verses with ancient literary references and corroborating the spatial positioning outcomes, this research is based on the principle of holdout method in machine learning [37]. Employing a 5–5 proportionate split, the textual datasets from Shaoxing City, Taizhou City, and Ningbo City were randomly divided into test sets A and B. By comparing their congruence with the original dataset's results, the credibility of the clustering division was evaluated.

The validation results (Fig. 16) demonstrate that the test sets from Shaoxing City, Taizhou City, and Ningbo City all exhibited a congruence of over 70% with the original results. Moreover, the average accuracy of the test sets was determined to be 79.2%. These findings further authenticate the high strength of the clustering division in this study (Additional file 6).

Fig. 16
figure 16

Cluster credibility assessment

Analysis of the causes of spatial images of Poetry Road

The comparison and induction of the scenes of Poetry Road

By comparing the overlap of natural-cultural landscape units and behavioral-emotional words of the nine major clustering scenes in the three cities (Shaoxing, Taizhou, and Ningbo), we explored the differences and commonalities between these cities' clustering scenes (Table 2).

Table 2 The comparison and induction of the scenes of Poetry Road

The results indicated that the common main landscape elements of clustering scenes in Cluster 2 (Shaoxing), Cluster 7 (Taizhou), and Cluster 8 (Ningbo) included the typical elements of the landscape such as a stone cave, stone bridge, allusions such as a tree of Heaven, Penglai Mountain, and people such as Taoist master, the immortals which induced the main behavior of Spiritual practice and often triggered inward emotions like quiet and melancholy. Therefore, these scenes can be categorized under the theme of Meditation and enlightenment. The common main landscape elements in Cluster 3 (Shaoxing), Cluster 5 (Taizhou), and Cluster 9 (Ningbo) included typical elements of the skyscapes, waterscapes, and construction, such as autumn wind, bright moon; river flip, sea tide and small boat, expeditionary sail which induced the main behavior of farewell and exploring historic landmarks and often triggered inward emotions of melancholy and loneliness. Therefore, these scenes were categorized under the theme of autumn tides and contemplation. Scene 1 was unique in Shaoxing, which included typical elements of skyscapes, waterscapes, and plants such as a spring breeze, a white cloud; running water, Shanxi Creek and peach flowers, and fragrant grass. This scene induced various behaviors such as meeting friends, religious practice, poetry recitation, scaling heights, and angling. It often triggered a mix of inward emotions such as yearning, and loneliness, and outward emotions such as free and unfettered. Therefore, this scene can be summarized under the theme of spring-seeking immortals. Scene 4 was also unique in Shaoxing, which included typical elements of construction such as city walls, and village gate, and allegories such as Dayu Cave and washing gauze. This scene induced typical behaviors such as exploring historic landmarks and often triggered a mix of outward emotions such as refreshment, and leisure, and inward emotions such as melancholy, and loneliness. Therefore, this scene can be summarized under the theme of visiting an ancient prefectural city. Scene 6 was unique in Taizhou, which included typical elements of skyscapes such as white clouds, cloudy sky, people such as no man’s land, mountain visitors, and waterscapes such as mountains and rivers, and spiritual stream. This scene induced the typical behavior of spiritual practice and often triggered inward emotions of homesickness and indifference. Therefore, this scene was summarized under the theme of seclusion in the forest.

Scene 1—meditation and enlightenment: the evolution of mountain faith and its blend with poetic monks

The system of divine caves and the construction of temples and Taoist institutes have been refurbished. The cultural landscape of the mountains in Jiangnan has evolved from the worship of mountain gods during the Qin and Han dynasties, the cultivation of Taoism in the mountains and the transformation of Buddhism into the forest of the Six Dynasties [38]. By the Tang Dynasty, cave heavens and Buddhist temples, relying on the mountainous environment of the eastern Jiangnan region, accounted for 36.4% and 35.8% of the country's total sacred spaces, respectively. This made it the most densely distributed area of Buddhist and Taoist architecture in the country [38]. The Chronicles of Daoism in Successive Dynasties(《历代崇道记》) recorded that more than 15,000 Taoist priests were ordained since the founding of the Tang Dynasty, while during the reign of Emperor Xuanzong, there were 5,338 temples, with the most concentrated numbers found in the eastern Jiangnan region. The emergence of physical spaces such as temples and palaces heralded the comfort of the practice environment, transforming individualistic ascetic practices in the mountains into group practices bound together by “teacher-student-disciple” relationships [39]. Simultaneously, a comfortable environment for practice required worldly support and patronage. As such, the connection between cultural groups in the mountains and the outside world continued to strengthen, especially through the medium of literary scholars and poetic monks who, through the "Eastern Zhejiang Alliance" of singing in harmony and scattering in four corners(分声唱和, 名散四陬) have become symbols of the blend of poetic sentiment and Zen inspiration in eastern Zhejiang [41]. The image of the poetic monk has become a unique cultural icon through the interpretation of poets.

Scene 2—autumn tides and contemplation: the transformation of coastal waterways and the incorporation of nocturnal anchorage imagery

According to the research on the waterline of Hangzhou Bay in the Atlas of Chinese History(《中国历史地图集》), it can be inferred that the mouth of the Qiantang River gradually silted and formed northward after the Tang Dynasty [40]. Therefore, during the Sui and Tang dynasties, the inland areas of Zhejiang Province were closer to the river and sea, and the view of the tides was wider. In the Shanzhong area during the Tang Dynasty, there were numerous lakes and swamps, and the ground elevation was close to sea level [41], making the sound of autumn tides more empty and remote. Supplement to the History of the Tang Dynasty(《唐国史补》) recorded that there is no city or town in the southeast that is not connected to water (“东南郡邑, 无不通水”). Additionally, as early as in the Book of Songs (《诗经》) and Songs of Chu (《楚辞》), the image of night anchoring had become a symbol of lonely and ethereal emotions, as seen in the verses Float in a cedar boat, Drift along the stream. Can’t sleep at night, as if suffering from hidden sorrows and I ride my horse in the morning to the Jiang Gao, I cross the river in the evening at Xi Zhi (“泛彼柏舟, 亦泛其流。耿耿不寐, 如有隐忧” “朝驰余马兮江皋, 夕济兮西澨”). During the peak of poetry in the Tang Dynasty, the image of night anchoring became even more prevalent in poetry. When combined with the legendary stories of Wei and Jin literati traveling and singing in the area, such as Wang Ziyou's Night Visit to Dai and Fan Li's Reclusive Journey in the River, and Xie Lingyun's creek walk in Yue (“王子猷雪夜访戴” “范蠡江隐” “谢灵运越岭溪行”), the connotations of night anchoring became even more profound.

Scene 3—spring seeking immortals: the cultivation of swampy mountains and the development of springtime traditions

During the Wei and Jin dynasties, the immigrant community who had crossed the Yangtze River settled in the hilly areas of the Cao'e River basin in Zhejiang, as they competed with the local gentry. The development of mountainous estates in this area led to the dredging of lakes and canals and the construction of roads [42], while the gathering of the Wang and Xie clans in this region attracted attention from famous figures. This created a celebrity effect and built complete environmental facilities. Consequently, literati from later generations came here to "check-in" and enjoy the complete environment. In addition, artistic families often held literary meetings with the support of the royal court, expanding the content of the gatherings to include the traditional water-side sacrificial ceremonies and purification rituals, as well as activities such as climbing high for spring scenery and drinking by the water [46]. By the Tang dynasty, the Shangsi Festival had become more widely celebrated and developed into one of the important three holidays (Zhonghe Festival, Shangsi Festival, and Chongyang Festival), becoming an auspicious day for literati to participate in purification rituals, travel, feast, and poetry composition [43].

Scene 4—visiting ancient prefectural city: the progression of county functions and the rise of sightseeing and antiquarianism

Although cities first emerged as political strongholds, during the Tang Dynasty, the prefectures, and counties in eastern Zhejiang had already taken on economic functions, becoming regional economic centers [35]. This gave rise to the gathering of poets who sought to explore the mountains and waters of the region. According to the Entire Donovan (《全唐文》), the seven prefectures of eastern Zhejiang were producing silk, collecting taxes on cocoons, fishing and salt-making, and providing half of the food and clothing for the entire empire (“茧税鱼盐, 衣食半天下”). Particularly in the commercial center of Yue Prefecture, a wide variety of goods, such as hats, caviar, silk, and celadon porcelain were found [35]. In the late Tang Dynasty, the commercial function of the cities in eastern Zhejiang transformed, breaking through the night-time trade ban and introducing night markets. Additionally, the extensive distribution and dissemination of ancient relics and legends related to Yu the Great’s water management of Kuaiji, his burial in Kuaiji, Emperor Qin Shi Huang’s southern tour of Kuaiji, and Sima Qian’s exploration of Yu’s Tomb. This gave rise to the historical perspective of “Yue as the descendant of Yu,” which laid the foundation for the initial identity of the Kuaiji people [44]. This led to the ancient relics, such as those related to Yu the Great becoming the core scenic spots sought after by poets.

Scene 5—seclusion in the forest: the propagation of both Buddhism and Taoism and the pursuit of leisurely seclusion and wanderlust

During the Sui, Tang, and Five Dynasties periods, which were the third warm and longest humid periods in Chinese history, the natural ecological environment in Zhejiang was lucrative, and the mountainous areas maintained their original appearance [35]. According to Eminent Monks (《宋高僧传》), there was a record of "when visiting Zhenlin Temple for morning porridge, many tigers and leopards followed to the temple gate" (“侵星赴禅林寺晨粥, 而多虎豹随到寺门”). In addition, The Chronological History of Immortals and Taoism Through the Ages (《历世真仙体道通鉴》) recorded that the Yellow Emperor visited Tiantai Mountain and received the "Golden Liquid Divine Pill," indicating that Tiantai had legends of “receiving pills and refining pills” (“(轩辕)黄帝尝往天台山, 受金液神丹”) as early as ancient times. During the early Tang Dynasty, the Daoist patriarch Sima Chengzhen lived in seclusion on Tiantai Mountain for thirty years, constructing a complete system of caverns of heaven and place of blessing and founding the Shangqing Tiantai school. He advocated the "Buddhist-Taoist Dual Cultivation" theory of cultivating the heart and nourishing the qi, which further promoted the popularization of Daoist beliefs [41]. Sima was highly respected by the Tang emperors, which attracted many literati to Tiantai seeking to be recommended for official positions. For example, Li Bai visited Sima Chengzhen on his way to Tiantai Mountain and presented him with poems and articles for his feedback. Sima praised him as "having a divine aura and Daoist spirit, capable of traveling to the ends of the earth," (“有仙风道骨, 可与神游八极之表”)and he was later recommended for a position in the Hanlin Academy [45].

Identification of critical areas and deduction of indigenous strategies

Scene 1—meditation and enlightenment

Scene 1 was widely distributed in the eastern Zhejiang region (Fig. 17, 18), with significant concentrations at scenic spots such as Siming Mountain, Wanwei Mountain, Tiantai Mountain, Chicheng Mountain, and Xuedou Temple. The imagery of Buddhists and Daoists in poems such as The colorful banners and flower petals, along with the canopy of the Buddha, fill the clear river and When the book of gods and spirits is opened, from Three Caverns they'll slide (“幡花宝盖满青川” “箓开三洞鬼神惊”) coexisted among the three mountains of Tiantai, Siming, and Kuaiji. This further confirmed that during the Sui and Tang periods, famous mountains had become highlands for the integration and development of religious beliefs. Furthermore, three-quarters of the verses were related to the exchange of poetry recitation between the poet monks in this scene, while two-thirds of the verses were composed at places such as Xuedou Temple, Chenxin Temple, and Longxing Temple. This highlighted the important role of temples and monks in radiating the influence of famous mountains and promoting the integration of Zen and poetry. It also provided important references for the selection and layout of scenes that featured both poetry and Zen today.

Fig. 17
figure 17

Kernel density distribution of five scene images

Fig. 18
figure 18

Deduction of indigenous scene images

Scene 2—autumn tides and contemplation

When compared to Scene 1, Scene 2 exhibited a linear distribution in eastern Zhejiang (Fig. 17, 18). A further comparison revealed that poets not only favored singing about famous mountains such as Tiantai, Chicheng, and Longshan, but also frequently composed poems in political and cultural centers such as the capital of Yuezhou (now within the moat of Shaoxing city), the capital of Taizhou (now the ancient city of Linhai), and Shanzhong (now in Xinchang County and Shengzhou City), as well as along waterways such as Jinghu and Shanxi. The statistics showed that three-quarters of Tang poems in Scene 2 contained farewell terms such as "chou" (giving) "Jian" (presenting), and "Zeng" (farewell) (“酬” “饯” “赠”), among which four-fifths of the poems were located on the river landing place, thus confirming that Tang poets mainly relied on waterways to travel to and from eastern Zhejiang. Moreover, the "autumn night" depicted in verses such as On an autumn night in Dongyue, a white-haired traveler laments the passage of time and In the place where Xie once sang, his illustrious name has remained unmatched for a thousand years (“东越秋城夜, 西人白发年” “谢公吟处依稀在, 千古无人继盛名”) had the power to inspire poetic sentiments, particularly stirring a poet's nostalgia and lament for the Wei and Jin dynasties' legacy in the eastern Zhejiang region. Therefore, in the planning of Tang poetry-themed tourism projects in present-day eastern Zhejiang, combining a night cruise in autumn with reminisces of the cultural heritage of the Wei and Jin dynasties reinforces the cultural and educational significance of the tourism activities.

Scene 3—spring-seeking immortals

In the Shaoxing region, Scene 3 was dispersed among several centers, with Jian Lake and Yunmen Mountain as the core and Taoyuan, Wuzhou Mountain, Ruoye Creek, and Wanwei Mountain as secondary centers (Figs. 17, 18). According to the statistics, two-thirds of the poems with outing themes used keywords such as "Deng” (ascend), "you" (stroll), and "fan” (boat ride) (“登” “游” “泛”). Among them, poems composed about creeks and marshes, such as Jinghu Lake, Ruoye Creek, and Shanyin Creek, accounted for about 40%, while about 60% of the poems were written about mountainous areas such as Yunmen Mountain, Wuzhou Mountain, and Wanwei Mountain. In conjunction with phrases like treading the curves of the spring river together and drinking wine in the wind pavilion, indulging in the pleasure of appreciation (“共踏春江曲” “禊饮风亭恣赏心”), these verses confirmed that activities such as climbing, boating, purifying rituals, and literary gatherings were popular among literati groups during the spring outings. Therefore, by further exploring traditional spring festivals such as the Last Day of the Lunar Year, the Dragon Head-raising Festival, and the Double Third Festival; and cultural activities such as water banquets, boating gatherings, and climbing, outings can be organized to activate the multi-faceted benefits of the spring scenery tourism theme. They can also move towards the transformation of cultural heritage protection and inheritance from materialization to vitalization.

Scene 4—visiting an ancient prefectural city

In the Shaoxing area, Scene 4 depicted a main and two secondary centers (Figs. 17, 18). The city of Yuezhou was the main center, and the mountains of Wozhou in Xinchang County, and the Huansha River in Zhuji County served as the secondary centers. Further statistical analysis reveals that poems that described the pre-Qin historical sites such as the “Dayu Cave,” “Dayu Temple,” and “Yao Cottage” accounted for three-quarters of all poems, with the majority located in the Dayu Mausoleum Scenic Area in Yuezhou. Additionally, poems that described the historical events of the Wu and Yue kingdoms, such as “The War between Wu and Yue,” “Wuxu,” and “Concubine of Wu,” accounted for two-fifths, and were concentrated in the Huansha River Scenic Area in Zhuji City. Poems that paid tribute to the cultural relics of the Wei and Jin dynasties, such as “Wang and Xie,” “Shining Villa,” and “Wang Yu,” accounted for another two-fifths, and were concentrated in the Wozhou Mountain Scenic Area in Xinchang County. Further, phrases such as Boats travel through the sea straits, and fields surround the city, and Small markets are bustling with activity, and fishermen are separated by deep reeds (“舟船通海峤, 田种绕城隅”、 “亥茶阗小市, 渔父隔深芦”), indicated that the places to visit not only included secluded scenic spots but also prosperous rural landscapes and bustling marketplaces. Therefore, in the development of the Tang Poetry Road in the Shaoxing section of eastern Zhejiang, the cultural specialties of the poetry route, such as rural landscapes, night markets, and the historical sites of Dayu, combined to create a modern version of the "Visiting Ancient Cities" painting, highlighting the immersive tourism experience and situational science education of the ancient city.

Scene 5—seclusion in the forest

Scene 5 concentrated on mountains such as Tiantai Mountain, Chicheng Mountain, Tongbai Mountain, and Shiqiao Temple in Taizhou (Figs. 17, 18). According to research on the traces of poets in Examination of the Poetics of Tang Poetry Road (《浙东唐诗之路诗人行迹考》) [46], two-thirds of the poets in this cluster were hermits and Taoists who lived in seclusion in the mountains and temples; 95% of them composed poems in the context of the Caverns of Heaven and Place of Blessing. Evidently, the mountainous caves and temples with fairy-tale colors were the most popular attractions for reclusive poets. In addition, the frequent appearance of famous monks such as Ge Xuan (葛玄), Zhiyi (智顗), and Zhidun (支遁) in poems such as Following the Footsteps of Zhiyi and Support from the Eminent Monk Zhi Dun, Someday I May Lead the Way (“应齐智者踪” “支遁他年识领军”) indicated that mountainous temples and caves in this area had already established a "cultural brand" of immortal seclusion during the Sui and Tang dynasties. Hence, tourism planning for the Tang Poetry Road in eastern Zhejiang in the Taizhou section relied on the excellent environment of the mountainous caves and temples, by fully tapping into these resources for vacation and health care. These included features such as medicinal food, traditional Chinese medicine, and places such as Caverns of Heaven and Place of Blessing, which presented a modern version of the theme of forest health.


This study also shows that:

  1. (1)

    This study exemplifies the Tang poems of the Eastern Zhejiang’s Tang Poetry Road to affirm the significance of literary works as records of behaviors and emotions, reflections of regional characteristics, and reflections of social contexts. This role of literary works in revealing the socio-cultural background finds corroboration in Alves' research on the narrative language of Brazilian novels and films [47]. By further analyzing the poet's activities from Figs. 7, 8, 9, 10, 11, 12, 13, 14 and 15, it is observed that reclusive behavior constitutes the highest proportion, averaging at 41.6%. Exploring the biographical information of the poets, it is revealed that 76.3% of them have experienced banishment or seclusion, reflecting the cultural and political environment of concurrent Confucianism, Buddhism, and Taoism during the Tang dynasty. This aligns with Teng's research on Chinese Tang dynasty education [48]. Thus, the reclusive behavior of Tang poets not only serves as spiritual solace under Confucian, Buddhist, and Taoist influences but also becomes a significant means to achieve officialdom. Therefore, in-depth exploration of the socio-cultural environment is beneficial for a comprehensive understanding of cultural heritage construction.

This study verifies the significant correlation between emotional expression in literary works and the occurrence of specific behaviors. This response mechanism is supported by Gendolla's exploration of the mutual influence of emotions and actions [49]. By further analyzing the connection between behaviors and emotions from Figs. 8, 9, 10, 11, 12, 13, 14 and 15, “solitude” emerges as the most prevalent emotion, accounting for 3.5%, followed by “free and unfettered” at 3.2%. They are predominantly associated with the behaviors of “boating” and “spiritual practice” at rates of 4/7 and 3/5, respectively. This reflects the emotional connotation of solitude often associated with boating in the context of Tang poets, while seemingly carefree spiritual practice is the externalization of the dichotomy between officialdom and seclusion. This finding is similar to Chen [50] and Shao’s [51] quantitative analysis of the “boat” imagery in Tang poems and the study of the officialdom and seclusion mentality of typical Tang poets. It can be said that this “solitary boating” and “freedom of spiritual practice” have already become classic poetic imagery over a thousand years ago. Therefore, emphasis should be placed on the influence of different behavior activities in shaping the emotions of the participants.

This study confirms the significant association between the celebrity charm and the formation of specific landscape imagery. Gao’s research on the long-term impact of historical and cultural celebrities on tourism economy also reveals the driving role of the “celebrity effect [52].” By further statistical analysis, Scene 2 “Autumn Tides and Contemplation” demonstrates that the average edge weight between the nodes “moon,” “night,” “wind,” and “boat” is the highest at 60.0. Moreover, the co-occurrence rate of literary allusions such as Wang Ziyou(王子猷), Fan Li(范蠡), and Xie Lingyun (谢灵运) with the aforementioned elements reaches up to 70%. This reflects the existence of night boating imagery as early as the Jin dynasty, which has been influencing the travel behavior of Tang literati. This aligns with Li's analysis of the night boating imagery on the Zhejiang poetry road [53]. It can be said that the accumulation of literary allusions such as Wang Ziyu's snowy night visit to Dai, Fan Li's seclusion on the river, and Xie Lingyun's creek walk in Yue gradually shaped the night boating imagery with multiple meanings, including a sense of freedom, solitude, and joy. Therefore, in text analysis of literary works, the interpretation of local specific terms such as allusions, legends, and characters is indispensable.

This study also demonstrates that different geographical environmental characteristics often induce different behavioral activities of the poets. Wei et al.’s investigation of the spatial distribution characteristics of tourism reveals the spatial heterogeneity of tourist activities [54]. By comparing Figs. 17 and 18, it is further found that poets in Shaoxing City are more engaged in historical visits, accounting for 24.0%; poets in the mountainous areas of Taizhou City are more involved in interactions with poet-monks, at 16.0%; poets along the Yanxi River are more likely to engage in farewell activities, at 17.7%. This reflects that historical relics are the main attraction for historical visits, mountains and forests are the natural basis for nurturing groups of poet-monks, and flowing rivers are important media for farewell activities. This is in line with Huang's research on the cultural market of the Tang dynasty in China [55]. It can be seen that the ancient city of Shaoxing, with over 2000 years of history, the widespread monastic sites in Taizhou, and the convenient water transportation along the Yanxi River jointly gave rise to diverse travel activities during the Sui and Tang periods. Therefore, the exploration of geographical spatial elements is a crucial aspect of semantic mining in literary works.

  1. (2)

    This study also possesses several noteworthy technical advantages. Firstly, through semantic comparative sampling, the approach, which treats individual characters as semantic units in the processing of ancient poetry texts, achieves an accuracy rate of 88.3% and demonstrates a high level of reliability. Moreover, the study successfully identified 1,634 landscape nodes, which is 33 times more than previous research [19, 56]. This indicates that this method exhibits greater speed, accuracy, and reusability in the context of classical Chinese text mining. Additionally, this study has constructed a comprehensive knowledge base containing Zhejiang's Tang poetry texts, a collection of landscape terms, a collection of literary allusions, a collection of clusters, and a collection of behavioral emotions, facilitating future researchers to directly refer to and utilize it (see the attached appendix). Furthermore, Chinese classical poetry, as a precious heritage of world literature, harbors inherent artistic, historical, social, and cultural values that warrant further in-depth exploration in future studies. Therefore, the method of processing individual character semantics holds significant potential and value for wider application.

Secondly, this study utilized Python and the Gephi tool to construct semantic networks and analyze text clustering. 18,860 co-occurrence relationships were identified, which is 35 times more than previous research [19, 56], and they were classified into nine clusters, visually presented in the form of social networks. By applying the principle of holdout method, the clustering results of the text dataset were subjected to credibility testing, resulting in an average test set accuracy of 79.2%, demonstrating the reliability of the clustering results. The combination of Python programming and Gephi tool exhibits remarkable advantages in extracting complex networks from classical poetry, as confirmed by existing research on social network analysis [57,58,59]. Therefore, the application of well-established social network analysis methods from the field of sociology to text mining of classical poetry proves to be a beneficial endeavor.

  1. (3)

    Different research scales have a significant impact on various aspects such as food security levels [60], landscape structure [61], and land policies [62]. Urban agglomerations, as the primary spatial form for carrying developmental elements, serve as the main region for China's economic and social development and play a crucial role in the global shift of economic gravity [63]. Therefore, starting from the perspective of urban clusters is more conducive to integrated planning and zoning control of the basic environment [64].

This study explores the commonalities and individualities in the distribution of literary landscape scenes within Shaoxing, Taizhou, and Ningbo cities, which helps establish accurate connections between regions in terms of heritage and provides guidance for the development of local characteristics. Specifically, the current Tang Poetry Road in Eastern Zhejiang has not yet established a unified and effective coordination mechanism, leading to issues of independent decision-making by counties and cities, as well as low a rate of resource sharing and utilization [33]. Therefore, it is possible to enhance the comprehensive heritage corridor experience for residents and tourists by promoting regional collaboration among literary landscape scenes with similar themes and establishing local connections between historical landscape elements that share the same theme [65]. At the same time, the Tang Poetry Road faces the risk of lacking distinctiveness, compared with cultural sites that have distinct regional characteristics and hold significant cultural significance, it has not received adequate exploration and preservation measures [33]. Therefore, it is advisable to implement a “characterization” approach can be adopted in areas with significant variations in literary landscape characteristics, fully utilizing and developing local resources, planning thematic tourism products and routes related to cultural experiences, landscape sightseeing, and eco-leisure, to drive the sustainable development of local literary tourism [66].

The study had a few limitations as follows:

  1. (1)

    Due to the restriction of focusing on the Tang Dynasty poetry along the Poetry Road, the research objective only focused on Tang Dynasty poetry. In the future, a wider range of poetic content and longer time samples could be considered to research data to unlock more poetic heritage secrets.

  2. (2)

    The Tang Dynasty existed over 1400 years ago, and some Tang poems no longer have accurate information regarding their original composition locations. This leads to a blurred geographical reference, which had to be excluded from this study. For instance, out of 283 poems, only their connection to the Zhejiang Eastern region can be determined, without precise identification of the specific cities or key areas.

  3. (3)

    Chinese ancient poetry extensively employs techniques such as metaphor and allusion. However, the usage of these specialized terms varies significantly across different dynasties and regions. Consequently, in the process of identifying these special terms, it is not feasible to rely on pre-existing specialized vocabulary databases for bulk processing, as it would result in significant inaccuracies. Therefore, based on an extensive study of numerous monographs reflecting the natural and cultural conditions of Zhejiang Province during the Tang Dynasty, as well as investigations into the contemporary landscape and cultural trends of the region, we have established a proprietary noun vocabulary text corpus (see Appendix) by manually selecting words after the Jieba word segmentation process.

  4. (4)

    There is a profound interconnectedness among various art forms, including calligraphy, painting, music, and ancient poetry, all of which constitute humanity's remarkable cultural heritage. Numerous studies have employed machine learning techniques such as Computer Vision and NLP to achieve batch identification of ancient painting creation eras [67], automated recognition of music emotions [68], and identification and evaluation of calligraphy styles [69]. However, due to the specific focus on Eastern Zhejiang's Tang Poetry Road, the study has only explored the heritage value of poetic literature and has not ventured into the constructive exploration of other art forms like painting.


We developed a text-mining framework for the construction of landscape scenes and the identification of key areas in poetic works, focusing on the Poetry Road in Shaoxing, Taizhou, and Ningbo.

  1. (1)

    In the conservation and development of the Zhejiang Eastern Poetry Road, Shaoxing has the strongest influence on the poetic route scenes (Average weighted degree = 664.897). Among them, the scene of “Spring Seeking Immortals” exhibits the highest regional recognition (Average weighted degree = 424), followed by the scene of “Visiting Ancient Prefectural City” (Average weighted degree = 67), which are concentrated in natural landscape of mountains and waters and historical ancient counties of Shaoxing, respectively. Therefore, targeted planning could be introduced for countryside exploration during springtime and the organization of ancient city markets and festivals to uncover the social and living space values, strengthening the themes of rural living and ancient city tourism in Shaoxing.

  2. (2)

    Taizhou's “Seclusion in the forest” scene exhibits a distinctive allure (Average weighted degree = 124), mainly distributed in the religious mountains of Taizhou, aligning with the regional brand of Taizhou's Buddhist and Taoist traditions. Therefore, it is possible to combine forest nature therapy with traditional spiritual practices, transforming these hidden paradises into more romantic, high-quality, and therapeutic modern immortal caves.

  3. (3)

    While Ningbo City may not exhibit distinct regional characteristics of Poetry Road, it possesses remarkable collaborative advantages, fostering strengthened cooperation with other counties and cities. This facilitates the seamless integration of the overall scenic style of “Meditation and Enlightenment” and “Autumn Tides and Contemplation”, throughout the Poetry Road of Eastern Zhejiang.

  4. (4)

    By comparing the regional commonalities of the three cities' poetic scenes, it is observed that scenes of “Meditation and Enlightenment” and “Autumn Tides and Contemplation” are distributed throughout the entire route. The “Autumn Tides and Contemplation” scene linearly follow the water transportation hubs and could be further enhanced through shared waterfront corridor construction between counties and cities to create a modern version of the Eastern Zhejiang Water Gallery. The “Autumn Tides and Contemplation” scene is scattered across various renowned mountain retreats and could be strengthened by integrating the cultural essence of Buddhist and Taoist themes, enhancing the appeal of the “Buddhist and Taoist traditions” brand.

This study broadens the research perspective on the landscape inheritance of literary heritage and provides a basis and guidance for the cross-regional conservation and development of heritage corridors.


  • The Collection of Tang Poetry of the Tang Poetry Road was compiled by Zhu Yuebing (1935–2019), the initiator of the Tang Poetry Road in Zhejiang Province, based on the 1960 edition of The Tang Poetry published by Zhonghua Book Company and the 1992 supplementary edition of The Complete Collection of Tang Poetry also published by Zhonghua Book Company. The book contains a detailed collection of 1,589 Tang poems related to the Tang Poetry Road in eastern Zhejiang, arranged in order along the road, making it the most comprehensive work on Tang poetry in Eastern Zhejiang.

  • Weighted Degree is a metric used to measure the overall importance and density of nodes and edges in a clustered graph. It is calculated by adding up the weighted degree of all nodes and dividing it by the total number of nodes. In this article, we use this metric to measure the overall importance and density of the nodes.

Availability of data and materials

The datasets used and analysed during the current study are available from the corresponding author on reasonable request.



Natural language processing

F-N Algorithm:

Fast-Newman algorithm

F-R Layout Algorithm:

Fruchterman-Reingold layout algorithm


Kernel density estimation


  1. Busa R. The annals of humanities computing: the index thomisticus. Medicina nei secoli. Comput Hum. 1980;17(3):443–59.

    CAS  Google Scholar 

  2. Lambrechts W, Sinha S, Mosoetsa S. Colonization by algorithms in the fourth industrial revolution. IEEE Access. 2022;10:11057–64.

    Article  Google Scholar 

  3. Allen RB. Collaborative research in the digital humanities. Electronic Library. 2013;32(4):588–9.

    Article  Google Scholar 

  4. Earley T. Spatial history, deep mapping and digital storytelling: archaeology’s future imagined humanities. J Archaeol Sci. 2017;84:95–102.

    Article  Google Scholar 

  5. Nyhan J, Flinn A, Welsh A. Oral history and the hidden histories project: towards histories of computing in the humanities. Lit Linguist Comput. 2013;30(1):71–85.

    Article  Google Scholar 

  6. Finger A. Digital German Studies: Digital Humanities in Linguistics and Literature Studies. seminar-a journal of germanic studies. journals division, 5201 dufferin st, downsview, toronto, on m3h 5t8, canada: univ toronto press inc. 2021; 57(3):319–321.

  7. Marcos AF. Digital art: When artistic and cultural muse merges with computer technology. IEEE Comput Graphics Appl. 2007;27(5):98–103.

    Article  Google Scholar 

  8. Justicia de la Torre C, Sanchez D, Blanco I, Mart MJ. Text mining: techniques, applications, and challenges. Int J Uncertain Fuzziness Knowledge-Based Syst. 2018;26(4):553–82.

    Article  Google Scholar 

  9. Baumard N, Huillery E, Hyafil A, Safra L. The cultural evolution of love in literary history. Nat Hum Behav. 2022;6(4):506–22.

    Article  Google Scholar 

  10. Hussain MA, Yunos MYM, Ismail NA, Ismail S. A review of the elements of nature and the Malay cultural landscape through Malay literature. Sustainability. 2020;12(6):2154.

    Article  Google Scholar 

  11. Alex B, Grover C, Tobin R, Oberlander J. Geoparsing historical and contemporary literary text set in the city of Edinburgh. Lang Resour Eval. 2019;53(4):651–75.

    Article  Google Scholar 

  12. Zhang HZ, Ji T, Pagel M, Mace R. Dated phylogeny suggests early Neolithic origin of Sino-Tibetan languages. Sci Rep. 2020;10(1):20792.

    Article  CAS  Google Scholar 

  13. Xu Y, Tao ZM, Rong HF. A study of the spatial and temporal characteristics of typical landscapes and the aesthetics of tourism in the poetry of Mount Lushan (庐山诗词中典型景观时空特征及旅游审美研究). J Tourism. 2022;37(5):57–68.

    Article  Google Scholar 

  14. Xs XI, An XR, Zhang GM, Liang SF. spatial patterns, causes and characteristics of the cultural landscape of the Road of Tang Poetry based on text mining: take the Road of Tang Poetry in Eastern Zhejiang as an example. Herit Sci. 2022;10(1):1–20.

    Article  Google Scholar 

  15. Barbado A, Fresno V, Riesco AM. DISCO PAL: diachronic Spanish sonnet corpus with psychological and affective labels. Lang Resour Eval. 2022;56(2):501–42.

    Article  Google Scholar 

  16. Wang GJ, Ye ZL, Zhao HX, et al. Analysis of hyper-network properties in Tang poetry and Song lyrics. Comput Appl. 2021;41(8):2432–9.

    Article  Google Scholar 

  17. Xiao Y, Liu L. Royal land use and management in Beijing in the Qing dynasty. Land. 2021;10(10):1093.

    Article  Google Scholar 

  18. Li CL, Li XG, Zhao W. Cognition of rural landscape based on the interpretation of words in ancient poems-a case study of Chengdu plain (基于古诗词语义解析的乡村景观认知——以成都平原为例). Chin Landsc Archit. 2020;36(5):76–81.

    Article  CAS  Google Scholar 

  19. Li Y, Li XF. The “Poetic” reproduction of landscape imagery-landscape perception of Buddhist temple gardens in Ming dynasty Beijing(风景意象的“诗化”再现——明代北京佛寺园林的景观认知). Landsc Archit. 2022;29(4):128–33.

    Article  Google Scholar 

  20. Wang L. Common knowledge of ancient Chinese(古代汉语常识). Beijing: Beijing United Publishing Company; 2019. p. 103–6.

    Google Scholar 

  21. Barthes R. Principles of semiotics (Éléments de sémiologie). Beijing: China Renmin University Press; 2008.

    Google Scholar 

  22. Zheng Y, Cheng XC, Huang RH, Man Y. A Comparative study on text clustering methods. Advanced Data Mining and Applications: Second International Conference, ADMA 2006, Xi’an, China, August 14-16, 2006 Proceedings 2. Springer Berlin Heidelberg, 2006:644-651.

  23. Cheng Z, Wang NN, Zhao YT, Cheng L, Song T. Water policy evaluation based on the multi-source data-driven text mining: a case study of the strictest water resource management policy in China. Water. 2022;14(22):3694.

    Article  Google Scholar 

  24. Liu YH, Lai LP, Yuan J. Research on Zhanjiang’s leisure sports tourism development strategy in coastal recreational areas. J Coastal Res. 2020;111:248–52.

    Article  Google Scholar 

  25. The Open Graph Viz Platform, a paradigm appeared in the Visual Analytics field of research. 2023. Accessed 9 June 2023.

  26. Gajdoš P, Ježowicz T, Uher V, et al. A parallel Fruchterman-Reingold algorithm optimized for fast visualization of large graphs and swarms of data. Swarm Evol Comput. 2016;26:56–63.

    Article  Google Scholar 

  27. Liu QQ, Tang XL, Li K. Do historic landscape images predict tourists’ spatio-temporal behavior at heritage sites? a case study of West Lake in Hangzhou, China. Land. 2022;11(10):1643.

    Article  CAS  Google Scholar 

  28. Griffiths HM. Enacting memory and grief in poetic landscapes. Emot Space Soc. 2021;41:100822.

    Article  Google Scholar 

  29. Zhang YK, Wu B, Tan LF. Information visualization analysis based on historical data. Multimedia Tool Appl. 2022;81(4):4735–51.

    Article  Google Scholar 

  30. Yu XJ, Xu HG. Ancient poetry in contemporary Chinese tourism. Tour Manage. 2015;54:393–403.

    Article  Google Scholar 

  31. Bai XF, Xu H. A study on the spatial and temporal distribution and evolution of historical landscape resources in the Nanjing Zhongshan scenic area in modern times (近代南京钟山风景区历史景观资源时空分布与演变研究). Chin Landsc Archit. 2022;38(7):139–44.

    Article  Google Scholar 

  32. Song XY, Huo XN, Liu YP, et al. An analysis of the temporal and spatial trajectories of relegated poets in the Quan Tang poems from a digital humanities perspective. Libr Intell Work. 2022;66(7):26–34.

    Article  Google Scholar 

  33. Notice of the People's Government of Zhejiang Province on the Issuance of the Development Plan of the Poetry Road Cultural Belt of Zhejiang Province (浙江省人民政府关于印发浙江省诗路文化带发展规划的通知). Zhejiang Provincial People's Government Gazette. 2019; (Z4):4–34.

  34. Zhang Z, Zhong L, Wu BS, et al. A Study of landscape impressions of Tang dynasty gardens based on the temporal and spatial information analysis of garden records (基于园记时空信息解析的唐代园林景观印象研究). Chin Landsc Archit. 2021;37(11):139–44.

    Article  CAS  Google Scholar 

  35. Li ZT. General history of Zhejiang Sui, Tang and five dynasties (浙江通史 第4卷 隋唐五代卷), vol. 4. Hangzhou: Zhejiang People’s Press. Zhejiang; 2005. p. 26–188.

    Google Scholar 

  36. Lu SJ, Li MR. The development of the Eastern Zhejiang poetry road in the early Tang dynasty(初唐浙东诗路的发展). J Jiangxi Normal Univ Philos Soc Sci Edit. 2022;55(4):87–94.

    Google Scholar 

  37. Bax E. Validation of k-nearest neighbor classifiers. IEEE Trans Inform Theory. 2012;58(5):3225–34.

    Article  Google Scholar 

  38. Bin W. The sacred imagination of mountains and its spatial influence in early medieval China: the case of Mount Tiantai. Soc Sci China Abingdon Routledge J. 2018;39(1):132–64.

    Article  Google Scholar 

  39. Wei B. The history of the six dynasties in the mountains(“山中”的六朝史). Literature, History and Philosophy. 2017. p. 115–168.

  40. Tan QX. Historical Atlas of China(中国历史地图集). Shanghai: China Map Press; 1996. p. 43–4.

    Google Scholar 

  41. Zhu YB. An overview of the road to Tang Poetry(唐诗之路综论). Beijing: China Literature and History Press; 2003. p. 4–6.

    Google Scholar 

  42. Wang ZB. General History of Zhejiang Qin, Han and six dynasties(浙江通史 第3卷 秦汉六朝卷), vol. 6. Zhejiang: Zhejiang People’s Press; 2006. p. 375–8.

    Google Scholar 

  43. Liu LT, Wan MC. Sources, types and main features of festival culture in the Yangtze river basin during the Tang and Song dynasties(唐宋时期长江流域节日文化的源流、类型与主要特质). Jianghan Forum. 2021;5:95–105.

    Google Scholar 

  44. Lin HD, He CW. Re-discussion on Shaoxing Kuaiji and Dayu(再论绍兴会稽与大禹). Zhejiang J. 1995;4:20–3.

    Google Scholar 

  45. Chen ZY. Taoist culture and poetic imagery-A Tang poem about Taoism in the Tiantai Mountains(道教文化与诗歌意象——以有关天台山道教的唐诗为对象). J Shanghai Normal Univ Philos Soc Sci Edit. 2012;41(5):65–70.

    Google Scholar 

  46. Zhu YB. The road of Tang Poetry: a study of the travels of Tang dynasty Poets(唐诗之路唐代诗人行迹考). Beijing: China Literature and History Press; 2004. p. 231–75.

    Google Scholar 

  47. Alves WS. Borders and intersections between aesthetic and politic in the movie Rear Window, by Alfred Hitchcock, and the short-story “Sessão das quarto”, by Roberto Drummond. Estudos de Literatura Brasileira Contemporânea. 2012;39:151–80.

    Article  Google Scholar 

  48. Teng J Y. The research about the education of the poets in the most prosperous period of tang dynasty. Northeast Normal University. 2011.

  49. Gendolla GHE. Comment: do emotions influence action?—of course, they are hypo-phenomena of motivation. Emot Rev. 2017;9(4):348–50.

    Article  Google Scholar 

  50. Chen W. On boat: a magnificent panorama of river Basin in Tang dynasty. Heliyon. 2023.

    Article  Google Scholar 

  51. Shao MZ. Official career and seclusion: typical case studies of tang and song dynasties. East China Normal University. 2011.

  52. Gao YY, Su W. The long-run tourism effect of historical celebrities: evidence from one of the most influential literatus in China. Tour Econ. 2022.

    Article  Google Scholar 

  53. Li JL. Night boat and Zhejiang poetry road(夜航船与浙江诗路). Zhejiang J. 2021.

    Article  Google Scholar 

  54. Wei J, Zhong YD, Fan JL. Estimating the spatial heterogeneity and seasonal differences of the contribution of tourism industry activities to night light index by POI. Sustainability. 2022;14(2):692.

    Article  Google Scholar 

  55. Huang L. Research on culture market of tang dynasty. Huazhong Normal University. 2009.

  56. Wang Y, Li CY, Huang Y. A study on the artistic conception and landscape architecture of pingyuan rural landscape in Western Sichuan under the value path of poetry text. Furnit Interior Decor. 2023;30(04):6–11.

    Article  Google Scholar 

  57. Li ZY, Wang YX, Shi ZH, Huang X, Cui R. What do cancer medical tourists care about? Content analysis based on network texts. Curr Issues Tour. 2022.

    Article  Google Scholar 

  58. Zhang M, Su HH, Wen JH. Analysis and mining of Internet public opinion based on LDA subject classification. J Web Eng. 2021.

    Article  Google Scholar 

  59. Huang Y, Liu R, Huang S, et al. Imbalance and breakout in the post-epidemic era: research into the spatial patterns of freight demand network in six provinces of central China. Plos one. 2021;16(4):e0250375.

    Article  CAS  Google Scholar 

  60. Qiao JM, Cao Q, Zhang Z, et al. Spatiotemporal changes in the state of food security across mainland China during 1990–2015: a multi-scale analysis. Food and Energy Security. 2022;11(1):e318.

    Article  Google Scholar 

  61. Jackson HB, Fahrig L. Are ecologists conducting research at the optimal scale? Glob Ecol Biogeogr. 2015;24(1):52–63.

    Article  Google Scholar 

  62. Liu X, Wang Y, Li Y, et al. Changes in arable land in response to township urbanization in a Chinese low hilly region: Scale effects and spatial interactions. Appl Geogr. 2017;88:24–37.

    Article  Google Scholar 

  63. Fang C, Yu D. Urban agglomeration: an evolving concept of an emerging phenomenon. Landsc Urban Plan. 2017;162:126–36.

    Article  Google Scholar 

  64. Liu C, Wang T, Guo Q. Factors aggregating ability and the regional differences among china’s urban agglomerations. Sustainability. 2018;10(11):4179.

    Article  Google Scholar 

  65. Hoppert M, Bahn B, Bergmeier E, et al. The Saale-unstrut cultural landscape corridor. Environ Earth Sci. 2018;77:1–12.

    Article  Google Scholar 

  66. Chen Y, Dang A, Peng X. Building a cultural heritage corridor based on geodesign theory and methodology. J Urban Manag. 2014;3(1–2):97–112.

    Article  Google Scholar 

  67. Zou Q, Cao Y, Li QQ, et al. Chronological classification of ancient paintings using appearance and shape features. Pattern Recogn Lett. 2014;49:146–54.

    Article  Google Scholar 

  68. Gómez-Cañón JS, Cano E, Eerola T, et al. Music emotion recognition: toward new, robust standards in personalized and context-sensitive applications. IEEE Signal Process Mag. 2021;38(6):106–14.

    Article  Google Scholar 

  69. Gao PC, Gu G, Wu JQ, et al. Chinese calligraphic style representation for recognition. Int J Doc Anal Recogn. 2017;20:59–68.

    Article  Google Scholar 

Download references


We would like to express our profound gratitude to the numerous pioneers who have devoted their efforts to the scholarly pursuit of the Tang poetry route in Zhejiang Province. In particular, we extend our sincere appreciation to Mr. Zhu, the visionary founder of the Tang Poetry Route in Zhejiang Province, whose seminal publications, including Review of Tang Poetry Road, Examination of the poetics of Tang Poetry Road, and Collection of Tang Poems of Tang Poetry Road, have provided invaluable support and guidance for our research. In addition to the above, numerous Chinese reference materials have significantly contributed to the inspiration and reference value for the writing of this research manuscript. Therefore, we are including them as supplementary information in the attached file to the manuscript, with the hope of providing assistance to fellow researchers in their related studies (Additional file 7).


This research was funded by Natural Science Foundation of Beijing Province (Grant No. 8222022); Beijing Forestry University Science and Technology Innovation Plan Project (Grant No. 2019JQ03010); The Hot Spot Tracking Project of Beijing Forestry University (Grant No. 2022BLRD08); Special Funds for Basic Scientific Research Funds of Central Universities (Grant No. BLX202111).

Author information

Authors and Affiliations



Jiayan Li was responsible for completing the writing of the manuscript, data collection and analysis; TX was responsible for project conceptualization and methods; PY and MS were responsible for revising the logical framework of the paper and for guiding and making suggestions on the methods for analyzing the example cases; Jiayan Li, XG and Jingyuan Lin were responsible for formal analysis and text proofreading. TX, ML, PT and XD were responsible for investigation of cases. Jiayan Li and TX contributed equally to this work and should be considered co-first authors. PY and MS contributed equally to this work and should be considered co-corresponding authors. All authors read and approved the final manuscript.

Corresponding authors

Correspondence to Peng Yao or Ming Shao.

Ethics declarations

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1.

Textual database of research materials.

Additional file 2.

Landscape elements.

Additional file 3.

Allusions elements.

Additional file 4.

Parameters for clustering.

Additional file 5.

Parameters related to behavior-emotion correlations.

Additional file 6.

Results of cluster credibility testing.

Additional file 7.

Supplementary Chinese literature references.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, J., Xu, T., Gu, X. et al. Scene clusters, causes, spatial patterns and strategies in the cultural landscape heritage of Tang Poetry Road in Eastern Zhejiang based on text mining. Herit Sci 11, 212 (2023).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: