RESEARCH ON ENTITY RELATIONSHIPS IN THE KNOWLEDGE GRAPH OF DISEASE MONITORING IN GROTTO TEMPLES

: The grotto temple, carved into cliffs and widely distributed, is a significant cultural heritage in China. However, it faces severe damage and collapse threats due to natural disaster risks in its environment. Nearly seventy percent of grotto temples are located in regions prone to earthquakes and water hazards, leading to varying degrees of damage to cultural artifacts. Therefore, preventive measures are necessary to reduce the impact of natural disasters on grotto temples. A knowledge graph, a structured semantic knowledge base describing concepts and their relationships in the physical world, plays a crucial role in knowledge organization and content representation. Entity relationships are the core of knowledge, serving as both foundational data and a key task in constructing knowledge graphs and processing unstructured text. In the field of grotto temple disease monitoring, while data continues to grow, research on the correlation between textual data remains underexplored. This paper adopts the BiLSTM-CRF method to extract entity relationships, matching them with the grotto temple monitoring knowledge graph. Finally, the Neo4j software is utilized to program and display the knowledge graph, aiming to enhance the efficiency of natural disaster risk management and cultural heritage protection for grotto temples.


INTRODUCTION
In our country, there are numerous cave temples distributed widely, integrating sculpture, painting, and architecture, making them a crucial component of historical and cultural heritage ( Liu Shijie,2022 ) .However, influenced by factors such as climate, geographical location, and geological conditions, these cave temples exhibit diverse forms of deterioration.Corresponding protective measures are challenging to standardize, and there is a lack of references and support from a scientific and technological system for the specific protection of each cave temple (Wang Jinhua,2021).In recent years, as China's cultural heritage protection strategy has gradually shifted towards emphasizing both rescue and prevention, the demand for new scientific and technological approaches for the preventive protection of cave temples has become increasingly strong (An Cheng,2020).The scientific protection of cave temples involves the interdisciplinary intersection of archaeology, geography, surveying science, and technology (Liu Yingnan,2022), forming a typical complex knowledge structure.The structuring and linking of existing knowledge is one of the directions of artificial intelligence development, with knowledge graphs being a primary theoretical tool (Fu Shan,2019).Knowledge graphs have been widely applied in various fields such as transportation, healthcare, finance, etc. Constructing a knowledge graph for cave temples and the field of risk monitoring can establish a structured knowledge system.This, in turn, can provide scientific and rational technological support for the risk monitoring of cave temples.The basic unit of a knowledge graph can be represented as a directed triple [Entity1]->[Relation r]->[Entity2], where entities represent abstract concepts of objective entities, and relations connect two or more entities, imparting meaningful connections (Wang Zhijin,2013).Relations encompass semantic and syntactic relationships, with semantic relationships reflecting the associations between entity concepts in the knowledge graph (Wang Zhijin,2014).Research on semantic relationships began in 1987 when Landis and others defined five semantic relationships: antonymy, synonymy, class-inclusion, part-whole, and event (Kallio,1988).Chinese scholars such as Wang Zhijin, Zhang Xueying, and Jiang Ting have conducted research on semantic relationships in their respective fields, defining types of semantic relationships in information organization, geographic information, information retrieval, and other domains (Wang Zhijin,2013;Zhang Xueying,2012;Jiang Ting,2017).In addition to research on relationship types, the extraction of entity relationships is a key technology for identifying and understanding semantic relationships.Currently, numerous relationship extraction algorithms have been proposed, with deep learning methods, in particular, becoming a focus of current research (Xun H,2013).Yang Yanyun (Yang Yanyun,2023), Chen Zhongliang (Chen Zhongliang,2022), Ren Ming (Ren Ming,2020), Peng Bo(Peng Bo,2021), and others have used BiLSTM methods to achieve entity relationship extraction in various fields such as medicine, geology, genealogy, cultural relics, etc.However, there is currently no evidence of the construction of a knowledge graph or the extraction of entity relationships specifically for cave temples and their risk monitoring.Therefore, considering the limited textual data in the field of risk monitoring for cave temples, with a focus on the risk monitoring of cave temples and stone carvings, this study revolves around entity relationship extraction concerning risk types, causation mechanisms, monitoring theories, methods, etc.To enhance accuracy, the paper opts to experiment with the supervised learning BiLSTM method combined with a conditional random field model.Additionally, a corresponding knowledge graph will be constructed to establish a knowledge system for the detection and protection monitoring of cave temple diseases.This will provide effective technological support for the preventive protection of cave temples.projects and methods, monitoring principles, basic data, and monitoring relationships (Hu Yungang,2022)  Based on the accumulated research on semantic relationships and the construction of the knowledge graph for cave temple risk monitoring, this study summarizes the organizational structure of two major categories of three-tier semantic relationships.The two major categories refer to dividing relationships into conceptual semantic relationships and logical semantic relationships, while the three tiers further divide them into subcategories.Conceptual semantic relationships include hierarchical relationships and parallel relationships.Hierarchical relationships can be further subdivided into subclass relationships, whole-part relationships, attribute relationships, and instance relationships.Logical semantic relationships include causal relationships, temporal relationships, spatial relationships, and monitoring relationships.These two major categories of relationships are mainly used for expressing relationships between entities within the pattern layers of monitoring objects and content, monitoring projects Knowledge graphs represent knowledge nodes and their semantic relationships in a structured manner, composed of basic units in the form of "entity-relation-entity" or "entityproperty-property value" triplets (Liu Qiao,2016).Addressing the risk monitoring of immovable cultural relics, Hu Yungang et al. constructed a knowledge graph for the risk monitoring of architectural heritage.The knowledge graph pattern is divided into five levels: monitoring objects and content, monitoring

KNOWLEDGE GRAPH FRAMEWORK CONSTRUCTION
and methods, monitoring principles, basic data, and mapping relationships between these four monitoring pattern layers.Monitoring, as not only a logical semantic relationship but also a research method, helps delve into the causes of diseases in cave relics, understand the trends of disease development, and provide corresponding protective measures.By reading investigation reports and literature from Chen Jianping (Chen Jianping,2019) and others, monitoring types are classified into routine monitoring, long-term monitoring, and warning monitoring based on attributes such as monitoring cycle, monitoring time, urgency, and cultural relic value.Firstly, routine monitoring refers to continuous monitoring over time.
Long-term monitoring refers to prolonged and continuous monitoring.Warning monitoring is a proactive monitoring method that detects problems before diseases cause serious impacts, aiming to prevent them.

3.2The description and expression of semantic relationships and monitoring types
In order to more accurately express the organizational structure of semantic relationships and monitoring types between entities, we choose to use the basic building unit of a knowledge graph -triplets for representation.This triplet consists of three elements: the subject ( e 1 ) 、 the relationship indicator ( * ) and the object (e 2 )，The formula is： S={( S={(e 1 , * , e 2 )|e 1 , e 2 ∈ E， * ∈ (I ↦ R)}} In this formula, the superscript *of represents different relationship types, the subscript i represents different relationship types corresponding to the indicator word.The subscripts of and represent two different entities or classes.S represents the set of triplets, E represents the set of entities, R represents the set of relationship indicators connecting entities, which includes vocabulary describing relationships between entities and classes.I represents the set of relationship types, which includes categories defining specific relationships between entities and classes.represents the mapping of indicator words to relationship types.By revealing the semantic relationships and monitoring types between entities, a knowledge graph can orderly store and efficiently retrieve information.This plays a crucial role in the application of knowledge graphs (Ma Chaolong,2010), enabling researchers to quickly and clearly determine the correct monitoring methods and sequences, thereby contributing to better preventing the occurrence of diseases in cave temples.In the process of constructing a knowledge graph, the expression of semantic relationships and monitoring types relies on systematically chosen indicator words, which form the intermediate elements of triplets.In this construction process, the selection of indicator words is not only a tool for linguistic expression but also a key to ensuring the accuracy and completeness of relationship descriptions.In investigation reports and related literature, various indicator words are carefully selected for expressing relationships, ensuring that the triplets of the knowledge graph can convey information clearly and accurately.For example, in the text, the presence of words like "分为" and "包括" indicates a subclass relationship, while words like " 因 此 " and " 因 为 " represent a causal relationship.At the same time, monitoring types are determined based on disease types and monitoring methods, and their precise definition also relies on indicator words representing monitoring content, such as "渗漏水"、"温湿度"、"风 化 " etc.Therefore, the selection of indicator words directly influences the quality of knowledge graph relationships, ensuring accurate extraction of information from the text and effective communication.This provides a reliable foundation for the construction of the knowledge graph.In this process, accurately extracting and appropriately utilizing indicator words becomes an indispensable part of ensuring data quality and the precision of relationship descriptions.Tao,2021).

BiLSTM-CRF Method and Entity Relation Extraction
The BiLSTM-CRF method is a model based on Bidirectional Long Short-Term Memory (BiLSTM) and Conditional Random Field (CRF).It is capable of capturing bidirectional semantic dependencies, globally modeling the entire sentence, and obtaining rich semantic information.This results in a more comprehensive and accurate representation of word vectors (Liao Tao,2021).Therefore, this method has been widely applied in entity relation recognition tasks.
The input gate is used to determine which relevant textual information about cave temple disease monitoring should be included in the analysis at the current time step.The formula is: The output gate is used to control the generation of the analysis results at the current moment with respect to disease monitoring text.The formula is: = * ℎ −1 ， + # 3 The design of LSTM enables it to capture long-term dependencies in text, allowing the transmission of contextual information from the past.This is crucial for understanding the temporal relationships and background information in disease monitoring text.Additionally, LSTM can perform well on relatively small annotated datasets, especially in domains like cave temple disease monitoring where large-scale datasets may not be available.In comparison to the LSTM model, BiLSTM considers both forward and backward contextual information simultaneously.This helps capture hidden relevant information and dependencies in the text, leading to a more comprehensive understanding of the textual content.It exhibits better performance in recognizing the context and context in disease monitoring text for cave temples (Liu K,2020;Zhuang Chuanzhi,2019;Xie Yanhong,2021;Yang Yun,2022).Its structure is illustrated in Figure 4.The CRF layer considers the dependency relationships between labels and the observed features for each label.It utilizes dynamic programming algorithms to calculate the optimal path for each label, determining the optimal label sequence (Huang Z,2015 ;Fan R,2019).In the task of entity relation recognition in cave temple disease monitoring text, the combination of label dependency modeling, observation feature integration, and optimal path decoding helps the BiLSTM model more accurately identify various entities in the text, thereby improving the performance and accuracy of the task.The BiLSTM-CRF model fully integrates the contextual understanding ability of BiLSTM and the label dependency modeling capability of CRF.This significantly enhances the efficiency of manual extraction and achieves outstanding performance.It also helps avoid extraction errors caused by fatigue.Therefore, this model is adopted for entity relation extraction experiments in this study.

5.1Data Acquisition and Dataset Construction
Currently, there is a significant lack of annotated samples in the field of cave temple risk monitoring text data.Therefore, it is necessary to carefully read literature and seek assistance from experts to obtain information about relevant entities and relationship types.The model operates in a Python 3.7 environment with PyTorch 1.10.The parameter settings are batch_size=16, lr=0.0005, and epochs=4.The model achieved an accuracy of 0.814.In the experiment, taking the CNKI literature database as an example, a search was conducted from the year 2000 to 2023 using keywords such as "石窟寺" (cave temple), "石窟寺保护" (cave temple protection), "石窟寺病害" (cave temple diseases), "石 窟 寺 风 险 监 测 " (cave temple risk monitoring), and others.A total of 289 literature materials were obtained.These studies provided sufficient data for entity relation extraction.The experiment successfully extracted words such as "诱发" , "间歇 性" , "季节性" , as shown in Table 4.These extraction results provide clearer and richer guiding vocabulary for the subsequent construction of relationship models.

Visualization and Application
Taking the Yungang Grottoes as an example, the Yungang Grottoes are one of the first national key cultural relics protection units and also a UNESCO World Cultural Heritage site.Over the centuries, they have suffered significant damage due to natural forces and human activities.In order to scientifically protect this precious cultural heritage, especially for preventive protection, monitoring work is crucial.Through investigation and analysis of the Yungang Grottoes, a knowledge graph can be established to provide a foundation for research on the prevention and control of diseases.The process of constructing the knowledge graph involves connecting categories and entity nodes, establishing nodes and relationships through relationship types determined by research, storing them using Neo4j software, and displaying the completed knowledge graph of cave temple disease monitoring, as shown in Figure 5.
When constructing the knowledge graph, the following operations can be performed:

CONCLUSIONS
The systematic knowledge structure is indispensable in practical applications, and the relationships within it play a crucial role in Relationship Type Indicative Words Extraction Results (Quantity) knowledge organization and content presentation, forming the core of a knowledge system.In the knowledge graph of cave temple disease, the classification of relationships does not negate traditional semantic relationships but adds more detailed and domain-specific relationship types based on traditional semantic relationships.This approach aligns with the trend of in-depth research in knowledge systems across various fields.By exploring relationships between monitoring methods and vulnerable entities, as well as relationships between the structures of cave temples, one can quickly obtain accurate monitoring methods, providing new insights for the prevention of cave temple diseases and potentially enhancing the efficiency of preserving cave temples and stone carvings.As technology advances, the cave temple disease monitoring system needs continuous updates and should acquire relevant knowledge through various channels to expand the knowledge graph.Only by doing so can theoretical knowledge be efficiently applied in practical scenarios, achieving the scientific protection and effective management of cave temple cultural heritage.

Figure 1 .
Figure 1.Organizational Structure of Semantic Relationships

Figure 2 .
Figure 2. Structural Diagram of BiLSTM-CRF Model Long Short-Term Memory (LSTM) is a type of neural network model developed on the foundation of Recurrent Neural Networks (RNNs).It addresses the challenge of handling longterm dependencies and effectively deals with the issues of

Figure 3 .
Figure 3. Structure Diagram of LSTM Unit A single LSTM unit comprises a forget gate (f), an input gate (i), and an output gate (o).The forget gate is used to indicate whether the model needs to forget or weaken certain information unrelated to disease monitoring.The formula is:= * ℎ −1 ， + # 1The input gate is used to determine which relevant textual information about cave temple disease monitoring should be included in the analysis at the current time step.The formula is:= * ℎ −1 ， + # 2The output gate is used to control the generation of the analysis results at the current moment with respect to disease monitoring text.The formula is:= * ℎ −1 ， + # 3 The design of LSTM enables it to capture long-term dependencies in text, allowing the transmission of contextual information from the past.This is crucial for understanding the temporal relationships and background information in disease monitoring text.Additionally, LSTM can perform well on relatively small annotated datasets, especially in domains like

Figure 4 .
Figure 4. Schematic Diagram of BiLSTM StructureThe CRF layer considers the dependency relationships between labels and the observed features for each label.It utilizes dynamic programming algorithms to calculate the optimal path for each label, determining the optimal label sequence(Huang  Z,2015 ;Fan R,2019).In the task of entity relation recognition in cave temple disease monitoring text, the combination of label dependency modeling, observation feature integration, and optimal path decoding helps the BiLSTM model more accurately identify various entities in the text, thereby improving the performance and accuracy of the task.The BiLSTM-CRF model fully integrates the contextual understanding ability of BiLSTM and the label dependency The annotation format for cave temple risk monitoring corpora is in "BIOE" format.The open-source software doccano is used for annotation, transforming the problem of joint extraction of entity relationships into a sequence labeling problem.The annotated data is then transformed programmatically into a format like "岩 B-AFF、 体 M-AFF、结 M-AFF、构 E-AFF..." for the training of the BiLSTM model.The annotation results are shown in Figure 5.

Figure 5 .
Figure 5. Display of Data Annotation Effect Creating Nodes: CREATE (<node_name>:<label_name>{name: 'node_name'}) where node_name represents the name of the node, and label_name is the label assigned to the node.Creating Relationships: CREATE ((a)-[:<relation_name>] -> (b)) where relation_name represents the relationship between entity a and entity b.As shown in the diagram on the right, searching for water leakage in the knowledge graph can retrieve six corresponding monitoring methods.In literature and investigation reports, the connection between water leakage diseases and monitoring methods may include relationship indicator words such as "可 以 用 " ," 应 用 ", " 使 用 " etc.Based on attributes such as monitoring period, monitoring time, urgency, and cultural heritage value, it can be determined that long-term monitoring is needed for water leakage diseases.

Figure 5 .
Figure 5. Visualization of the Knowledge Graph for Risk Monitoring in Grotto Temples In conclusion, the article has accomplished the construction of the entity relationship model for the monitoring of cave temple diseases and the creation of a knowledge graph.The dataset's labeling format is BIOE, and the BiLSTM-CRF method is employed for entity relationship extraction.The constructed knowledge graph for monitoring cave temple diseases is created using Neo4j software, showing accurate results and good experimental performance.Future research could focus on

Table 1 .
(see Table1).Following this pattern, a knowledge graph for the risk monitoring of cave temples and stone carvings was established.However, further research, such as relationship classification and extraction, is needed for the monitoring relationship level to complete the knowledge graph system.Classification of Data Layer in Risk Monitoring Knowledge Graph of Grotto Temples and Stone Carvings

Table 2 .
Summary of Literature Relations

Table 3 .
Description and Expression of Semantic Relations Additionally, BIOE labels need to be defined for character data, where B (Beginning) indicates the beginning of an entity relationship word, I (Inside) indicates the middle part of an entity relationship, O (Outside) indicates non-entity relationship words, and E (End) indicates the end of an entity relationship word.This is done for the automatic extraction of entity relationships, and suitable tools like doccano software can be selected to assist in the entity annotation process (Han

Table 4 .
Display of Indicator Word Extraction Results