Automatic Enrichment of Semantic 3D City Models using Large Language Models
Keywords: 3D City Models, CityGML 3.0, Urban Digital Twins, Semantic Enrichment, 3DCityDB, Large Language Models
Abstract. Semantic 3D city models have become an essential component of city planning and digital twin applications. While standards like CityGML have enabled the structured representation of buildings and infrastructure, publicly available CityGML datasets often lack critical semantic attributes such as construction year, usage type, refurbishment status or sometimes outdated building function. These gaps hinder the application of 3D models in areas like energy demand analysis or infrastructure planning. Meanwhile, much of the missing data can be found in alternative sources such as municipal records, OpenStreetMap, or other APIs. Yet, integrating this heterogeneous and often unstructured information into the CityGML schema remains a complex task that requires geospatial expertise and good knowledge of the CityGML data model. In this paper, we explore the use of Large Language Models (LLMs) to automatically extract and map relevant information from sources like PDFs, APIs and VGI (Volunteered Geographic Information) platforms such as OpenStreetMap into CityGML, using spatial databases such as 3DCityDB to store and manage the enriched semantic data for both building and street use cases. We propose a framework based on two LLM agents, one for data enrichment and one for querying, which will enable non-experts to enrich and interact with 3D city models more effectively. Our approach aims to reduce reliance on domain-specific knowledge and make the usage of semantic 3D city models accessible to everyone.