AUGMENTING 3 D CITY MODEL COMPONENTS BY GEODATA JOINS TO FACILITATE AD-HOC GEOMETRIC-TOPOLOGICALLY SOUND INTEGRATION

Virtual 3D city models are integrated complex compositions of spatial data of different themes, origin, quality, scale, and dimensions. Within this paper, we address the problem of spatial compatibility of geodata aiming to provide support for ad-hoc integration of virtual 3D city models including geodata of different sources and themes like buildings, terrain, and city furniture. In contrast to related work which is dealing with the integration of redundant geodata structured according to different data models and ontologies, we focus on the integration of complex 3D models of the same representation (here: CityGML) but regarding to the geometric-topological consistent matching of non-homologous objects, e.g. a building is connected to a road, and their geometric homogenisation. Therefore, we present an approach including a data model for a Geodata Join and the general concept of an integration procedure using the join information. The Geodata Join aims to bridge the lack of information between fragmented geodata by describing the relationship between adjacent objects from different datasets. The join information includes the geometrical representation of those parts of an object, which have a specific/known topological or geometrical relationship to another object. This part is referred to as a Connector and is either described by points, lines, or surfaces of the existing object geometry or by additional join geometry. In addition, the join information includes the specification of the connected object in the other dataset and the description of the topological and geometrical relationship between both objects, which is used to aid the matching process. Furthermore, the Geodata Join contains object-related information like accuracy values and restrictions of movement and deformation which are used to optimize the integration process. Based on these parameters, a functional model including a matching algorithm, transformation methods, and conditioned adjustment methods can be established in order to facilitate ad-hoc 3D homogenisation for consistent 3D city models. * Corresponding author.


INTRODUCTION
The established mass market applications like navigation systems and visualization tools like Google Earth and MS Bing Maps have shown the potential of using geoinformation for navigation purposes and for the virtual inspections of locations in various fields.However, the potential of using 3D spatial information is much higher than only visualizations.More complex applications arise in various disciplines which require analysis and simulation functionalities, e.g.show all roof surfaces with a certain photovoltaic production potential regarding to their spatial properties.Complex queries like this require geodata not only having geometric information but also with semantic information and in 3D.Semantic 3D city models are geo-referenced urban information models, which decompose a city into objects regarding to logical and spatial criteria.Urban information models provide an integrative frame for data from different disciplines and application fields, both in terms of city inventory taking and planning, and facilitate a large number of analytic and simulation applications in various disciplines like city and infrastructure planning, strategic energy and environmental planning, and disaster management.
When integrating spatial data of different themes, quality, scale, and dimensions, typically spatial inconsistencies occur, which are due to the separate acquisition and continuation of different thematic data [Kampshoff and Benning, 2005].The fragmentation of geodata leads to a lack of information between objects and datasets.Therefore, the integration of multiple geodatasets to a consistent model is highly diverse and not fully solved yet, due to the complexity of the considered syntactic, semantic, and spatial interoperability.Though standards for geodata infrastructures ensure syntactic andto a certain degreesemantic interoperability, there are still difficulties w.r.t. the spatial integration.However, the continuous establishment of new and more complex geodata applications requires structurally, semantically, and spatially consistent base data, in order to facilitate ad-hoc integration of provided data sets, in the sense of the plug&play principle, to create usable 3D urban models with respect to spatial consistency.Up to now, this requirement is not fulfilled and time consuming and costintensive manual post-processing is needed frequently.The inconsistency of geodata from different providers is one major reason for the slow development and establishment of an active geodata market, far less than expected in the past [Micus, 2003].A feasibility study concerning the realisation of the EU environmental noise directive in the state North Rhine-Westphalia, Germany has shown that the largest fraction of time and thus of costswas used for the generation, preparation, and integration of geometry and attribute data (90%), as opposed to a small fraction of time for the actual noise calculation and noise mapping.Particularly, a high proportion of time was used for the homogenisation and processing of geometry data (30%) [Plümer et al., 2006].

City Geography Markup Language (CityGML)
As mentioned above, complex applications like analysis and simulations require virtual 3D city models includingbeside the geometric representationalso coherent semantic information and application relevant parameters of the objects.The City Geography Markup Language (CityGML) is an international standard for the representation and exchange of semantic 3D city and landscape models issued by the Open Geospatial Consortium (OGC) in 2008 [Open Geospatial Consortium, 2008].CityGML defines a common information model and data exchange format for the most relevant topographic objects in cities and regional models.It specifies a common definition of classes, attributes, and relations in terms of the ontology of 3D city models, with respect to their geometrical, topological, semantic and appearance properties.This thematic information goes beyond simple 3D visualization and is required for sophisticated analysis tasks in different application domains like simulations, urban data mining, facility management, and thematic inquiries.CityGML is implemented as an application schema of the Geography Markup Language (GML) 3.1.1,the extensible international standard for geodata exchange and encoding issued by the OGC and the ISO TC211 [Kolbe, 2008].

Geodata Integration
During the integration of spatial data, very different problems occur depending on the foundation and lineage of the data and the objective.Sester [2007] describes the characteristics of the integration of two-dimensional data from various sources and different data models.A prerequisite for the interoperability of these data is that the structural, the geometrical, and the semantic differences between the data sources must be considered.Structural interoperability is achieved through the current established standards of the International Organization for Standardization (ISO) and the OGC.According to Sester [2007], the different representations and content of geodatasets and thus the semantic differencesare more problematic.In order to facilitate a meaningful integration of those data, the semantics have to be made comparable.If, for example, in one dataset an object is denoted by the term "lake" and in another dataset the term "pond" is used, the equality of the object types can only be identified by a semantic analysis [Sester, 2007].Sester [2007], Duckham and Worboys [2005], and Volz [2006] have used an approach, which is based on the idea, that objects at the same location with similar geometric structures have a semantic relationship.
The use of a standardized semantic data model for spatial data facilitates consistent data integration regarding to the structural andto a certain degreealso the semantic and spatial interoperability.Although due to varying interpretations by different data producers one object could be modelled using different semantic classes or resolutions, however, variations are limited by the semantic data model of the used standard.Based on the semantic information, assumptions about the meaning of objects can be made and thus implicit geometric relationships between objects of different models can be derived.These relationships can be formulated by rules or conditions facilitating the semantic and spatial interoperability.Stadler and Kolbe [2007] describe the benefits of the integration of geodata using semantically decomposed models with strict observance of the coherence between the semantics and the geometry.Koch and Heipke [2006] present methods for the integration of two-dimensional GIS data with a Digital Terrain Model (DTM), e.g. the integration of water or traffic areas with a DTM.Geometrical relationships between the different thematic objects from different models are derived from the semantics of the objects.The geometrical relationships are formalized by integration rules and conditions and flow into adjustment procedures to facilitate a spatio-semantically consistent overall model where water surfaces are restricted to be horizontal and roads do not exceed a certain slope [Koch and Heipke, 2006].
When integrating thematic objects from different datasets, e.g. a footpath with a building, the actual relationship between these objects can often not be determined by assumptions from the implicit semantics or statistical tests.For example, an existing gap between a building model and a traffic area model may describe a geometric inconsistency between both datasets or may really exist in the form of a narrow grass or gravel strip between the objects.The semantic data model of CityGML includes already a first concept for the explicit description of relationships between city objects and a digital terrain model.The Terrain Intersection Curve is an explicitly modelled line which represents the connecting line between the geometry of buildings or other city objects with the terrain surface and facilitates spatial consistency between these classes [Open Geospatial Consortium, 2008].Emgard andZlatanova [2008a, 2008b] present an information model which intends to integrate geographic features on the earth surface as well as above and below the earth surface into a common semantic-geometric model, which include an extension of the idea of the CityGML Terrain Intersection Curve to describe relationships between different dimensional objects and the terrain.CityGML version 2.0 will come up with a new concept to specify the relationship of objects with the terrain surface.The new attributes relativeToTerrain and relativeToWater are available for every _CityObject and specify the feature's location with respect to the surrounding terrain/water surface by qualitative (not quantitative) expressions, e.g.entirelyAboveTerrain or substantiallyBelowTerrain.
The discussed concepts face the problems of geodata integration, where the structural, geometrical, and semantic differences have to be considered, in order to facilitate data interoperability.The structural interoperability can be achieved by standards for modelling geodata.The semantic and geometrical interoperability can be aided by using a standardized semantic data model and by assumptions from the implicit semantics.However, there are still many cases, especially when integrating different thematic objects which are adjoining, where the actual relationship cannot be assumed and explicit linkage information is needed.The approach presented within this paper adopts and extends the first concepts of making relationships between city objects and the terrain explicit.We intend to provide a model which facilitates to specify the relationship between adjoining objects of arbitrary theme in different datasets.

DEVELOPMENT OF THE GEODATA JOIN
The goal of the approach presented in this paper is to overcome the lack of information between fragmented geodata by introducing a so called Geodata Join model which interfaces objects in different CityGML datasets of arbitrary themes and different origin, scale, dimension, and quality.The Geodata Join provides the ability to objects additional information which describes the linkage to another object and further objectrelated information including accuracy values and inner geometric conditions.During the creation process of geodatasets, e.g. during the extraction and modelling of buildings, this information is often known or can be derived from the base data, however, they are not yet explicitly considered.This is mainly due to the separate thematic acquisition and modelling of spatial data and, moreover, due to the fact that semantic data models do not provide the possibility to store spatial linkage information between objects of different datasets in a unified way.
In preparation of modelling the Geodata Join, some considerations have to be made.CityGML defines classes for the thematic classification of city objects and determines their geometric modelling.However, CityGML provides some flexibility which leads to a variety of possible semantic and geometrical representations of the same object in different datasets, depending on the data provider, the geometric base data, the semantic interpretation, and the desired level of detail.Hence, the Geodata Join has to reflect a certain degree of abstraction in order to be applicable for all thematic classes in CityGML and to allow for the different geometric and semantic modelling of the same object type in different datasets.The Geodata Join shall be usable for each city object within a dataset, e.g.buildings, roads, and city furniture, describing the relationship to one or more adjoining objects in other datasets.
The Geodata Join comprises information about the linkage of an object to another object within another dataset and further object-related parameters, shown in figure 1.The linkage information includes three considerations; 1) the part of an object which holds a topological or geometrical relation to the other object, 2) the object type of the connected object, and 3) the kind of relationship between both objects.The objectrelated information includes object accuracy values and geometric conditions.
Figure 1.Geodata Join information shown by way of example for a building

Linkage Information
The first linkage information is referred to as a Connector.It is the description of the part of an object which has a topological or geometrical relation to another object.Connectors specify the part of the object which includes the connection to another object and can be described by points, lines, and polygons of the existing object geometry or by introducing specific connector geometry.The advantage of using the existing object geometry is that no additional geometry is needed.However, this approach means that in some cases the actual connection geometry between two objects may not be explicitly described by the object geometry as shown in figure 2b.The example in figure 2 shows that the connection between two objects can be described by connectors of different dimensions depending on the object geometry, e.g.line-line, face-line, or point-line.If a building is modelled in a way that the lower edge of the wall surface represents the intersection with the terrain, this part can be used to describe the connector curve geometrically, shown in figure 2a.If a building is modelled extended into the ground, e.g.including the cellar, the connection curve with the footpath lies within the wall surface, so that the entire surface represents the connector, shown in figure 2b.
In order to facilitate a semantically correct connection of objects, the linkage information includes the specification of the connection object type based on the semantic model of CityGML.The connection object is specified by the top level class of the corresponding thematic module in CityGML.In case of the example in figure 2, the Geodata Join of the building model would specify the connection object road by the CityGML type TransportationObject.The relationship between objects is defined by the Dimensionally Extended 9 Intersection Model [Clementini and Di Felice, 1996] and, in case of a gap between objects, by geometrical values quantifying the distance.
The provision of linkage information for the integration process allows an explicit matching of neighboured objects which have a topological or geometrical relationship, however, the Geodata Join does not explicitly model the matching between two objects, e.g. by using an identifier.The advantage is that the dataset which is to be enriched by the Geodata Join is independent of other datasets and includes general linkage information which bases on the semantic class of the object to be integrated.Moreover, the Geodata Join only needs to be available on one side.Although the actual matching has to be developed during the integration, the process will be simplified and matching ambiguity will be reduced.

Object-Related Information
Additionally, accuracy values and geometric conditions of an object are defined by the Geodata Join to enhance the quality of the integration result.The object accuracy can be quantified separately by the absolute horizontal, the absolute vertical, and the inner object accuracy.This is due to the fact that the accuracy values of the different dimensions are often very heterogeneous.Geometric conditions can be defined by restricting the movement of the entire model in horizontal and vertical direction and by restricting the deformation in horizontal and vertical direction.If a deformation is allowed, the angles and distance ratios can be preserved and the introduction of new line break points can be allowed.

Geodata Model
The Geodata Join was modelled and implemented as a CityGML Application Domain Extensions (ADE).[Kaden, 2009].

CONCEPT OF INTEGRATION USING GEODATA JOINS
Within this chapter, the basic concept for the integration of geodata using the Geodata Join is presented.Inconsistencies like gaps and intersections between objects which adjoin in the real world are to be automatically detected and removed.Integrated models shall ad-hoc represent a consistent mapping of the real world and with the highest possible accuracy w.r.t. the accuracies of the input models.

Semantically Based Matching
A prerequisite for the geometric harmonization is the matching of adjoining geometry parts of neighbouring objects within different datasets.Through the linking of these entities over the entire model, inconsistencies between integrated datasets can be determined and removed.Based on the linkage information in the Geodata Join, a matching algorithm determines the connection between the respective objects, sketched in figure 4. Within a first step, the matching algorithm interprets the Geodata Join within the building model on the left side and determines whether there is a connection to another object, of which type is the connected object, and which relationship between both objects exist.Within a second step, the correct connection object in the city model on the right side is now to be determined from all objects of the corresponding class by statistical analysis of the object geometries.The matching algorithm is dynamic of the integration algorithm and defines the matching rules, which can be adapted regarding to the data or user specifications, e.g. the quality and the semantic and geometric modelling characteristics of the input datasets.After the correct matches have been identified, the inconsistencies are to be determined in the form of inconsistency vectors between the object geometries, shown in figure 5. Inconsistency vectors are to be calculated between object points and its perpendicular at the connected geometry of the other object.
(a) (b) Figure 5. Examples for the geometric inconsistency and inconsistency vectors between (a) a building and a footpath and (b) two buildings

Elimination of Systematic Inconsistencies
Inconsistencies between objects of two different spatial datasets contain a global systematic, a local systematic, and a random component [Kampshoff and Benning, 2005].By the interpretation of the inconsistency vectors over all linked objects, the systematic components of the model inconsistency can be determined.If a systematic component can be identified, the direction and the magnitude of the vector of the systematic component between both models are to be determined, shown in figure 6a.Thereby, the difficulty is that inconsistency vectors are not linking identical points (like tie points in the photogrammetric bundle block adjustment) but coincident lines or surfaces, e.g. between a building and a footpath.That means that the magnitude of the systematic component for each single inconsistency vector depends on its orientation w.r.t. the direction of the systematic component vector of the model inconsistency.The position of the common target system is to be calculated by weighting the starting positions regarding to the accuracies of the input datasets, e.g.given in the Geodata Join.

Elimination of Randomly Distributed Inconsistencies
As a result of an over-determined transformation, typically residuals between the liked objects remain, which is due to the randomly distributed measurement errors of the objects.The randomly distributed components of the object inconsistencies are to be eliminated by a geodetic homogenization process including least squares adjustment calculations.A functional model is to be applied for all linked objects of the integrated datasetsnot separate adjustment between two connected objectssince an object can be linked to multiple objects and the connected object may have a connection to a third object.
Although the residuals are primarily related to that part of an object which has been linked, the homogenization process has to consider the complete object and even those objects with no topological connection to another object.Changing only linked geometry parts of objects to eliminate the inconsistencies, however, has an influence on the one hand to the inner object geometry and on the other hand to the distance dependent correlation of objects within a dataset, which follows the so called principle of neighbourhood.Since the neighbourhood accuracy (relative accuracy) is in general considerably higher than the absolute positional accuracy of two points, the distance dependent correlation has to be considered within the adjustment.The same is true for the inner object geometry which is often subject to object-geometric conditions, e.g.rectangularity, parallelism, and distance ratios, and which has to be introduced into the adjustment calculations by corresponding equations.
Both, the distance dependent correlation and the objectgeometric conditions are to be defined based on additional object-related information, e.g.given in the Geodata Join.Connected objects will be moved and deformed in order to minimize all residuals between all connected objects, regarding to the given constrains.

Quality Measures of the Integration Result
Based on the harmonization process and the involved functional algorithms, quality measures and accuracy values are to be determined.The goal is to facilitate the comparability of integrated models, e.g.virtual 3D city models, regarding to their model quality and accuracy.Quality measures of the integration are on the one hand the degree of the achieved geometrictopological consistency between the linked objects and on the other hand the degree of preserved geometric-topological relations of objects within the input datasets.In addition, accuracy values for the objects of the integrated model are to be determined using the laws of error propagation and the accuracy values of the input models.Information about the quality of virtual 3D city models will be useful for many applications, especially for analyses and simulations.Moreover, these values can be used for further integrations, so that uncontrolled error propagation will be avoided.

SUMMARY OUTLOOK
Based on the semantic model of CityGML, a concept was introduced with the aim to support ad-hoc integration of semantic 3D city models with various thematic contents and origins in order to generate consistent overall models.In this paper the problem of integrating complex 3D models of the same representation (here: CityGML) with respect to the homogenization and the consistent geometric-topological linkage have been discussed.An essential part of the concept is the development and the implementation of the new objectrelated Geodata Join which describes the topological or geometrical relationship between an object of a dataset and another object in another dataset, e.g. a building is connected to a footpath.Based on the join information, a matching process can identify connected objects and determines the inconsistency vectors.The inconsistency vectors can be analysed and appropriate homogenisation algorithms can be applied to separately eliminate the systematic component and the randomly distributed inconsistencies.The Geodata Join includes not only the linkage information but also additional object-related information, i.e. spatial accuracy values and geometric conditions of the object.Considering the spatial accuracy of the input models facilitates a weighted transformation in order to eliminate the systematic components of the inconsistency and allows setting up a stochastic model for the optimal adjustment of randomly distributed inconsistencies.
The geometric conditions include movement and deformation restrictions of an object which facilitate a conditional adjustment during the integration.The additional join information about the object accuracy and the geometric conditions allow for optimizing the integration result regarding to the geometric-topological consistency and in a best possible accuracy.Finally, quality measures will be derived which represents the topological and geometrical quality of the integrated model.
The development of the Geodata Join and the integration algorithm is an on-going process.Further work will include the generation of test datasets enriched with Geodata Joins and the implementation of the matching and integration algorithms.Practical tests will provide useful insights to further develop and optimize the join information and the Geodata Join model.Thereby, the amount of additional join information is to be kept as low as possible so that an automated annotation of the models can be done at a reasonable extra cost.Also the case of contradictory or wrong Geodata Join information has to be investigated.One challenge is to develop a common evaluation function, which takes into account on the one hand the geometrical residuals and on the other hand the achieved and preserved topological relations.
Although the introduction of the Geodata Join first of all requires some extra effort for data providers, however, the (manual) effort during the data integration will be significantly reduced.It is aimed to reduce the timeand thus the costfor the preparation and integration of geodata significantly by slightly enriching the input datasets with information about connectivity and additional object accuracy and geometric conditions.The Geodata Join can be seen as a powerful generalization of the concept of tie points in photogrammetric data integration, i.e. the "Passpoint 2.0".
Connection of a building with a road object represented by (a) a line connector and (b) a surface connector at the building Figure 3 shows the UML diagram of the modelled Geodata Join.The class Interface is modelled as an additional complex structured property.Since Interface is modelled as an (optional) component of the CityGML class _CityObject, all concrete subclasses like Building, CityFurniture, and Road inherit the new Geodata Join properties.The class Interface aggregates the five component classes _Connector, TopologicalGeometrical-Relation, ConnectionObject, ObjectAccuracy, and _Object-IntegrationBehavior which include the five groups of join information.The connector geometry is represented within the abstract class _Connector.It includes the child classes PointConnector, CurveConnector, and SurfaceConnector, which are geometrically described by GML3 geometry types.The component class ConnectionObject specifies the type of object which is connected.The CityGML object class is specified by the attribute classType.The class Topological-GeometricalRelation includes attributes to describe the topological or geometrical relationship between two connectors using the Clementini-Matrix [Clementini and Di Felice, 1996] and geometrical values to quantify any distances between two objects.The class ObjectAccuracy includes the attributes innerAccuracy, absoluteHorizontalAccuracy and absolute-VerticalAccuracy. For a unique indication of the accuracy, the values are classified within the ExternalCodeList.The AccuracyClassificationType list includes the definition of the accuracy levels according to the German ALKIS data model.The component class ObjectIntegrationBehavior defines the restrictions of the manipulation of the object geometry.The Movement type restricts the movement of the entire object by the attributes positionFix and heightFix.With the value "true", the horizontal position and the elevation of an object can be fixed separately.The Deformation type restricts the deformation of the inner object geometry by the attributes horizontalFix and verticalFix.If the deformation is allowed, the angles and distance ratios can be preserved by the attributes preserveAngle and preserveRatio.The attribute breakPoints allows the deformation of the object by the introduction of new break points into the object geometry.The concept and the model of the Geodata Join are explained and demonstrated in detail within the master thesis of the first author

Figure 4 .
Figure 4. Schematic representation of the matching algorithm including the two steps (a) Inconsistency vectors between the buildings and the path and the derived vector of the systematic component of the model inconsistency with bearing of 180° and magnitude "m" (b) weighted transformation of the building model and the traffic area model towards each other The systematic component of the model inconsistency is to be eliminated by applying the similarity transformation to the entire input datasets as shown in figure 6b.Both datasets are transformed onto the determined systematic component vector and moved towards each other into a common target system.