FUSION OF DIRECT GEOREFERENCED AERIAL DRONE WITH TERRESTRIAL LASER SCANNER DATA THE CASE OF THE ROMAN BATHS OF AMATHUS, CYPRUS

: The fusion of geomatic techniques with different accuracy and resolution has been used in recent years to applications of geometrical documentation for several archaeological sites. In this research article, the case of the Roman Baths of Amathus in Limassol, Cyprus was investigated. Aerial photogrammetry and Terrestrial Laser Scanning (TLS) techniques were used to produce the final product. The photogrammetric data were based on Post-Processed Kinematic (PPK) imagery while the TLS georeference was based on targets. The main research objective of this paper is to examine the reliability of PPK photogrammetric data without the use of Ground Control Points (GCPs) and how well they can be integrated with TLS data. Also, the two acquisition techniques were compared and indicated that 0.005 m Ground Sampling Distance (GSD) resolution aerial images and 0.0061/10 m resolution scans can be qualitatively fused. The research paper will present the entire methodology up to the generation of the final 3D photorealistic product.


INTRODUCTION
Cultural Heritage (CH) is precious and irreplaceable, a fact that charges modern civilization with the responsibility of preserving and safeguarding it.The conservation, restoration, monitoring and protection of CH sites should be a common goal for all of us.In recent years, several geospatial techniques and applications have been developed that can help monitor such important CH sites.In the last decade, the creation of Three-Dimensional (3D) models of the real world has been a very exciting topic in digital photogrammetry and Structure from Motion (SfM) technologies as well as in Light Detection And Ranging (LiDAR) applications.This development in the field of geomatics formed the basis for planning protection, conservation and restoration works of CH sites, something that is very important in the field of archaeology.In CH, data fusion or integration, is essentially the process of merging 3D data from sensors, possibly with different resolutions, to reconstruct a real object that will be complete, consistent, accurate and useful for further study.The fusion is performed to exploit the advantages of each method as well as to minimize the weaknesses of each of them (Ramos, M. M., & Remondino, F., 2015).For example, it can be done by fusing Point Clouds (PCs) data from images taken by Unmanned Aerial Vehicle (UAV) and TLS.Essentially, data fusion aims to produce a final deliverable that is better than the data provided by each method separately.This application is taking place in a part of the archaeological site of Amathus located on the seafront of Limassol, Cyprus.The current paper deals with data fusion from Real Time Kinematic (RTK) capable UAV and TLS.RTK technology in UAVs is relatively new and the goal is to see how reliable its data is when integrated with data from other sources.Also, there will be a description and analysis of the actions followed in the various stages in order to produce the final product which is a 3D photorealistic model of the area under study.* dimitrios.skarlatos@cut.ac.cy; phone +357 25002360

Case study
The archaeological site of Amathus is a UNESCO world heritage site.Its located 10 km east of Limassol center and it is one of the most important historical sites in Cyprus.Τhe case study concerns the Roman Baths and the Doric order columns (Figure 1) located past of the Roman market (Agora) with a total area ~625 m 2 (~25*25m).The small complex of Roman Baths was assumed that it was built in the second century AD.The structure contains the basic elements of a typical Roman bath that forms a square structure with cold rooms to the left and hot to the right, heated by an underfloor hypocaust system (Aupert, 1998).The Baths consisted of the following spaces: Apoditerium (A) -The entrance to the Baths area.It also served as a dressing room and was furnished with benches.Frigidarium (F) -This area featured a well and two cold pools.Guests would enjoy these pools of cool water before proceeding to warmer Baths in the other rooms.Sudatorium (T) -According to archaeologists this room may have served as a place for light sweating and as a transition from cold to hot bath.Caldarium (C) -It was the hottest bath which featured a hot water immersion pool.Service area (S) -It was a service area, occupied by the furnace where the fire was lit and afterwards the superheated air was forced into the hypocaust (Aupert, 1998).

Motivation
Given the affordability and the usability of RTK UAV in CH projects, the question of accuracy and possibility to use the photos for Direct Georeferencing (DG) rises.The accuracy in CH projects is more demanding than in mapping projects, where RTK/PPK UAV solution was proved to be equivalent to traditional GCP approach (Tomastik, J., et al., 2019).At the same time Stroner, M., et al. (2021) suggested that use of oblique photos significantly improves RTK/PPK solution, hence the oblique approach which anyway favorites 3D modelling in CH, could also improve DG.This is important when UAV data are to be combined with data from other sources, such as TLS data, which are being georeferenced independently, i.e., with control points in the state Geodetic Reference System (GRS).The later becomes more relevant, when PCs from independent sources must be merged to fill in missing areas.The same holds if photos from RTK UAV must be used to texture a mesh model generated from TLS.The co-registration of data from different sources must be of very high accuracy before performing any of the above tasks.In this paper authors investigate: a) the accuracy of DG of PPK UAV photos, b) whether this accuracy is enough to co-register PCs from TLS with the aerial photos directly, c) TLS mesh vs photogrammetric mesh.

RELATED WORK
Fusion of photogrammetry and LiDAR data is a technique that has been used quite a bit in recent years in CH applications.In this section, some relevant research articles from the recent past will be presented.Luhmann, T., Chizhova, M., & Gorkovchuk, D. (2020) attempted to represent some churches as 3D models in Tbilisi, Georgia.Measurements were made with two TLSs, aerial and terrestrial photogrammetry.Problems were observed at the edges of the church dome and additional ground images were taken to improve the deliverable.In general, the results of the methods were similar, but the most complete and qualitative 3D model was achieved by combining the TLS and photogrammetry data.Naanouh, Y., & Stanislava, V. ( 2021) generated a 3D model and topographic map using TLS and UAV photogrammetry in Beaufort castle, Lebanon.The total deviation between the two technologies was sufficient to generate convergent data.Aerial photogrammetry had the potential to improve the 3D model on the upper parts of buildings that were difficult to be scanned by laser, thereby increasing the accuracy of the overall topography as well as the shape of an individual building.Finally, the best version of the model was obtained by fusing both techniques.Ramos, M. M., & Remondino, F. (2015) analyzed some previous applications of data fusion (based on image and range) in CH, in order to gain insight into actual data fusion methods and clarified some open research issues.They mentioned the concept and levels of fusion as well as some fusion approaches that have been presented to the research community so far.They explained the pros and cons of each method and explained that there is no single method that gives the best results individually.They concluded that data fusion can provide more complete final deliverables.Girelli, V. A., et al. (2017) carried out a series of CH applications in San Leo, Italy.TLS and SfM techniques were applied individually and in combination.The purpose was to produce models for geological studies.The results showed that the obtained data in all cases were useful for subsequent analysis and processing.In Republic of Korea, they focused on 3D digital documentation of Magoksa Temple.They applied UAV photogrammetry and TLS data fusion.The two data sets were merged using tie points and for georeference in an absolute system, they used GCPs.They explained that photogrammetry gives better results in the horizontal direction while the LiDAR in the vertical.Finally, they integrated the data from the two methods and produced a complete model with high accuracy (Jo, Y. H., & Kim, J. Y., 2017). Del Pozo, S., et al. (2019) proposed a methodology about fusing TLS and close-range photogrammetry to reconstruct the Cueva Pintada archaeological site in Gran Canaria, Spain for conservation, restoration, monitoring and protection purposes.The outcome derived from the fused data of the two techniques provided an accurate and complete 3D model.It is remarkable that they carried out a virtual tour with metric properties on the entire archaeological site.

Equipment
The goal of the 3D reconstruction was to achieve a complete, accurate and properly textured 3D model of the Roman Baths, which could be used both for geometric documentation as well as virtual reality tours.Based on this, high accuracy and resolution technologies had to be used.Therefore, the main data acquisition technologies used were TLS and RTK UAV.In addition, a reflectorless Total Station (TS) LEICA TCR1203+ (Figure 2c) and LEICA Viva GS15 GNSS receiver (Figure 2d) were used for georeferencing to state coordinate system and measuring TLS control points and photogrammetry check points.FARO Focus M70: The FARO M70 (Figure 2a) can perform fast, direct and highly accurate 3D measurements.It can combine high-quality scanning technology with versatility.The basic FARO M70 technical specifications are as follows: Range: 0.60-70 m, angular step size 0.009 o (hor/ver), measurement speed: up to 488,000 pts/sec, minimum ranging error: ±0.0015/10 m, integrated GNSS: GPS and GLONASS and field of view: 360 (hor)/300 (ver).It also has HDR photography, compass and altimeter.For the post-processing of the LiDAR data the FARO SCENE software was used.AUTEL Evo II Pro Enterprise: The EVO II Pro (Figure 2b) is a multi-purpose quadcopter.It has 1-inch CMOS sensor with 0.01057 m real focal length (0.029 m equivalent), 82 o field of view, adjustable aperture (f/2.8 -f/11) and electronic rolling shutter.The sensor offers a resolution of 20MP (5472 x 3648 pixels) with a 2.4 micron pixel size.It has automatic flight functions, 6-way obstacle avoidance stereo sensors and RTK module that allows taking pictures with high accuracy.For the post-processing of the image data the Agisoft Metashape software was used.

Acquisition Methodology
The basic workflow diagram followed to produce the final deliverable is presented in Figure 3.The laser scans were merged using Iterative Closest Point algorithm (ICP) and then georeferenced using pre-signalized targets, automatically recognised by the software.The targets were measured with a reflectorless TS from a closed traverse, whose points were measured with RTK GNSS.Given that the accuracy of RTK GNSS is suboptimal for CH projects, the rigid closed traverse was adjusted with a three parameter (translation and rotation) Least Squares adjustment to RTK points' measurements.
The projection centers of the aerial photos were processed with PPK and then aligned using SfM process.The result is produced without any GCPs, and it is considered as 'DG' to Local Transverse Mercator 1993 projection (LTM 93).Natural check points were measured in field and observed in the photogrammetric software to provide accuracy assessment.These check points were measured with the same methodology of the TLS targets.

Initial field planning:
Given the enormous data LiDAR and photogrammetry produce, an initial field plan was necessary.
The goal was to deliver just enough data to meet the requirements of the application.Lighting conditions are a very important factor that affects the quality of photogrammetric data.The RTK flight was chosen to take place first, during early morning hours where there is no intense solar radiation and under cloudy conditions to ensure a homogeneous photorealistic result.Since fusion is a hybrid technique that links data referred to the same reference system, a reference network had to be established that would connect the two techniques to the Cyprus Geodetic Reference System 1993 (CGRS 93) on LTM 93.For this reason, two reference networks were established with GNSS and TS measurements, where afterwards specific LiDAR targets were measured with a TS.At the same time physical targets were measured to examine the photogrammetric data set.The last phase was the LiDAR scans.Despite the small size of the study area, several scan stations were established to provide a detailed and complete product.

Photogrammetric data acquisition:
Since the photogrammetric data would have to be integrated with the LiDAR data at post processing stage, a flight plan with high resolution images and quality block geometry had to be designed to ensure compliance with the specifications.Based on the concept of the flight plan and given the high level of detail of the archaeological site, a low flight with vertical (nadir direction) and oblique (45° angle with respect to the horizontal level) images, with large overlaps and crossed pattern was chosen to ensure a high level of matching points between images.The total number of photos defined in the initial field plan was not sufficient for acquisition in a single flight due to battery change limitation.In total, six flights took place to ensure able data before leaving the site.Data from only two flights (one flight under shade and one under low sunlight), were processed further.The total number of obtained images and the basic parameters of the two flights are presented in Table 1.

Reference network establishment and target acquisition:
Before the laser scans began, two separate reference networks based on the same three stations were established (triangle -Figure 4a).The first network was based on RTK observations on LTM 93 using a GNSS receiver and the second (local -independent of LTM 93) by raw measurements of angles and distances between the three stations using TS.All angles and distances were measured at least three times each.Subsequently, raw TS measurements were acquired on specific checkerboard LiDAR targets (Figure 4b), as well as physical targets (Figure 4c) with size 2-3 times larger than image average GSD.The checkerboard targets were measured without a prism, while the physical targets with or without a prism.The targets were chosen to be placed at high and low spots with good geometric distribution.In total, raw measurements were acquired on 11 checkerboard and 26 physical targets.

TLS data acquisition:
Laser scans were acquired at short distances between stations (~4-7m) aiming to have sufficient overlap and good visibility with the checkerboard targets.In some hidden spots, the TLS stations were placed closer to each other, to obtain the desired detail.The scanner was placed on a special tripod and depending on the area to be scanned, adjusted to the appropriate height.At the end of each scan, the scanner's installed camera took RGB images to colorize the data.The scans were acquired based on the following parameters (Table 2).

TLS data acquisition
Total scans 39 (×3 passes) Average point distance 0.0061/10 m Total points per second 244,000 Total points per scan 43,700,000 Time per scan (images acquisition included) 4:30 mins Table 2. TLS data acquisition.

Pre-processing
Reference network coordinates calculation: As mentioned in the reference network establishment chapter, two separate networks were established, the first based on RTK GNSS observations on LTM 93 and the second (local) based on raw measurements (all possible angles and distances) obtained by a TS.These TS measurements were solved using Least Squares, with RMSE 0.002 m, which is precision of the internal geometry of the triangle.Given the suboptimal of RTK GNSS for such applications, it was necessary to perform a three-parameter transformation (2 translations and 1 rotation) between the two reference networks in order to adjust the TS triangle to LTM 93.The RMSE of this horizontal Least Squares adjustment was 0.037 m, which is a measure of how accurately this triangle was fitted to LTM 93.The final adjusted LTM 93 coordinates for the threeposition reference network combined with the raw measurements acquired by the TS, were used to compute the coordinates of the 11 checkerboard and 26 physical targets.

Additional corrections to the photogrammetric data set:
The projection centers and their coordinates were derived after PPK using a Virtual Reference Station (VRS) close to the archaeological site, as created from Electricity Authority of Cyprus GNSS Network.The PPK corrections were needed to minimize the original error of the RTK positions of the images, necessary in all CH applications.The average 3D position accuracy of the 708 projection centers after PPK adjustment was ~0.010 m.

Processing
4.2.1 Photogrammetric workflow: Data from both two flights were processed together during bundle adjustment and 3D reconstruction.For final texturing, photos from the flight with low sunlight were deactivated to produce aesthetically pleasing and virtual reality ready 3D model.High resolution was selected during all processing phases.For the initial alignment (relative orientation -Figure 5) of the 708-image block, 5,000 and 15,000 were selected as thresholds for tie points and key points respectively.After SfM and filtering of the original sparse PC, 630K points with 2.3M projections (on average 4 projections for each 3D point) were used for bundle adjustment.The image reprojection error was below 0.5 pixels, ensuring a rigid relative orientation of the block.Camera selfcalibration was performed with rolling shutter on, and the remaining residuals were below 1/10 of the pixel, apart from the extreme corners of the frame (Figure 6).The 26 check points were observed at 6 photos each, but did not take part in the bundle adjustment.After bundle adjustment, the dense PC was created.The check points RMSE of ~0.040 m is like the reference network coordinates transformation error, also ~0.040 m.The camera locations were moved on average 0.0028 m horizontally and 0.0075 m vertically to align to each other, during the DG.This is also an internal indication measure of the PPK solution.
The residual bundle adjustment errors for the 708 image centers and the 26 check points are presented in Table 3.The 3D reconstruction continued by filtering out the worst confidence points (values 0-2) from the dense PC.Afterwards the 3D mesh (Figure 7) was created.After the meshing of the PC, hidden points in holes or under column capitals that were difficult to be captured by camera, had a low confidence level (Figure 8)., where photogrammetry successfully returned points all the way to the bottom.At the same spot (point A -Figure 12), it was necessary to place the scanner in three stations to properly cover it.

TLS workflow:
Initially, all scans were aligned to each other using ICP.The PC georeferencing was performed in the LTM 93 using the checkerboard targets.The remaining mean georeferencing errors, shown in The georeferenced PC was cleaned of unnecessary information and noise.It was evident that the necessary point density wasn't obtained on high spots such as the column capitals or the upper parts of walls, which didn't have good visual contact with the scans (see arrows -Figure 9).The same happened in hidden spots behind walls or stones that also didn't have good visual contact with the scans.Then, for the generation of the mesh model, the PC was divided into 4 sub-areas (~15*15 m) with little overlap, since a model with sufficient number of triangles could not be generated for the entire area (Del Pozo et al., 2019), due to software restrictions.For the entire area (~30*30 m) the software was limited by generating only ~2.5M triangles, while each sub-model consisted of ~2.5-3M triangles.It is important to mention that the edges of the sub-models appeared to have deformations (Figure 10).After merging the 4 sub-models, a sufficient 3D mesh model was generated.In Figure 12, the final mesh model and the scan positions are presented.Deformations (swollen) were observed on the top of the column capitals (Figure 13) as it was difficult for the laser to get accurate measurements.In the mesh creation options it could be chosen to leave open holes on the column capitals, but for aesthetic purposes of the final photorealistic model it was preferred to close them.

Mesh-to-mesh comparison:
After cropping the two 3D models to the same area, a mesh-to-mesh distances calculation was applied for comparison.The photogrammetric and LiDAR model consisted of 2.3M and 5.7M triangles respectively, so the denser was set as the reference model.Τhe distribution of differences was from -0.070 to 0.040 m (Figure 14).The mean value of the differences was 0.022 m and the standard deviation was 0.030 m (RMSE 0.037 m).Greater differences (~0.050 to 0.100 m -Figure 14) were observed in areas where the one or other method (or both) took unreliable measurements, such as columns capitals, holes or stones hidden behind others.Additionally, slight differences (~0.010 to 0.040 m -Figure 14) were noticed along the seamlines (point X -Figure 14) due to the edge deformations (Figure 10) of the TLS sub-models.

ICP registration and comparison:
The final step was fusion data, i.e., texture the TLS mesh from the aerial photos.
To do so, data from both sources should be perfectly aligned, otherwise artifacts in the edges would be apparent.To ensure alignment, an ICP transformation was applied.The LiDAR model moved along the X, Y, Z axes -0.024, 0.066 and -0.022 m respectively, closer to the photogrammetric.Afterwards, meshto-mesh distances were recomputed with the following results.
The systematic error disappeared after applying the ICP transformation, as the differences follow the normal distribution with a range from -0.030 to 0.030 m (Figure 15).The mean value of the differences was 0.003 m and the standard deviation was 0.013 m (RMSE 0.013 m).The latter represents the systematic error of photogrammetry over TLS data and it is equivalent to 3 pixels.A correlation was noticed between Figures 8, 14 and 15.
In the narrow gap (point A) it was shown that the accurate measurements obtained by photogrammetry (Figure 8) were confirmed, since the differences (Figures 14 and 15) at the bottom of the gap are below 0.005 m.
Figure 15.Mesh-to-mesh distance recomputation after an ICP.
In the narrow gap (point A), where three TLS stations were needed to capture ground and wall points, photogrammetry succeed in doing the same, with similar precision.The differences due to the TLS deformations on the edges of the submodels appeared to be covered (differences below 0.010 m -Figure 15), since after the ICP transformation the two models achieved high alignment.

Fusion
In the last step, the two data sets were fused.To generate the final deliverable, the LiDAR 3D model and the texture from the UAV images acquired under shade were integrated.In the final model, the texture of the images was draped over the 3D mesh model, without artifacts appearing on the edges.The final photorealistic product (Figure 16) is displayed below.
Figure 16.The final photorealistic model and details of Amathus Roman Baths.See it in 3D tour on the link: https://www.youtube.com/watch?v=Fjfq-uiOYRQ.

CONCLUSION
The geomatic techniques of aerial photogrammetry and TLS were integrated for the case of the 3D reconstruction of the archaeological site of the Roman Baths of Amathus, Cyprus.The main goal of this research paper was the acquisition of RTK imagery and how successfully it can be aligned with LiDAR data to create an accurate 3D photorealistic model.The final results (after applying the ICP) showed high level alignment between the two methods.The combination of the two techniques can deliver high resolution and quality 3D photorealistic deliverables for small archaeological sites like Amathus Baths.PPK (and RTK) GNSS data are good enough for DG and 3D reconstruction without GCPs, provided aerial photogrammetry is the only data collection methodology.The use of PPΚ additional corrections over RTK image positions increases the quality of the final deliverable and is recommended in geometric documentation applications where the required expectations are higher than simple mapping.Given that the PPK solution will transfer systematic errors to the georeferencing, just as any other GNSS depended method, including TLS target measuring, an alignment of the data sets must take place to ensure that the data sets can be fused without further issues.
The stations of the traverse should not be estimated using RTK GNSS.RTK GNSS may be used for georeferencing, provided some adjustment using TS measurements is employed for final estimation of coordinates.In our case the discrepancies of the RTK data on the three stations was 0.037 m.
Given TLS point spacing ~equal to image pixel size, photogrammetry offers worse shape since it produced a mesh with ~3 times fewer triangles.In terms of overall quality, one must consider the peculiarities of the object before selecting a method.Also, the oblique images were able to properly describe even the most difficult narrow areas, and providing better overall coverage of the site, although this is highly related to the site's geometry.Generally, each acquisition method has its advantages and disadvantages, and no method can produce the perfect deliverable independently.The two methods achieved similar accuracy, that the final accuracy depends mainly on the "optical" of each scan or camera position with respect to the object, relative to distance and angle.UAV is superior in 3D reconstruction of tall objects that TLS cannot see, but TLS can collect more detail in difficult/hidden spots as it can be positioned closer to the objects.With traditional surveying methods it would be impossible to measure an archaeological site with so much detail.Fusion methods need several assumptions and an organized methodology to provide the final product.Finally, it is shown that fusion techniques are very useful for the control, maintenance, and restoration of CH sites and it is expected that their use will grow and evolve in the coming years.
A question that remains unanswered, is whether a data set with limited and uneven distribution of TLS control points for georeferencing, suffering from several degradations (combination of RTK GNSS and TS measurements, multiple sightings, interpretation on PC, etc.) are better than a larger and evenly distributed set of photo projection centers, directly measured with a GNSS antenna, which is located much higher than the ground level with a much wider horizon visibility.
Photogrammetric data acquisition Average flying height 20 m Camera focal length 0.01057 m Average GSD 0.005 m Time per flight ~20-25 mins Front Side Vertical images overlap (nadir) 70% 60% Oblique images overlap (45° angle) 80% 60% Obtained images Flight under shade 401 Flight under low sunlight 307 Total images 708 Table 1.Photogrammetric data acquisition.Only photos under shade were used to texture the 3D model.Both shadow and low sunlight photos were used for 3D reconstruction and comparison to TLS.

Figure 5 .
Figure 5.The relative orientation of the 708 images and the dense PC.

Figure 6 .
Figure 6.Residuals of the camera self-calibration.

Figure 8 .
Figure 8. Points with low confidence level.Notice the very narrow gap (0.40m) between the walls in top left image (point A), where photogrammetry successfully returned points all the way to the bottom.At the same spot (point A -Figure 12), it was necessary to place the scanner in three stations to properly cover it.

Figure 10 .
Figure 10.Deformations on the edges of the sub-models.

Figure 11 .
Figure 11.The 4 sub-areas for generating a quality 3D mesh.

Figure 12 .
Figure 12.The TLS mesh and the scan positions.

Figure 14 .
Figure 14.Mesh-to-mesh distance computation.Note the visible seamlines due to the TLS deformations.

Table 4 .
Table 4, are well below 0.010 m, indicative of a proper alignment, when considering the internal accuracy of the scanner itself.TLS georeferencing errors.