GEOMETRIC EVALUATION OF GAOFEN-7 STEREO DATA

: China’s first sub-metre stereo satellite, GaoFen-7, was launched on 7 November 2019. One of the main criteria for a stereo mapping satellite is the geometric accuracy of the images. In this paper, we present a systematic evaluation of the geometry accuracy of Gaofen-7 on two scenes over the centre of Munich, Germany. The geometry accuracy is evaluated in a three-step workflow: 1) direct georeferencing accuracy; 2) image orientation using bundle adjustment with ground control points; 3) height accuracy of the generated digital surface model (DSM). In addition to dense LiDAR point clouds, ground control points were measured in the field. These were used as references. The results show that RPC bundle adjustment with 0 order bias correction is sufficient to achieve sub-metre absolute accuracy. The height accuracy of the generated digital surface models varies with land cover type, ranging from 0.9m (NMAD) in open areas to 4.5m in urban areas.


INTRODUCTION
Recently, an increasing number of Earth-observation platforms are able to capture Very High-Resolution (VHR) stereo optical imagery.IKONOS, which was launched on 24.September 1999 is the first civilian very high resolution along track stereo satellite (Dial et al., 2003).It provides stero imagry with a resolution of 0.82 m to 1 m.Limited to the image processing and stereo matching techniques, the initial research using IKONOS data focused on image radiometry and geometric evaluation.Stereo imagery were mainly served for visualization and manually extraction purpose (Dial et al., 2003, Baltsavias et al., 2001, Tao et al., 2004).The automatic Digital Surfac Model (DSM) generation approaches with different matching techniques were proposed years later (Baltsavias et al., 2006, Zhang andGruen, 2006).The advanced dense matching techniques and the available 0.5 meter resolution WorldView-2 data have brought a new era for the automatic building 3D reconstruction (Tian et al., 2017).Former Digital Globe, now Maxar has further improved the image resolution to 31 cm in their WorldView-3 and Worldview-4 satellite which were launched in 2014 and 2016, respectively.With the launch of GeoEye in 2009 and Pléiades in 2011, more VHR stereo satellite data are available.However, these VHR satellites capture only stereo data for some specific regions according to demands, and are rather expensive.Therefore, the related researches are still limited to specific small regions ordered as (multi-) stereo data.The specially designed satellites Cartosat-1 and ALOS/PRISM provide stereo imagery with around 2.5 meter resolution.Cartosat-1 only provides panchromatic data, and ALOS/PRISM is no longer in operational mode.Therefore, more stereo data at affordable prices is still in demand for a global 3D monitoring.
China has started their space mission on stereo imagery since 2012 along with the launch of the first civil stereo satellite Ziyuan-3 (ZY3).ZY-3 provides 2.1 meter resolution nadir view images.The forward and backward camera that are inclined at ±22 • can provide stereo images with a resolution of 3.5 meter * Corresponding author (Tang et al., 2020a).As an advanced version, on November 7, 2019, China launched the the first civilian sub-meter resolution stereo satellite sensors, Gaofen-7.Different to the Ziyuan-3 series, Gaofen-7 system is composed of two cameras with forward and backward views, respectively (Tang et al., 2020b).
Several applications and researches on Gaofen-7 dataset are available.However, most of them are concentrating on the Laser Altimeter System (Tang et al., 2020b, Xie et al., 2020, Chen et al., 2022b, Chen et al., 2022a), positioning accuracy (Liu et al., 2021).Although Gaofen-7 stereo imagery have been available since two years ago, the utilization of these data for 3D reconstruction is restricted partly due to the large stereo view angle, which brings extra difficulty in stereo matching over urban region.
Previous work (Tian et al., 2022, Luo et al., 2021) has evaluated the quality of the Gaofen-7 data from the application-oriented aspects, including 3D modeling, building extraction, and road extraction, but no detailed evaluation of the underlying elevation models is performed.Detection of tectonic faults based on GF-7 digital elevation models was investigated in (Zhu et al., 2023), and the geolocation accuracy and quality of the elevation model was evaluated in this context.These these studies provide a good overview of the GF-7 capabilities and application to specific problems.
In this paper we focus on the geometrical properties of the stereo imagery and evaluate DSM generation over the Munich metropolitan area with precise reference data and for different land cover types.We evaluate different image orientation approaches from an end-user perspective.Our main contribution is detailed, landcover specific evaluation of the elevation models over urban and semi-urban areas.

GAOFEN-7
2.1 Description of satellite Gaofen-7 is equipped with the stereo cameras with a view direction near nadir and a forward view camera with an inclination of +26°.The backward (nadir) and forward view cameras can provide panchromatic images with a resolution of 0.65 and 0.8 meters, respectively.The backward view cameras can provide four-band multi-spectral imagery at 2.6 meter resolution.Table .1 illustrates the detailed parameters of the GF-7.In addition, a laser altimeter system with a plot size of 1.6 km × 1.6 km is also installed on Gaofen-7 (Tang et al., 2020b, GF-7 Satellite, n.d.).It has a revisit time of 5 days.

Revisit time 5 days
Table 1.Description of GF-7 two-line stereo imagery

Image orientation
The RPC sensor model, is used for all processing.Image orientation is performed using RPC bundle adjustment, using ground control points, tie points and a reference height model (d'Angelo, 2013).The Block adjustment procedure yields image space RPC corrections, 0 order row, column shifts and an affine correction were estimated.Well-distributed multiray tie points are automatically generated between the forward and backward panchromatic images using pyramidal local least square matching, leading to highly accurate points, with a precision of approximately 1/10 of a pixel.For this evaluation, GCPs tie the imagery to the world coordiate system.The SRTM DSM is only used as support for nearly colinear points, which might occur in the overlapping areas of images cut from a larger pushbroom strip, and is used with a very low a priori weight and thus has only neglible influence the absolute orientation of the block.

DSM generation
Using the refined orientation, we generate the DSMs based using Semi-Global matching (SGM), which is still the most robust dense matching approach by considering both efficiency and accuracy (Xia et al., 2020), and performs reliably over large areas.In the SGM matching procedure, the Census cost function is used as similarity measure.The matched point cloud is resampled to an gridded digital surface model with 1 meter sampling.Remaining holes are filled with hierarchical b-spline interpolation.It has to be mentioned that the whole DSM generation procedure is fully automatic without any manual processing.In the end, orthophotos of the panchromatic and multispectral images are calculated using the filled DSM and refined RPC parameters.We select the backward view images of GaoFen-7 to generate orthophotos, due to their higher resolution and low off-nadir angle.

Testsite
The test region covers the center of Munich, Germany.The orthophotos are overlaid with Open Street Map (OSM) for a better visualization (Figure .1).The two scenes for the experiments were captured on 17. April 2022, each with has a range of 23 km × 26 km.We chose Munich as our test region for a two reasons.Firstly, it is one of our most studied test regions with many remote sensing images from other sensors, including other stereo satellite data, airborne data.Secondly, we have field measured ground control points (GCPs) that can be used for absolute geometry evaluation.Table 2. Survey point reprojection errors (predicted-measured) before image orientation.This is a measure of the direct georeferencing accuracy for these scenes.The pixel values have been multiplied with the GSD to show the deviation in meters.

LiDAR Data
The airborne LiDAR data from the LBDV (Bavarian Surveying an Mapping Authority, 2021) is used as reference data during the evaluation of the GF-7 DSM.It has a point density of more than 4 points per square meter, and a vertical accuracy of 0.1m in open terrain and a planimetric accuracy of 0.5 m.It has been converted to ellipsoidal heights using the German Combined Quasigeoid (GCG2016) (Federal Agency for Cartography and Geodesy, 2016).

Direct georeferencing accuracy
The direct georeferencing accuracy is an important criteria for a mapping satellite, as it will determine the amount of ground control points required for processing the satellite imagery.In this section we used the RPCs without any adjustment.This indicates the accuracy without ground control.For the direct georeferencing evaluation, all survey points are used as check points (CP).
We have performed two experiments to estimate the direct georeferencing accuracy of the sensors.First, all survey points were reprojected into the images and the difference to the measured image positions was calculated.The results are given in Table 2.
To estimate the 3D position accuracy, the 3D position of each check point point was computed through forward intersection of the measured image coordinates and compared to the 3D position of the survey point.See Table 3 for results.The mean X, Y and Z residuals are an estimate for the absolute accuracy of DSMs and ortho images created using direct georeferencing (without GCPs).Two scenes are not enough to fully characterize the direct georeferencing performance of a satellite, but the example GF-7 scene evaluated in this paper show a relatively large deviation with an overall RMSE of 42.3 m. 3.4.2Image Orientation First, 1055 high quality multi-ray tie points were automatically measured between all 4 GF-7 images using local least squares matching.7 survey points were used as ground control point (GCP) and the remaining were used as checkpoints.A bundle block adjustment, based on tie points and GCP is then performed 3.1.We estimate a zero order image space correction, i.e. row and column shift in image space.The location of the used GCPs and CPs are shown in Fig. 4. Residual statistics for the CPs after the adjustment are presented in Table 4, and are much improved over the values in Table 3.

CPs
7 / 16 2.1 -0.1 / 1.2 0.8 / 0.9 0.2 / 1.2 The tiepoint RMSE is between 0.21 and 0.15 pixels.When checking the residual plots, no systematic patterns could be detected, and a simple 0 order shift is sufficient for RPC correction.This indicates a stable and well calibrated interior orientation, and is a good sign as processing can be performed with a single high quality GCP per scene.Compared to other dedicated stereo satellites like Cartosat-1 or ZY-3, which often require affine RPC correction to reach their full accuracy potential, the GF-7 scenes evaluated in this paper show that GF-7 is well suited for mapping of large areas.

Digital Elevation Models
After orientation with the GCPs, DSMs (digital surface models) have been generated, as described in Section 3.2.A grid spacing of 1m, roughly 2×GSD was chosen.All DSMs have been generated automatically and with the same parameter values and are used without any manual processing (such seed point measurement, DSM editing etc.) Figure 3. Shaded DSM generated from GF-7 stereo scenes.
An overview of the generated digital surface model is shown in Fig. 3.The general topography is depicted well, but some mismatches on water areas, such as the Starnberger See, are visible.This is expected, as no valid heights can be matched over water areas, and no further editing was performed on the generated DSM.The GF-7 DSM is then compared to LiDAR first pulse data, as shown in Fig. 4 for the whole DSM and in Fig. 5 for two smaller regions.It can be seen that most larger buildings are visible in GF-7 DSM, whereas smaller residential buildings and individual or sparse trees are often not present.This can be observed in the lower left part of AOI 1 and the upper part of AOI 2 in Fig. 5.In addition, buildings with complex roof structures are often not completely reconstructed, possibly due to different brightness and texture appearance in the FWD and BWD image.This is due to relatively big stereo convergence angle between the two cameras and the large off-nadir angle of the FWD camera, leading to an increase occluded occluded areas in dense city structures.Additionally, the height of sparse or low trees is underestimated, as seen the lower part of AOI 2. Heights over dense forest or open terrain seem to be estimated well.Thus the visual inspection indicates that the accuracy of the GF-7 DSM changes depending on the land cover type.Fig. 4 shows that the height accuracy of open areas does not change over the whole DSM, indicating that a 0 order bias correction is indeed sufficient to obtain good results, even at large distances from the used GCPs, indicating a stable satellite platform and good characterisation and calibration.Statistics for the height differences are shown in Table 5.The ESA Worldcover 2020 (Zanaga et al., 2021) land cover dataset, with a resolution of 10 m and 11 classes was used for evaluate the DSM performance over different land cover types.The "All except waterbodies" statistics in table 5 provide the overall accuracy over land.It can be seen that the normalized absolute deviation (NMAD) and standard deviation (STD) values differ, indicating a non-normal distribution of the height differences.The different nature and time shift of the LiDAR first pulse data to the optical data results in different heights, especially in forest and agricultural areas, as seen in Fig. 5.These differences affect the STD, so NMAD is a better indicator for the precision of the GF-7 DSMs.Over Grassland, Cropland and Bare soil, the GF-7 images perform well, with a mean difference of 0.1 m and an NMAD of 0.9 m.The negative mean values show that the heights are underestimated for the other land cover types, particularly for Tree cover, Shrubland and Built up.These differences are mostly caused by unmatched areas with small building and sparse trees.These are then filled by interpolation of neighbouring matched areas, which are often ground pixels, leading to an height underestimation, cf.Fig. 5.

CONCLUSION
This paper has investigated the GF-7 stereo imagery products on two scenes covering Munich, Germany.Geometric orientation and DSM generation was assesed and compared to   (Zhou and Tang, 2022) showed a roughly 10 times higher direct georeferencing accuracy after bundle block adjustment.Further investigation is required why the two scenes evaluated in this study shows a much higher deviation.It was found that a 0 order RPC shift correction was sufficient to obtain an RMSE in the magnitude of the pixel size.DSM generation showed, that this correction results in a stable orientation, even far from the used GCPs.This is a good result, as in theory one very good GCPs per scene is sufficient to obtain digital elevation models without height bias.Height variations due to satellite jitter or CCD stitching effects are barely visible when compared to the high quality LiDAR reference data.Thus image orientation of GF-7 data is straightforward and easier than other stereo mapping satellites such as Cartosat-1, which requires affine RPC correction (d'Angelo, 2013) or ZY-3, which can suffer from sub-pixel attitude jitter and CCD stitching effects visible in the generated DSMs (Wang et al., 2016).As this work only evaluated two GF-7 stereo pairs from the same datatake, it is not clear if these conclusions apply to all GF-7 datasets.Evaluation on multiple, well distributed test areas could strengthen the conclusions.
DSMs generated using the robust SGM algorithm showed a good height accuracy and precision in order of the image GSD for areas that could be matched well.Evaluation using land cover data showed decreased height accuracy for complex structures, for example urban areas or sparse trees, where dense matching fails due to occlusions and differences in scene appearance.Occlusions occur frequently due to the relatively large stereo convergence angle and the large off-nadir angle of the forward looking camera.Previous Studies and experience with VHR (Carl et al., 2013) stereo triplets show that smaller convergence angles lead to denser and higher quality DSMs, without loosing vertical accuracy, as the accuracy of image matching increases with smaller angles and compensates for the worse intersection geometry.As the GF-7 convergence angle is fixed, future work could look into training deep learning stereo methods to perform context-dependent interpolation of occluded areas.When using traditional, robust dense matching algorithms which can be applied reliably on a worldwide scale, such as SGM, GF-7 is well suited for DSM generation over open terrain, but not ideal for dense urban areas.

Figure 1 .
Figure 1.Test region overlaid on Open Street Map 3.3.1 Reference data 3.3.2GCPs Ground Control Points (GPS) were previous collected through different projects.The general workflow involved a manual selection of clearly visible points from the 3K aerial images.The points were generally located on ground, mainly on road or pedestrian ways.Fig.2shows an example of the collected GCP.For each GCP point, the XY coordinate and elevation values were recorded using differential GPS.Identification in the GF-7 images was not easy, as the road markings, clearly visible in aerial imagery, were not easily recognizable in the lower resolution GF-7 images, limiting the quality of the GCP measurement.In the experiments, the measured points are divided into GCPs used during image orientation and checkpoints (CP) used during evaluation.
Figure 2. Examples of GCP collection procedure.Scene Sensor CPsColumn Row mean STD mean STD

Figure 4 .
Figure 4. Height difference to the first pulse LiDAR points.No systematic deviation is visible.The GCPs used during image orientation are shown as red dots, the checkpoints as yellow dots.

Table 4 .
Object space residuals (mean / standard deviation) for independent check point (CPs) after bundle block adjustment.

Table 5 .
Statistics on height differences between LIDAR first pulse data and filled DSM.The analysis has been performed for different landcover classes, as provided by the ESA WorldCover product.Note that the Tree cover class also includes many residential areas.checkpoints derived from GPS measurements, and LiDAR point clouds.For the two scenes, it was found that bundle adjustment using GCPs or other absolute reference data is required, as the direct georeferencing accuracy (RMSE) is 42.3 m.Other studies with larger datasets over China