LIDAR-INERTIAL LOCALIZATION WITH GROUND CONSTRAINT IN A POINT CLOUD MAP

: Real-time localization is a crucial task in various applications, such as automatic vehicles (AV), robotics, and smart city. This study proposes a framework for map-aided LiDAR-inertial localization, with the objective of accurately estimating the trajectory in a point clouds map. The proposed framework addresses the localization problem through a factor graph optimization (FGO), enabling the fusion of homogenous measurements for sensor fusion and designed absolute and relative constraints. Specifically, the framework estimates the light detection and ranging (LiDAR) odometry by leveraging inertial measurement unit (IMU) and registering corresponding featured points. To eliminate the accumulative error, this paper employs a ground plane distance and a map matching error to constraint the positioning error along the trajectory. Finally, local odometry and constraints are integrated using a FGO, including LiDAR odometry, IMU pre-integration, and ground constraints, map matching constraints, and loop closure. Experimental results were evaluated on an open-source dataset, UrbanNav, with an overall localization accuracy of 2.29 m (root mean square error, RMSE).


INTRODUCTION
Localization is a fundamental task for AVs as it provides realtime position to higher-level functions and automatics, such as vehicle control, path planning, and environment perceiving.The solution of LiDAR-based navigation coupled with inertial sensors, has become a widely used observation multi-sensors system for navigation tasks, as the observed point clouds are invariant to changing illumination and viewpoints (Ai et al., 2022).Although map-less LiDAR-based simultaneous localization and mapping (SLAM), such as Cartographer and LOAM-variants (Nüchter et al., n.d.;Ji Zhang and Singh, 2014), can generate accurate global mapping with real-time estimated trajectory, they are prone to misalignments and accumulated errors when there are limited stable points with associated features (Dube et al., 2017;Rozenberszki and Majdik, 2020;Schaefer et al., 2019).Therefore, involving a prior map which encodes the environmental features has become a popular solution to enhance the performance of long-term localization in urban canyon environments.
Researchers have successfully implemented map-aided navigation in urban environments; however, the performance is still limited in complicated scenarios.The challenging issues of map-aided LiDAR-based navigation can be discussed in two aspects.First, the accuracy of the odometry estimation is reduced when moving objects exist in the observed environments, as misalignments cannot be corrected by global constraints, such as loop closure, especially in cases where there are no loops along the trajectory.Although deep-learning-based semantic features have been widely used to enhance the alignment by involving object-level semantic patches, the positioning is still drifting overtime as the error continues to accumulate when the error source is not related to semantics.Secondly, the localization performance highly depends on the fusion scheme and designed constraints.The estimated trajectory is prone to falling into local minima resulting in unexpected errors.These challenging issues, if not addressed appropriately, can decrease the accuracy of LiDAR-based navigation, particularly in dense urban environments.
To address the aforementioned challenges, we propose a mapaided tightly coupled LiDAR-Inertial localization framework that incorporates ground constraints and map matching constraints to obtain a global trajectory.Specifically, relative odometry is estimated using LiDAR odometry, IMU preintegration.Along the trajectory, this paper proposes to detect and apply the local absolute constraints including map matching constraint and ground constraint, as well as loop closure constraint to enhance the positioning.Finally, a FGO is designed to fuse all the constraints and estimated relative odometry and output the real-time high accuracy trajectory.The main contributions of this work are summarized as follows:

•
We propose a map-aided navigation solution that integrates a LiDAR and an IMU for global localization and formulates the localization problem as FGO framework.

•
To enhance the performance of localization, we introduce absolute constraints based on the segmented ground planes and the map matching, which are jointly optimized in the FGO.

•
We present experimental evaluations that demonstrate the stable localization performance of our proposed framework in various urban environments.

RELATED WORK
In this section, we provide a brief review of map-aided LiDARbased localization.We categorize related works into two aspects: the mathematical formulation of navigation and the algorithms of constraints.
Some researches propose place recognition algorithms that match online point clouds with the prior map for real-time positioning.For example, the Autoware implements NDT matching for LiDAR-based navigation and introduces a scan matching framework, that the performance highly depends on the map resolution (Kato et al., 2018).In (H.Liu et al., 2019), a multigroup-LM (Levenberg-Marquardt) optimization algorithm is proposed to calculate the Jacobian matrix and avoid local minima.
Other researchers propose error models such as Monte Carlo (MC) algorithm to fuse homogenous variables and establish state function.For example, in the case of (Dong et al., 2021), the localization framework is implemented based on MC localization, and online trajectory is estimated by matching extracted pole-like objects with the prior pole map and fusing the variables into particle filters.Similarly, in (Liu et al., 2022), an Extended Kalman Filter (EKF) and an adaptive MC algorithm are designed to fuse multi-source data to estimate relative motion.Inspired by FGO-based SLAM algorithms, we involve the FGO framework to estimate the trajectory, which is one of the most stable frameworks to fuse heterogenous measurements and variables.
To address misalignment issues resulting in large drift along the trajectory, some researchers focus on feature encoding and extraction of the prior map to provide geometric description for the registration with onboarding sensors.For example, featurebased points, such as scale-invariant feature transform (SIFT) and infrared remittance, are extracted to represent the geometric distribution of surrounding environments (Imanullah et al., 2019;Levinson and Thrun, 2010).In (Bârsan et al., 2020), the work introduces the intensity-based map and involves the joint embedded features for association.High-definition (HD) maps, such as OpenstreetMap, provide the distribution and positions of traffic signs, which are considered as geo-referenced objects and matched with perceived results from LiDAR (Cho et al., 2022).
Other researchers involve training-based semantics such as DeepICP and Segmatch for the semantics-based alignment (Dube et al., 2017;Lu et al., 2019).However, the training models need to be updated with extra manual works when new environments are expected to be used as test datasets.In this work, we introduce a feature map that employs stable geometric points, including corner points, planar points and segmented ground points (Ji Zhang and Singh, 2014).
To enhance the robustness of long-term navigation, some researchers focus on designing constraints and optimization conditions for fault detection engines and boundary minimization.For example, overlapNet is proposed to estimate the yaw angle and recognize the loop closure for global mapping by aligning the semantics between LiDAR and the semantics map (Chen et al., 2022).Similarly, (Dube et al., 2017) introduces a semanticsaided matching algorithm for the loop closure, while (Zhu et al., 2020) introduces a graph-of-semantics network that involves semantic segments and generates graph descriptor for loop detection.In (X.Liu et al., 2019), ground points are segmented to provide consistent normal vector to eliminate the error from non-stationary objects.These works can address the problem by designing global constraints for accurate loop closure, however, the instability of point-level alignment is ignored during the local odometry estimation.Different from the previous works that focus on loop closure, this work proposes the local constraints including the map matching and consistent ground plane to eliminate the drift during the movements.

METHODOLOGY
In this section, the detail of the proposed map-aided localization framework is introduced as follows.

System overview
Considering a multi-sensor system comprising an onboard LiDAR and IMU, the system receives synchronized raw data from these sensors, and the measurement of Global Navigation Satellite System (GNSS) is optional.A prior feature map is georeferenced and is composed of geometric feature points.In addition, it is assumed that all the sensors are pre-calibrated, and the IMU frame coincides with the body frame.To estimate the trajectory, the motion state  of the system can be formulated as follows: The objective is to estimate the transformation matrix T from body coordinate B to world coordinate W, which is represented as Figure 1 provides an overview of the proposed localization framework, which aims to estimate the state node of the navigation in each timestamp.The state estimation is formulated as a maximum a posteriori (MAP) problem.In this study, a factor graph is used to model the MAP problem, which is one of the stable solution to fuse heterogenous measurements with different frequency (Shan et al., 2020).By defining a Gaussian noise model, the framework is to find the minimized value by solving the nonlinear least-squares problem, as Equation 2.
where e & ,-./0 , e -,3 loop , e & ground , e 4 567 , and e 8 9:8;<-4= denote the LiDAR odometry, loop closure, ground constraints, IMU preintegration, and map matching constraint, respectively. denotes the number of nodes in the factor graph.Each factor contributes to the pose estimation when a new state node is added to the graph.Then the factor graph is subsequently optimized based on the Bayes tree (iSAM2) using incremental smoothing and mapping framework (Kaess et al., 2011).The process for generating the aforementioned factors and constraints is described in the following sections.

Tightly coupling LiDAR-IMU odometry
The tight coupling of LiDAR and IMU enables the estimation of the relative motion between each relative state node incorporating IMU pre-integration and LiDAR odometry.Details are described in the following sections.

IMU pre-integration
Following the algorithm and solution from (Forster et al., 2017), the IMU pre-integration based on raw angular velocity and acceleration is defined, as shown in Equation 3and Equation 4.
The IMU pre-integration provides the relative transformation between two received IMU raw data and provides a state factor while mitigating the motion drift of point clouds.
where  E 8 = raw angular velocity a F 8 = raw acceleration R 8 ?$ = transformation matrix from  to where Δv -3 , Δp -3 , ΔR -3 represent the velocity, position, and rotation of relative motion in the body frame.
The IMU mechanization provides highly frequent relative pose information through its measurements.Meanwhile, the IMU bias is jointly optimized using FGO during motion to mitigate the drift.

LiDAR odometry estimation
When a new frame of point clouds is available, several steps are implemented to estimate the local LiDAR odometry, including motion drift calibration, feature extraction, scan matching and relative transformation estimation (J Zhang and Singh, 2014).
Here, we provide a brief overview of the LiDAR odometry estimation process.
By integrating IMU data, the first step is to perform motion drift calibration to synchronize the timestamp of each frame.Next, edge and planar points are then extracted as featured point } by calculating and sorting the roughness of points.The distance between relative frames is minimized and the relative transformation is calculated.The process can be formulated as Equation 5, where  denotes the distance minimization operation.m &EB,# &,# demonstrates the coarse relative pose information between the feature point set and feature point set Two steps are involved in estimating the relative transformation matrix.First, the corresponding edge and planar points are registered by comparing the feature patches.Second, with an initial value is estimated from the IMU pre-integration, a Gauss-Newton method is used to solve the non-linear function for minimizing the distance.The relative transformation matrix can be represented using Equation 6, ΔT &EB,& = T &EB A T & (6) where ΔT &EB,& is the relative LiDAR odometry between two states  & and  &EB .

Map matching
In this section, we provide a comprehensive explanation of incorporating map matching between LiDAR perception and the prior feature map for the localization task.
Prior to online localization, a prior feature map is generated to represent the geometric features of the surrounding structures.In general, we utilize the ground truth poses to align featured points extracted from LiDAR to construct the feature map along the trajectory.Firstly, pre-processing is performed, which includes coordinate transformation and time synchronization, between GNSS and LiDAR.This ensures that both data sources are properly aligned for further analysis.Secondly, a feature extraction module is designed to extract stable points from LiDAR.Specifically, ground points are segmented to provide consistent ground planes, while corner points and planar points are extracted to encode the geometric information of the environment.Additionally, a denoising algorithm is implemented to eliminate the effect of noise points and nonstationary objects.Finally, the offline map is generated by aligning continuous feature points using the ground truth trajectory.An example of the feature map is shown in Figure 2. The aim of the proposed map matching process is to establish relative transformation relationship between the onboard LiDAR and the prior feature map.In general, the distance error of the map matching is expected to be zero, which can be an absolute constraint for the relative transformation estimation.To generate the map matching factor, the Iterative Closet Point (ICP) is implemented to align the corresponding points between onboarding LiDAR and the feature map.Here, to reduce the computation cost, sparse keyframes are selected with predefined distance and angle thresholds and are utilized to generate the constraint.An example of map matching is shown in Figure 3.

Ground constraint
This section outlines the approach employed to detect ground points and constraints from point clouds captured by the onboard LiDAR.
Various techniques have been proposed for extracting the ground plane from point clouds, and one widely used and efficient algorithm is Random Sample Consensus (RANSAC) (Ni et al., 2016).Prior to ground detection, points clustering is implemented via a pre-defined Euclidean distance and ground points segmentation is achieved by exploiting the angle of the received laser signals, which enables estimation of the relative height of the vehicles in relation to the ground.In this work, the ground plane model is defined using the normal vector of the detected surface and the Euclidean distance between the points to the detected ground surface, as follows, ) represents the normal vector and  & is the Euclidean distance.Therefore, the process of ground detection via RANSAC can be defined as Equation 8, * denotes the optimal parameters of the ground plane.The  = = { F,= ,  G,= ,  H,= } denotes several points contributed to ground surface.Figure 4 illustrates an example of ground detection using RANSAC in urban environments.The red points indicate the ground points, while the white points represent the non-ground points.It is worth noting that the algorithm is capable of accurately detecting and labelling the ground points, even in the presence of dynamic objects such as vehicles on the road.By assuming the ground model is consistent as plane primitive along with the trajectory, the difference of detected ground planes can be considered as ground constraints and optimized in the FGO.

Loop detection
To perform global constraints, loop closure detection is a useful technique to eliminate the accumulative errors.It involves comparing the current frame with historical frames and applying the global constraint.To efficiently identify potential loop closures, this paper employs the scan context (SC) algorithm for loop candidate selection.When the difference of SC is less than a threshold , the loop closure is accepted when the distance is less than distance threshold  K; .A brief description of the SC descriptor and loop closure is introduced as follows: The (SC) algorithm serves as a global descriptor for place recognition and facilitates the descriptor of loop closures (Kim and Kim, 2018).The SC descriptor divides a frame point clouds into azimuthal bins and radial bins, with the number of sectors  L and rings  M .The unorganized point clouds can be transformed using Equation 9.
(9) where  = point clouds [i , j] = i th ring and j th sector Then the value of each bin of matrix is defined as the maximum value of the height value, as the following Equation 10, () demonstrates the operation that return the matrix value of  direction.
Given SC pairs, the similarity between two SC descriptor is calculated as Equation 11, where ( T ,  ; " ) = Euclidean distance between the two SC pairs. = predefined threshold  * = index of loop closure After comparing the SC distances, the loop closure is selected by searching the closest distance of feature points in Euclidean space.The relative transformation matrix between the accepted loop is considered as loop closure factor, following Equation 12, where -,3 KUUC denotes the transformation matrix between the detected frames. -,3 KUUC and  -,3 KUUC represent the translation and quaternion form, respectively.

GNSS localization (optional)
Despite the ability of map matching can provide reliable georeference and constraints, accumulative errors may affect the performance of localization.To address this issue, a GNSS system is considered as an optional solution to provide absolute measurements.
When GPS signal is received, it is transformed into the Cartesian coordinate using the Monarto projection.Subsequently, an initialization process is carried out to evaluate the accuracy of the current GPS signal by comparing the position between the state of the last node and the current GPS position.A new GPS factor is then added as a new node into the factor graph.In addition, to synchronize with other factors, linear interpolation is employed among the GPS measurements based on timestamps.

Factor graph optimization
This section provides an overview of the construction and optimization process of the factor graph optimization, incorporating the designed constraints and the motion state.The factor graph efficiently integrates the constraints, including map matching constraints and ground constraints, with the global constraints of loop closure, as well as the LiDAR odometry and IMU pre-integration.The structure of the factor graph is shown in Figure 5, which includes four types of constraints and the state node of the localization in the .The state node is represented using  which is defined in Equation 1, which encodes the pose information along the trajectory.
The loop closure constraint is defined by the Euclidean distance between the detected loop as following, Given that the ground constrain is based on the detected ground plane, the  & =MUY4Z can be represented using the normal vector, as demonstrated in Equation 16 (Wen and Hsu, 2022).

EXPERIMENTS
To evaluate the proposed framework, we conducted experiments using a opensource dataset from UrbanNav (Hongkong) with medium density is selected (Hsu et al., 2021).The detailed description of the dataset is shown in Table 1.Meanwhile, some key parameters used in experiments are listed in Table 2.During the evaluation, we compared to the state-of-art Lego-LOAM and NDT-based solution algorithms (Kato et al., 2018;Shan and Englot, 2018).The accuracy evaluation for translation and rotation is defined as following.Meanwhile, we implement the evaluating toolkit to calculate the relative error (RPE) for the translation and rotation in movement direction, including x axis and y-axis (Grupp, 2017).As shown in Figure 6, the proposed solution successfully estimates a highly accurate trajectory compared to the groundtruth pose information.Specifically, the trajectory obtained by Lego-LOAM is highly drifted as the performance of LiDAR odometry is easily affected in dynamic environments with many moving objects.Additionally, the global optimization of Lego-LOAM is only based on loop closure in the graph optimization algorithm, and large drift has been accumulated before the loop is obtained along the trajectory.As for the NDT-based algorithm, the performance of the trajectory depends only on the accuracy of NDT matching, which is easily affected by moving objects.
Compared to these two algorithms, our proposed algorithm achieves more stable results in urban canyon environments.Figure 7 shows the detailed and zoomed-in results of the estimated trajectories using different algorithms.We selected turning (location A), after-turning (location B), and long-term navigation (location C) for accuracy comparison.When the vehicle enters the turning, the result of Lego-LOAM is highly affected as the overlapping between relative frames decreases (as shown in Figure 7(a)).Additionally, the performance of the NDT-based algorithm continuously decreases as errors accumulate via point cloud registration.Therefore, our proposed algorithm achieves stable performance by enhancing error elimination and local constraints.
The error evaluation of rotation and translation along the trajectory are shown in Table 3 and Table 4.In general, our proposed algorithm achieves more stable results than Lego-LOAM and NDT-based algorithm.Regarding the mean rotation error, the proposed algorithm achieves around meters error in the long-term navigation task.With the help of IMU preintegration, the trajectory is stable even where there is a turning (as shown in Figure 7).

CONCLUSION AND FUTURE WORK
This paper presents a localization framework based on LiDAR and inertial measurements, aiming to enhance robustness in urban canyon environments.The proposed approach formulates localization within a FGO framework, utilizing state nodes and constraints.Specifically, LiDAR odometry and IMU preintegration estimate the relative transformation to link the relative state node, while three separate constraints in the form of factors are introduced through segmented ground constraint, map matching, and loop closure.The proposed algorithm offers two key advantages: Firstly, by employing the FGO framework, heterogenous measurements can be fused, effectively mitigating cumulative errors.Secondly, the framework can efficiently enhance positioning performance at two stages of odometry estimation and global optimization by fusing different constraints.
Experimental results demonstrate that the navigation solution can achieve stable and high accuracy positioning in urban canyon environments.
Future work will concentrate on the feature representation and association to improve the accuracy of registration and place recognition.The limitation of our work is that we utilize all the point clouds obtained from LiDAR.For the process of point registration, moving objects (e.g., moving cars) will bring in large error feature association and the non-rigid transformation matrix between relative state nodes.Feature work will focus on removing moving objects based on semantics and enhancing the positioning in dense environments.

Figure 1 .
Figure 1.The pipeline of the proposed framework.

Figure 2 .
Figure 2. One example of prior feature map, where (a) represents the global map and (b) shows the zoomed-in version of the global map.The zoomed-in map clearly illustrates the boundary and consistent ground points.

Figure 3 .
Figure 3.An example of map matching between received featured point clouds (coloured points) and the down sampled feature maps (white points) in the Linux system.

Figure 4 .
Figure 4. Examples of ground points segmentation (raw data: white points, segmented ground: red points) in the Urbannav dataset.

Figure 5 .
Figure 5.The structure of factor graph optimization.

Figure 6 .
Figure 6.Estimated trajectory using our proposed method and ground-truth trajectory.

Figure 7 .
Figure 7. Evaluation results compared to ground truth trajectory.
• RMSE: standard deviation of the trajectory • MEAN: Mean error of the trajectory

Table 2 .
Parameters in the experimental evaluation

Table 3 .
Accuracy evaluation in translation.

Table 4 .
Accuracy evaluation in rotation.