Image-to-Point Cloud Registration using Camera Motion Generation and Monocular Depth Estimation
Keywords: Image-to-point cloud registration, Monocular depth estimation, Camera motion generation
Abstract. Image-to-point cloud (I2P) registration plays a vital role in applications requiring accurate spatial alignment between 2D images and their corresponding 3D point clouds. Traditional I2P methods often require extensive training to generalise across diverse environments and rely on intrinsic camera parameters for accurate metric depth estimation, limiting their effectiveness in complex or unseen scenarios. To address these challenges, this study introduces a novel approach that leverages Camera Motion Generation (CMG) and Monocular Depth Estimation (MDE) for I2P registration task. CMG simulates camera movements in the up, down, left, and right directions, enabling the generation of novel viewpoints of the scene. MDE is applied to each frame to generate point clouds, which are subsequently registered using multi-way registration. The final registered point cloud is then aligned with the scene point cloud through the Iterative Closest Point (ICP) algorithm, ensuring precise spatial alignment. The proposed method eliminates the need for training or reliance on intrinsic camera parameters, making it robust for diverse and unseen environments. We evaluated the proposed approach through extensive experiments using the Root Mean Square Error (RMSE) to measure registration accuracy between the generated and ground truth point clouds. The results indicate that our method achieves competitive RMSE values across various scenarios, validating its effectiveness in enhancing I2P registration accuracy and adaptability.