Image-to-Point Cloud Registration using Camera Motion Generation and Monocular Depth Estimation

Kamali, Masoud; Atazadeh, Behnam; Rajabifard, Abbas; Chen, Yiqun

doi:https://doi.org/10.5194/isprs-annals-X-G-2025-453-2025

Articles | Volume X-G-2025

https://doi.org/10.5194/isprs-annals-X-G-2025-453-2025

© Author(s) 2025. This work is distributed under
the Creative Commons Attribution 4.0 License.

https://doi.org/10.5194/isprs-annals-X-G-2025-453-2025

© Author(s) 2025. This work is distributed under
the Creative Commons Attribution 4.0 License.

Articles | Volume X-G-2025

10 Jul 2025

| 10 Jul 2025

Image-to-Point Cloud Registration using Camera Motion Generation and Monocular Depth Estimation

Masoud Kamali, Behnam Atazadeh, Abbas Rajabifard, and Yiqun Chen

Keywords: Image-to-point cloud registration, Monocular depth estimation, Camera motion generation

Abstract. Image-to-point cloud (I2P) registration plays a vital role in applications requiring accurate spatial alignment between 2D images and their corresponding 3D point clouds. Traditional I2P methods often require extensive training to generalise across diverse environments and rely on intrinsic camera parameters for accurate metric depth estimation, limiting their effectiveness in complex or unseen scenarios. To address these challenges, this study introduces a novel approach that leverages Camera Motion Generation (CMG) and Monocular Depth Estimation (MDE) for I2P registration task. CMG simulates camera movements in the up, down, left, and right directions, enabling the generation of novel viewpoints of the scene. MDE is applied to each frame to generate point clouds, which are subsequently registered using multi-way registration. The final registered point cloud is then aligned with the scene point cloud through the Iterative Closest Point (ICP) algorithm, ensuring precise spatial alignment. The proposed method eliminates the need for training or reliance on intrinsic camera parameters, making it robust for diverse and unseen environments. We evaluated the proposed approach through extensive experiments using the Root Mean Square Error (RMSE) to measure registration accuracy between the generated and ground truth point clouds. The results indicate that our method achieves competitive RMSE values across various scenarios, validating its effectiveness in enhancing I2P registration accuracy and adaptability.

Image-to-Point Cloud Registration using Camera Motion Generation and Monocular Depth Estimation

Useful Links

Useful External Links

Our Contact