ISPRS-Annals

ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences

ISPRS-Annals

ISPRS Ann. Photogramm. Remote Sens. Spatial Inf. Sci.

2194-9050

Copernicus Publications

Göttingen, Germany

10.5194/isprs-annals-XI-2-2026-145-2026

Extraction of Pole-like Road Objects from MMS Point Clouds Using Deep Learning and Geometric-Topological Feature Fusion

Shu

¹ Shirai

Masataka

¹ Yokota

Hiroyuki

AERO TOYOTA CORPORATION, Japan

03 07 2026

XI-2-2026 145 154

2026

This work is licensed under the Creative Commons Attribution 4.0 International License. To view a copy of this licence, visit https://creativecommons.org/licenses/by/4.0/

This article is available from https://isprs-annals.copernicus.org/articles/XI-2-2026/145/2026/isprs-annals-XI-2-2026-145-2026.html

The full text article is available as a PDF file from https://isprs-annals.copernicus.org/articles/XI-2-2026/145/2026/isprs-annals-XI-2-2026-145-2026.pdf

This paper presents a fusion framework for the automatic extraction of pole-like road objects, including traffic lights, road signs, streetlights, and utility poles, from Mobile Mapping System (MMS) point clouds. The proposed method combines KPConv-based semantic segmentation with geometric-topological reasoning, enabling structural completion and heuristic filtering of nearby clutter without retraining or additional annotated data. The framework was trained on 8 km of manually annotated MMS data collected in the Kinki region of Japan and evaluated on two large-scale datasets: (i) a 26 km MMS dataset from Hokkaido (≈2.53 billion points) acquired using the same LiDAR sensor, and (ii) the Paris-Lille-3D benchmark (France) captured with a different LiDAR sensor. Quantitative evaluation demonstrates that the proposed fusion framework consistently outperforms the KPConv baseline across all datasets, particularly in recall and F₁-score. On the Hokkaido dataset, recall improved from 0.7952 to 0.8924 (+0.0972), and the F₁-score increased from 0.8263 to 0.8689 (+0.0426), reflecting successful reconstruction of lamp tops, signal arms, and previously unseen snow delineator posts (snow poles). On the Paris-Lille-3D benchmark, representing a cross-sensor and cross-domain scenario, recall improved from 0.5109 to 0.6656 (+0.1547), while the F<sub>1</sub>-score increased from 0.6230 to 0.7032 (+0.0802). In terms of computational efficiency, the 26 km Hokkaido dataset was processed in under 13 hours on a single NVIDIA Quadro RTX 8000. Overall, these results confirm that the proposed deep- learning-geometry-topology fusion framework achieves high accuracy, robust generalization, and practical scalability for large-scale road-asset mapping and digital-twin generation.