Occlusion-Robust SfM in Construction Sites via Geometry-Guided Foreground Segmentation
Keywords: Dynamic Occluders, Outlier Detection, Structure-from-Motion, Construction Sites, Prompt-based Segmentation
Abstract. Accurate 3D reconstruction is a key enabler for construction progress monitoring and digital-twin maintenance. However, in tower-crane imagery, persistent dynamic occluders such as hooks and slings violate the static-scene assumption of conventional Structure-from-Motion (SfM), leading to feature mismatches and degraded reconstruction consistency. In this paper, we present a geometry-guided occlusion-handling pipeline for crane-mounted construction-site SfM. Our approach leverages geometric cues from reprojection errors and depth inconsistencies to identify outlier observations, clusters them into spatially coherent prompts, and uses these to guide a foundation segmentation model (SAM2). The resulting per-frame masks are integrated into mask-constrained SfM optimization, ensuring that only static background contributes to reconstruction. Experiments on three real-world crane-mounted sequences (30 m, 45 m, and 120 m) show consistent reductions in mean reprojection error relative to the unmasked baseline. In the most challenging case, the error decreases from 0.962 to 0.872 pixels (9.4%). Compared with a fixed rectangular masking strategy, the proposed masks yield similar reprojection errors while better preserving valid observations and sparse-point completeness. These results indicate that the proposed framework provides a practical geometry-guided strategy for improving internal reconstruction consistency in crane-mounted construction environments.
