Evaluation of Input Sampling Methods for Deep-Learning-Based Semantic Segmentation of Large-Scale 3D Point Clouds
Keywords: 3D Point Clouds, Deep Learning, Semantic Segmentation, Input Sampling
Abstract. 3D point clouds used in geospatial applications typically contain billions of points. Processing 3D point clouds of this size as a whole with deep learning models requires computational resources (e.g., GPU memory) that are usually not available. To obtain 3D point clouds that can be processed by deep learning models, sampling methods that produce local subsets of large-scale 3D point clouds with a smaller extent or lower density are essential. Nonetheless, the impact of different input sampling methods on the semantic segmentation performance of deep learning models has received little attention so far. In this paper, we compare three widely used input sampling techniques (random sampling, farthest point sampling, and grid sampling) concerning the semantic segmentation performance of different deep learning architectures, using inputs of different spatial extents. We consider both indoor and outdoor scenarios, using the Stanford Large-Scale 3D Indoor Spaces and Paris-CARLA-3D datasets as reference datasets. We find that random and grid sampling outperform farthest point sampling in terms of segmentation performance, with mean intersection-over-union scores of approximately 0.6, while random sampling displays the fastest execution time. For indoor scenarios, using input 3D point clouds with a small spatial extent (i.e., 1m) yields the best results. For outdoor scenarios, similar performance is obtained for all tested input extents. In an additional experiment, we evaluate a curvature-weighted sampling approach to test whether geometric features derived from 3D point clouds can guide the selection of more informative input points for deep learning models. However, we find that using curvature as a sampling criterion decreases the segmentation performance, indicating a mismatch between the expected relevance of high-curvature points (e.g., points representing object borders) and the internally learned features of the deep learning models.