ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences
Download
Publications Copernicus
Download
Citation
Articles | Volume X-1/W1-2023
https://doi.org/10.5194/isprs-annals-X-1-W1-2023-469-2023
https://doi.org/10.5194/isprs-annals-X-1-W1-2023-469-2023
05 Dec 2023
 | 05 Dec 2023

CONSTRUCTION OF A DUAL-TASK MODEL FOR INDOOR SCENE RECOGNITION AND SEMANTIC SEGMENTATION BASED ON POINT CLOUDS

J. Jiang, Z. Kang, and J. Li

Keywords: Scene recognition, Semantic segmentation, Indoor cognition, Point cloud, Multi-tasking

Abstract. Indoor scene recognition remains a challenging problem in the fields of artificial intelligence and computer vision due to the complexity, similarity, and spatial variability of indoor scenes. The existing research is mainly based on 2D data, which lacks 3D information about the scene and cannot accurately identify scenes with a high frequency of changes in lighting, shading, layout, etc. Moreover, the existing research usually focuses on the global features of the scene, which cannot represent indoor scenes with cluttered objects and complex spatial layouts. To solve the above problems, this paper proposes a dual-task model for indoor scene recognition and semantic segmentation based on point cloud data. The model expands the data loading method by giving the dataset loader the ability to return multi-dimensional labels and then realizes the dual-task model of scene recognition and semantic segmentation by fine-tuning PointNet++, setting task state control parameters, and adding a common feature layer. Finally, in order to solve the problem that the similarity of indoor scenes leads to the wrong scene recognition results, the rules of scenes and elements are constructed to correct the scene recognition results. The experimental results showed that with the assistance of scene-element rules, the overall accuracy of scene recognition with the proposed method in this paper is 82.4%, and the overall accuracy of semantic segmentation is 98.9%, which is better than the comparison model in this paper and provides a new method for cognition of indoor scenes based on 3D point clouds.