ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences
Download
Publications Copernicus
Download
Citation
Articles | Volume V-3-2022
https://doi.org/10.5194/isprs-annals-V-3-2022-77-2022
https://doi.org/10.5194/isprs-annals-V-3-2022-77-2022
17 May 2022
 | 17 May 2022

IMPROVING SEMANTIC SEGMENTATION PERFORMANCE BY JOINTLY USING HIGH RESOLUTION REMOTE SENSING IMAGE AND NDSM

R. Yang, Q. Dai, H. Cheng, Y. Zhang, N. Chen, and L. Wang

Keywords: Semantic Segmentation, Deep Learning, nDSM, ResNet, Resolution Remote Sensing, Augmentation

Abstract. Semantic segmentation algorithms based on full convolutional neural network have greatly improved segmentation accuracy of high-resolution remote sensing (RS) images. However, the interpretation of RS images from single sensor is still challenging due to the variety and complexity of land objects, the extreme imbalance distributions of land objects on size and numbers. In contrast, multiple sensors can provide complementary information on the land classes, and thus benefit the interpretation. In this context, this research explores the joint use of RGB optical bands and normalized DSM (nDSM) to analyze an urban scene. The method firstly concatenated three channels RGB image and one channel nDSM band into a four-channel image. Thereafter, ResNet-101 network with fine adjustment were utilized as the backbone network to retain multiple feature information by residual blocks. Then the augmented RGB and nDSM images were used to training the network. The established model was evaluated on the Postdam test set. Results show that the proposed method achieves 86.85% on Overall Accuracy (OA), 77.42% Mean Intersection Over Union (MIOU), which is 6.88% and 11.39% higher than the result achieved by single RGB images. Especially, small targets, such as car and tree, are higher. The experimental results show that the simple structure adjustment of ResNet-101 network can achieve good segmentation performance on RS images (especially small targets) after the combination of twice augmented RGB channels and nDSM channels respectively. In addition, with the addition of nDSM, the accuracy of buildings and trees with height information has been improved.