SEMANTIC SEGMENTATION OF REMOTE SENSING IMAGERY USING AN ENHANCED ENCODER-DECODER ARCHITECTURE
Keywords: Deep Learning, Semantic Segmentation, RUNET, UNET, Remote Sensing, Squeeze and Excitation
Abstract. Semantic segmentation is one of most the important computer vision tasks for the analysis of aerial imagery in many remote sensing applications, such as resource surveys, disaster detection, and urban planning. This area of research still faces unsolved challenges, especially in cluttered environments and complex sceneries. This study presents a repurposed Robust UNet (RUNet) architecture for semantic segmentation, and embeds the architecture with attention mechanism in order to enhance feature extraction and construction of segmentation maps. The attention mechanism is achieved using Squeeze-and-Excitation (SE) block. The resulting network is referred to as SE-RUNet. SE is also tested with the classical UNet, termed SE-UNet, to verify the efficiency of introducing SE. The proposed approach is trained and tested using “Semantic Segmentation of Aerial Imagery” dataset. The results are evaluated using Accuracy, Precision, Recall, F-score and mean Intersection over Union (mIoU) metrics. Comparative evaluation and experimental results show that using SE to embed attention mechanism into UNet and RUNet significantly improves the overall performance.