SEMANTIC SEGMENTATION OF UAV IMAGES BASED ON U-NET IN URBAN AREA
Keywords: Semantic Segmentation, UAV, Deep Learning, Convolutional Neural Network, Encoder-Decoder Architecture, U-Net
Abstract. Semantic segmentation of aerial data has been one of the leading researches in the field of photogrammetry, remote sensing, and computer vision in recent years. Many applications, including airborne mapping of urban scenes, object positioning in aerial images, automatic extraction of buildings from remote sensing or high-resolution aerial images, etc., require accurate and efficient segmentation algorithms. According to the high potential of deep learning algorithms in the classification of complex scenes, this paper aims to train a deep learning model to evaluate the semantic segmentation accuracy of UAV-based images in urban areas. The proposed method implements a deep learning framework based on the U-Net encoder-decoder architecture, which extracts and classifies features through layers of convolution, max pooling, activation, and concatenation in an end-to-end process. The obtained results compare with two traditional machine learning models, Random Forest (RF) and Multi-Layer Perceptron (MLP). They rely on two steps that involve extracting features and classifying images. In this study, the experiments are performed on the UAVid2020 semantic segmentation dataset from the ISPRS database. Results show the effectiveness of the proposed deep learning framework, so that the U-Net architecture achieved the best results with 75.15% overall accuracy, compared to RF and MLP algorithms with 52.51% and 54.65% overall accuracy, respectively.