Multi-Source Fusion Enhanced Feature Segmentation in Remote Sensing Imagery
Keywords: Remote Sensing Images, Multi-source Data Fusion, Pixel-level Fusion, Semantic Segmentation
Abstract. With deepening application of deep learning technology in the field of remote sensing, several challenges persist in the segmentation of remote sensing optical images. These challenges include: (1) insufficient availability of deep learning-based remote sensing semantic segmentation datasets; (2) inadequate utilization of multi-source remote sensing data in the field of semantic segmentation; (3) limited sample size for effective model training, as well as the need to enhance both the speed and accuracy of model training. To address these challenges, this study introduces a multi-source remote sensing dataset consisting of 15000 data pairs, each comprising remote sensing multispectral data, synthetic aperture radar(SAR) images, land use and land cover(LULC) data, digital elevation model (DEM) data as well as the analysis data including Slope, Aspect, and Hillshade. Through the application of an end-to-end network based on pix2pix, effective fusion and feature enhancement of multi-source remote sensing data were achieved. The structural similarity index (SSIM), peak signal-to-noise ratio (PSNR), and Spectral Angle Mapper(SAM) values reached 0.84, 23.14, and 0.19, respectively, representing significant improvements over the baseline Pix2pix model’s performance metrics of 0.62, 18.84, and 0.23. In downstream semantic segmentation applications, the enhanced dataset was utilized to train semantic segmentation models for remote sensing image analysis. This approach effectively improved the training speed and segmentation accuracy of the models, with the mean intersection over union (mIoU) increasing from 0.467 to 0.481 and accuracy rising from 0.734 to 0.746. Moreover, the visual representation of remote sensing image segmentation demonstrated noticeable enhancements.