ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences
Download
Publications Copernicus
Download
Citation
Articles | Volume X-3-2024
https://doi.org/10.5194/isprs-annals-X-3-2024-85-2024
https://doi.org/10.5194/isprs-annals-X-3-2024-85-2024
04 Nov 2024
 | 04 Nov 2024

Optimum Combination of Spectral Variables for Crop Mapping in Heterogeneous Landscapes based on Sentinel-2 Time Series and Machine Learning

José Galdino de Oliveira Júnior, Júlio César Dalla Mora Esquerdo, and Rubens Augusto Camargo Lamparelli

Keywords: Remote sensing, Random Forest, SITS, Red Edge, agricultural monitoring

Abstract. This article aimed to determine a workflow for more efficient large-scale crop mapping using a time series of images from the Sentinel-2 Satellite, statistical methods of attribute selection, and machine learning. The proposed methodology explores the best possible combination of spectral variables related to vegetation (16 vegetation indices in the RGB, NIR, SWIR, and Red Edge regions) to characterize different spectro-temporal profiles of Land Use and Land Cover (LULC) in spatially heterogeneous landscapes. First, we applied a data dimensionality reduction analysis using the PCA (Principal Component Analysis) method. Subsequently, the variables that showed the highest statistical correlation between each other were used in the spectro-temporal classification process, using the Random Forest, TempCNN, and LightTAE algorithms, following three different strategies: C1 (ALL), C2 (BE + IV (Red Edge)) and C3 (BE + IV (without Red Edge)), where ALL – All variables; BE – Spectral Bands; IV – Vegetation Indices. Given the results found, the C2 classification scenario (with bands B3, B4, B5, B6, B7, B8, and B8A and the NDRE1, RESI, and MSR indexes) demonstrated the best LULC classification accuracy at the crop pattern level, compared to the other scenarios, with average values of 0.91, 0.88, 0.91, 0.89, and 0.89 (Global Accuracy, Producer Accuracy, User Accuracy, Kappa index, and F1-Score, respectively, for the TempCNN model), the which emphasized the importance of both qualitative and quantitative variability of sampling data and variables based on the Red Edge region for improving LULC classification processes in large-scale heterogeneous landscapes.