ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences
Download
Share
Publications Copernicus
Download
Citation
Share
Articles | Volume X-4/W8-2025
https://doi.org/10.5194/isprs-annals-X-4-W8-2025-751-2026
https://doi.org/10.5194/isprs-annals-X-4-W8-2025-751-2026
29 May 2026
 | 29 May 2026

Assessing the Feasibility of Landsat-Driven NO2 Prediction: A Spatial Cross-Validation Framework

Amir Tahooni, Ata Kakroodi, Majid Kiavarz, and Hossein Mansourian

Keywords: Nitrogen Dioxide (NO2), Landsat, Machine Learning, Spatial Cross-Validation, Air Pollution Mapping, Tehran

Abstract. Accurate high-resolution mapping of nitrogen dioxide (NO₂) is critical for environmental health studies. While many models rely on direct satellite NO₂ column data, this study investigates an alternative approach: predicting ground-level NO₂ using Landsat imagery combined with topographic and proximity variables. We developed and evaluated Random Forest and XGBoost models on a dataset from Tehran, Iran, using 21 predictors derived from Landsat 8/9, ASTER DEM, and OpenStreetMap. To rigorously assess spatial generalization, we employed three cross-validation strategies. The results highlight a critical dependence of performance metrics on the validation method. Traditional 10-fold CV yielded optimistic results (R² = 0.21-0.26), while rigorous spatial methods like leave-one-station-out (LOSO) and cluster-based CV exposed significant generalization challenges. LOSO CV revealed that while models achieved a pooled RMSE of ≈45.5 μg/m³, their mean R² was negative, indicating predictions at unseen locations were often worse than using the simple global mean. Both algorithms showed comparable accuracy, though XGBoost exhibited greater robustness to overfitting. We conclude that Landsat-derived proxies offer a viable but limited pathway for NO₂ estimation, as the models captured broad patterns but failed to resolve fine-scale, local variations. This work underscores that rigorous spatial cross-validation is nonnegotiable for obtaining a realistic assessment of model performance in air pollution mapping, especially in complex urban environments.

Share