Performance analysis of Bayesian optimised gradient-boosted decision trees for digital elevation model (DEM) error correction: interim results
Keywords: Digital Elevation Model, Copernicus, Bayesian optimisation, Gradient boosted decision trees, Machine learning, Hyperparameter tuning
Abstract. Gradient-Boosted Decision Trees (GBDTs), particularly when tuned with Bayesian optimisation, are powerful machine learning techniques known for their effectiveness in handling complex, non-linear data. However, the performance of these models can be significantly influenced by the characteristics of the terrain being analysed. In this study, we assess the performance of three Bayesian-optimised GBDTs (XGBoost, LightGBM and CatBoost) using digital elevation model (DEM) error correction as a case study. The performance of the models is investigated across five landscapes in Cape Town South Africa: urban/industrial, agricultural, mountain, peninsula and grassland/shrubland. The models were trained using a selection of datasets (elevation, terrain parameters and land cover). The comparison entailed an analysis of the model execution times, regression error metrics, and level of improvement in the corrected DEMs. Generally, the optimised models performed considerably well and demonstrated excellent predictive capability. CatBoost emerged with the best results in the level of improvement recorded in the corrected DEMs, while LightGBM was the fastest of all models in the execution time for Bayesian optimisation and model training. These findings offer valuable insights for applying machine learning and hyperparameter tuning in remote sensing.