Satellite-derived precipitation data calibration using ground-based rain gauge observations by means of machine learning methods
Keywords: Precipitation, Bias Correction, Machine learning, Regression, Remote Sensing
Abstract. Accurate precipitation estimation is vital for water resource management, climate monitoring, and natural hazard assessment. However, satellite precipitation products (SPPs) such as CHIRPS and GPM often exhibit significant biases, particularly over complex mountainous regions. This study aims to improve precipitation estimates across Mazandaran Province, northern Iran, by integrating satellite data with rain gauge observations through machine learning (ML) approaches. Monthly precipitation records from 190 stations (2000–2024) were combined with SPPs and environmental predictors, including elevation, soil moisture, temperature, and land cover. Two ML models—Extreme Gradient Boosting (XGBoost) and Multi-Layer Perceptron (MLP)—were implemented to correct satellite biases and enhance spatial precipitation accuracy. Both models substantially reduced estimation errors relative to raw satellite data, with XGBoost achieving superior performance. The mean RMSE decreased by approximately 20–30 mm, and correlation coefficients increased from about 0.4–0.5 to 0.6. Feature importance analysis indicated that GPM, CHIRPS, and elevation were the most influential predictors. Stratified evaluation by elevation, rainfall intensity, and forest cover revealed that XGBoost maintained robust performance under diverse environmental conditions, while MLP was more sensitive to topographic variability. Overall, the integration of multi-source data and ML-based bias correction demonstrates strong potential for improving precipitation accuracy in regions with complex topography and sparse gauge coverage, supporting more reliable hydrological and climate applications.
