Comparison and evaluation of machine-learning-based spatial downscaling approaches on satellite-derived precipitation data

: Precipitation estimation with high accuracy and resolution is crucial for hydrological and meteorological applications, particularly in ungauged river basins and regions with scarce water resources. Many machine learning (ML) algorithms have been employed in the downscaling of precipitation, however, it remains unclear which algorithm can outperform others. To address this issue, this study evaluates the performance of four ML based downscaling methods to generate high-resolution precipitation estimates at an annual scale. The satellite-derived precipitation data, environmental variables, such as, latitude, longitude, normalized difference vegetation index (NDVI), digital elevation model (DEM), and land surface temperature (LST), as well as the observations from rainfall gauges were used to constructed the regression models. The performance of the four ML algorithms including the Support Vector Regression (SVR), Random Forest (RF), Spatial Random Forest (SRF), and Extreme Gradient Boosting (XGBoost) algorithms was compared with three conventional methods: Multiple Linear Regression (MLR), geographically weighted regression (GWR) and Kriging interpolation model. Results showed that ML-based method generally outperformed traditional interpolation methods in precipitation downscaling, as they had higher accuracy and were better at reproducing the spatial distribution of rainfall. Out of ML approaches, XGBoost received the best performance, followed by SRF, RF and SVR, indicating its robustness of capturing nonlinear relationships. After the XGBoost, better performance of SRF than RF and SVR was found. This might be because the SRF just introduced the spatial autocorrelation into the RF models, which illustrated the importance of capturing spatial variations in ML algorithms. These findings regarding the comparison and assessment provided a novel downscaling method for generating high-resolution precipitation data, which


INDTRODUCTION
Precipitation is an important component in global water cycle and energy balance (Chen et al., 2021).The amount and distribution of precipitation have significant impact on the water resource management, climate research, and environmental monitoring (Karbalaye Ghorbanpour et al., 2021).Accurate precipitation data is crucial for irrigation planning, reservoir operations, and flood control measures.However, precipitation is also one of the most difficult meteorological factors to detect (Li et al., 2021).There are some measurements of precipitation data.Rain gauge stations can provide high-quality observations with high temporal resolution, but their spatial coverage is very limited (Sinha et al., 2018).Alternatively, satellite-based precipitation data provide wider spatial coverage.However, satellite-derived precipitation estimates were generated at global scale, and their coarse spatial resolution limits their utility for regional applications such as hydrological modelling and flood forecasting.Thus, the downscaling of satellite precipitation is vital to provide precipitation estimates at finer spatial resolutions.Generally speaking, there are two distinguished downscaling techniques, the dynamic and statistic downscaling.Both techniques have their own advantages and disadvantages.Dynamic downscaling uses the regional climate model based on strict physical assumptions, and it requires great computing resources and is more computationally expensive (Shashikanth et al., 2014).Statistical downscaling, on the other hand, is achieved by developing statistical relationships between environmental variables (such as temperature, pressure, and moisture) and the precipitation at a lower spatial resolution.These regression relationships are then used to generate downscaled precipitation data.Statistic downscaling is much easier to use, and has been widely used in many studies (Zhang et al., 2018).Among statistic downscaling, there has been an increasing popularity in using machine learning techniques to downscale precipitation data.For example, Jing et al. (2016) used Support Vector Machine (SVM) to downscale precipitation based on NDVI, DEM, and Land Surface Temperature (LST) over Tibetan Plateau.Devak et al. (2015) proposed a dynamic framework for downscaling climatic variables by integrating K-Nearest Neighbour and SVM and generating an ensemble of outputs, which performed better than individual models in simulating extreme precipitation events.He et al. (2016) developed an adoptable random forest (RF) model for the downscaling of precipitation, in which the single and double RF models were applied for the mean and extreme precipitation events.Yan et al. (2021) constructed a downscaling-merging scheme based on RF and cokriging to acquire high-resolution precipitation data, and greatly improved its accuracy and spatial details.Chen et al. (2021) introduced the spatial autocorrelation to the RF model and proposed a spatial random forest (SRF) for downscaling.They found that the SRF outperformed other conventional algorithms and illustrated the importance of incorporating spatial autocorrelation to ML approaches.After the Extreme Gradient Boosting (XGBoost) method was proposed, it has been gradually applied in downscaling.The use of XGBoost and Artificial Neural Network (ANN) in the downscaling of Gravity Recovery and Climate Experiment (GRACE) satellite Terrestrial Water Storage (TWSA) estimates for monitoring hydrological droughts was explored, and the XGBoost model was found to outperform the ANN (Ali et al., 2023).
There are many machine learning algorithms used in the downscaling of precipitation data.However, not all ML methods are equally effective, and each method may have its own strengths and weaknesses.Therefore, the objectives of this study were (1) to evaluate and compare five machine learningbased downscaling algorithms in precipitation estimation (2) to investigate the benefits and drawbacks of using machine learning methods to improve the spatial resolution of precipitation data.

Study area
Guangdong Province is located in the southern part of China, and covers an area of approximately 180,000 km 2 (Yan et al., 2020).It has a subtropical monsoon climate, characterized by hot and humid summers and mild winters.Guangdong experiences abundant rainfall during the rainy season, which lasts from April to September, and relatively dry weather during the rest of the year (Xin et al., 2021).The terrain of Guangdong Province is mountainous and hilly, with an average elevation of about 200 m.It has a complex topography, with many valleys, basins, and plains.The precipitation patterns in Guangdong Province is influenced by the monsoon climate and the topography.The rainfall is unevenly distributed both spatially and temporally, with middle areas experiencing heavy rainfall and flooding, and other regions with less rain.(1) Rain gauge observations The study region includes 86 rain gauge stations, with a high density in the east and low density in the west, resulting in an uneven distribution (Fig. 1).Daily precipitation data for the period 2006-2010 was collected from the China Meteorological Data Service Centre (CMDSC, 2022), which undergoes strict quality controls (Jiang et al., 2021).
(2) PERSIANN-CDR PERSIANN-CDR provides a long-term and high-resolution precipitation dataset that spans from 1983 to present with a spatial resolution of 0.25 degrees (Ashouri et al., 2015).PERSIANN-CDR has been validated against a wide range of rain gauge networks and other satellite-based precipitation datasets in different regions and has been shown to have good accuracy and reliability (Miao et al., 2015).In this study, the yearly PERSIANN-CDR data from 2006 to 2010 was obtained from the National Oceanic and Atmospheric Administration (NOAA) National Centres.
(3) Environmental factors In this study, Normalized Difference Vegetation Index (NDVI), Digital Elevation Model (DEM), and Land Surface Temperature (LST) were commonly used as predictors in downscaling models.They could influence the precipitation through evapotranspiration process, orographic effect, and land surface's energy balance (Duan andBastiaanssen, 2013;Shah et al., 2019;Zhan et al., 2018).Therefore, incorporating these variables into the downscaling model can improve the accuracy of precipitation estimates.The Global Inventory Monitoring and Modelling System (GIMMS) NDVI dataset was adopted.It has a spatial resolution of 8 km and a temporal resolution of 15-day (Tucker et al., 2005).The Shuttle Radar Topography Mission (SRTM) based DEM data was applied, with a spatial resolution of 90 m (CGIAR, 2022).The LST data were obtained from the Moderate Resolution Imaging Spectroradiometer (MODIS) at a spatial resolution of 1 km, and a temporal resolution of 8 days (Krishnan et al., 2022).In this study, we compared four machine learning algorithms for downscaling precipitation: SVM, RF, SRF, and XGBoost.These four models were selected because of their wide applications in precipitation downscaling and their ability to capture nonlinear relationships between predictor and response variables (Cheng et al., 2022;Sachindra et al., 2018).SVM is a popular algorithm for classification and regression tasks, and has been used successfully in precipitation downscaling.RF is an ensemble learning method that can handle a large number of input variables and capture complex interactions between them.SRF is an extension of RF that integrated spatial autocorrelation in modelling.XGBoost is a gradient boosting method that has shown excellent performance in various prediction tasks.By comparing the performance of these four models, we aimed to provide insights into their strengths and weaknesses for precipitation downscaling applications.

Accuracy measures
In order to assess the accuracy and reliability of the downscaling results, four commonly used indicators were used, including: correlation coefficient (CC), root mean square error (RMSE), mean absolute error (MAE), and Kling-Gupta efficiency (KGE).KGE is a comprehensive evaluation index that considers three components: correlation, variability, and bias (Liu, 2020).Their equations are as followed: n 2 1 1 ( ) Where Ei is the estimated precipitation at station i, and Oi is the observed precipitation at station i, n is the number of rain gauge stations.

Accuracy analysis of downscaled results based on different models
Table 2 and Table 3 presented the performance of different models for predicting precipitation from 2006 to 2010.As shown, ML methods generally had better performance than the others, with higher CC and KGE.This might be because ML algorithms were more capable of constructing the complex and nonlinear relationships between environmental predictors and precipitation, whereas traditional methods just assumed linear relationships.Meanwhile, ML models can handle outliers in data and models' overfitting more effectively compared to traditional methods.It is worth noting that the GWR models reported good performance in other studies (Chen et al., 2018;Wang et al., 2022;Xu et al., 2015), but its performance in this study was slightly worse than the ML models.This might be because that ML algorithms included the feature selection techniques that help identify the most important variables for prediction, but GWR does not have built-in feature selection capabilities.Additionally, ML models had more parameters in model training, such as regularization strength, learning rate to achieve optimal performance.The superior performance of ML downscaling models can help to improve the accuracy and resolution of precipitation data.Out of four ML algorithms, XGBoost outperformed other models, showing the highest correlation coefficient (0.78), the highest KGE (0.57), and the lowest MAE and RMSE (244.43 mm and 305.13 mm, respectively).The better performance of XGBoost might be due to the regularization techniques involved in XGBoost, which help to prevent overfitting and improve the generalization performance of the model.The SRF model also showed better performance than SVR and RF, because the spatial autocorrelation has been introduced into the RF model and SRF was better at handling spatially correlated data.
The scatter plot in Figure 3 compared the estimated precipitation from different downscaling models with the observed values.As shown, the result of PERSIANN-CDR was the worst, suggesting that the original satellite data contained large bias and uncertainty.The results presented in the scatter plot are consistent with those in the tables above.ML algorithms had better performance as their scatter plot displayed a more tightly concentrated distribution around the 1:1 line, indicating a closer agreement between the estimated and observed values.Across all the scatter plots of downscaling models, a consistent trend can be found with the original satellite data, in which it tended to underestimate when observed values are larger, and conversely, overestimate when observed values are smaller.This was not surprising because all the downscaling models were constructed based on the satellite precipitation, and the over-and under-estimations would be inherited by the models.
Figure 3. Scatter plots between the observed and estimated precipitation based on different models

Spatial distribution of downscaled results
In Figure 4, the spatial distribution patterns of the original PERSIANN-CDR and its downscaled results in 2010 were compared.The downscaled precipitation maps shared similar distribution patterns with the original satellite map, with higher precipitation in the middle and lower precipitation in other areas, which was not surprising given that all the regression models were trained from satellite precipitation and would exhibit similar distribution characteristics in the PERSIANN-CDR map.While the original PERSIANN-CDR annual precipitation map contained mosaic-like pixels due to its coarse resolution, the downscaled maps generated by ML-based algorithms provided more spatial information and replicated basic spatial features.

Future work
The results of this study suggest that ML-based approaches have significant potential in improving the accuracy of precipitation downscaling, with XGBoost being the most effective in generating high-resolution precipitation data.However, there are opportunities for further improvement in future studies.Firstly, using a larger and more diverse training dataset would be beneficial to reduce overfitting and improve the generalization of ML models.The size and quality of training data have a significant impact on the performance of machine learning models, and a larger and more diverse dataset can help enhance the accuracy, generalization ability, and robustness of ML models (Liu et al., 2021).Secondly, it is important to involve more observations from rain gauges for testing the ML models.In this study, validation results were based on in-situ data from a limited number of rain gauges, which may not be sufficient to accurately capture the spatial variability of precipitation across the entire region (Sun et al., 2022).This could lead to incomplete and potentially biased results.Furthermore, incorporating multiple satellite-derived precipitation data sources could enhance the accuracy and reliability of downscaling models.Although PERSIANN-CDR was used in this study due to its longer temporal coverage and good consistency with measurements, other remote sensing precipitation products may have their own advantages and limitations (Miao et al., 2015).Combining multiple satellitederived precipitation data sources could reduce the uncertainty contained in individual products and provide more reliable precipitation estimates (Arshad et al., 2021).

CONCLUSION
This study evaluated the performance of four ML-based downscaling methods, including XGBoost, SRF, RF, and SVR, for generating high-resolution satellite precipitation data.
Results showed that ML-based algorithms outperformed conventional methods in terms of CC and KGE, indicating their superior capability in fitting nonlinear relationships between satellite precipitation and environmental variables.Among the four ML-based algorithms, XGBoost and SRF tended to produce the best results having higher CC and KGE and lower MAE and RMSE at most validation years.The downscaled precipitation maps showed comparable distribution patterns with the original PERSIANN-CDR map, reproducing the basic spatial features and more importantly, providing enriched spatial information.It was found that overestimations were obtained at most rain gauges, especially in the middle area and the eastern side.Overall, this study provides valuable insights into the performance of different downscaling methods for satellite precipitation data, which can help improve the accuracy of precipitation estimates in various applications.

Figure 1 .
Figure 1.The distribution of rain gauges in Guangdong Province, China 2.2 Dataset and Pre-processing

Figure 2 .
Figure 2. The flowchart of ML based downscaling models.

Figure 4 .
Figure 4. Spatial distribution pf downscaled results based on different models.Particularly, GWR showed good accuracy and successfully captured the spatial features of the PERSIANN-CDR distribution.MLR generally had higher CC and KGE values than Kriging, but it struggled to reproduce the spatial distribution and underestimated precipitation in the middle area.Kriging and the original PERSIANN-CDR both had poor CC and KGE results and produced almost the same spatial pattern, possibly because Kriging interpolation only generated smooth values of the original satellite data.SVR and XGBoost, on the other hand, provided more details with large spatial variations, as seen in the downscaled maps of the regions highlighted by the black circles (Figure 4g-h), where they reproduced low precipitation levels.

Table 1 .
Datasets used in this study

Table 3 .
The performance of different downscaling models using MAE and RMSE