A COMPARISON OF METHODS FOR THE APPROXIMATION AND ANALYSIS OF RAINFALL FIELDS IN ENVIRONMENTAL APPLICATIONS

Digital environmental data are becoming commonplace and the amount of information they provide is huge, yet complex to process, due to the size, variety, and dynamic nature of the data captured by the available sensing devices. Making use of the data largely relies on the availability of efficient methods to extract meaningful information, and requires to process the environmental events at the speed data are acquired. This paper focuses on the evaluation of methods to approximate observed rain data, in real conditions of sparsity of the observations. The novelty stands in the selection of a particularly complex area, Liguria region, located in the north-west of Italy, where the orography and the closeness to the sea causes complex hydro-meteorological events. Approximation results are compared on a fine granularity in terms of cumulated rain interval used, gathered from two different rain gauge networks, with different characteristics and spatial distribution. Moreover, beside traditional cross-validation comparison, we provide a qualitative comparison based on the analysis of the number and location of maxima of the approximation. Rain maxima are indeed crucial features of rain fields needed for storm tracking, to support effective monitoring of meteorological events.


INTRODUCTION
The large amount of digital data provide an extremely rich, yet difficult to process, amount of information about our environment, geographic and meteorological phenomena.The geographical area selected for presenting our results, the Liguria region in Italy, is an exemplary case study: the territory is characterized by an articulated orography close to the sea, with many small catchment basins that are highly influenced by local maxima of precipitation.Moreover, the proximity to the sea causes additional problems during storms, concurring to the creation of secondary low pressure areas, also known as the Genova Low, which increases the amount of precipitation and increases the risk of critical flash floods.The continuous observation of rain data during critical events, as well as the analysis of historical time series of precipitation, are definitely crucial to support a better understanding and monitoring of hydro-geological risks, such as floods and landslides (Keefer et al., 1987, Hong et al., 2007, Wake, 2013, Hou et al., 2014).
In this context, the paper discusses the evaluation of three different approximation techniques in relation to their suitability to capture the behavior of precipitation events: LR (Locally Refinable) B-Splines and meshless approximation with kriging, and Radial Basis Functions (RBFs).The comparison of methods for rainfalls approximation has been addressed in the literature both at the theoretical level (Scheuerer et al., 2013) and for domainspecific analysis (Skok and Vrhovec, 2006).Our study contributes to this topic extending the analysis to other approximation techniques, LR B-Splines in particular, and using a new setting for the comparison, inspired by the theory of topological persistence (Edelsbrunner et al., 2002).The basic idea is that in order to characterize precipitation events, it is important to focus on the main features of the rainfall fields and their configuration, discarding irrelevant details that do not contribute to understanding the overall event structure.With this motivation in mind, the prominence of precipitation maxima is measured through the notion of persistence, which allows for hierarchically organize maxima by importance, and possibly filter out irrelevant (i.e., non-prominent) ones.Based on this, we developed a criteria to compare different approximation methods, by analyzing the number and location of the most prominent maxima they produce.Finally, we want to remark that the interest here is to evaluate the performance of the approximation methods in real conditions of sparsity: the number of the measuring gauges is quite low with respect to the area covered and their distribution is quite uneven.This fact makes the experimental evaluation more interesting.
The comparative study was conducted selecting Liguria as area of interest, and the precipitation event recorded on September 29, 2013: the precipitation was characterized by light rain with 2 different thunderstorms, which caused local flooding and landslides.The observed rain data are heterogeneous both in terms of spatial distribution of the rain gauges and acquisition frequency, therefore adding a further variability that deserves analysis.Moreover, results are shown for the integration of another source of rain data, namely, extracted by radar data acquired the day of the selected event.
For the approximation of the sparse rainfall data, we considered LR B-Spline and two meshless approximation, kriging and radial basis functions.LR B-Splines are particularly useful as a compact representation of functions over large domains: they use a (locally) regular domain parameterization and can be locally refined according to the required approximation error.Ordinary kriging and RBFs use a variogram or a kernel, which are adapted to the spatial distribution of the data (e.g., through the selection of the kernel width).The three approximation methods define slightly different functions, whose behavior is studied both at the numerical level (accuracy, sensitivity to sparsness, computational issues) and at a qualitative level, by measuring the difference among the configuration of precipitation maxima induced by the three approximation techniques.
To contextualize better the comparison discussed, we start with a short overview of related work on rain observation methods, approximation and comparison techniques (Sect.2.).We present the setting adopted for the evaluation with details on the rain event and metrics used for the comparison (Sect.3.).Then, we give the formal definition of the three approximation methods discussed (Sect.4.) and discuss the performances of the approximations schemes with respect to approximation, sparsity, and computational aspects (Sect.5.).Then, the approximation schemes are compared by analyzing the difference in the configuration and prominence of the detected maxima (Sect.6.).Finally (Sect.7.), we summarize our study.

RELATED WORK
We briefly review previous work on measuring, approximating, and analyzing rainfall data and precipitation fields.
Measuring rainfall data Rainfall intensities are traditionally derived by measuring the rain rate through rain gauges, weather radar, or by measuring the variations in soil moisture with microwave satellite sensors (Brocca et al., 2014).Even though satellite precipitation analysis allows the estimation of rainfall data at a global scale and in areas where ground measures are sparse, the evaluation of light rainfalls is generally difficult, thus generating an underestimation of the cumulated rainfalls (Kucera et al., 2013).To bypass this issue, in (Brocca et al., 2014) the soil water balance equation is applied to extrapolate the daily rainfall from soil moisture data.The integration of rainfall data at regional and local levels is also intended to provide a more precise approximation of the underlying phenomenon on urban areas, which are sensitive to spatial variations in rainfalls (Segond, 2007).Furthermore, the spatial and temporal variations (e.g., speed, direction) of rainfalls are important to characterize their variability and peaks, together with their effects on catchments.
Approximating rainfall data Different approaches have been developed for the approximation of rainfall data.In (Thiessen, 1911), rainfalls recorded in the closest gauge are associated with un-sampled locations, by identifying a Voronoi diagram around each weather station and assigning the measured rainfall to the respective cell.In 1972, the U.S. National Weather Service proposed to estimate the unknown rainfall values as a weighted average of the neighboring values; the weights are the inverse of the squares of the distances between the un-sampled locations and each rainfall sample.The underlying assumption is that the samples are autocorrelated and their estimates depend on the neighboring values.This method has been extended in (Teegavarapu and Chandramouli, 2005) through the modified inverse distance and the correlation weighting method, the inverse exponential and nearest neighbor distance weighting method, and the artificial neural network estimation.In (McRobie et al., 2013), storms are modeled as clusters of Gaussian rainfall cells, where each cell is represented as an ellipse whose axis is in the direction of the movement and the rainfall intensity is a Gaussian function along each axis (Willems, 2001).
McCuen (McCuen, 1989) proposed the isoyetal method that allows the hydrologists to take into account the effects of different factors (e.g., elevation) on the rainfall field by drawing lines of equal rainfall depths among the rain-gauges and taking into account the main factors that influence the distribution of the rain field.Then, the rainfalls at new locations are approximated by interpolation starting from the isohyets.Geo-statistical approaches allow us to take into account the spatial correlation between neighboring samples and to predict the values at new locations (Journel and Huijbregts, 1978, Goovaerts, 1997, Goovaerts, 2000).Furthermore, the geo-statistic estimator includes additional information, such as weather-radar data (Creutin et al., 1988, Azimi-Zonooz et al., 1989) or elevation from a digital model (Goovaerts, 2000, Di Piazza et al., 2011).
Comparing rainfall data approximations For the comparison of the precipitation fields originated from different approximation schemes, we have adopted a number of standard metrics to assess the differences and performance of the schemes.Moreover, we extend the evaluation approach by comparing the differences in the configurations of meaningful features, namely prominent maxima, of the approximated fields.The motivation for this evaluation is that precipitation maxima convey important information for storm tracking, a crucial analysis of dynamic measures of rain data.
In storm tracking, different sets of meaningful features associated with distinct time frames, are matched to track their evolution along time.Previous work in this area focuses on the identification of regions of interest on radar images, usually characterized by high reflectivity and sufficiently large area.Various characteristics of these regions, such as centroids, area, major/minor radii, and orientation, are computed.Finally, regions are matched across the two consecutive time frames,according to the idea that the best candidate for matching minimizes some distance between the considered characteristic (Lakshmanan and Smith, 2009).For example, the TITAN algorithm (Dixon and Wiener, 1993) combines both centre of mass and area of regions for final decision of tracking.The SCIT algorithm (Johnson et al., 1998, Han et al., 2009) forecasts the centroid locations of cells at a given time: regions at the next time step are then assigned to the closest centroid location within a certain radius.
In this paper, we take inspiration from a storm tracking strategy recently proposed in (Biasotti et al., 2015), and apply it to the comparison of the different fields.The approach is based on a topological analysis of rainfall data, which focuses on the most prominent precipitation maxima instead of regions.Indeed, the granularity of the analysis is more appropriate for the characteristics of the geographic area selected; at the same time, the introduction of an ad-hoc bottleneck distance allows for matching prominent maxima of two consecutive time frames, and hence tracking their evolution along time.The same strategy for matching maxima is used to compare the configuration of maxima of different approximation results, treating them as if they were snapshots at different times.

CASE STUDY AND EVALUATION METRICS
The area selected for the evaluation is the Liguria region, in the north-west of Italy.Liguria can be described as a long and narrow strip of land, squeezed between the sea, the Alps and the Apennines mountains, with the watershed line running at an average altitude of about 1000 m.The orography and the closeness to the sea make this area particularly interesting for hydrometeorological events, frequently characterized by heavy rain due to Atlantic low pressure area, augmented by a secondary low pressure area creating from the Ligurian sea (Genoa Low).Moreover the several and small catchments are causing fast flooding events, and even small rivers exhibit high hydraulic energy due to the quick variation of altitude.This is the main motivation behind our analysis, which targets the understanding of the best approximation method to capture important and potentially dangerous precipitation events.
In Liguria, observed rainfall data are captured by two different rain gauges networks.The first rain gauge network is owned by the ARPAL team of Regione Liguria, and consists of 143 professional measure stations distributed over the whole region; the measures are acquired every 5-20 minutes, and the stations are connected by GPRS and radio link connection, producing about 2 MB data per day.The second rain gauge network is owned by the Genova municipality and consists of 25 semi-professional measuring stations spread within the city boundary; the acquisitions are done every 3 minutes, and the stations are linked by GPRS or LAN connections, with an average production of 1Mb data per day.The configuration of the rain gauge networks is shown in Fig. 1.
The two rain gauge networks act as sampling devices of the true precipitation field, working at two different scales, that is, at two different spatial distributions.Since the temporal interval is different for each network, we have cumulated the station rainfalls to a step of 30 minutes.This selection is also motivated by the desire to produce a fine-grained evaluation of the approximation methods in the perspective of a real-time precipitation monitoring.Note that the cumulated interval is a much smaller than the one used in (Skok and Vrhovec, 2006), where an interval of 24 hours was used.Concerning the precipitation events, we selected a rainy day, September 29, 2013, which was characterized by light rain over Liguria with 2 different thunderstorms that caused local flooding and landslides, without damages.The maximum rain-rate over all time step is 60mm/30 and the average rainrate is 1.12mm/30 .
To establish a formal evaluation setting, let us formulate the problem of rainfall approximation as follows.Given a set of points P := {pi} n i=1 , which represent the position of the measurement instruments, let f : P → R be the precipitation field, measured at the n sample points.An approximation of f is defined as F : R 2 → R such that d(F (p) − f (p)) ≤ for some required distance d(•, •) and threshold .When d(F (p) − f (p)) = 0 the approximation is an interpolation of f .The map F can be used to evaluate the value of the precipitation at any point other then those in P, with results differing according to the approach used to define F .In our case, we will consider three different F approximation functions.The color coding used for the illustration of the computed approximation ranges from blue (i.e., smallest rain rate) to brown (i.e., highest rain rate), passing through green, yellow, and orange (i.e., intermediate values).
To compare the approximations, we adopt a cross-validation strategy, implemented in two ways.First, every rainfall station at pi is iteratively turned off, that is, it is not used in the computation of F ; the approximation function F obtained is sampled at that position pi and compared with the rain value measured at pi, which acts as a ground truth (leave-one-out strategy).Second, the rainfall data measured by the municipality stations are used as ground-truth to validate the values approximated from the ARPAL data set: in this setting, the cross-validation aims at evaluating the capability of the different methods to estimate the local features of rain fields interpolated over a sparse data set, with different spatial distribution.
The comparative study also includes the analysis of the spatial configuration of local maxima extracted from the rainfall fields produced by each approximation scheme.In this case, local maxima are endowed with a notion of prominence borrowed from topological persistence, which is used to quantify the importance that a maximum has in characterizing the associated rainfall field.This comparative analysis is motivated by the fact that, in order to understand the evolution of precipitations and tracking their changing along time, local maxima and their configuration provide a useful description for capturing the important elements of the underlying rainfall field.Indeed, they have a relevant semantic content and, at the same time, are formally well-defined.For this set of experiments, the approximated rainfalls were sampled at the vertices of a digital terrain model, producing a discretization of the precipitation field whose maxima were compared.The DTM used is coming from the SRTM (Shuttle Radar Topography Mission (Farr et al., 2007)), available in public domain at the URL http://www2.jpl.nasa.gov/srtm/,and with a spatial resolution of 100 mt.

THEORETICAL BACKGROUND
In the following, we give an overview of the three approximation methods compared and of the persistence analysis framework used to analyze the evolution of precipitations (Fig. 2).

Approximation schemes LR B-Splines
The rainfall values are parameterized on the xyvalues of the corresponding geographic location and the rainfall is approximated by a 2.5D LR B-spline surface (Dokken et al., 2013).The approximation of the rainfall data is performed by an iterative procedure starting from a lean tensor-product B-spline surface being constantly equal to zero.For each iteration the distance between the current surface and the rainfall data is computed, the surface is refined locally where a given tolerance is not met, and the surface coefficients are updated using Multilevel Bspline approximation (MBA) (Lee et al., 1997) adapted for LR Bsplines.The MBA method is a local and explicit approximation method, where the surface coefficients are updated based on the data points situated in the support of the corresponding B-spline.The performance depends on three components, which are done at each iteration step: the refinement of the LR B-spline, distance computations, and update of the surface coefficients.The latter two elements are the most time consuming.For each iteration, the coefficients are updated twice and one additional distance computation is performed.Let the number of data points be N .The number of non-zero B-splines for each data point varies, but will be in the magnitude of (d1 + 1) × (d2 + 1) where d1 and d2 are the polynomial degrees in the two parameter directions of the surface.The surface is bi-quadratic so d1 = d2 = 2.In our tests, the algorithm is run with 20 iterations giving a total of 3×20×N ×9 bi-variate B-spline evaluations., where ϕ is the kernel function (Aronszajn, 1950, Dyn et al., 1986, Micchelli, 1986, Patanè et al., 2009).Depending on the properties of ϕ, we distinguish globally- (Carr et al., 2001, Turk andO'Brien, 2002) and compactly-supported (Wendland, 1995, Morse et al., 2001) supported radial basis functions.Then, the coefficients (αi) n i=1 solve a n × n linear system, which is achieve by imposing the interpolating constraints F (pi) = f (pi), i = 1, . . ., n.Since a n × n linear system is solved once, the computational cost of the approximation with globally-and locally-supported RBFs is O(n 3 ) and O(n log n), respectively.In our experiments, we have chosen the Gaussian kernel ϕ(st) := exp(−st), which has a global support; in fact, its fast decay makes it suitable to approximate rainfalls with a sparse spatial distribution that change quickly in time.To this end, the width of each basis function is automatically adapted to local sampling density by selecting its width according to the local spatial distribution of the rainfall stations (Dey andSun, 2005, Mitra andNguyen, 2003).

Implicit approximation with radial basis functions
Kriging The previous two approximation methods do not take into account in an explicit manner; correlation among observations may have unwanted effects especially in the case of unevenly distributed observations.Furthermore, there is no natural mechanism for propagating the individual quality of the observations into a quality description of the estimation.A class of methods that takes care of these issues is kriging, (Wackernagel, 2003), which is a common technique in environmental sciences and a special case of the maximum likelihood estimation.The underlying assumptions are that the quality of the observations is given as variance values, and that the covariance between observations only depends on their mutual spatial or temporal distance, and not on their location.Formally, kriging is expressed as F (p) := n i=1 ωif (pi), where the weights ωi are defined as Ω = C −1 × D with C the covariance matrix of the measured values and D is the matrix defined by the covariance between the known values and the points to be estimated.The covariance is expressed by the variogram model, which reflects the priors on the spatial variability of the values.The main problem with kriging is the low computational efficiency, as the solution of the linear systems scales quadratically with the number of observations.In the implementation used, the problem is addressed by combining kriging with deterministic spatial division techniques, which efficiently restrict the number of observations to the closest ones.More specifically, the Kd-tree is used to select only the 20 closest neighbors for the matrix inversion.

Prominent rainfall maxima via persistence analysis
The importance of precipitation maxima is evaluated by means of the persistence analysis.Given a scalar field F : M → R (e.g., the interpolated rainfall field) and sweeping t in R, new connected components of the superlevel sets M t = {p ∈ M : F (p) ≥ t} are either born, or previously existing ones are merged together.A connected component is associated with a local maximum p of F , where the component is first born.When two components corresponding to local maxima p1, p2, F (p1) < F (p2), merge together the component corresponding to p1 dies.In this case, the component associated with the smaller local maximum is merged into that associated with the larger one.Each local maximum p of F is associated with its a persistence value pers F (p), which is defined as the difference between the birth and the death level of the corresponding connected component.Maxima associated with a higher persistence value identify relevant features and structures of the underlying phenomena.
To compute the local maxima and the associated persistence val-

APPROXIMATION BEHAVIOR
The first set of results we discuss is related to the comparison of the behavior of the three methods according to approximation performance and computational complexity.Concerning the leave-one-out cross-validation strategy, we have checked the results by computing the three approximation fields turning off, iteratively, each rainfall station at pi, for each cumulated interval.The value of the approximation function F obtained was then compared at pi with the rain value measured by the corresponding rain gauge at pi, acting as a ground truth.The statistics of the evaluation are shown in Table 1; in this case, ordinary kriging and LR B-Splines have the smaller maximum error, but the RBFs have a smaller mean-squares error and standard devitation.In Fig. 4, the plot of the MSE distribution for the three methods is shown, per each time interval.
The second set of results concerns the cross-validation done using the rainfall data measured by the municipality stations as groundtruth to validate the values that approximate only the ARPAL data set.This validation aims at gathering indicators on the behavior, in terms of accuracy, on different spatial distribution of the sample points.This approach is meaningful as the two observation networks cover an overlapping region of the study area.The network from Genova municipality is located within the boundary of the city and is denser than the ARPAL one, which covers the whole study area, and some of the ARPAL stations are located in the Genova municipality.Comparing the approximation results at these two scales, we have evaluated the sensitivity of the approximation to local distributions of the samples and the capability to estimate the local features of rain fields interpolated over Finally, concerning computational complexity, the different algorithms have been tested over a 64 bits workstation 8 cores at 1.6GHZ and RAM of 16 GB.The system runs an Ubuntu 14.04LTS with 3.13.0kernel.The run of LR B-Splines takes 19.33 seconds to compute the approximation over the whole region (20K points) for the 48 time intervals.For the same task, the ordinary kriging takes 1.746 seconds and RBFs approximation takes 6.23 seconds.One important point to make here is that, for all the methods, the computational complexity and the timing collected are well below the time interval analyzed (30min).This important characteristic tells us that we could use any of them for real-time monitoring of the rain events.The analysis carried until now does not tell us much about the scalability of the methods for a larger set of observation points, where the computational complexity could become an issue.Preliminary results of this situation are presented later on, in Fig. 5, where results of kriging  obtained integrating radar rain data are presented.

PERSISTENT RAINFALL MAXIMA
Tables 3 and 4 report the comparative results about the extraction of persistent maxima when considering the rainfall fields produced by the three approximation schema.For our tests, rainfall fields have been computed based on the data set described in Sect.3..Hence, for each approximation scheme, we considered the 48 approximated fields, one for each cumulative step.
For each field F , the associated persistence maxima have been extracted according to four different values for the persistence threshold ε, namely ε = τ (max F − min F ) with τ = 0.05, 0.15, 0.25, 0.35.In practice, a maximum is preserved only if its persistence is larger than ε, while the others are filtered away.Table 3 reports the total number of extracted persistent maxima, averaged by the amount of considered cumulative steps; Table 4 shows the maximum number of local maxima that have been extracted, method by method, from the 48 rainfall fields.Despite some slight differences in the results, the general trend is to have a decreasing number of persistent maxima as the threshold τ increases.This is actually not surprising, since a higher persistence threshold implies that a larger portion of local maxima are pruned out.Also, for low values of the persistence threshold, we can relate the number of detected maxima to the smoothness of the considered approximation: in this view, the RBF schema appears to have a higher smoothing effect, as indicated by the smaller number of maxima characterized by a low persistence value.

Comparing sets of persistent maxima
In order to refine the above comparative analysis, we make use of the tracking procedure introduced in (Biasotti et al., 2015) to quantitatively assess a (dis)similarity measure between two sets of local maxima, originated from the three approximation schema when considering the same cumulative step.Before presenting results, we briefly recall the main ideas of (Biasotti et al., 2015).
For two sets F, G of local maxima of two rainfall fields F, G : M → R, it is possible to compare them by measuring the cost of moving the points associated with one function to those of the other one, with the requirement that the longest of the transportations should be as short as possible.Interpreting the local maxima in F and G as points in R 3 (i.e., geographical position and persistence value), the collections of local maxima are compared through the bottleneck distance between F and G, which is defined as where p ∈ F, γ ranges over all the bijections between F and G, d(•, •) is the pseudo-distance d(p, q) := min{ p − q , max{pers F (p), pers G (q)}}, which measures the cost of moving p to q, and • is a weighted modification of the Euclidean distance.In practice, the cost of  taking p to q is measured as the minimum between the cost of moving one point onto the other and the cost of moving both points onto the plane xy : z = 0. Matching a point p with a point of xy, which can be interpreted as the annihilation of p, is allowed by the fact that the number of points for F and G is usually different.The matching γ between the points of F and those of G, for which dB is actually occurred, is referred to as a bottleneck matching (Fig. 6).Through the bottleneck matching and the bottleneck distance, it is then possible to derive quantitative information about the differences in the spatial arrangement and the rain measurements for the points in F and G.
The bottleneck distance can be evaluated by applying a pure graphtheoretic approach or by taking into account geometric information that characterize the assignment problem.We opt for a graph-theoretic approach, which is independent of any geometric constraint and our implementation is based on the push-relabel maximum flow algorithm (Cherkassky and Goldberg, 1997).For each iteration, the algorithm runs in O(k 2.5 ), where k is the number of local maxima involved in the comparison.We note that the computational complexity is not an issue, because the number of points to be considered is very limited in general.For example, in tracking applications the number of persistent maxima to be monitored is usually no more than a dozen for each time sample.

Experimental results
For each cumulative step, we consider the rainfall fields interpolated by the three methods, and extract the sets of local maxima according to the four persistence thresholds discussed above.For each threshold, the three collections of persistent maxima are pairwise compared as follows.Since geographic coordinates and rainfall measurements come with different reference frames and at different scales, local maxima to be matched are first normalized so that their coordinates range in [0,1]; then, they are processed by computing the associated bottleneck matching and the bottleneck distance, and afterwards projected back in the original reference frames.Finally, a measure of their distance in terms of both geographical coordinates and rainfall values is derived by combining the information contained in the bottleneck matching and the associated numerical (dis)similarity score.Precisely, we consider the geographical and rainfall distances, which are defined as the largest difference in geographical and rainfall value, respectively, for two persistent maxima that have been paired by the bottleneck matching.
Tables 5 and 6 report the obtained results, in terms of geographical and rainfall distances, respectively, averaged by the total number of considered cumulative steps.To have a clearer picture of the comparative evaluation in terms of the two distances, the results included in Tables 5 and 6 should be jointly interpreted for each persistence threshold.For instance, when τ = 0.05 we have (relatively) high values for the geographical distance together with quite low rainfall distance values: this can be interpreted as slight numerical variations for the three approximations, possibly appearing spatially far one from each other.Note, however, that in this view that the RBF and the kriging techniques appear to have a more similar behavior, both producing higher values for the geographical and rainfall distances when compared with LR-B Splines.On the other hand, moving to higher persistence thresholds, the values of geographical distance decrease, as an effect of filtering out non-relevant maxima.As a consequence, the corresponding rainfall distance values reveal now the differences occurring at prominent maxima, which appear to be quite small.
We conclude by proposing in Table 7 a similar analysis to compare the results that are obtained when rainfall fields are interpolated by considering either observed rainfall measurements or an integration of these data with radar acquisitions (Sect.5.).Indeed, integrated data can reveal useful, e.g., for tracking applications: although rainfall measurements are more reliable, integrating them with radar data makes it possible to extend the rainfall field interpolation in larger areas and to have a clearer picture about the temporal evolution of the associated precipitation event.
As can be seen from the results in Table 7, characterized by high values in both the geographic and the rainfall distance, considering radar data can sensibly change the spatial location and the rainfall value of persistent maxima.This can be interpreted as the introduction of complementary information with respect to rainfall measurements, which hopefully can help in having a clearer understanding of precipitation events.

CONCLUSIONS AND FUTURE WORK
The aim of this study was the comparison of different spatial approximation methods finalized to compute the amount of rainfalls for hydro-metereological analysis and civil protection.As a final remark, we point out that all these approaches easily support the integration of further sources of rain measures, for instance those captured by radar.Fig. 5 shows the results of estimation achieved by kriging (a) with the rain gauges only and (b) with the integration of this data with radar information.In (c,d), we show the contribution of each data set in the estimated map plotted in (b).The color scale varies from blue to red corresponding respectively to a contribution varying from null to full.Since we have selected only two different data sets, the two maps in (c,d) are complementary.
Finally, we plan to proceed further with the presented comparison framework, including several more aspects and extending the evaluation to more elaborate correlation analysis, taking into account other relevant data, such as terrain morphology, satellite imagery, and meteorological situation.We will further investigate this possibility and especially the effect on approximation results on storm tracking.
Figure 1: (a) Input rainfall measures at 143 stations (regional level, white points) and 25 stations (municipality level, red circles).(b) Map of the rain rate maximum recorded at each weather station, which highlights that only the central west of the region has been involved by heavy rain and the remaining part were interested by drizzle.

Figure 3 :
Figure 3: A function F : M → R, color-coded from blue (low) to red (high) values, and the associated local maxima having persistence greater than α(max F − min F ), with α = 0.05, 0.15 (middle) and 0.25.

Figure 4 :
Figure 4: Leave-one-out cross validation MSE: y-axis reports the MSE [mm 2 ] for each time step x-axis.ues, F is interpolated on the vertices of a triangle mesh M. The points of M are first sorted in decreasing values, from max F to min F ; then, the classical 0th-persistence algorithm (Edelsbrunner et al., 2002, Edelsbrunner and Harer, 2010) is used.The cost of sorting the n points of M is O(n log n); by using a union-find data structure, the persistence algorithm requires linear storage and running time at most proportional to O(mα(m)), where m is the number of edges in the mesh and α(•) is the inverse of the Ackermann function.An example for the extraction of local maxima at three different persistence levels is given in Fig. 3.

Figure 5 :
Figure 5: Ordinary kriging approximation of rainfalls computed with (a) rain gauges and (b) integrated with radar measurements.(c) Rain gauges weights and (d) radar data set mapped in (b). a sparser data set.The results are shown in Table 2: in this case, ordinary kriging and LR B-Splines have the smaller maximum error, but the RBFs have a smaller mean-squares error.

Figure 6 :
Figure 6: Two fields F, G : M → R, color-coded from blue (low) to red (high) values, and the associated local maxima.On the right, bottleneck matching between local maxima.

Table 1 :
Summary statistics for the error distribution of the cross validation.

Table 2 :
Summary statistics for the error distribution of the accuracy evaluation at different scales.

Table 3 :
Statistics for the average number of extracted persistent maxima.

Table 4 :
Statistics for the maximum number of extracted persistent maxima.

Table 5 :
Table of results for the averaged geographical distance (measured in km) between sets of local maxima (Liguria area size: 5.410 km 2 ).

Table 6 :
Table of results for the averaged rainfall distance (measured in mm) between sets of local maxima.

Table 7 :
Averaged geographical (measured in km) and rainfall distance (measured in mm) between sets of local maxima.