LANDSCAPE IMPACT ASSESSMENT OF SDG2 DEVELOPMENT PROJECTS USING REMOTE SENSING AND UNSUPERVISED CONTROL SITE SELECTION

: As part of its objective to achieve Zero Hunger under SDG2 the United Nations World Food Programme, in partnership with Governments, NGOs and other UN agencies, supports food insecure communities to increase natural resource availability and improve their management. This is done mostly through the building and rehabilitation of soil and water conservation assets (e.g., small dams, weirs, landscape restoration) and structures that increase productivity (e.g., vegetable gardens, irrigation canals). To adequately monitor these activities around the globe simultaneously, remote sensing was found to be an adequate tool. This study introduces the use of high-resolution satellite imagery, and more specifically NDVI derived from the Landsat series to verify and quantify the impact of such development projects. In total 121 projects in 10 countries and six different climate zones were analyzed using a pre-and post-implementation comparison and a Before-After Control Impact (BACI) study considering randomly selected control sites. Both approaches were found to show robust results throughout the different countries, project types and climate zones. 67% of all projects showed significant improvements in vegetation conditions during the wet seasons only three years after the implementation. Using the proposed workflow based on Python scripting and cloud computing of satellite data, fast and robust analyses can be achieved, while assuring constant data quality.


INTRODUCTION
The United Nations World Food Programme (WFP) 1 is the world's largest humanitarian organization focusing on SDG 2: Zero Hunger.As part of its portfolio to improve vulnerable communities' food security and resilience to shocks, WFP, in close collaboration with Governments, UN Agencies and NGOs, builds and rehabilitates assets that improve natural resource availability and management in the most fragile contexts globally.In 2021 alone, WFP has reached more than 8.7 million people across 49 countries with its Food Assistance for Assets (FFA) programmes, whereby communities receive food assistance while participating in the planning, design, and management of such structures.Through these efforts, WFP has rehabilitated over 190.000 ha of farmland, built 3,740 water ponds, planted more than 3,200 hectares of forests, and constructed or repaired 3,400 kilometres of roads (PRO-R WFP, 2022).Adequate monitoring of FFA activities is a crucial step for the organization to determine on the successful creation and maintenance.Further, this information can support future project decisions and enhance the overall quality of the implementations.The Asset Impact Monitoring from Space (AIMS) 2 is a WFP internal service that provides satellite-imagery based evidence of FFA interventions and their positive impact on the environment.Robust analysis on environmental conditions is required to monitor a wide range of FFA assets constructed under diverse climatic conditions.The quantification of impacts is a crucial part of this analysis since success of projects should be quantifiable.The overall question of this paper is: Which approach is suitable ⃰ 1 https://www.wfp.org/who-we-are 2 https://aims-unwfp.hub.arcgis.com/ to monitor globally the environmental impacts of FFA projects on the landscape and quantify the success of the programmes?Which approach can be a fit to diverse climates and difficult contexts in countries experiencing conflict or war?

Remote Sensing for monitoring
The use of satellite imagery for applications in environmental studies is well-established and there is an increasing trend in the monitoring of crop health (Zhang et al., 2019), drainage, soil moisture and the development of blended datasets (Khanal et al., 2020).Remote sensing is recognized for the vast possibilities of large-scale and dynamic observations (Li et al., 2020) and is indispensable for the creation of land-cover and land-use change data (Rogan and Chen, 2004;Brown et al., 2022) with increasingly better resolution.Furthermore, the use of remote sensing for drought monitoring and the assessment of possible impacts is very common and the use of satellite-based drought indicators established in research (West et al., 2019;Bachmair et al. 2016).The increasing number of datasets 3 , the introduction of new platforms like Google Earth Engine (Gorelick et al. 2017) or the Microsoft Planetary Computer 4 , and the usage of machinelearning based algorithms (Maxwell et al., 2018) that can be used for image classification or data processing (Camps-Valls, 2009) present important developments.
The 2030 Agenda5 for Sustainable Development declared 17 goals with 169 targets for a better future.Several indicators have been described to reliably monitor their progress at a global scale.According to Estoque (2020), 30 of these indicators can successfully be monitored using remote sensing.In fact, further exploration of how Earth Observation can support the monitoring of development projects is realized through several stakeholders.To monitor the successful implementation of FFA assets certainly, the widespread availability and variety of satellite imagery should be considered.Considering the right data sources is an important element in making correct decisions and implementing improved projects (Vorovencii, 2011).For example, time-series analyses (Saroglu et al 2011;Vorovencii, 2011) is suggested as the biggest advantage of remote-sensing based approaches since historical data is available over large areas of the planet, that can describe accurately the conditions at specific time points.Since the implementations of FFA assets focuses on enhanced crop productivity, improved vegetation cover or regrowth of forested areas, the analysis should focus on capturing the vegetation productivity differences during different stages of the crop cycle.(Smith, 2013).
For this study the Normalized Difference Vegetation Index (NDVI) (Rouse et al., 1973) was chosen as a suitable indicator.Using the NDVI in environmental studies is well established and numerous studies show that the indicator is resilient against changing sun angles, topography and shadows and atmospheric conditions (Bunkei, 2007).In past Before-After Control Impact (BACI) studies, indicators based on near-infrared (NIR) and red bands like the NDVI, or the Enhanced Vegetation Index (EVI) have shown reliable results (del Rio-Mena et al., 2021).

Statistical models for impact assessment
In analysing the impact on the landscape of environmental projects, various analysis models have been dominant, including a simple before-after analysis (Green, 1979) that compares the performance along time within the project area (see Figure 1a).This approach has been criticized thoroughly due to the lack of control measures (e.g., comparing to a site not undergoing any intervention) or the fact that environmental studies did not account for climatic variabilities despite crop production being highly dependent on the weather conditions in specific years.Results can highlight an improvement in the target metrics, but this improvement cannot be causally attributed to the intervention.Results can therefore be influenced by other trends within the ecosystem e.g., general increase in rainfall causing an overall increase of vegetation.
Commonly, the solution to this criticism was the introduction of a control site that would ideally have a similar land cover, be close in space to experience the same weather variability and not subject to anthropogenic changes in the whole research period plus being randomly selected (Meroni et al., 2017).The Before-After-Control-Impact (BACI) Design includes the selection of one control site for comparison of an indicator before and after the intervention (see Figure 1).This approach was suggested by Eberhardt in 1976 assuming that the start of the impact is known to the analyst.The general idea is to estimate the potential change in the magnitude of variations, in addition to measuring any potential or actual change in the mean of a target variable (Underwood, 1994).By applying this sampling, both annual fluctuations, e.g., crop cycles, as well as inter-annual climate variability are considered (Meroni et al., 2017).One measure of the BACI analysis is the BACI contrast (del Rio-Mena et al., 2021): with μ being the mean value of an indicator during a specific period, B and A standing for Before and After and C and I for Control and Impact site, FFA asset site in this study The relative contrast expresses in percent the changes relative to previous conditions, where negative values are associated with an improvement of conditions (Meroni et al., 2017).
The BACI design itself is a robust tool to understand improvements or deteriorations in a fluctuating surrounding.Nevertheless, there is enough room for critique, including the lack of random selected control sites and a rather manual selection of control areas instead.Further, the fact that usually only one control site is selected for each impact site was strongly criticised by Underwood in 1991.A more asymmetrical model was suggested and the work on several control sites proposed to overcome the issue that the statistical outcomes heavily depend on the few control sites considered.On the other hand, even very well drafted studies fail to have more than a dozen impact sites, which results in reduced statistical power (Wood, 2021).Bootstrapping suggested by Mishra et al. 2023 to generate an increase of sample pairs is a possible solution.In this paper, this challenge was addressed by selecting a control site on matching pre-intervention values at a pixel level, as detailed below.

METHODOLOGY
Since a homogeneous response to the intervention within the project areas was expected, the basis for all analysis was polygon based, not pixel based (e.g., del Rio-Mena et al., 2017) in this study.

Project data
A broad dataset of the AIMS service was used as basis of this research.In total, 121 FFA assets in 10 different countries under different climatic conditions were analysed.The projects were implemented between 2011 and 2021 ranging in size between two and 200 ha.The geographical positions include areas with

NDVI data
The satellite imagery used for this study was taken from the USGS (United States Geological Survey) Landsat Collection 2 Level-2 atmospherically corrected surface reflectance series 6 and accessed through the Planetary Computer STAC API.Several pre-processing steps were applied to the Landsat time series to ensure data quality using Python.Clouds and cloud shadows were masked using the Landsat Quality Assessment bitmask band as well as surface reflectance outliers.The last preprocessing step is a Whittaker smoother applied on the NDVI computed from the red and NIR bands.The smoother applied is based on a V-Curve optimization and expectile smoothing, as presented by Eilers et al. in 2017.The optimization is carried out pixel by pixel.This is to correct a variety of signal interferences that are mostly due to atmospheric cloudiness and haze, knowing that cloud cover distorts measurements with stronger negative deviations, but it also allows to interpolate data gaps due to cloud coverage.

Haiti
Countries, where projects were analysed for their environmental impact, WFP 2023 Further, bigger areas are included in the sample and not only ones similar in size to the impact site.Very similar weather conditions will apply due to the short distances to the impact site, while similar NDVI patterns reflect a similar landscape type.This can be supported using very-high resolution imagery to compare the land cover at several time slots before the intervention was started.The only constraint is the risk of including areas being under human intervention, yet the size of the control site should make this influence neglectable.Furthermore, it is likely to experience human activity in any form also in areas that were labelled as not being under any project.Overall, this control site selection procedure gives reliable outputs in different climatic and landscape contexts due to its data-based approach, in contrast with more traditional human-based selection methods that are more time-intensive and utilize less data in the process.

Impact assessments
After retrieving reliable data for both impact and control sites, long-term NDVI values were calculated on the basis of 10 years before the intervention started for the wet and dry season in each specific asset area and control site.Further the values during wet and dry season for impact and control site were calculated for the last three years before and three years after the implementation.
Anomalies relative to the long-term average values were calculated for all sites (both impact and control) for six years.A 10% increase of mean NDVI value during dry and wet season in comparison to long-term pre-intervention conditions is used as a standard reference in the corporate results framework in WFP, hence it is useful to understand how this approach compares to the previously presented BACI calculations.The BACI contrast and relative contrast were calculated for both wet and dry season.
A two-way ANOVA with interaction model was also applied in b .
the BACI design with intervention (asset / control sites) and period (before / after intervention) as independent factors.The statistical model is as follows: =  +   +   + ()  +   , (3) where μ is the overall mean, α_i is the effect of period (i = before or after), β_j is the effect of location (j = control or asset), (αβ)_ij is the interaction between period and location and ε_ijk represents the error.This test allows to obtain the statistical significance of changes occurring after the intervention in the asset looking at the test for an interaction effect.Indeed, an interaction effect occurs when an impact will be observed at the after period depending on the value of the location.Hence, this interaction effect informs us if a change is occurring after the intervention but only at one location, ie. the asset site.Therefore, here and after, "p-value" refers to the statistical p-value of the ANOVA interaction test.Finally, a Pearson's r was calculated for the NDVI values between impact and control site for pre-and post-implementation timeframes.

RESULTS
After running the above-mentioned script, all outputs were compiled.Out of 121 projects analysed, the BACI relative contrast showed improvements for 81 during the wet season of them, representing 67%.Regarding the 10% criteria being currently used in the standard impact assessments, 78% of all projects were showing a substantial increase of vegetation activity in the first 3 years after the implementation.Improvements during the dry season were found in 41 projects considering the BACI criteria and 35 regarding long-term NDVI values.located in a zone with Mediterranean hot summer only three showed significant vegetation increases in the summer, meanwhile in winter 7 out of 8 showed vegetation increases of 27%.In Semi-arid Steppe climates 52% of the analysed projects showed improvements during the dry season, 59% in the wet season.Improvements were generally 10% higher during the rainy season.Tropical Savana climates of generally higher NDVI values also showed significant improvements.Nevertheless, 78% of 41 projects showed improvements within three years after implementation in the wet season.Projects implemented in Tropical Monsoon climates showed increase in vegetation productivity in 20% of all projects in dry and 47% in wet seasons.NDVI values registered for certain projects decreased slightly due to clearing activities done beforehand, or the fact that the analysis year was particularly dry affecting the growth of young vegetation.Overall, the improvements registered in dry seasons showed NDVI increases of approx.10%, meanwhile the significant improvements in wet seasons on average are 40% higher than the long-term average conditions.Comparing the impact methodologies applied, it is visible that the BACI and a 10% NDVI vs. long-term average threshold show similar results in the analysed selection of FFA projects.Meanwhile the BACI methodology found substantial increase in NDVI for more projects in the dry season than the simple pre-/post comparison, the pre-/post comparison found more projects to have a positive impact during the wet season.Results of the two methods differ therefore depending on the season.There are many possible reasons for this behaviour, e.g., the type of intervention analysed or the climatic context of the outliers.Generally, though, none of the groups showed any particularities in comparison to the other projects.Since the investment going into the calculation of both methods has become minor thanks to the data infrastructure provided, a suggestion is to apply both methods simultaneously.In case of major differences in the findings, a more detailed look into the intervention activities and current climatic contexts is needed.As a last step the correlation between the impact and control side was calculated to understand if any major changes can be detected.Since the selection of the control site is based on the past NDVI profiles, the correlations should generally be lower after the implementation.In particular, irrigation canals should show a lower correlation between the impact and control site, since it in this case indicates lower seasonality e.g., start of second or third crop cycles in contrast to the surrounding rain-fed plots.In the case of the woodlot project in Afghanistan (see Figure 9) a reduction of the correlation of 0.7 was calculated, which means that the project area stopped following the general landscape trends.The correlation can be a great addition to any findings, yet only the change in correlation is not sufficient to make plausible statements on projects, since other human activities can be the reason for trend changes and not necessarily those are positive as given in the woodlot example.

DISCUSSION
Overall, the study succeeded in comparing different impact assessment methods and proposing a control site selection procedure in theoretical and practical approaches.Nevertheless, it is crucial to highlight the eventual shortcomings or opportunities for improvement.Considering the selection of control sites, the suggested unsupervised approach showed fast results at low computation cost, fulfilling all the criteria mentioned in literature.For quality control purposes it could be of great improvement to consider a high-resolution land cover layer as additional criteria.By filtering both project and impact site to be in the same land cover category and showing similar NDVI curves, a very strict additional criteria is applied, hence the quality of the control sites would be higher.Nevertheless, this additional processing step needs to be checked for computation efforts and data availability of reliable land cover data for past years.
From a data infrastructure perspective, the suggested approach in this study was computationally efficient, considering that mostly cloud-computing accelerated the output creation.For this reason, an increase of analysis of post-intervention years is possible and advised for further studies.While interpreting the results of the two applied impact assessment methods, both showed compelling results.Meanwhile a simple threshold seems to be generous during wet seasons, the BACI approach found more positive impact during dry seasons.For this reason, it is important to always consider full crop cycles while monitoring these projects and considering the post-implementation weather conditions.As visible in Figure 10 the NDVI improvements during dry and wet season reflect the start of a second crop cycle visible in the graph.The sample projects in Sierra Leone were i.e., affected by drought conditions, showing low responsiveness of crop-related vegetation meanwhile the surrounding forested land was more resistant to abrupt conditions changes.Reducing the results on simple number or labels therefore is not advised, since landscape monitoring considers rather complex relationships.
Considering the need for a fixed criteria though, the thresholds for improvement could be adapted to the climate zone where a project is located.In arid and semi-arid contexts, a NDVI increase of +0.2 can reflect a 300% increase, meanwhile in tropical areas it could reflect only a 25% gain.Finally, both calculations could be enhanced by regarding other environmental variables depending on the project type.This includes land surface temperature, biomass, or soil quality.

CONCLUSION
This study presented a landscape impact assessment of 121 WFP's FFA projects in 10 countries considering the vegetation development post-intervention.Both NDVI thresholds in comparison to long-term average conditions and a BACI assessments have been conducted on the dataset showing compelling results and overall good achievement of the selected projects.A robust control site selection methodology was applied that fulfils different criteria and is computationally feasible.Due to the big sample size and the geographical spread of the projects, a general suitability of both methods could be confirmed.Quick calculation and robust findings confirm their applicability for global monitoring of such development projects.

Figure 1
Figure 1.a. simple pre-post comparison, b.Comparison between Impact and Control Site, c.Multiple comparison between preand post-conditions between Impact and Control Site(Smith, 2013).

Figure 4 .
Figure 4. Stone bunds to control the water run-off from the hill in Phalombe, Malawi, WFP/Badre Bahaji 2023.

Figure 5 .
Figure 5. Overview of countries included in this study.Since the aim is to detect a robust methodology that supports analysis on a global scale, the research area is distributed over several continents.The interventions include irrigation projects, reforestations, soil and water conservation activities to newly created gardens.Under soil and water conservation assets several interventions can be included e.g., half-moons, soil bunds, check dams or terracing.For each project the different interventions undertaken were known, including the exact start and end date.

3. 3 K
Figure 6.k-means clustering method to identify groups within dataset according to their characteristics, Maxar 2023.selected for the control site.This control site selection techniques improves the similarity of asset and control areas in terms of confounding variables, thereby reducing selection bias,

Figure 7 .
Figure 7. NDVI over an asset and its control site in Syria a. Whole area b.Analysis ready area after a K-mean clustering selection to keep only pixels with land cover similar to the asset.

Figure 8 .
Figure 8. Example of an improved woodlot project in Afghanistan, in an arid climate context.The NDVI of the impact site (green) and NDVI of the control site (grey) in comparison throughout the years.The relative BACI contrast is -156 for the dry season (p-value = 0.015) which shows a clear improvement.The NDVI is up to 400% higher after the implementation than long-term average values.Regarding the different intervention types, four out of five woodlots showed significant improvement in dry and wet season, meanwhile all other intervention types have a lower number of projects showing vegetation increase during the dry season than the wet season.Soil and water conservation projects show great success during the wet seasons and also great improvements in NDVI of over 40%.Meanwhile Gardens show a mean increase of 25% in wet seasons and even slight reductions during dry seasons.Forestry projects show good results during the dry season, since woody biomass is less dependent on seasonal cycles than crop plants.The only water pond showing an increase in vegetation productivity during the dry season reached improvements of +30% in comparison to long-term average

Figure 9 .
Figure 9. Before (2014) in black-white and after (2021) RGB imagery over the woodlot project site in Afghanistan.conditions.Since different climatic conditions have influence on vegetation growth and the increase or decrease of NDVI values, the results were grouped into six different types of climate zones.Projects located in arid climates increased in 60% of the cases the vegetation conditions during the dry season and in 64% during the wet season.Increases generally reach up to 70%, related to generally lower NDVI values in the area.Out of 8 projects

Figure 10 .
Figure 10.Example of an improved garden creation project in Zimbabwe in semi-arid hot steppe climate.NDVI of the impact site (green) and NDVI of the control site (grey) in comparison throughout the years.The BACI contrast is -36 for the dry season (p-value = 0.005).The NDVI is up to 20% higher postimplementation than long-term average values.Two crop peaks are visible after implementation, meaning an increase in food production, a declared goal of the project.

Figure 11 .:
Figure 11.: Before and after implementation, grassland was transformed into a garden.VHR imagery supports the positive findings of the NDVI curves.

Table 2 .
Results per asset type.

type % of projects improved in dry season Average increase in NDVI (in comparison to lta) % of projects improved in wet season Average increase in NDVI (in comparison to lta)
Image Source: © 2021 MaxarImage Source: © 2014 Maxar

Table 4 .
Discrepancies between two proposed methods.