SPATIAL CORRELATIONS OF MALARIA INCIDENCE HOTSPOTS WITH ENVIRONMENTAL FACTORS IN ASSAM , NORTH EAST INDIA

Malaria is endemic and a major public health problem in north east (NE) region of India and contributes about 8-12% of India's malaria positives cases. Historical morbidity pattern of malaria in terms of API (Annual Parasite Incidence) in the state of Assam has been used for delineating the malaria incidence hotspots at health sub centre (HSC) level. Strong spatial autocorrelation (p<0.01) among the HSCs have been observed in terms of API (Annual Parasite Incidence). Malaria incidence hot spots in the state could be identified based on General G statistics and tested for statistical significance. Spatial correlation of malaria incidence hotspots with physiographic and climatic parameters across 6 agro-climatic zones of the state reveals the types of land cover pattern and the range of elevation contributing to the malaria outbreaks. Analysis shows that villages under malaria hotspots are having more agricultural land, evergreen/semi-evergreen forests with abundant waterbodies. Statistical and spatial analyses of malaria incidence showed a significant positive correlation with malaria incidence hotspots and the elevation (p<0.05) with villages under malaria hotspots are having average elevation ranging between 17 to 240 MSL. This conforms to the characteristics of two dominant mosquito species in the state Anopheles minimus and An. baimai that prefers the habitat of slow flowing streams in the foot hills and in forest ecosystems respectively.


INTRODUCTION
Globally malaria clinical cases are reported as 300-500 million and 1.5-2.7 million deaths annually (Srivastava et al., 2001).Malaria is endemic and a major public health problem in north eastern region (NER) of India.It is stable with preponderance (60-80%) of Plasmodium falciparum (Pf) species in the entire region.NE region makes up to only 3.7% population of India but contributes 8-12% of malaria positives, 10-20% of Pf cases and 13-41% deaths due to malaria as compared to the whole nation during the last decade (Dev, 2009).Among NE states, Assam is highly receptive to malaria transmission and accounts for more than 50% of reported cases of malaria in NER.Here malaria transmission is perennial and persistent with seasonal peak during April-September corresponding to months of rainfall.P. falciparum and P. vivax both occur in abundance but P. falciparum is the predominant parasite (>60%).Focal disease outbreaks were recurring characterized by high rise in cases and deaths attributed to P. falciparum malaria.(Dutta, 1997).
Human malaria is a complex disease and its incidence is a function of the interaction between the Anopheles mosquito vector, the parasite, humans and the environment.Different mosquito species have different habitat preferences like steams, rice fields, plantations, forests, forest fringes, foothills, etc.The physical environment also plays a significant role in the distribution of species in particular geographical areas.For optimizing vector control operations, spatial distribution of malaria is an important consideration for designing situation-specific intervention strategies that are aimed at reducing transmission.
Geospatial techniques comprises of remote sensing, geographic information system (GIS) and global positioning system (GPS) have added new dimensions to spatial statistics analysis in epidemiological studies (Back et al., 1994, Hay et al., 1997).Many studies have applied these advanced tools in understanding the host-vector relationship and their spatial distribution (Barnes & Cibula, 1979, Glass et al., 1995, Hendrickxy et al., 1999, Dhiman, 2000, Abelardo et al., 2000, Jeganathan et al., 2001, Handique et al., 2011).Spatial statistics analytical techniques help in analyzing the spatial order and association of a variable under study.In areas like ecology, epidemiology, geology and image processing, it is often not appropriate to randomize, block and replicate the data because of the spatial associations of attribute features associated with study variable (Lawson, 2001).On the other hand, it is required to stratify and prioritise areas under a particular administrative unit for better planning and managing resources.Hence a sound technique has to be followed to prioritise these areas of importance or hot spots with sound statistical base.Use of predictive approaches have been demonstrated by different workers for study the of mosquito vector borne diseases like malaria (Srivastava et al., 2001, Abeku et al., 2002).
In this study, we have employed spatial statistics analytical tools in GIS domain to study the spatial distribution of malaria incidence and identify the disease hotspots at Health Sub Centre (HSC) level.Malaria incidence rates were integrated with vegetation cover derived from Indian remote sensing data and elevation from digital elevation model of ASTER data.
The outcome of the study is expected to help the district health authorities to mobilise man and materials for timely interventions.

Study Area
The study was carried out in Assam state located in the north eastern part of India considering the severity of impact of Malaria and its perennial occurrence.The state of Assam (24 0 44' -27 0 45' N latitude; 89 0 41' -96 0 02' longitude) is the most populous (30.94 million population as per census 2012) and is the gateway to the northeast for economic activities.The problem of chloroquine resistance first detected in Assam and is spreading and intensifying thereby creating greater concern for the disease.

Measure of spatial autocorrelation
It is of interest to see the spatial distribution pattern of malaria reporting villages in the district.If we observe the malaria reporting villages to follow any kind of clustering pattern, we may like to relate the occurrence of the disease with underlying landscape and socio-economic features.In classifying spatial patterns as clustered, dispersed or random, we focussed on how various HSCs are arranged and observed the extent of spatial autocorrelation (Lee and Wong, 2001).Here, high autocorrelation would imply the occurrence of HSCs with higher value of API and the correlation is attributable to the geographic ordering HSCs.The most commonly used spatial auto-correlation statistic, Moran's I coefficient (Chou, 1997) was employed to measure the autocorrelation (Eq.1-4).Moran's I can be defined as- Here, Euclidean distance is used to define the weights w ij .
Corresponding to each pair of sample points i and j, let d ij represent the distance between them.The distance weight is applied in an inverse manner, since the intensity of spatial relationship diminishes when the distance increases.Hence w ij = 1 / d ij .
When no spatial autocorrelation exists, the expected value of Moran's I is Here, n is the total number of geographic units (HSCs), x i denotes API corresponding to ith HSC.
The value of the Moran's I coefficient ranges between -1 and 1.A larger positive value implies a clustered pattern, while a negative value significantly different from 0 is associated with scattered pattern.When the Moran's I coefficient is not significantly different from 0, there is no spatial autocorrelation and the spatial pattern is considered to be random.Spatial statistics tool in ArcGIS software used to measure spatial autocorrelation is based not only on locations of the HSC or on number of malaria cases (API) alone, but on both HSC locations and corresponding API simultaneously.Given a set of HSC and associated malaria cases, it evaluates whether the pattern expressed is clustered, dispersed or random.A 'Z' score is calculated to assess whether the observed clustering / dispersion is statistically significant or not.The Z score is calculated as-(4)

Delineation of Malaria hotspots
Moran's I has well-established statistical properties to describe spatial autocorrelation globally.However, it is not effective in identifying different type of clustering spatial patterns.These patterns are sometimes described as 'hot spots' and 'cold spots'.If high values are close to each other, Moran's I will indicate relatively high positive spatial autocorrelation.The clusters of high values may be labelled as a hot spot.But high positive spatial autocorrelation indicated by Moran's I could be created by low values close to each other.This type of clusters can be described as cold spot.Delineation of these hot spots and cold spots will help in optimising the use of resources for timely interventions.The G statistics (Getis and Ord, 1992) has the advantage of detecting the presence of hot spots or cold spots over the entire study area (Eq.5-8).A local measure of spatial autocorrelation is the local version of the General G statistics.The local G statistics is derived for each aerial unit to indicate how the values of aerial units of concern is associated with the values of surrounding aerial units defined by a distance threshold d.The Local G statistics is defined as: This G statistics is defined by a distance d, within which the aerial units can be regarded as neighbours of i.The weight w ij (d) is 1 if aerial unit j is within d, or 0 otherwise.Thus the weight matrix is essentially a binary symmetrical matrix, but the neighbouring relationship is defined by distance, d.The sum of the weights is: Basically, the numerator of (5) which indicates the magnitude of G i (d) statistics will be large if neighbouring features (Malaria incidence in terms of API) are large and small if neighbouring values are small.A moderate level of G i (d) reflects spatial association of high and moderate values, and a low level of G i (d) indicates spatial association of low and below average values.Before calculating the G statistics we need to define a distance d, within which aerial units will be regarded as neighbours.In this exercise we have defined d as a distance of 10 kilometers based on the extent of the study area and spatial distribution of Health Sub Centres.So the HSCs will be regarded as neighbours if they are within an aerial distance of 10 km.For detail interpretation of the general G statistics we rely on its expected value and standardised score (Z score).
To derive Z score and to test for the significance of the general G statistics, we have to know the expected value of G i (d) and its variance.The expected value of G i (d) is- The expected value of G i (d) indicates the value of G i (d) if there is no significant spatial association or if level of G i (d) is average.Then we need to derive the Z score of the observed statistics based on the variance.Variance of G i (d) is calculated as follows and tested for statistical significance.(Getis and Ord, 1992): Where, n denotes the number of aerial units (villages) in the entire study area.

Geospatial data
Land use/ land cover maps prepared as a part of National Natural Resources Census (NR Census) using IRS LISS III satellite data with spatial resolution of 23.5 meters were used in the study (NRSC 2011).Land cover types within a buffer of 3 kilometres (maximum flying range of mosquitoes) from the centres of the villages under malaria incidence hotspots were delineated and a comparison has been made across 6 agro-climatic zones of the state to correlate with the disease incidence.Climatic condition of different agro-climatic zones and the districts covered under each zone are given in Table-1.With the assumptions that the most dominant mosquito species in the state Anopheles minimus and An.baimai prefers the habitat of slow flowing streams in the foot hills and in forest ecosystems respectively, variation in the elevation of areas under JE incidence hotspots were analysed with elevation data retrieved from ASTER DEM with 30 m resolution.

Cachar Hailakandi Karimganj
The climate is characterized by high rainfall (more than 2000 mm), high temperature and high humidity.Maximum temperature rises up to 37 0 C in July-August and minimum falls to 9 0 C in January.

Karbi-Anlong Dima Hasao
Rainfall and temperature differ substantially among the different parts of the zone due to varying altitudes and location of hills and valleys.The total rainfall is about 1,144 mm in North Cachar hills and 600 mm in Karbi Anlong.The temperature ranges between 37 0 C and 9 0 C.

Spatial pattern of malaria distribution
Spatial autocorrelation among HSCs have been measured with Global Moran's I index.Global Moran's I, O(I) calculated with all the HSCs having API >2 for the study period is found to be 0.00744 with Expected value E(I) 0.00143.Z score is found to be 3.24, which is significant at 99% confidence level (p< 0.01).These results confirm that spatial distribution of Malaria occurrence is non-random) and hence calls for special attention in the areas of Malaria occurrence.

Malaria incidence hotspots and high risk areas
Malaria incidence hotspots have been identified with G statistics based on whether large number of cases measured in terms of API tends to cluster in the area.Highest value of Gi is calculated to be 5.211 and the lowest is -1.430.Z Scores have been calculated for testing the statistical significance.SHCs with Z score more than 2.56 has been considered to significant at 99% confidence level (p<0.01) and put in the hotspot category.Udalguri district located in the foothills of Bhutan has identified as the district having maximum SHCs under hotspot category followed by Baska district having 70 HSCs in hotspot category.On the other hand the districts of Dibrugarh and Kamrup (Metro) are the districts having two number of HSCs as hotspots (Figure 2).It has been observed that the SHCs located in the foot hills are having relatively high Malaria incidence in terms of API and the same has been reflected in having higher malaria incidence hotspots.

Relation of Malaria incidence hotspots with vector habitats
It is interesting to note that majority of the Malaria incidence hotspots are from foothills areas of the state.Foothills areas from Kokrajhar to Dhemaji reported maximum number of hotspots.This observation reveals that the most dominant mosquito species in the state Anopheles minimus and An.baimai prefers the habitat of slow flowing streams in the foot hills and in forest ecosystems respectively.Vegetation covers within the 3 km buffer of villages under Malaria incidence hotspots shows that kharif crop areas (grown during June/July to Nov/Dec) occupies the major areas in all the agro-climatic zones with a percent of coverages from 16-50% (Table 2 & Figure 3).In LBVZ and CBVZ, deciduous forest covers about 12 % of the total land cover areas.In HZ, forest covers about 63% of the total land cover areas, out of which about 43% is the dense forest (evergreen and semi-evergreen).Significant areas of evergreen and semi-evergreen are observed in the malaria incidence hotspots of UBVZ (11%).This corroborate the results of Srivastava et al. (2004) who reported that two types of forests, namely evergreen tropical wet and moist deciduous forests available along the Himalaya foothills are favourable for the distribution of An. baimai.

Relation of Malaria incidence hotspots with elevation
In view of the observation that the majority of the malaria incidence hotpots are located along the foothill area of the state, a detailed analysis was by correlating average elevation of the villages under malaria hotspots to that of the year wise malaria incidence.Positive correlation observed in all the agroclimatic zones except the hill zone, which is obvious.Among agro-climatic zones, UBVZ showed highly significant positive correlation during the years 2008-2012 (p<0.005).This may be due to the fact that An. minimus prefers to breed in clear, unpolluted slow moving water with grassy and partially shaded edges (Nagpal and Sharma, 1987).

CONCLUSION
The study shows the potential application of spatial statistics analysis to delineate the disease incidence hot spots at health sub centre level.Maximum attention should be given to these Malaria incidence hotspots by the health department authorities to minimise fatalities.Information generated in this exercise will serve as baseline information and will help in future monitoring of the disease in the state.The study in the hot spots in terms of physiographic and climatic factors has helped in understanding the threshold of different parameters responsible for disease transmission and outbreak.

Figure 1 .
Figure 1.Location of study area 2.2 Collection of Malaria case data Data pertaining to the Malaria cases during the period 2008-2013 were collected from the office of the Joint Director of Health Services located at different district head quarters and from the office of the Directorate of Health Services, Guwahati, Annual Parasite Incidence (API) which is the number of confirmed cases during one year per 1000 population was calculated based on the PHC wise census records collected from the respective PHCs compiled under National Vector Borne Disease Control Programme (NVBDCP).We considered the HSCs having more than 2 API.

Figure 2 .
Figure 2. Number of SHCs identified as hotspots in different districts of Assam

Figure 3 .
Figure 3. Malaria incidence hotspots with different land use/land covers

Figure 4 .
Figure 4. Correlation of malaria incidence with elevation in different agro-climatic zones of Assam

Table 1 .
Climatic conditions of Agro-climatic zones and districts covered

Table 3 .
Correlation of Malaria incidence hotspots with elevation

Table 2 .
Land use/land cover classes within three kilometers buffer of villages under malaria hotspots