EFFECTIVENESS OF SPECTRAL SIMILARITY MEASURES TO DEVELOP PRECISE CROP SPECTRA FOR HYPERSPECTRAL DATA ANALYSIS

The present study was undertaken with the objective to check effectiveness of spectral similarity measures to develop precise crop spectra from the collected hyperspectral field spectra. In Multispectral and Hyperspectral remote sensing, classification of pixels is obtained by statistical comparison (by means of spectral similarity) of known field or library spectra to unknown image spectra. Though these algorithms are readily used, little emphasis has been placed on use of various spectral similarity measures to select precise crop spectra from the set of field spectra. Conventionally crop spectra are developed after rejecting outliers based only on broad-spectrum analysis. Here a successful attempt has been made to develop precise crop spectra based on spectral similarity. As unevaluated data usage leads to uncertainty in the image classification, it is very crucial to evaluate the data. Hence, notwithstanding the conventional method, the data precision has been performed effectively to serve the purpose of the present research work. The effectiveness of developed precise field spectra was evaluated by spectral discrimination measures and found higher discrimination values compared to spectra developed conventionally. Overall classification accuracy for the image classified by field spectra selected conventionally is 51.89% and 75.47% for the image classified by field spectra selected precisely based on spectral similarity. KHAT values are 0.37, 0.62 and Z values are 2.77, 9.59 for image classified using conventional and precise field spectra respectively. Reasonable higher classification accuracy, KHAT and Z values shows the possibility of a new approach for field spectra selection based on spectral similarity measure.


INTRODUCTION 1.1 Introduction
Spectral similarity measures are effectively used to distinguish among vegetation and background soil (Chang, 2000 andDu et. al, 2004) to distinguish among mineral spectra (Van der meer 2005) and discriminating crop varieties (Kong et al. 2010).Spectral similarity measures can be effectively used to match the similarity among the collected field spectra.To serve the purpose of study three spectral similarity measures are used.The said measures are selected in such a way that they reveal the complete information for spectral similarity.The study uses two deterministic, i.e. i) Spectral distance based and ii) Spectral Angle based and one Stochastic (self-information based) measures.Spectral similarity values among same crop spectra are computed by the three measures and the spectra having larger values are considered as outlier and hence rejected.The effectiveness of the selected spectra to classify the remotely sensed hyperspectral data into required classes is evaluated by spectral discriminatory measures.Therefore, it results into authenticity and precision of comparison.Spectral discriminatory values decide the measure that is best discriminating among the various crop classes in which image is to be classified.Conventionally crop spectra are selected by averaging collected field spectra after rejecting outlier based on spectral shape.However, the effectiveness of the selected spectra to classify the image into required classes is never evaluated in conventional method.Thus the offspring of the present research would create a fresh opportunity for the evaluation of the spectra used to carry out classification.___________________________________________________ * Corresponding Author

Novel Contribution from the Study
Outlier rejection addressed through 'training set refinement' conventionally uses divergence, transformed divergence, Jeffries-Matusita distance etc, which uses second order and third order statistics.(Pearlman et. al, 2003).Atmospheric correction needs to be applied before further processing and analysis.Fast Lineof-sight Atmospheric Analysis of Spectral Hypercubes (FLAASH), an efficient correction code for atmospheric correction was applied using ENVI commercial software package (Kruse F.A., 2008 Field observations were carried out concurrent to the satellite pass.The field observations include ground based hyperspectral reflectance using GER 1500 spectroradiometer with GPS locations covering all major crop types.The field observations were taken over 106 sites.The spectroradiometer has a range of 512 channels with a range of 325 to 1075 nm.Gathering spectra at a given location involves optimizing the integration time providing fore-optic information, recording dark current and collecting white reference reflectance.The target reflectance is the ratio of energy reflected off the target (crop) to energy incident on the target (measured using BaSO4 white reference).The reflectance measurements were made from one meter above the crop canopy with the sensor facing the crop and oriented normal to the plant.The readings were taken on cloud free days at around solar before noon local time.While taking the observations, explicit care was taken not to cast shadows over the area being scanned by the instrument.

SPECTRAL SIMILARITY AND DISCRIMINATION MEASURES
The effectiveness of three spectral similarity measures was evaluated by two spectral discrimination measures in this study.
3.1.1City Block Distance Measure: CBD (Chang 2003) computes the difference vector between two pixel vectors to determine spectral similarity.For a given hyperspectral pixel vector x = (x 1 ………..,x L ) T , each component x l represents a pixel in band image B l which is acquired by a certain wavelength  l in a specific spectral range.Let s = (s 1 ,……, s L ) T be the corresponding spectral signature (i.e., spectrum) of x where s l represents its spectral signature of x l in the form of either radiance or reflectance values.
is a set of L wavelengths, each of which corresponds to a spectral band channel.The difference vector between two spectral signatures of two pixel vector s i and s j can be derived from l 1 , l 2 ,….l m norms in real analysis.
3.1.2Spectral Angle Measure (SAM) SAM is a widely used spectral similarity metric in remote sensing.It measures spectral similarity by finding the angle between the spectral signatures of two pixel vectors s i and s j .

Spectral Information Divergence Measure:
The spectral information divergence measure (Chang 2000) calculates the distance between the probability distributions produced by the spectral signatures of two pixels defined as, and derived from two probability vectors p = (p 1 , p 2 ,………..p L ) T and q = (q 1 , q 2 ,………..q L ) T for the spectral signatures of two pixel vectors s i and s j where, and I l (r j ) = -log q l and similarly I l (r i ) = -log p l. Measures I l (r j ) and I l (r i ) are referred to as the self-information of r j for band l (Kullback 1959;Cover and Thomas 1991).Note that Eqs. ( 4) and ( 5) represent the relative entropy of r j with respect to r i .

Spectral Discrimination Measure
Spectral discrimination measures were used as objective statistical criteria to evaluate the performance of spectral similarity measures.discriminatory probabilities of all the spectral signatures in a spectral library or database. Let , be the K spectral signatures in the set  which can be considered as a database and t be any specific target spectral signature to be identified using  .Probabilities of spectral discrimination of all s k 's in  relative to t as follows.
where,   is a normalization constant determined by t and  .The resulting probability vector P t,  = (P t,  (1), P t,  (2),…….., P t,  (K)) T is called probability of spectral discrimination (PSD) of  with respect to t or spectral discriminatory probability vector of  relative to t.  7) provides a quantitative index of spectral discrimination capability of a specific hyperspectral measure m(.,.) between two spectral signatures s i and s j relative to d. Obviously, the higher the PWSD m (s i , s j; d) is the better discriminatory power the m(.,.) is.

Analysis of Spectral Similarity to select Precise Spectra
The spectra collected during field study were processed using GER 1500 data acquisition software.Conventionally collected field spectra were averaged after rejecting outlier based on spectral shape for three major crops such as chickpea, sorghum and wheat (Rao et al. 2007).It was identified as field spectra developed conventionally as presented in figure 1(a).To develop precise spectra based on spectral similarity CBD, SAM and SID similarity values were computed among same crop spectra.The spectra having larger values with other spectra of a crop class were considered as outlier and rejected.1, 2, 3, 4, 5 and 6 elucidate for wheat -spectra 4, for sorghum-spectra 6 and for chickpea-spectra 11 and 12 having largest CBD, SAM and SID values with all other spectra hence they are considered as outlier and rejected.Average spectra constituted of remaining 6 spectra of wheat, 5 spectra of sorghum and 10 spectra of chickpea, identified as precise field spectra based on spectral similarity as presented in figure 1(b).

PSD and PWSD for Conventional and Precise Spectra
The PSD calculates the probability for all spectra or a set of selected spectral signature from a spectral library that is able to classify a spectrum of a pixel to target class.Higher the probability, the better is capability of the set of spectra to predict the pixel spectrum.PWSD evaluates the effectiveness of spectral similarity measures to predict specific target spectrum.It is designed based on the power of discriminating one pixel vector from another relative to a reference pixel vector (Van der Mir 2005).The Higher PWSD causes the better discrimination among the spectra.PSD and PWSD for the spectrum of figure 2(a) and 2(b) computed separately and comparative analysis is carried out.7 and 8 extract spectra developed based on spectral similarity increase the PSD of chickpea and sorghum in CBD and SID.Hence chickpea and sorghum now have higher probability for classification of target pixel.Also PWSD of chickpea and wheat is increased for all measures that conclude chickpea and wheat can be better discriminated than sorghum.A marginal decrease in the PSD although increase in the PWSD has been observed for wheat.Thus, wheat has better discrimination from other classes.Eventhough SAM is very widely used algorithm for hyperspectral image classification, results of PSD and PWSD restricts the selection of SAM as classification algorithm.Among all SID is better discriminating among the classes as it has highest PSD and PWSD.Thus, selecting SID as the classification method, higher classification accuracy can be assured.

Crop Classification and Accuracy Assessment
Crop classification has been carried out using Spectral Information Divergence (SID) algorithm.SID (Du et al. 2004) is a spectral classification method that uses a divergence measure to match pixels to reference spectra.The smaller the divergence, the more likely the pixels are similar.Pixels with a measurement greater than the specified maximum divergence threshold are not classified.Hyperion's atmospherically corrected image is used as an input data for classification.The ultimate aim of the image analysis was crop classification.To carry out classification for agriculture areas only pixels having NDVI value less than 0.4 (threshold decided from the field knowledge) are masked.Classification was carried out as input spectra from two spectra i.e. field spectra developed conventionally and precise field spectra developed based on spectral similarity.Image was classified into major three crop classes such as chickpea, sorghum and wheat and pixels not belong to these classes' remains as unclassified.After classification georeferencing was carried out for accuracy assessment.Image was georeferecned using 9 GCP to the accuracy of 0.1 pixel resolution.Accuracy assessment of the classified images was carried out for both the cases by cross validation of classified pixels against testing pixels collected during field study and error matrix is generated to test the accuracy.2010;Jensen, 2004;Richards and Jia, 2006).Kappa analysis technique is used to measure the agreement between two observers on the same data; for remote sensing, it is used to measure the agreement between the classification approaches.Since, it takes into account the whole error matrix instead of only the diagonal elements, as the overall accuracy does, it has been recommended (Fung and Ledrew, 1988) as suitable measures of accuracy of classification.The Kappa analysis was applied by means of the formula given below, Where, r = number of rows in the error matrix ii x = number of observations in row i and column i (on the major diagonal)  i x = total numbers of observations for row i i x  = total number of observations for column i N = total number of observations in error matrix Kappa is a dimensionless real number between -1 and 1, the value close to 1 includes the maximum agreement while value of -1 can be interpreted as a total disagreement.Ladis and Koch (1977) proposed a classification of agreement based on the value of Kappa, (Table 11).them (Skidmore, 1999).The test statistics Z is obtained by using the formula (9) derived by Fleiss et al. (1969).
Where, K 1 and K 2 are the Kappa coefficient of image classified by conventional and precise crop spectral library respectively and Var 1 and Var 2 are the variances of respective Kappa statistics.The Z statistics follows a normal distribution.For instance, assuming for Z test, the null hypothesis H 0 : K 1 = K 2 and the alternative H 1 : K 1  K 2 , the H 0 hypothesis is rejected if Z value is obtained is greater than 1.96; the classification results (error matrices) are significantly different at a 95 % confidence level.Whereas if Z value obtained is lesser than 1.96, the H 0 is accepted i.e. the classification results (error matrices) are not significantly different at a 95% confidence level.Z test was carried out for remote sensing data by Dwivedi et al. (2003) 12 also presents the variance of the KHAT statistics and the Z statistics used for determining if the classification is significantly better than a random result.At the 95% confidence level, the critical value would be 1.96.Therefore, if the absolute value of the test Z statistics is greater than 1.96, the result is significant and it can be concluded that the classification is better than random.The Z statistics values for the two error matrices in Table 12 are both higher than 1.96, so both classifications are significantly better than random.Also the value of pairwise comparison of two classification results computes higher Z statistic value, which shows two classification results are significantly different.
According to Kappa value the image classified using spectra selected conventionally shows fair agreement (KHAT = 0.37), while image classified using precise spectra shows substantial agreement due to increased Kappa value (KHAT = 0.62).Hence image classified using precise spectra shows substantial improvement in image classification validated by kappa statistics.

CONCLUSIONS
The results of the PSD, PWSD, accuracy assessment, kappa statistics and Z values show that spectra developed based on spectral similarity is the possibly an effective way to develop spectra for crops.A reasonably higher overall accuracy, kappa statistics and Z values shows improvement in classification results indicates spectra developed based on spectral similarity was more effective to distinguish among the classes.As the field has major three classes, the approach covers solely three classes but the scope of developed approach is applicable to any number of classes and also to other land use land cover classes since it is not class specific.The approach can be effectively applied to visually inseparable Class II category of land use land cover classification.
Probability of Spectral discrimination (PSD; Chang 2003) and the Power of Spectral discrimination (PWSD; Chang 2003) spectral discrimination measures are used to check the effectiveness of spectral measure to classify a set of spectral classes on the basis of a set of spectral library.3.2.1 Probability of Spectral Discrimination (PSD): PSD relates to the selected spectral signature in a spectral library used to train a classifier (Van der Mir 2005).PSD calculates the This contribution has been peer-reviewed.The double-blind peer-review was conducted on the basis of the full paper.doi:10.5194/isprsannals-II-8-83-2014

Figure 2 :
Figure 2: Image data: a) Hyperion data, b) Classified image using field spectra developed conventionally, c) Classified image using field spectra developed precisely based on spectral similarity, d) Zoomed portion of encircled area for image classified using spectra developed conventionally, e) Zoomed portion of encircled area for image classified using spectra developed precisely.
).An atmospheric correction removes the effect of atmospheric scattering and absorption features, corrected data contains 168 bands.Acquired Hyperion image covers the regions with central coordinates 20º N latitude and 76.5 º E longitudes are Lonar, Mehekar in Buldhana District of Maharashtra, India.Study area having climate of tropical semi arid with alluvial soil types.The area is fertile having irrigated crops in Rabi season which prolongs October through February.The major crops during Rabi season are chickpea, sorghum and wheat.The minor crops are seasonal vegetables and fruits.
selects the discriminatory power of m(.,.) the maximum of two ratios, ratio of m(s and in all other cases larger than 1except for the case in which two spectra are different and equidistant from a third spectrum.The PWSD m (s i , s j; d) = max {m(s i ,d)/m(s j ,d),m(s j ,d)/m(s i ,d)} (7) More precisely, PWSD m (s i , s j; d) i , d) to m(s j , d) and ratio of m(s j , d) to m(s i , d).PWSD m (s i , s j; d) is 1 if s i = s j i , s j; d) defined by eqn.(

Table 6 :
SAM and SID among Chickpea field spectra For table 2, 4 and 6 upper triangle w.r.t.diagonal shows SAM while lower triangle shows SID results.Table

Table 7 :
Spectral similarity and discriminatory values: Spectra developed conventionally

Table 8 :
Spectral similarity and discriminatory values: Spectra developed precisely based on spectral similarity Table

Table 9 :
Error matrix for the image classified using field This contribution has been peer-reviewed.The double-blind peer-review was conducted on the basis of the full paper.doi:10.5194/isprsannals-II-8-83-2014

Table 10 :
Error matrix for the image classified using field

Table 11 :
Kappa statistics obtained from the error matrices of two classifications was done to determine if they are significantly different.The determination of the normal distributed Z was obtained by the ratio among the difference value of two Kappa coefficients and the difference of the respective variance of Table12presents the results of the Kappa analysis on the individual error matrices.The KHAT values are a measure of agreement or accuracy.The values can range from +1 to -1.Table