INCORPORATING UNCERTANITY INTO MARKOV RANDOM FIELD CLASSIFICATION WITH THE COMBINE USE OF OPTICAL AND SAR IMAGES AND ADUPTIVE FUZZY MEAN VECTOR

A Markov Random Field (MRF) model accounting for the classification uncertainty using multisource satellite images and an adaptive fuzzy class mean vector is proposed in this study. The work also highlights the initialization of the class values for an MRF based classification for synthetic aperture radar (SAR) images using optical data. The model uses the contextual information from the optical image pixels and the SAR pixel intensity with corresponding fuzzy grade of memberships respectively, in the classification mechanism. Sub pixel class fractions estimated using Singular Value Decomposition (SVD) from the optical image initializes the class arrangement for the MRF process. Pair-site interactions of the pixels are used to model the prior energy from the initial class arrangement. Fuzzy class mean vector from the SAR intensity pixels is calculated using Fuzzy C-means (FCM) partitioning. Conditional probability for each class was determined by a Gamma distribution for the SAR image. Simulated annealing (SA) to minimize the global energy was executed using a logarithmic and power-law combined annealing schedule. Proposed technique was tested using an Advanced Land Observation Satellite (ALOS) phased array type L-band SAR (PALSAR) and Advanced Visible and Near-Infrared Radiometer-2 (AVNIR-2) data set over a disaster effected urban region in Japan. Proposed method and the conventional MRF results were evaluated with neural network (NN) and support vector machine (SVM) based classifications. The results suggest the possible integration of an adaptive fuzzy class mean vector and multisource data is promising for imprecise class discrimination using a MRF based classification.


INTRODUCTION
Urban areas occupy a very small region of the earth surface.However the rapid change of the urban land-cover categories within a small distance resulting similar entities at different locations, majority of land-cover types being internally heterogeneous and intermediate conditions of the class boundaries make the urban landscape a complex and vague entity (Forster, 1983;Wang, 1990;Wood and Foody, 1993;Bastin et al., 2002;Fisher et al., 2006;Tang et al., 2007;Foody and Matur, 2006).Satellite missions with their global view over the earth surface provide useful and rapid information for urban mapping, map updating and urban change detection.Most well-known classification approaches were developed using multispectral images for urban land cover features.The diversity of urban land cover components with similar spectral signatures and the effect of point spread function (PSF, i.e., the response of an imaging system to a point source) make it difficult to classify them even in multispectral images (Zhang et al., 2014).As a result the use of multi-source satellite images is currently considered as one promising approach to improve the vague urban land cover classification accuracy (Weng , 2012).
The characteristics of the optical and SAR sensors are different to each other.Multispectral satellites such as AVNIR-2 provide information on the energy scattered and radiated by the Earth's surface in different wave lengths, providing the ability to discriminate between different land cover classes such as vegetation, water and impervious surface.Optical sensors are not all-weather imaging systems providing atmospheric limitations such as clouds.On the other hand SAR sensors such as PALSAR provide measurements in amplitude and phase related to the interaction of the land cover classes with microwaves.Backscattering information over urban areas recorded by the SAR sensors typically reaches high values due to single, double-bounce and trihedral reflections from the manmade structures with low values for vegetated areas and water sources.The use of SAR data in land cover classification of urban areas is relatively limited due to the peculiar imaging geometry, complexity of the microwave interaction with urban feature and the presence of coherent fading (speckle).These issues in combination make it difficult to determine proper class distributions and parameter estimation for SAR data using the Bayesian classification mechanisms.Speckle makes it difficult to use a single source SAR image to initialize the class arrangement in MRF for the prior probability estimation.As a result a combination of SAR and optical sensor data will be useful to provide a complete set of information for the classification.
Bayesian statistics have been used widely as a theoretically robust foundation for image classification (Tso and Mather, 1999).In order to maximize the posterior probability with maximum a posterior estimation (MAP) it is important to determine both class-conditional and prior probabilities as accurate as possible.Modelling them with MRF using optical and SAR images respectively is a convenient way of developing a contextual classification procedure for an individual SAR data.Several research works in the direction considered this framework for feature classification (Tso and Mather, 1999;Solberg et al., 1996;Zhang et al., 2014;Deng and Clausi, 2004;Yang and Clausi, 2012).Most of these works used the amplitude information of the SAR data to detect more homogeneous land cover elements (ex-Sea Ice, Lithological types).In this study we also consider the fact that the classification results lack additional information related to the degree of certainty or the complexity associated with the heterogeneous urban land cover classes.Fuzzy set theory provides the mathematical tool to model imprecise, incomplete and unreliable information classes (Zadeh et al., 1965).The author has also reported the advantage of using fuzzy class parameters in MRF models to generate super resolution land cover maps with optical images in his previous work (Welikanna et al., 2012).The work we present in this letter can be considered as an extension of ongoing work (Ardila et al., 2011;Kasetkasem et al., 2002;Kasetkasem et al., 2005;Welikanna et al., 2008).
Several key implementations are brought to focus in this study (a) fuzzification in source universe: which determines the memberships values of each class for the SAR intensity pixels, prior to the determination of its conditional probability density function (p.d.f) using a Gamma distribution (b) initial class arrangement for MRF using optical images with the highest sub pixel class abundance in a pixel and (c) logarithmic and powerlaw combined annealing schedule to decrease the temperature for the MAP algorithm.

FUZZY MRF MODEL AND POSTERIOR ENERGY ESTIMATION
MRF based contextual classification models rely on the Hammersley-Clifford theorem system (Tso and Mather, 2009), which describes the equivalence between the Gibbs random fields (GRF) and MRF.This also means that a unique GRF exists for every MRF, as long as the GRF defines in terms of cliques on a neighborhood system (Tso and Mather, 2009).A clique is a set of neighbors defined with respect to the pixel to be classified.The resulting model is based on the energy functions, leading to a classification task of energy minimization.The prior and the likelihood energy terms are modelled using the optical and SAR images respectively.
Let and denotes the observed optical ( ) and SAR ( ) image pixel gray level values at a pixel location respectively.The intended label for each image pixel at location is represented by .Depending on the number of classes in the classification problem, varies.Within a considered pixel neighborhood N in the contextual model a neigboring pixel label with the pixel in concern is represented by .According to the Bayesian theory the posterior distribution can be determined by combining both the prior and conditional probabilities: Conferring to the equivalence of the MRF and GRF, probabilities in Eq. ( 1) can be defined by means of energy functions (Geman and Geman, 1984).Hence the posterior energies can be given as below: In the case of multisource data, the measurements from different sensors are assumed to be conditionally independent (Solberg et al., 1996).This simplifies the mathematical analysis.The validity of this assumption in the case of optical and SAR image integration is considerably higher due to different operational wavelengths and the acquisition times of the images.As a result the statistically independent conditional probability distribution can be determined as below: Therefore the MRF-MAP model for the combined SAR and optical sensors with corresponding reliability or discrimination criteria can be given as: According to Eq. ( 4) it is possible to integrate many different image indices such as SAR texture as well as GIS information in the proposed framework with an appropriate reliability criterion.In the experiments presented in the paper we essentially model the prior probability using the optical data while the conditional probability is modelled using the SAR intensity information.Combining Eq. ( 2) and Eq. ( 4) we can define the class conditional energy for multisource data as below: The conditonal distribution for a pixel given a true label of the multilook SAR intensity images follows a Gamma distribution (Deng and Clausi, 2005;Yang and Clausi, 2012).Thus the conditonal probaility for SAR intensity can be defined as follows: where denotes the number of looks and is the mean value for class .According to the Gamma distribution (Eq.( 6)), the relative likelihood of a SAR intensity pixel to be classified to a class is mainly controlled by the class mean values.Notably this distribution function doesnot account for any spatial relationship among different urban land cover classes.Therefore the conditonal energy can be derived according to the MRF and Gibbs equivalnce as shown in Eq. ( 7).
As a major implementation in this paper, we propopse an aduptive fuzzy mean vector for the classes in Eq. ( 7) with the use of fuzzy set theory.Fuzzy set theory provides the base to construct a more meaningful relationship between a pixel and a class using the grades of memberships.Intermidiate grades of memberships improve the pixel belongingness to a particular class providing much better estimation for the class mean values.This pixel weighting schemce is advantageous particularly for the distribution functions such as Gamma distribution, where the estimation is strongly controlled by class mean values.Let denote the universal set.A fuzzy set for class , ώ over is defined as the set of ordered pairs: where ( )( ) is termed the grade of membership or simply the membership function, of the element to the fuzzy set .Importantly all the elements in belong to with different grades of memberships.In the case satellite images, the pattern is described as a vector in an M-dimensional space, , where denotes the k th sensor data observation space (i.e. the source universe), and the number of sensors.In the case of multisensory observation space the universal set is the Cartesian product .If is a set of predefined classes then each class is defined as fuzzy set over Thus ( ) conveys information on the degree to which the pattern may be treated as belonging to class .
In the case of SAR average polarization (HH+VV/2, see section 3) used as the input data, observation space is only 1dimensional.To determine these membership values we use the fuzzy c-means partition matrix.The image (when number of bands, ) is portioned into a matrix ( ) with each element of the matrix ( ( )) showing the pixel membership value for class ω.Here is the number of classes and ( ) will be the number of pixels.The J function measures the difference between each pixel values and the cluster centres μ to estimate the F matrix (Tso and Mather, 2009).The clustering is performed according to the Eq.( 9): Where V= ( ) is the vector of the cluster centres, and being the membership weighting exponent .Parameter is also known as the fuzzy exponent, controlling the level of cluster fuzziness.For a particular class of interest, different values will result in different membership grades.Many optimal values for (ranging from 1.1 to 1.2) have been suggested but it's less explained why one value of is better than the other (Fisher 2010).The parameter was set to a usually accepted value of 2.0 (Fisher et al. 2006) in aid to add more fuzziness to the membership assignment.A local minimum for can be achieved under the condition shown in Eq.( 10).Here and : Let the training pixels for a particular class take values with ( ) being the number of training pixels for C number of classes.The discrete fuzzy mean ̅̅̅̅ can then be defined by using grade of membership estimated for a single image pixel in the sample: where is the number of pixels for a particular class sample and ( ) is the pixel membership grade, which satifies the conditon.From this point onwards we abbreviate this new implementation as fuzzy MRF (FMRF) and the conventional approach as MRF.The prior class information is modelled by using the optical image.For a particular pixel of the observed optical image, propotion of the land cover class of interest is determined using Singular Value Decomposition (SVD).Let ( ) ( ) ( ) ( ) represent the propotion of land cover in the pixel for class in the observed multispectral image.And let a corresponding pixel of the intial class image ( ) be .Then a class is assigned to a pixel based on the maximum class propotion recorded in any of the coarser resolution multspectral image pixel as follows: Under the assumption of pair-site interaction, in which the pixel of interest and one of its eight neighbors (2 nd order neighborhood) is considered, the prior energy is defined as: where ( ) is the local contribution to the prior energy from a pixel in the optical image.denotes the weigting parameter for each pixel and ( ) takes a value 0 if and 1 otherwise.Combining Eq.( 7) and Eq.( 13) we can define the posteriro energy for the multisource data sets as follows: The realiability factor controls the contributions from the optical and SAR imagery in the form of prior and likelihood energy in the posteriro energy determination.You can regulate this factor mainly considering, the level of details that the end user needs in the calssification process and also in terms of the highest classification accauracy.

Simulated Annealing with a combined annealing schedule
The MAP estimation of the posterior energy in Eq.( 14) is determined by using the stochastic Simulated Annealing technique (SA).This implements a Metropolis-Hasting sampling technique to reduce the energy and retrieve MAP solution.SA algorithm allows the randomness ( -temperature), to decrease in an iterative way so that the best solution for Eq. ( 14) can be made.The temperature will be decreased according to the criterion called annealing schedule.The process is repeated until the system becomes frozen ( ), which means pixels stop updating.The annealing schedule is very important considering the MAP algorithm.Different annealing schedules were tested in many studies.Logarithmic and power-law annealing schedules are two of the most preferred annealing schedules.Logarithmic schedules are considered to be the most promising in the case of better solutions for the global energy minimization (Geman and Geman, 1984).But it will lead to a slower annealing process.Hence in this study we combine two cooling schedules within the SA.Eq.( 15) and Eq.( 16) shows the logarithmic and the power-law annealing schedules used respectively.
where is the initial temperature, the cooling schedule parameter and the iteration number.For better results in complex situations, a large value of both these parameters which slows down the annealing process is recommended (Tolpekin and Stein 2009).We start with a logarithmic annealing schedule and made it to change for the power-law schedule after the temperature ( ( )) drops to a certain value.This temperature value was determined by several experimental runs.Detailed explanation of the SA technique can be found in literature for further understanding (Li, 2009, Tso and Mather, 2009, Geman and Geman, 1984).

STUDY AREA AND DATA
The region selected for the study covers the heavily damaged Ishinomaki and Onagawa areas in Japan.Ishinomaki city and the areas north to the city are located in a flat basin.Two main rivers flow through the area, one which runs to the south through Ishinomaki city (Old-Kitakami river) and the other which runs eastward through Ogatsu area (Kitakami river).Many of the primary land covers of the area belong to cultivated 300m m 300m m N farmlands while the impervious, soil, water and vegetation dominates the rest.As a significant change observed after the tsunami, the farmlands with exposed soil and grass were transformed into inundated farmlands.A post disaster Advance Land Observation Satellite (ALOS) phased array type L-band synthetic aperture radar (PALSAR) image and Advanced Visible and Near-Infrared Radiometer-2 (AVNIR-2) image was used in this study.These SAR datasets are the result from an urgent data acquisition after the earthquake on 11 th March 2011.Full polarimetric observations conducted on 8 th April 2011 was taken as the post-disaster input.Observation mode was an off-nadir angle of 21.5° in the ascending orbit.A single look PALSAR image carries a resolution of 4.45m in azimuth and 23.14m in ground range direction.The revisit cycle for ALOS is 46 days.The temporal base line is 138 days and the perpendicular base line is about 2Km.Such large temporal and spatial baseline can induce significant decorrelation effects and produce poor interferometric coherence.The PALSAR images were geocoded using UTM projection (zone 54N) and WGS84 Datum.Multilooking (5-look) processing in azimuth direction was performed to adjust the azimuth and range pixel size to be comparable, with a resulting spatial resolution of 25m.No speckle filters were applied initially on the data.In the case of AVNIR-2 data set, the image on 10 th April 2011 was taken as the post disaster image.This image was resampled using nearest neighbors from 10m to 25m spatial resolution to have the same pixel resolution of the SAR image.The time interval between the optical and SAR images in the case of post disaster was 2 days.AVNIR-2 data sets were used to model the contextual information for the post disaster event.The study area map and the 5 look average co-polarized intensity PALSAR and AVNIR-2 images (225×225 pixel subset) are shown in Fig. 1.

EXPERIMENTAL SETUP
Several key parameters have to be estimated prior to the performance of the multisource image classification proposed in this study.They are the fuzzy membership grade parameter ( ) for each intesity pixel, fuzzy class mean ̅̅̅̅ and the source realiability factor .To detemine the membership grades we employed the fuzzy C-means clustering criteria.Fuzzy membership function for residential area SAR intensity pixels with the mean value in the range of 340000.00 and the FCM clustering results are shown in Fig. 2. We first determine the FCM based membership values for each radar intensity pixel.Additionally with the use of pixel membership weights the number of pixels in a training sample necessary to model the class distribution can be reduced substantially.The scattering mechanism recorded by fully polarimetric SAR data after the earthquake and tsunami changes depending on the damages to the manmade structures and the disturbances to the land scape.The effect of the scattering mechanism in the SAR intensity is an important factor to consider in the case of urban area damages and inundated farmlands.As a result of the disaster, post disaster SAR image has a significant decrease in the double bounce scattering due to the urban structural damages.This is in contrast to the increase in volume scattering power caused by the scattering from large amount of debris.This significant change in the scattering power can be seen with the rapid fluctuation of the membership grades in the SAR image Fig. 2(b) mainly due to ruble, after the disaster.Four main land cover clases were identified for the classification, namely water, industrial, residential and the farmlands.An additional class due to the post disaster situation as exposed soil and grass were identified specifically due to the tsunami effect to the area.Therefore, finally a five class classification problem is solved.The NN and SVM based refrence images were generated by using HH and VV polarizations inputs.We compare the MRF and FMRF results with an NN and SVM based classification results.The conventional and the fuzzy class mean values calculated are different to each other as shown in Table 1.As mentioned earlier this requires accurate membership grade estimation for each training pixel for each class.Fig. 2 the membership function determines the pixel contribution for the class mean value estimation.
SVD based pixel unmixing was carried out on AVNIR-2 images (Boardman, 1989;Canty, 2010;Krus et al., 1993).Then the pixels were labelled to a class depending on the highest class fraction estimated by SVD.Using the posterior energy a b minimization in MRF and FMRF models we perform the classification using the generated initial class image.It is important to mention that the is a key parameter which controls the contribution of the prior and the likelihood energy in the posterior energy determination.For example when the likelihood term is completely ignored in Eq. ( 14) for a minimal posterior energy, forcing all the pixels to be classified to a single class.
Table 1.Conventional and fuzzy mean values calculated for each land cover training class

EXPERIMENTAL RESULTS
After the initial setup we ran the experiment for a range of values.To determine the best value, class seperability, classification accuracy and genetic algorithms can be used (Tso and Mather, 2009).In this study we used the classification accuracy to determine the best .The optimum results were generated with the in the range from 0.2 to 0.5.In this range the reliability of the SAR data in the classification mechanism is much higher than the optical image.In fact, with the main motivation of classifying the SAR image this range provides higher reliability to the SAR image.Posterior energy (Eq.( 14)) was minimized using SA with Metropolis-Hasting sampler (Geman andGeman 1984, Tso andMather 2009).
We have implemented a new cooling schedule within the SA in this study.Initially the parameter and was set to values 3.0 and 0.9 respectively .We tested the energy minimization with ordinary lognormal annealing scheme.It was identified that once the T reaches a value between 1.0 and 0.7 the drop in the energy becomes significantly slow.Hence we implemented the power-law annealing schedule after and 0.7.From this experiments, due to the considerable reduction of number of iterations and the slight difference in local energy we stick to .The optimum MRF and FMRF results and the NN and SVM based classification results are shown in Fig. 3.The two schedules and there test runs are shown in Fig. 4.

RESULTS VALIDATION
The validation of the SRM at both pixel level as well as the sub pixel level is a complex process.This becomes even tricky when proper ground truth information is not available.In this study we implement two approaches to validate MRF result.In the pixel based validation, we used the Overall accuracy (OA) and the Cohen`s kappa statistics (Congalton 1991), with the contingency tables.We also report the producer accuracy for each class.
Visual interpretation of the classified images interestingly suggests the improved ability of FMRF to classify the local water sources and the inundated farmland better than the MRF.SVM and NN based classification outputs shows salt and pepper effect with noise-like pixel labels for land cover classes whereas both MRF being contextual approaches shows less noisy effect.Clarity in the edge information (ex.Internal water ways, live fences) resulting from MRF approaches is higher than the SVM and NN.Table 3 and Table 4 show the overall comparison results of the FMRF with NN and SVM respectively.
Table 3. Overall accuracy and the kappa statistics resulted from the comparison of the MRF results with the NN based reference images Table 4. Overall accuracy and the kappa statistics resulted from the comparison of the MRF results with the SVM based reference images The evaluation with the NN based reference image clearly suggested the different energy estimations and the improvements in the FMRF over MRF.At the optimal classification results were reached for the post disaster image tested using MRF and FMRF methods.The FMRF OA was improved from 73% to 81% with resulting kappa values of 0.63 and 0.73 respectively.The SVM based comparison also follows the same pattern but with marginal improvements for the FMRF.The results show better accuracy for the MRF at , with OA and kappa being 87% and 0.83 over FMRF OA and kappa of 83% and 0.77 respectively.
To get a better understanding of the classification performance at individual class level, we examine the producer accuracy for each class.Producer accuracies for each class show the significant improvement in classification accuracy with the use of the grade of memberships.Table 5 and 6 report the producer accuracies for each class.Producer accuracy shows the reduction in omission errors with the use of fuzzy memberships in the classification for each class.With the NN based comparison, all the five classes improved the classification accuracy in the FMRF with average producer accuracy reaching 79% from 86%.In the case of SVM based comparison, only the soil class shows a significantly less accuracy of 49% in FMRF with the 77% of MRF.This large misclassification coming from the soil class considerably affect the OA and the kappa values of the overall classification accuracy.It is difficult to say from the NN and SVM based classification methods which is the most suitable reference to assess the accuracy of the proposed FMRF model.Hence we report both methods in the analysis.The use of new SA scheme provided significantly rapid optimization by cutting down the number of iteration from 155 to 43 (Fig. 4).

CONCLUSION
The MRF model proposed in this study use multisource data and adaptive fuzzy class mean vector for the SAR intensity pixels to model the conditional energy using a Gamma distribution.We propose this development as one of the possible solution to account for the classification uncertainties in an MRF framework.The initial class labelling was performed using an optical image with the highest sub-pixel class fractional estimation.Method was tested over an urban area in the north east coast of Japan, heavily affected by an earthquake and tsunami.The FMRF model shows its improvements over the MRF model for the multisource data with better contextual smoothening and class discrimination ability.Several important conclusions can be made from the findings of this work.Both the MRF and FMRF approaches show very high level of classification accuracy for the multisource data.It is important to have fuzzy memberships for all the pixels in the image for better class discrimination.For a particular class which is not possible to be discriminated in the MRF (as an example water class in this study), a better option is to use its fuzzy mean.This also means that we can replace specific class mean values with fuzzy means without replacing the whole mean vector to model its conditional energy much more sensitively.The combined annealing schedule shows the promise for practical applications with faster annealing.The variation of the reliability factor λ is not clear with the experiments conducted; it will be interesting to see its variation with further experiments.
We conclude that the use of membership weighted pixels to model the energy functions can bring better classification accuracy especially in the case of SAR intensity images, where the pixel distribution follows a Gamma function.

Figure 1
Figure 1 Study area map and the images (a) false colour composite RGB (4,3,2) of the AVNIR-2 image (b) average HH, VV intensity ALOS PALSAR image

Figure 2 .
Figure 2. Fuzzy membership function for an impervious training sample (a) ALOS PALSAR intensity image pixels, over a residential area (b) Fuzzy C-means clustering with pixel membership values for SAR intensity image

Table 5 .
Producer's accuracy (%) for the post disaster land cover classes, compared with NN reference images

Table 6 .
Producer's accuracy (%) for the post disaster land cover classes, compared with SVM reference images The multisource classification model developed under a fuzzy Bayesian environment is simple and useful to integrate different image indices for classification as well as image information fusion.