ROTATION MATRIX SAMPLING SCHEME FOR MULTIDIMENSIONAL PROBABILITY DISTRIBUTION TRANSFER

This paper address the problem of rotation matrix sampling used for multidimensional probability distribution transfer. The distribution transfer has many applications in remote sensing and image processing such as color adjustment for image mosaicing, image classification, and change detection. The sampling begins with generating a set of random orthogonal matrix samples by Householder transformation technique. The advantage of using the Householder transformation for generating the set of orthogonal matrices is the uniform distribution of the orthogonal matrix samples. The obtained orthogonal matrices are then converted to proper rotation matrices. The performance of using the proposed rotation matrix sampling scheme was tested against the uniform rotation angle sampling. The applications of the proposed method were also demonstrated using two applications i.e., image to image probability distribution transfer and data Gaussianization.


INTRODUCTION
In remote sensing, the analysis of multi-temporal data is widely used in many applications such as urban expansion monitoring, deforestation, change detection, and agriculture monitoring.One of the difficulties when dealing with multi-temporal images is the difference in image statistics because the images were acquired at different times.
In image classification context, it is possible that the image used for training the system can have statistics different from the image we want to classify.Moreover, images from different areas or times are combined to develop efficient classification system.To obtain a reliable classification result, the probability distributions of the training data and the data we want to classify should be similar.
In (Inamdar et al., 2008), it was pointed out that their are three approaches for relieving the problem of different statistics between images.The first approach is to use the atmospheric model to reduce the interference from atmosphere.The second is to use ground data to calibrate the images.The last one is to transform the image data such that its probability distribution of spectral information is similar to that of the reference or target image.The first two approaches are not possible, especially when the physical information of the atmosphere or ground data is not available.
In the last approach, digital numbers of pixels in a mutlispectral image are treated as multidimensional random variable.Its goal is to transfer statistical properties by reshaping the probability distribution of the source image such that its shape is matched with that of a reference one.In literature, The probability distribution transfer is also known as the histogram specification or histogram matching.Pitié and Kokaram (Pitié and Kokaram, 2007) presented a linear approach for probability distribution transfer.The method uses Monge-Kantorovich theory of mass transportation to transform the shape of the probability distribution.Namely, the principle * Corresponding author. is based on the Monge's transport problem which is to find minimal displacement for mass transportation.The solution from this linear approach can be used as initial solution for non-linear probability distribution transfer methods.
Similarly, in (Rabin et al., 2014), Rabin et al. also used the optimal transport technique to address the problem of statistical property transfer.Especially, the relaxed and regularized discrete optimal transport method was utilized.Moreover, the spatial distribution is also taken into account in the convex minimization process in order to deal with the spatial distortion introduced by the mapping.
In (Reinhard et al., 2001), a method for transfer color between image was proposed.By rotating the color from RGB color space to Lab color space in order to decorrelate the information between color channels, the statistical properties transfer between each channel of two images can be performed independently.Particularly, the color in the rotated space is translated and scaled to transfer the statistical properties.The major drawback of this approach is that it is limited to only RGB color space.
In stead of rotating the original data to a specific color space as presented in (Reinhard et al., 2001), Xiao and Ma (Xiao and Ma, 2006) computed the rotation using the axes defined by the principal components of each image.The transfer function is then formulated as a series of affine matrix transformation.
In (Pitié et al., 2007), an iterative method for transforming the probability distribution of an RGB image was proposed.The concept of this approach is to rotate data space using a random rotation matrix and perform 1D distribution transfer on each axe of the new coordinate system.The process iterates with different random rotations and stops when there is no change in the probability distribution of the modified image.Inamdar et al. (Inamdar et al., 2008) applied the method proposed in (Pitié et al., 2007) to use with multidimensional data i.e., multispectral satellite images.Particularly, the N-D rotation matrix was used for transfer the multidimensional density function.The applicability of the proposed method was tested in three scenarios including 1.) supervised classification 2.) semi-supervised classification and 3.) change detection.
The probability distribution transfer is also used transform data such that the shape of its probability distribution is matched with a parametric probablity distribution such as uniform, or normal distribution.For example, Laparra et al. (Laparra et al., 2011) proposed a method for data Gaussianization of which the aim is to make the transformed data distributed with a unit-covariance normal distribution.
In this paper, we adopted the iterative approach presented in (Pitié et al., 2007).Its disadvantage is that it was originally proposed for 3D probability distribution transfer using 3D rotation matrix.In order to deal with multidimensional rotation matrix, we proposed to use the sampling technique proposed by Stewart (Stewart, 1980) to generate a set of random orthogonal matrices.This sampling scheme uses Householder transformation converting normally distributed random variables to a random orthogonal matrix.This result in the uniform sampling of the orthogonal matrix.The obtained orthogonal matrices are then converted to proper rotation matrices.The iterative approach presented in (Pitié et al., 2007) can then be generalized to multidimensional case.
The rest of this paper is organized as follow.The concept of probability distribution transfer is briefly presented in Section 2..In Section 3., the proposed random rotation matrix sampling scheme will be presented.The performance of the proposed method will be demonstrated in Section 4..The conclusion will be discussed in Section 5..

The one-dimensional case
In order to explain the distribution transfer which is also known as histogram specification, lets use the one dimensional data as an example because its formulation is very simple.Let cs and ct are the probability distribution of the source and target data, respectively.The goal of the distribution transfer is to transform the source data such that the transformed data has probability distribution similar to the target's one.
According to the change of variable formula, one obtains the following constraint: where us and ut are respectively 1D random variables i.e., the digital number of a gray value image.Let the source data is transformed by a transformation T i.e., T (us) = ut.By integrating both side of (1), we come up with: The transformation T can then be expressed in terms of the cumulative distributions of the source and target data: where Cs and Ct are the cumulative distributions of the source and target data, respectively.Since the 1-D cumulative distribution function is non-decreasing function, the mapping can then be easily implemented using look-up table.Transform the adjusted source data back to the original space:

N-dimensional case
To transform the multidimensional probability distribution, this paper adopted the method proposed in (Pitié et al., 2007).The concept is similar to the Radon transformation such that the data is projected onto the principal axes of rotated space.The onedimensional distribution transform is then performed independently on each principal axis.Therefore, the probability distribution is iteratively transformed to the target one using rotation matrix sequences.
The work flow of the multidimensional probability distribution transfer is illustrated in Algorithm 1.The proposed rotation matrix sampling scheme is presented in the next Section.

RANDOM SAMPLING OF ROTATION MATRIX
The proposed random rotation matrix sampling begins with the random sampling of orthogonal matrix.The obtained orthogonal matrices are then converted to proper rotation matrices.A matrix Q is orthogonal if and only if det(Q) = ±1 and QQ = I.The set of N × N orthogonal matrices is denoted by O(N ) which is so called orthogonal group.The rotation matrix is a special case of the orthogonal group because the determinant of the rotation matrix is 1.Therefore, the set of rotation matrices is so called special orthogonal group which is denoted by SO(N ) where N is the dimension of the rotation matrix.

Uniformity of orthogonal matrix distribution
Unlike uniform sampling of real number, the sampling of orthogonal, or rotation matrices needs special care in order to avoid unintentionally bias.For example, uniform Euler angle sampling does not yield uniformly distributed rotation matrices.Particularly, it results in a set of rotation matrix that is heavily distributed in polar region (Kuffner, 2004).
In (Mitchell, 2008), it is pointed out that a scheme can uniformly sample orthogonal matrix Q ∈ O(N ), if these conditions satisfy: where µ is the Haar measure assigning orthogonal transformation invariant volume to subset1 Ω1, Ω2 ∈ O(N ).This means that, when the set Ω1 and Ω2 have the same "volume", the sampling is uniform, if the chances of choosing a orthogonal matrix Q from Ω1, or Ω2 are equal.This constraint is also used for the uniform sampling of rotation matrices.

Uniform orthogonal matrix sampling
To sample orthogonal matrices, the method proposed by Stewart (Stewart, 1980) is adopted because of uniform sampling.The obtained orthogonal matrices are then converted to proper rotation matrices.The orthogonal matrix sampling method is based on the use of Householder transformation to decompose a square matrix into orthogonal and upper triangular matrices i.e., QR decomposition.
Given a vector a, there exists a Householder transformation H such that: where e = [1, 0, . . ., 0] and r = ± a2 2 .The Householder transformation is defined as: where u is a non-zero vector.It is trivial to prove that the Householder transformation H is symmetric orthogonal.The geometrical meaning of the Householder transformation is the reflection of a vector a across the hyperplane orthogonal to u which is so called reflector.The reflector u can be computed by: where a1 is the first element of the vector a.
A series of Householder transformation can then be utilized to perform orthogonal triangularization on an N × N matrix A.
That is, there exists a sequence of Householder transformation H2, H3, . . ., HN satisfying: where U is upper triangular.By setting we may write which is known as QR decomposition 2 .
To sampling orthogonal matrices, the method proposed in (Stewart, 1980) is based on the decomposition of random N ×N matrix A whose elements are independently sampled from normal distribution with zero mean and a common variance.It was proved that the QR decomposition of the random matrix such that the diagonal elements of the upper triangular matrix are positive results in the uniform distribution of orthogonal matrix Q according to Haar measure.

Optimal rotation matrix
The obtained orthogonal matrix can be an improper rotation matrix i.e., |Q| = 1.In this paper, the method proposed by Kanatani (Kanatani, 1994) is adopted to project the obtained orthogonal matrix onto the SO(N ) manifold.Therefore, the projected orthogonal matrix is a rotation matrix.
Given an orthogonal matrix Q, the goal is then to find the rotation matrix maximizing trace(R Q).Let the Singular Value Decomposition (SVD) of the matrix Q be Q = WDV .The optimal rotation matrix can then be computed as follow: It can be easily proved that this projection provides proper rota- By combining the orthogonal matrix sampling with projection onto SO(N ) manifold, we obtain the scheme for sampling the rotation matrix which is illustrated in Algorithm 2. The multidimensional rotation matrix can then be randomly sampled and used in the probability distribution transfer.

Image to image probability distribution transfer
In this experiment, we demonstrate the effectiveness of the multidimensional probability distribution transfer.The performance of the proposed method was evaluated in terms of quality and quantity.
In order to show the evolution of the distribution transfer, we used the two benchmark images from stereo dataset i.e., cones and teddy images3 because of their color variations.In Figure 1, the evolution of the bivariate distribution of the red and green bands is illustrated.The Figure shows the contour plots of color distribution functions at some iterations, which were estimated by kernel density function technique, particularly Epanechnikov kernel.It can be observed that the distribution transfer can transform the shape of the source distribution function to match with that of target one.
The proposed method was compared with the method using uniform angle sampling (Inamdar et al., 2008) e.g., uniform Euler angle distribution in 3D case.The performance indicator is the similarity between the transformed probability distribution and the target one.A popular similarity metric called Kullback-Leibler (KL) divergence is utilized which is defined as follows: It can be noticed that the KL divergence is not symmetric i.e., DKL(cs, ct) = DKL(ct, cs).In order to make the metric symmetric, the average of the KL divergence is used: The images used in this comparison are Landsat 8 imageries, see Figure 2. The simulation was repeated 100 times and the averaged log Kullback-Leibler divergence at each iteration is shown in Figure 3.It can be observed that the proposed rotation sampling scheme yields better convergence than that of rotation angle sampling.The reason is that the rotation angle sampling can have unintentionally bias in the rotation matrix samples.
The proposed method was also compared with the band-by-band distribution transfer.In this experiment, the performance was evaluated using Kolmogorov-Smirnov test which is a non-parametric test for determining whether two samples are drawn from the same distribution.If the Kolmogorov-Smirnov statistics is small, we therefore cannot reject the null-hypothesis that those two samples were drawn from the same distribution.The Kolmogorov-Smirnov statistics from the experiment are tabulated in Table 1.It can be observed that the multidimensional distribution transfer yields lowest Kolmogorov-Smirnov statistics.The performance is also analyzed visually using Q-Q plot, where Q stands for quantile.If two sampled are drawn from the same distribution, they are expected to line up on the line of identity.In Figure 4, the Q-Q plots of the red band with and without distribution transfer are illustrated.It can be concluded that the source  and target data are not drawn from the same probability distribution function.After distribution transfer, the distribution of the transformed source data is similar to that of target data.

Data Gaussianization
In some applications, the data distribution has to be transformed into a parametric distribution function, particularly, the Gaussian or normal distribution.A popular approach is to use the power transformation e.g., Box-Cox transformation (Osborne, 2010), which is a parametric transformation.In this experiment, the distribution transfer was applied to transform the shape of the distribution function to that of normal distribution.
To transform the source images probability distribution function, the mean (µ) and the covariance matrix (Σ) of the source data were computed.The target data was then drawn from the normal distribution N (µ, Σ).The source data was then iteratively transformed to be distributed with the normal distribution N (µ, Σ).
The source data used in this experiment was the true-color MODIS 4 image of the semi-arid Tibetan Plateau, illustrated in Figure 5a.The image after distribution function transformation is shown in Figure 5b.
To quantitatively test the normality of the transformed source data, the kurtosis and skew were used (Ghasemi and Zahediasl, 2012).The kurtosis was used for measure the peakedness of the probability distribution function.The asymmetry of the distribution function was measured by skew.The shape of a distribution function is similar to that of normal distribution function, if the kurtosis is close to 3 and skew is negative.To pass the normality test with 95% confidence, the p-value must be greater than 0.05.Namely, by not reject the hypothesis of normality, no significant departure from normality is found.The statistics for normality test are tabulated in Table 2.It can be concluded that the multidimensional distribution transfer can transform the source data to be normally distributed.
The normality of the data can also be visually inspected using pairwise bivariate distribution plot shown in Figure 6.Particularly, the univariate distributions on each variable (color band) are shown on diagonal axis; the bivariate distributions between two variables are shown in off-diagonal elements.It can be observed that distribution transfer can change the shape of the non-normal distribution function to that of normal one.
It is also possible to check that the desired parameters of normal distribution function are obtained using the Kullback-Leibler divergence for multivariate normal distributions.Let the desired (16) By using the average of DKL( N N ) and DKL(N N ), the similarity score is then 7.32 × 10 −6 .

CONCLUSIONS
In this paper, a rotation matrix sampling scheme for distribution function transfer was presented.We proposed to use the Householder transformation to uniformly sampling the orthogonal matrices first and then transform the orthogonal matrices to rotation matrices.Using he advantage of Householder transformation for sampling orthogonal matrix, the rotation matrix can be sampling more efficient than using the rotation angle sampling.This efficiency was demonstrated in the experimental evaluation.The performance of the distribution transfer was tested with two applications i.e., image to image distribution transfer and data Gaussianization.It was demonstrated that the use multidimensional distribution transfer yield better results that that of 1D distribution transfer.

Algorithm 1 :
Multi-dimensional probability distribution transfer Input : Xs : source data and Xt : target data Output: Transformed source data 1 while Not converge do 2 Choose an N × N rotation matrix R; 3 At each iteration k, rotate the data using the rotation matrix R i.e., X r s (k) ← RXs(k − 1) and X r t ← RXt; 4 Perform 1-D histogram matching on each of the axis in the rotated space; 5

Figure 1 :
Figure 1: The evolution of source probability distribution function.The bivariate probability distribution of the red and green bands of the source data is shown in Figure 1a.The target probability distribution function is presented in Figure 1f.

Figure 2 :
Figure 2: Landsat 8 imageries used for comparing the performance of the proposed rotation sampling method and uniform rotation angle sampling.
Figure 4: The Q-Q plot of the red band of images illustrated in Figure 2. Original 1D ND source data transformation transformation

Figure 5 :
Figure 5: The MODIS imagery before and after data Gaussianization.

Figure 6 :
Figure 6: Visualization of the pairwise relationships between two variables.