AUTOMATIC REFINEMENT OF TRAINING DATA FOR CLASSIFICATION OF SATELLITE IMAGERY
Keywords: Classification, Imagery, Land Cover, Learning, Satellite, Training
Abstract. In this paper, we present a method for automatic refinement of training data. Many classifiers from machine learning used in applications in the remote sensing domain, rely on previously labelled training data. This labelling is often done by human operators and is bound to time constraints. Hence, selection of training data must be kept practical which implies a certain inaccuracy. This results in erroneously tagged regions enclosed within competing classes. For that purpose, we propose a method that removes outliers from training data by using an iterative training-classification scheme. Outliers are detected by their newly determined class membership as well as through analysis of uncertainty of classified samples. The sample selection method which incorporates quality of neighbouring samples is presented and compared to alternative strategies. Additionally, iterative approaches tend to propagate errors which might lead to degenerating classes. Therefore, a robust stopping criterion based on training data characteristics is described. Our experiments using a support vector machine (SVM) show, that outliers are reliably removed, allowing a more convenient sample selection. The classification result for unknown scenes of the accordant validation set improves from 70.36% to 79.12% on average. Additionally, the average complexity of the SVM model is decreased by 82.75% resulting in similar reduction of processing time.