DATA MINING FOR CLASSIFICATION OF HIGH VOLUME DENSE LiDAR DATA IN AN URBAN AREA
Keywords: LiDAR, Urban Area, Classification, Data Mining, Random Forest, Machine Learning, Point Cloud
Abstract. 3D LiDAR point cloud obtained from the laser scanner is too dense and contains millions of points with information. For such huge volume of data to be sorted, identified, validated and be used for prediction, data mining provides immense scope and has been used to achieve the same. Certain unique attributes were selected as an input for creating models through machine learning. Supervised models were thus built for prediction of classes through the available LiDAR data using random forest algorithm. The algorithm was chosen owing to its efficiency and accuracy over other data mining algorithms. The models created using random forest were then tested on an unclassified point cloud data of an urban area. The method shows promising results in terms of classification accuracy as overall accuracy of 91.71 % was achieved for pixel-based classification. The method also displays enhanced efficiency over common classification algorithms as the time taken to make predictions about the data is reduced considerably for a set of dense LiDAR data. This shows positive foresight of making use of data mining and machine learning to handle large volume of LiDAR data and can go a long way in augmenting efficient processing of LiDAR data.