A FAST AND SIMPLE METHOD OF BUILDING DETECTION FROM LIDAR DATA BASED ON SCAN LINE ANALYSIS

One of the major problems in processing LiDAR (Light Detection And Ranging) data is its huge data volume which causes very high computational load when dealing with large areas with high point density. A fast and simple algorithm based on scan line analysis is proposed for automatic detection of building points from LiDAR data. At first, ground/non-ground classification is performed to filter out the ground points. Douglas–Peucker algorithm is then used to segment the scan line into segment objects based on height variation. These objects are preliminarily classified into buildings and vegetation based on local analysis using simple rules. At last, the region growing method is used to improve the quality of the extraction. The test data provided by the ISPRS test project on urban object extraction, containing a lot of buildings with complex roof structures, various sizes, and different heights, is used to test the algorithm. The experimental results show that the proposed algorithm can extract building regions effectively.


INTRODUCTION
LiDAR (Light Detection And Ranging) data provides dense, discrete, and accurate point which is fundamentally different from the traditional remote sensing data.Although most of the problems of technical difficulties in hardware and system integration have been solved, the interpretation and modeling of LiDAR data has been a challenging task (Axelsson, 1999).New data processing methods different from the ones used in the traditional photogrammetry and remote sensing are urgently needed.One of the most important tasks of using LiDAR data is automatic extraction of buildings from LiDAR point cloud.The processing of airborne LiDAR data for automatic extraction of building regions has been a hot topic of research in photogrammetry for the last two decades (Mayer, 2008).A lot of algorithms of building extraction have been reported with the focus shifting to detailed representations of objects, to using data from sensors, or to advanced processing techniques (Rottensteiner et al., 2012).
Most of the algorithms of building extraction can be categorized into two groups.The first group which is used in this paper is to remove the ground points from the non-ground points first, then to classify non-ground points into vegetation points and building points.Particularly, ground point elimination procedure is known as ground filtering.Tóvári and Pfeifer (2005) categorized the ground filtering algorithms into morphological (Vosselman, 2000), progressive (Axelsson, 2000), surface based (Kraus, 1998), and segment based (Sithole, 2005) filters.The extensive studies show that all filters perform well in smooth rural landscapes, but all produce errors in complex urban areas and rough terrain with vegetation (Sithole and Vosselman, 2004).After filtering out the ground points, the remaining non-ground points are classified into vegetation and buildings by the features such as height differences from the Digital Terrain Model (DTM) and local statistical analysis.  * Corresponding author.Haithcoat et al. (2001) extract size, height and shape information from the point cloud, and use thresholds to discriminate small objects like cars and trees.The building footprints are simplified by orthogonality.Gross et al. (2005) start from the normalized DSM (nDSM=DSM-DTM) and use the first-last echo differences and a roughness measurement to discriminate vegetation and building points.At last, the building footprints are approximated and generalized by rectangles aligned with the boundary edges.Frédéricque et al. (2008) focuses on the ROI and extract the skeletons of the buildings.A set of rectangle hypotheses is then generated with the principal directions at given points of the skeleton.An iterative algorithm then allows obtaining a simplified graph of rectangles, which providing the representation of building blocks by a set of rectangles.
The other group extracts ground, buildings, and vegetation simultaneously.Moussa and El-Sheimy (2012) use a rule-based segmentation method to classify the LiDAR data into building, tree and ground segments, and use the spectral information obtained from the ortho-rectified CIR image to refine the classification.Zhang and Lin (2012) use a supervised classification of the airborne LiDAR data based on Support Vector Machine (SVM).Dorninger and Pfeifer (2008) firstly detect planar surface patches in the point cloud, followed by the model-based classification and the combination of patches to refine the detection result.Finally, the borders of the regions are delineated.Zhou and Neumann (2009) propose a streaming framework for building reconstruction.The buildings are extracted by SVM on local geometric features.From the practical system point of view, besides the problems in object modeling and recognition, one of the major problems in processing LiDAR data is its huge data volume which causes very high computational load when dealing with large areas with high point density.This paper focuses on fast detection of buildings using scan line analysis.Compared to many algorithms that use two dimensional or three dimensional analyses of local features (planar, facade and others) directly, the proposed method makes use of segmented scan line to find elements of building roofs by local geometric regularity.Although scan line analysis method is limited by directions (Meng et al., 2009), it is a 1D data structure which is easy to use GPU based parallel computing.It is unlike most of the other approaches processing data within a 2D/3D neighborhood of a point or a subset of the points.Finding and storing such neighborhood requires a large amount of computation and memory because of the large volume of LiDAR data and their irregular sampling pattern.With the simple algorithm it can extract building regions efficiently.

THE PROPOSED METHOD
The proposed algorithm for building extraction in airborne LiDAR data can be divided into three key steps: Filtering, scan line segmentation, object based classification.

Ground Filtering
In raw LIDAR data, both ground and non-ground objects, such as low vegetation, high vegetation, buildings, and vehicles, generate backscatter (Meng et al., 2009).Non-ground points need to be identified and filtered out from LIDAR data before DEM interpolation.Likewise, ground points need to be eliminated before extracting non-ground objects, such as vegetation and buildings.
Since all ground filtering algorithms perform well in smooth simple urban areas, in this paper we use the TIN (Triangular Irregular Network) progressive based algorithm (Axelsson, 2000) to classify ground and non-ground points.

Scan Line Segmentation
Most of the object extraction algorithms process point clouds within a 2D/3D neighborhood.Searching and storing such neighborhood need a large amount of memory and computational load because of the large volume of LiDAR data and their irregular sampling pattern.In this paper, the Douglas-Peucker algorithm (Douglas and Peucker, 1973), which is known as the most effective line simplification algorithm, is used to segment the scan line into objects based on the height variation.
The Douglas-Peucker algorithm uses the closeness of a vertex to an edge segment (distance to an edge segment).It is a recursive procedure that starts with a line segment whose extreme vertices coincide with the extreme vertices and of the polyline to be simplified.Each segment v v is split at the farthest vertex ( ), to it until the distance between the sequence of vertices and and the sequence of vertices and are less than the fixed tolerance and Márquez, 2003).Given a scan line (a sequence of points) of the LiDAR data as depicted in Figure 1(a), the segmentation starts with a crude initial guess, namely the single edge joining the first and the last points of the scan line (Figure 1(b)).Then the remaining points are tested for distance to that edge.If there are points further than a specified tolerance e  , then the point farthest from the edge is added to the previously simplified polyline.This creates a new approximation for the original polyline (Figure 1(c)).This iterative process continues for each edge (Figure 1

Classification of Segment Objects
Most of the building roofs are composed of planes, while the vegetation points show rather rough pattern.After the scan line segmentation procedure, the scan lines are divided into line segments, in which the building roof segments are usually longer (segment size = number of points) than the vegetation segments.Usually, most of the vegetation segments have no more than 2 points for our test data, as shown in Figure 2(b).
For the convenience of description, the following text "long segment" is a line segment whose segment size (number of points) is equal or greater than a certain number , while the "short segment" is a line segment whose segment size is less than .In this paper, points will be classified as the building category, if they satisfy three criteria: Though building roofs can be modelled by a composition of planar faces (Dorninger and Pfeifer, 2008), not all the roof components, such as skylights and rough edges, can be segmented into long segments (as shown in Figure 3(a) and Figure 3(b)) in our segmentation procedure.These roof points will be rejected as they do not fit the first criterion.These misclassifications are revised by a region growing based method.The region growing chooses all the building points, which are classified by the classification procedure, as seeds.In the growing step, the neighbouring points are searched using a grid index data structure.Although this simple and fast data structure may not accurate in searching a neighbourhood, it is enough for our algorithm.If the height difference between the seed point and the neighbouring point is within , the neighbouring point is accepted as the building point.Figure 3(c

Data Sets
The International Society for Photogrammetry and Remote Sensing (ISPRS) Commission III/WG3 provides LIDAR data for urban object classification and 3D building reconstruction.This is a subset of the data used for the test of digital aerial cameras carried out by the German Association of Photogrammetry, Remote Sensing, and Geoinformation (DGPF) (Cramer, 2010).Tow test sites are selected: Test data 1 is characterized by a residential area with detached houses and many surrounding trees (about 1.2 million points).Test data 2 is a highly developed area, consisting of complex buildings, roads and big trees (about 1.1 million points).Figure 4 shows the true ortho-photo of the test areas.
Figure 4: The true ortho-photo of the test areas.

Parameters
The proposed algorithm has 5 parameters:  , , , , .The density of the data can be accommodated by setting .Since the main goal of the algorithm is to find roof planes, the proportion of points belong to long segments should be as large as possible, and buildings should be higher than an adult person.Therefore, Actually, except the "long segment" threshold , all the other parameters can be set as fixed values in different experiments or data sets, as they have specific "physical meanings".The detection results are compared to reference data acquired using photogrammetric plotting.To quantitatively evaluate the proposed algorithm, the method (Rutzinger et al., 2009), which provides completeness, correctness, and quality of the results both on a per-area level and on a per-object level, is used.Figure 7 shows the evaluation of the building detection results on a per-pixel level.Table 1 gives the evaluation results of the building detection results for the two test areas.Both of the completeness and the correctness of the buildings are higher than 90%, but vegetation with very flat canopies and buildings with very rough or irregular roof surfaces, our method will produce wrong results due to the improper assumptions in section 2.3.

Evaluation results
Test

CONCLUSION
This paper proposes a simple and fast algorithm to separate points of building roofs from variegation points after filtering of the point cloud.Based on scan line segmentation and simple rules based classification of the segments it can detect roof points effectively and efficiently.The proposed method can be used for fast detection of buildings.Regarding for the speed of the algorithm, there is much room for accelerating it using GPU based parallel computing for the scan line process, as we did in filtering of point cloud (Hu et al., 2013).
However, in cases of vegetation with very flat canopies (Figure 8(a)) and buildings with very rough or irregular roof surface (Figure 8(b)), the proposed method will produce wrong results due to the improper assumption about the roughness of the surfaces of building roof and variegation.2D or 3D neighborhood is still needed to model the fine details of the objects in order to obtain high quality reconstruction of buildings.How to combine the detected building roofs by the fast algorithm with more features in order to achieve correct building extraction and 3D modeling with fine details is our major work in the future.
(d)) until all points' distances to the original polyline are within the tolerance  (Figure 1(e)).Figure 2 depicts the result of a scan line, different colors in the figure label different segments.

Figure 2
Figure 2(a) shows the segmentation result of a scan line from real LiDAR data.Figure 2(b) shows the segmentation result of an area from the test LiDAR data set, different colors in the figure label different segments.
to a long segment, 2) Proportion of points belong to long segments in a certain local neighbourhood, is more than , from DEM is higher than .The first one (belong to a long segment) means that the segment is most likely a part of plane.The second criterion has the same purpose of the computation of roughness, which is often used to separate building from tree points.But the computation of roughness, which is usually computed by the normal vectors of the nearest points, is very time consuming.The third criterion means that the building roof must have a certain height from ground.The basic Douglas-Peucker algorithm used in scan line segmentation.Scan line segmentation results, (a) segmentation result of a scan line; (b) segmentation result of a test area (different colors label different segments, 3D view).
) and Figure 3(d) are the building extraction results before and after the region growing based quality improvement.(a) and (b) segmentation results (red: long segments, yellow: short segments); (c) and (d) building extraction results before and after the region growing based quality improvement (red: buildings, green: vegetation).
The proposed building extraction algorithm has been implemented by using Microsoft Visual Studio C++.A desktop computer (CPU with 2.5 GHz and 4G Memory) is used to process the test data sets.Both of the two test data sets cost no more than 5 seconds (including ground filtering, scan line segmentation, object-based classification and region growing based quality improvement).

Figure 5 and
Figure 5 and Figure 6 show extraction results of the two test areas, which contain more than 200 roof planes.By comparing with the true ortho-photos (Figure 5(c) and Figure 6(c)), it becomes clear that for most buildings, the majority of the planes has been detected, but some of the building roofs have irregular shapes (see Figure 5(b)), which should be regulated in the outline extraction procedure.As shown in Figure 5(b), roof planes, which have been occluded by dense trees, have been detected correctly.
Figure 5: Extraction results of test area 1, (a) point cloud coloured by classification results (red: buildings, green: vegetation, brown: ground); (b) local enlarged view of (a) (3D view); (c) true ortho-photo of test area 1; (d) extracted building points (green) shown on the true ortho-photo.Extraction results of test area 2, (a) point cloud coloured by classification results (red: buildings, green: vegetation, brown: ground); (b) local enlarged view of (a) (3D view); (c) true ortho-photo of test area 2; (d) extracted building points (green) shown on the true ortho-photo.

Figure 7 :
Figure 7: Evaluation of the building detection results on a per-pixel level (yellow: true positive, red: false positive, blue: false negative; left: part of test area 1, right: part of test area 2).

Table 1 .
Evaluation of the building detection results.