ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences
Download
Publications Copernicus
Download
Citation
Articles | Volume X-1/W1-2023
https://doi.org/10.5194/isprs-annals-X-1-W1-2023-371-2023
https://doi.org/10.5194/isprs-annals-X-1-W1-2023-371-2023
05 Dec 2023
 | 05 Dec 2023

VEHICLE CLASSIFICATION IN URBAN REGIONS OF THE GLOBAL SOUTH FROM AERIAL IMAGERY

M. Mühlhaus, F. Kurz, A. R. Guridi Tartas, R. Bahmanyar, S. M. Azimi, and J. Hellekes

Keywords: Aerial Images, Dataset Annotation, Deep Neural Networks, Global South, Object Detection, Vehicle Classification

Abstract. Land transport is a major contributor to the human-caused climate change; knowing the total number and composition of the vehicle fleet is key for estimating its emissions. Especially for countries of the Global South, emission inventories are associated with high uncertainties because fleet data are often unknown or outdated – classifying vehicles on remote sensing has the potential to change this. We present the XWHEEL dataset based on annotated vehicles in aerial images with six classes depending on the number of wheels, size and motorization. The dataset consists of 73 annotated aerial images of the city of Dar es Salaam (Tanzania) with 15,973 vehicles. To analyze the performance of the dataset, a convolutional neural network, ReDet, and a transformer-based neural network, DINOOBB, are trained with different configurations and validated on the validation and test split, but also on aerial images from other regions. The transformer-based DINO architecture has been adapted to the remote sensing domain and modified to predict Oriented Bounding Boxes. Results show a good performance on the test split from Dar es Salaam, when the two-wheeled classes are merged and the non-motorized three-wheeled vehicles are excluded due to their rare occurrence. The best performing algorithm configurations with four classes were then tested on aerial images of Kathmandu (Nepal) and Kampala (Uganda). The performance drops for cycles and three-wheeled vehicles, as their appearance varies between countries. A main finding is that we can reliably detect the different vehicle classes in Dar es Salaam. When algorithms trained on XWHEEL are generalized to other regions of the Global South, performance decreases for the more difficult classes (bicycles and tricycles). To obtain results that are comparable across the board, we therefore recommend expanding the dataset with additional annotations from other regions of the Global South.