ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences
Download
Share
Publications Copernicus
Download
Citation
Share
Articles | Volume X-5/W2-2025
https://doi.org/10.5194/isprs-annals-X-5-W2-2025-459-2025
https://doi.org/10.5194/isprs-annals-X-5-W2-2025-459-2025
19 Dec 2025
 | 19 Dec 2025

Patch-Based Self-Supervised Learning for Road Damage Classification: A Case Study on the RDD2022 dataset of Indian test site

Poonam Jayhind Pardeshi, Shailja, Anushka Chaudhary, Arya, and Manohar Yadav

Keywords: Road, Image, Self-supervised learning, MobileNetV2, Classification, Road infrastructure monitoring

Abstract. Timely detection of road surface damage is recognized as essential for maintaining safe and efficient transportation infrastructure. In developing countries such as India, damage types including cracks, potholes, and surface wear are worsened due to climatic conditions, heavy traffic, and inconsistent maintenance. Manual inspection is resource-intensive and non-scalable, emphasizing the need for automated, learning-based approaches. This study proposes a lightweight, patch-based self-supervised learning (SSL) framework using MobileNetV2 for classifying five road damage types in the India-specific subset of the Road Damage Detection 2022 (RDD2022) dataset. Although RDD2022 supports deep learning for road damage detection, patch-wise modeling remains largely unexplored. The methodology comprises four stages: Image patching, SSL-based pretraining with augmentation, supervised fine-tuning on labeled patches, and evaluation. SSL facilitates representation learning from unlabeled data, crucial in domains with limited annotations. Combined with patch-based sampling, localized damage features are captured, improving performance under intra-class imbalance. MobileNetV2 is selected for its fast convergence and edge-device compatibility, making it suitable for deployment in low-resource settings. The proposed model achieves 78% overall accuracy and a weighted F1-score of 78% on the test set. Training accuracy improves steadily over 25 epochs, reaching over 91%, while validation accuracy stabilizes at approximately 78%. Compared to standard CNN architectures, competitive performance is achieved without large pretrained models or high-end computational resources. The approach supports real-time inference, geospatial integration, and potential applications in infrastructure monitoring and urban planning. Validation against state-of-the-art models confirms the framework’s effectiveness and relevance for scalable, region-specific road damage classification.

Share