Mapping of urban tree canopy in high-resolution aerial imagery using deep neural networks
Keywords: Urban forestry, Semantic segmentation, DeepLabV3, Aerial high-resolution imagery, Deep learning
Abstract. While deep learning has proven effective for urban tree mapping, there is a critical lack of validated benchmarks and comparative methodological studies for the diverse urban landscapes of Brazil. To address this gap, this work presents a deep-learning workflow that produces such maps from 25 cm RGB orthophotos. Images covering ten São Paulo cities were compiled; seven were used for training/validation and three for independent testing. The DeepLabV3 architecture with a ResNet-152 backbone was assessed under three loss configurations: (i) Balanced Cross-Entropy (BCE) baseline, (ii) BCE plus PointRend boundary refinement, and (iii) BCE combined with a 0.5-weighted Dice term. The BCE baseline delivered the top mean IoU (0.83) and F1-Score (0.91). PointRend increased recall but introduced systematic false positives in heterogeneous roofs and shaded riparian zones. The BCE+Dice variant recovered recall without raising commission error, achieving the highest balanced accuracy (0.96). The workflow delineates canopy with fine spatial detail and processes 2.8 × 10⁶ m² in under 30 minutes on a single RTX 4000 Ada workstation, demonstrating a practical, scalable solution for statewide tree-inventory production.
