Height Estimation from Single Optical Images Using KANU-Net Architecture
Keywords: Deep Learning, KANU-Net, Digital Elevation Model, Height Estimation, Google Imagery, Urban Analysis
Abstract. Monocular height estimation from single optical images is important for urban mapping and remote sensing, but remains challenging in heterogeneous urban scenes. We introduce KANU-Net, a U-Net variant that integrates Kolmogorov–Arnold Network (KAN) layers, which use functional basis expansions to enrich feature representation. KANU-Net is designed to better capture complex spatial patterns and multi-scale structures in aerial imagery. The method was evaluated on high-resolution (1 m) optical imagery from two urban areas: Utrecht (Google imagery) and Potsdam (ISPRS benchmark). Input data were processed into 256×256 patches, augmented in various ways and prepared for training and testing. Qualitative assessment shows that the model produces detailed and spatially consistent height maps across different urban morphologies with their unique complexities. Quantitative evaluation further confirms the model’s effectiveness, with RMSE values of 3.43 m and 3.29 m for Utrecht and Potsdam, respectively, and accuracy rates (δ₁) above 0.43 and 0.50. The results illustrate the feasibility of incorporating KAN layers into encoder–decoder architectures for monocular height estimation. This study highlights KANU-Net as a promising direction for further research in single-image 3D urban reconstruction.
