SELF-SUPERVISED LEARNING FOR MONOCULAR DEPTH ESTIMATION FROM AERIAL IMAGERY

Hermann, M.; Ruf, B.; Weinmann, M.; Hinz, S.

doi:https://doi.org/10.5194/isprs-annals-V-2-2020-357-2020

Articles | Volume V-2-2020

https://doi.org/10.5194/isprs-annals-V-2-2020-357-2020

© Author(s) 2020. This work is distributed under
the Creative Commons Attribution 4.0 License.

https://doi.org/10.5194/isprs-annals-V-2-2020-357-2020

© Author(s) 2020. This work is distributed under
the Creative Commons Attribution 4.0 License.

Articles | Volume V-2-2020

03 Aug 2020

| 03 Aug 2020

SELF-SUPERVISED LEARNING FOR MONOCULAR DEPTH ESTIMATION FROM AERIAL IMAGERY

M. Hermann, B. Ruf, M. Weinmann, and S. Hinz

Keywords: Monocular Depth Estimation, Self-Supervised Learning, Deep Learning, Convolutional Neural Networks, Self-Improving, Online Processing, Oblique Aerial Imagery

Abstract. Supervised learning based methods for monocular depth estimation usually require large amounts of extensively annotated training data. In the case of aerial imagery, this ground truth is particularly difficult to acquire. Therefore, in this paper, we present a method for self-supervised learning for monocular depth estimation from aerial imagery that does not require annotated training data. For this, we only use an image sequence from a single moving camera and learn to simultaneously estimate depth and pose information. By sharing the weights between pose and depth estimation, we achieve a relatively small model, which favors real-time application. We evaluate our approach on three diverse datasets and compare the results to conventional methods that estimate depth maps based on multi-view geometry. We achieve an accuracy δ_1:25 of up to 93.5 %. In addition, we have paid particular attention to the generalization of a trained model to unknown data and the self-improving capabilities of our approach. We conclude that, even though the results of monocular depth estimation are inferior to those achieved by conventional methods, they are well suited to provide a good initialization for methods that rely on image matching or to provide estimates in regions where image matching fails, e.g. occluded or texture-less regions.

SELF-SUPERVISED LEARNING FOR MONOCULAR DEPTH ESTIMATION FROM AERIAL IMAGERY

Useful Links

Useful External Links

Our Contact