<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v3.0 20080202//EN" "https://jats.nlm.nih.gov/nlm-dtd/publishing/3.0/journalpublishing3.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article" dtd-version="3.0" xml:lang="en">
<front>
<journal-meta>
<journal-id journal-id-type="publisher">ISPRS-Annals</journal-id>
<journal-title-group>
<journal-title>ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences</journal-title>
<abbrev-journal-title abbrev-type="publisher">ISPRS-Annals</abbrev-journal-title>
<abbrev-journal-title abbrev-type="nlm-ta">ISPRS Ann. Photogramm. Remote Sens. Spatial Inf. Sci.</abbrev-journal-title>
</journal-title-group>
<issn pub-type="epub">2194-9050</issn>
<publisher><publisher-name>Copernicus Publications</publisher-name>
<publisher-loc>Göttingen, Germany</publisher-loc>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.5194/isprs-annals-X-4-W8-2025-793-2026</article-id>
<title-group>
<article-title>High Resolution Multi-View Image-based Building Type Classification Using Deep Learning</article-title>
</title-group>
<contrib-group><contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Tavakoligargari</surname>
<given-names>Mohammad Hassan</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
</contrib>
<contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Ghasemzadeh</surname>
<given-names>Maryam</given-names>
</name>
<xref ref-type="aff" rid="aff2">
<sup>2</sup>
</xref>
</contrib>
<contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Hazrati</surname>
<given-names>Nima</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
</contrib>
<contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Arefi</surname>
<given-names>Hossein</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
</contrib>
</contrib-group><aff id="aff1">
<label>1</label>
<addr-line>i3mainz, Mainz University of Applied Sciences, Germany</addr-line>
</aff>
<aff id="aff2">
<label>2</label>
<addr-line>School of Surveying and Geospatial Engineering, College of Engineering, University of Tehran, Iran</addr-line>
</aff>
<pub-date pub-type="epub">
<day>29</day>
<month>05</month>
<year>2026</year>
</pub-date>
<volume>X-4/W8-2025</volume>
<fpage>793</fpage>
<lpage>799</lpage>
<permissions>
<copyright-statement>Copyright: &#x000a9; 2026 Mohammad Hassan Tavakoligargari et al.</copyright-statement>
<copyright-year>2026</copyright-year>
<license license-type="open-access">
<license-p>This work is licensed under the Creative Commons Attribution 4.0 International License. To view a copy of this licence, visit <ext-link ext-link-type="uri"  xlink:href="https://creativecommons.org/licenses/by/4.0/">https://creativecommons.org/licenses/by/4.0/</ext-link></license-p>
</license>
</permissions>
<self-uri xlink:href="https://isprs-annals.copernicus.org/articles/X-4-W8-2025/793/2026/isprs-annals-X-4-W8-2025-793-2026.html">This article is available from https://isprs-annals.copernicus.org/articles/X-4-W8-2025/793/2026/isprs-annals-X-4-W8-2025-793-2026.html</self-uri>
<self-uri xlink:href="https://isprs-annals.copernicus.org/articles/X-4-W8-2025/793/2026/isprs-annals-X-4-W8-2025-793-2026.pdf">The full text article is available as a PDF file from https://isprs-annals.copernicus.org/articles/X-4-W8-2025/793/2026/isprs-annals-X-4-W8-2025-793-2026.pdf</self-uri>
<abstract>
<p>The classification of building types is a major method for optimizing urban planning, enhancing disaster management strategies, and advancing sustainable development objectives. This study presents a multi-view deep learning approach that achieves an overall classification accuracy of 75.8% for distinguishing building types. Using OpenStreetMap (OSM) building tags as ground-truth labels and a multi-view image dataset of 10,360 buildings from the German states of Baden-W&amp;uuml;rttemberg and Rhineland-Palatinate, was generated accordingly. The multi-scale images include aerial images at multiple zoom levels as well as street view images for each building, which are then classified into four categories: commercial, industrial, public, and residential. This approach employs two convolutional neural network architectures (VGG16 and Inception3), with each view trained separately using these CNN model architectures. All CNN models were pretrained on ImageNet before being fine-tuned on the building images. The predictions from the separately trained models were fused using model blending to identify the best combination, followed by a stacking ensemble framework with a Random Forest meta-model for the final classification. Experimental results show that this model fusion leads to a 16% relative improvement in classification accuracy compared to all individually trained models. This paper highlights the importance of integrating different types of views and state-of-the-art CNN architectures, as well as employing model fusion methods for improved urban building classification. Future research will focus on enhancing model fusion techniques and possibly enriching the classification via the incorporation of statistical data on population, income distribution, and infrastructure.</p>
</abstract>
<counts><page-count count="7"/></counts>
</article-meta>
</front>
<body/>
<back>
</back>
</article>