<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v3.0 20080202//EN" "https://jats.nlm.nih.gov/nlm-dtd/publishing/3.0/journalpublishing3.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article" dtd-version="3.0" xml:lang="en">
<front>
<journal-meta>
<journal-id journal-id-type="publisher">ISPRS-Annals</journal-id>
<journal-title-group>
<journal-title>ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences</journal-title>
<abbrev-journal-title abbrev-type="publisher">ISPRS-Annals</abbrev-journal-title>
<abbrev-journal-title abbrev-type="nlm-ta">ISPRS Ann. Photogramm. Remote Sens. Spatial Inf. Sci.</abbrev-journal-title>
</journal-title-group>
<issn pub-type="epub">2194-9050</issn>
<publisher><publisher-name>Copernicus Publications</publisher-name>
<publisher-loc>Göttingen, Germany</publisher-loc>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.5194/isprs-annals-XI-1-2026-303-2026</article-id>
<title-group>
<article-title>Attention-guided Multi-Scale Deep Learning Approach for Tree Health Detection Using Very High-Resolution Aerial Imagery</article-title>
</title-group>
<contrib-group><contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Meshkini</surname>
<given-names>Khatereh</given-names>
<ext-link>https://orcid.org/0000-0003-2454-3379</ext-link>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
</contrib>
<contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Beloiu</surname>
<given-names>Mirela</given-names>
<ext-link>https://orcid.org/0000-0002-3592-8170</ext-link>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
</contrib>
<contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Xia</surname>
<given-names>Zhongyu</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
</contrib>
<contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Griess</surname>
<given-names>Verena C.</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
</contrib>
</contrib-group><aff id="aff1">
<label>1</label>
<addr-line>Department of Environmental Systems Science, Institute of Terrestrial Ecosystems, ETH Zurich, 8092 Zurich, Switzerland</addr-line>
</aff>
<pub-date pub-type="epub">
<day>03</day>
<month>07</month>
<year>2026</year>
</pub-date>
<volume>XI-1-2026</volume>
<fpage>303</fpage>
<lpage>311</lpage>
<permissions>
<copyright-statement>Copyright: &#x000a9; 2026 Khatereh Meshkini et al.</copyright-statement>
<copyright-year>2026</copyright-year>
<license license-type="open-access">
<license-p>This work is licensed under the Creative Commons Attribution 4.0 International License. To view a copy of this licence, visit <ext-link ext-link-type="uri"  xlink:href="https://creativecommons.org/licenses/by/4.0/">https://creativecommons.org/licenses/by/4.0/</ext-link></license-p>
</license>
</permissions>
<self-uri xlink:href="https://isprs-annals.copernicus.org/articles/XI-1-2026/303/2026/isprs-annals-XI-1-2026-303-2026.html">This article is available from https://isprs-annals.copernicus.org/articles/XI-1-2026/303/2026/isprs-annals-XI-1-2026-303-2026.html</self-uri>
<self-uri xlink:href="https://isprs-annals.copernicus.org/articles/XI-1-2026/303/2026/isprs-annals-XI-1-2026-303-2026.pdf">The full text article is available as a PDF file from https://isprs-annals.copernicus.org/articles/XI-1-2026/303/2026/isprs-annals-XI-1-2026-303-2026.pdf</self-uri>
<abstract>
<p>Monitoring tree health is essential for detecting early signs of stress, defoliation, and potential mortality, supporting effective forest management, ecosystem conservation, and early warning systems. Advances in deep learning have enabled automated analysis of trees in remote sensing imagery through object detection methods that leverage both spectral and spatial information. However, assessing tree defoliation remains challenging, as subtle differences between defoliation levels make accurate classification difficult. To address this, we propose the hybrid ResNet-Swin Transformer, an object detection architecture built on a Faster R-CNN framework, incorporating a fused ResNet and Swin Transformer backbone with attention-based feature fusion. This design captures rich, multiscale representations by combining convolutional and transformer-based features and progressively refines them through channel-wise attention blocks for robust detection and classification. The architecture was evaluated on a very high-resolution aerial dataset from Switzerland, partially annotated with five classes: Conifer (healthy), Conifer (defoliated), Broadleaf (healthy), Broadleaf (defoliated) and Dead. Comparative experiments with state-of-the-art object detection and classification methods demonstrate that the proposed approach achieves higher accuracy and robustness, highlighting its potential for precise and reliable automated tree health monitoring.</p>
</abstract>
<counts><page-count count="9"/></counts>
</article-meta>
</front>
<body/>
<back>
</back>
</article>