<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v3.0 20080202//EN" "https://jats.nlm.nih.gov/nlm-dtd/publishing/3.0/journalpublishing3.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article" dtd-version="3.0" xml:lang="en">
<front>
<journal-meta>
<journal-id journal-id-type="publisher">ISPRS-Annals</journal-id>
<journal-title-group>
<journal-title>ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences</journal-title>
<abbrev-journal-title abbrev-type="publisher">ISPRS-Annals</abbrev-journal-title>
<abbrev-journal-title abbrev-type="nlm-ta">ISPRS Ann. Photogramm. Remote Sens. Spatial Inf. Sci.</abbrev-journal-title>
</journal-title-group>
<issn pub-type="epub">2194-9050</issn>
<publisher><publisher-name>Copernicus Publications</publisher-name>
<publisher-loc>Göttingen, Germany</publisher-loc>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.5194/isprs-annals-XI-1-2026-313-2026</article-id>
<title-group>
<article-title>Fine-grained Vegetation Segmentation in Complex Urban Park Environments Using a Deeply Supervised Parallel SegFormer</article-title>
</title-group>
<contrib-group><contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Zhang</surname>
<given-names>Haixin</given-names>
<ext-link>https://orcid.org/0009-0002-6163-1757</ext-link>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
</contrib>
<contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Zhang</surname>
<given-names>Qinying</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
</contrib>
</contrib-group><aff id="aff1">
<label>1</label>
<addr-line>Department of Landscape Architecture, Tianjin University, 300072 Tianjin, China</addr-line>
</aff>
<pub-date pub-type="epub">
<day>03</day>
<month>07</month>
<year>2026</year>
</pub-date>
<volume>XI-1-2026</volume>
<fpage>313</fpage>
<lpage>320</lpage>
<permissions>
<copyright-statement>Copyright: &#x000a9; 2026 Haixin Zhang</copyright-statement>
<copyright-year>2026</copyright-year>
<license license-type="open-access">
<license-p>This work is licensed under the Creative Commons Attribution 4.0 International License. To view a copy of this licence, visit <ext-link ext-link-type="uri"  xlink:href="https://creativecommons.org/licenses/by/4.0/">https://creativecommons.org/licenses/by/4.0/</ext-link></license-p>
</license>
</permissions>
<self-uri xlink:href="https://isprs-annals.copernicus.org/articles/XI-1-2026/313/2026/isprs-annals-XI-1-2026-313-2026.html">This article is available from https://isprs-annals.copernicus.org/articles/XI-1-2026/313/2026/isprs-annals-XI-1-2026-313-2026.html</self-uri>
<self-uri xlink:href="https://isprs-annals.copernicus.org/articles/XI-1-2026/313/2026/isprs-annals-XI-1-2026-313-2026.pdf">The full text article is available as a PDF file from https://isprs-annals.copernicus.org/articles/XI-1-2026/313/2026/isprs-annals-XI-1-2026-313-2026.pdf</self-uri>
<abstract>
<p>Accurate vegetation mapping in complex urban environments is essential for ecological monitoring, biodiversity assessment, and sustainable park management. However, fine-grained vegetation segmentation remains challenging because of the high diversity of plant species, overlapping canopies, and the interference of artificial objects. To address these challenges, a deeply supervised parallel architecture based on the SegFormer backbone was proposed in this paper. The model incorporated a SegFormer-ASPP-low-level (SAL) head, which fused high-level semantic representations, multi-scale contextual information, and low-level spatial details through a parallel decoding mechanism. Two auxiliary heads, a pyramid pooling module (PSP) and a fully convolutional network (FCN), were added to provide deep supervision and improve the recognition of blurred boundaries and rare categories. High-resolution UAV imagery was used to perform fine-grained semantic segmentation of 17 vegetation categories. The dataset included multiple tree species as well as non-tree classes such as &lt;em&gt;Nelumbo&lt;/em&gt; sp. (lotus) and dead trees. Experimental results showed that our model achieved a mean intersection over union (mIoU) of 73.57%, outperforming architectures such as SegFormer-b1, DeepLab v3+, ConvNeXt and SCTNet. Visual analysis further demonstrated the model&amp;rsquo;s robustness in complex urban park scenes, showing superior boundary delineation, improved recognition of small and spectrally similar species, and resilience to interference from artificial objects like plastic lawns and landscape lighting. The proposed approach offers valuable insights for precision forestry, ecological monitoring, and intelligent UAV-based remote sensing applications.</p>
</abstract>
<counts><page-count count="8"/></counts>
</article-meta>
</front>
<body/>
<back>
</back>
</article>