<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v3.0 20080202//EN" "https://jats.nlm.nih.gov/nlm-dtd/publishing/3.0/journalpublishing3.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article" dtd-version="3.0" xml:lang="en">
<front>
<journal-meta>
<journal-id journal-id-type="publisher">ISPRS-Annals</journal-id>
<journal-title-group>
<journal-title>ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences</journal-title>
<abbrev-journal-title abbrev-type="publisher">ISPRS-Annals</abbrev-journal-title>
<abbrev-journal-title abbrev-type="nlm-ta">ISPRS Ann. Photogramm. Remote Sens. Spatial Inf. Sci.</abbrev-journal-title>
</journal-title-group>
<issn pub-type="epub">2194-9050</issn>
<publisher><publisher-name>Copernicus Publications</publisher-name>
<publisher-loc>Göttingen, Germany</publisher-loc>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.5194/isprs-annals-XI-2-2026-225-2026</article-id>
<title-group>
<article-title>MambaPanoptic: A Vision Mamba-based Structured State Space Framework for Panoptic Segmentation</article-title>
</title-group>
<contrib-group><contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Cheng</surname>
<given-names>Qing</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
<xref ref-type="aff" rid="aff2">
<sup>2</sup>
</xref>
</contrib>
<contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Bertolini</surname>
<given-names>Damiano</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
<xref ref-type="aff" rid="aff3">
<sup>3</sup>
</xref>
</contrib>
<contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Zhang</surname>
<given-names>Wei</given-names>
</name>
<xref ref-type="aff" rid="aff4">
<sup>4</sup>
</xref>
</contrib>
<contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Wang</surname>
<given-names>Dong</given-names>
</name>
<xref ref-type="aff" rid="aff5">
<sup>5</sup>
</xref>
</contrib>
<contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Zeller</surname>
<given-names>Niclas</given-names>
</name>
<xref ref-type="aff" rid="aff6">
<sup>6</sup>
</xref>
</contrib>
<contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Cremers</surname>
<given-names>Daniel</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
<xref ref-type="aff" rid="aff2">
<sup>2</sup>
</xref>
</contrib>
</contrib-group><aff id="aff1">
<label>1</label>
<addr-line>Department of Computer Science, Technical University of Munich, Munich, Germany</addr-line>
</aff>
<aff id="aff2">
<label>2</label>
<addr-line>Munich Center for Machine Learning (MCML), Munich, Germany</addr-line>
</aff>
<aff id="aff3">
<label>3</label>
<addr-line>Department of Electronics, Information and Bioengineering, Polytechnic University of Milan, Milan, Italy</addr-line>
</aff>
<aff id="aff4">
<label>4</label>
<addr-line>Institute for Photogrammetry and Geoinformatics, University of Stuttgart, Stuttgart, Germany</addr-line>
</aff>
<aff id="aff5">
<label>5</label>
<addr-line>Department of Photogrammetry and Remote Sensing, Wuhan University, Wuhan, China</addr-line>
</aff>
<aff id="aff6">
<label>6</label>
<addr-line>Faculty of Electrical and Information Engineering, Karlsruhe University of Applied Sciences, Karlsruhe, Germany</addr-line>
</aff>
<pub-date pub-type="epub">
<day>03</day>
<month>07</month>
<year>2026</year>
</pub-date>
<volume>XI-2-2026</volume>
<fpage>225</fpage>
<lpage>233</lpage>
<permissions>
<copyright-statement>Copyright: &#x000a9; 2026 Qing Cheng et al.</copyright-statement>
<copyright-year>2026</copyright-year>
<license license-type="open-access">
<license-p>This work is licensed under the Creative Commons Attribution 4.0 International License. To view a copy of this licence, visit <ext-link ext-link-type="uri"  xlink:href="https://creativecommons.org/licenses/by/4.0/">https://creativecommons.org/licenses/by/4.0/</ext-link></license-p>
</license>
</permissions>
<self-uri xlink:href="https://isprs-annals.copernicus.org/articles/XI-2-2026/225/2026/isprs-annals-XI-2-2026-225-2026.html">This article is available from https://isprs-annals.copernicus.org/articles/XI-2-2026/225/2026/isprs-annals-XI-2-2026-225-2026.html</self-uri>
<self-uri xlink:href="https://isprs-annals.copernicus.org/articles/XI-2-2026/225/2026/isprs-annals-XI-2-2026-225-2026.pdf">The full text article is available as a PDF file from https://isprs-annals.copernicus.org/articles/XI-2-2026/225/2026/isprs-annals-XI-2-2026-225-2026.pdf</self-uri>
<abstract>
<p>Panoptic segmentation requires the simultaneous recognition of countable &lt;em&gt;thing&lt;/em&gt; instances and amorphous &lt;em&gt;stuff&lt;/em&gt; regions, placing joint demands on long-range context modelling, multi-scale feature representation, and efficient dense prediction. Existing convolutional and transformer-based methods struggle to satisfy all three requirements concurrently: convolutional architectures are limited in their capacity to model long-range dependencies, while transformer-based methods incur quadratic computational cost that is prohibitive at high resolutions. In this paper, we propose MambaPanoptic, a fully Mamba-based panoptic segmentation framework that addresses these limitations through two principal contributions. First, we introduce MambaFPN, a top-down feature pyramid that leverages Mamba blocks to generate globally coherent, multi-scale feature representations with linear computational complexity. Second, we adopt a PanopticFCN-style kernel generator that produces unified &lt;em&gt;thing&lt;/em&gt; and &lt;em&gt;stuff&lt;/em&gt; kernels for proposal-free panoptic prediction, enhanced by a QuadMamba-based feature refinement module applied at multiple network stages. Experiments on the Cityscapes and COCO panoptic segmentation benchmarks demonstrate that MambaPanoptic consistently outperforms PanopticDeepLab and PanopticFCN under comparable model sizes, and matches or surpasses Mask2Former on Cityscapes in PQ and AP while requiring fewer parameters.</p>
</abstract>
<counts><page-count count="9"/></counts>
</article-meta>
</front>
<body/>
<back>
</back>
</article>