<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v3.0 20080202//EN" "https://jats.nlm.nih.gov/nlm-dtd/publishing/3.0/journalpublishing3.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article" dtd-version="3.0" xml:lang="en">
<front>
<journal-meta>
<journal-id journal-id-type="publisher">ISPRS-Annals</journal-id>
<journal-title-group>
<journal-title>ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences</journal-title>
<abbrev-journal-title abbrev-type="publisher">ISPRS-Annals</abbrev-journal-title>
<abbrev-journal-title abbrev-type="nlm-ta">ISPRS Ann. Photogramm. Remote Sens. Spatial Inf. Sci.</abbrev-journal-title>
</journal-title-group>
<issn pub-type="epub">2194-9050</issn>
<publisher><publisher-name>Copernicus Publications</publisher-name>
<publisher-loc>Göttingen, Germany</publisher-loc>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.5194/isprs-annals-XI-1-2026-255-2026</article-id>
<title-group>
<article-title>A Category-Specific Prompt Strategy for Semantic 3D Indoor Mapping Using RGB-D Camera</article-title>
</title-group>
<contrib-group><contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Hou</surname>
<given-names>Jiwei</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
</contrib>
<contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Volland</surname>
<given-names>Vivien</given-names>
</name>
<xref ref-type="aff" rid="aff2">
<sup>2</sup>
</xref>
</contrib>
<contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Karam</surname>
<given-names>Samer</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
</contrib>
<contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Iwaszczuk</surname>
<given-names>Dorota</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
</contrib>
</contrib-group><aff id="aff1">
<label>1</label>
<addr-line>Remote Sensing and Image Analysis, Department of Civil and Environmental Engineering, Technical University of Darmstadt, 64287 Darmstadt, Germany</addr-line>
</aff>
<aff id="aff2">
<label>2</label>
<addr-line>Geodetic Measurement Systems and Sensor Technology, Department of Civil and Environmental Engineering, Technical University of Darmstadt, 64287 Darmstadt, Germany</addr-line>
</aff>
<pub-date pub-type="epub">
<day>03</day>
<month>07</month>
<year>2026</year>
</pub-date>
<volume>XI-1-2026</volume>
<fpage>255</fpage>
<lpage>262</lpage>
<permissions>
<copyright-statement>Copyright: &#x000a9; 2026 Jiwei Hou et al.</copyright-statement>
<copyright-year>2026</copyright-year>
<license license-type="open-access">
<license-p>This work is licensed under the Creative Commons Attribution 4.0 International License. To view a copy of this licence, visit <ext-link ext-link-type="uri"  xlink:href="https://creativecommons.org/licenses/by/4.0/">https://creativecommons.org/licenses/by/4.0/</ext-link></license-p>
</license>
</permissions>
<self-uri xlink:href="https://isprs-annals.copernicus.org/articles/XI-1-2026/255/2026/isprs-annals-XI-1-2026-255-2026.html">This article is available from https://isprs-annals.copernicus.org/articles/XI-1-2026/255/2026/isprs-annals-XI-1-2026-255-2026.html</self-uri>
<self-uri xlink:href="https://isprs-annals.copernicus.org/articles/XI-1-2026/255/2026/isprs-annals-XI-1-2026-255-2026.pdf">The full text article is available as a PDF file from https://isprs-annals.copernicus.org/articles/XI-1-2026/255/2026/isprs-annals-XI-1-2026-255-2026.pdf</self-uri>
<abstract>
<p>Semantic 3D indoor mapping often depends on supervised learning and large annotated datasets, limiting scalability across diverse environments. This work introduces a category-specific prompt strategy for semantic 3D mapping using RGB-D cameras, integrating RGB-D SLAM with the Segment Anything Model 2 (SAM2) to enable annotation-efficient reconstruction. Keyframes and trajectories extracted from SLAM provide spatial references, while SAM2 performs zero-shot segmentation guided by a Category- Wise Prompt Segmentation Strategy (CPSS), which segments structural and functional elements (e.g., floors, doors, staircases) by category to reduce prompt interference and manual effort. The segmented keyframes are then fused with depth and pose data to produce instance-level semantic point clouds. Experiments on custom RGB-D sequences and selected ScanNet scenes demonstrate centimeter-scale geometric consistency and strong semantic consistency, with mIoU values up to 0.89 on the custom dataset and 0.98 on ScanNet. The resulting semantic point clouds are clean, structured, and require minimal post-processing, showing that the proposed strategy provides an efficient and scalable solution for semantic 3D indoor mapping without retraining or environment-specific supervision.</p>
</abstract>
<counts><page-count count="8"/></counts>
</article-meta>
</front>
<body/>
<back>
</back>
</article>