<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v3.0 20080202//EN" "https://jats.nlm.nih.gov/nlm-dtd/publishing/3.0/journalpublishing3.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article" dtd-version="3.0" xml:lang="en">
<front>
<journal-meta>
<journal-id journal-id-type="publisher">ISPRS-Annals</journal-id>
<journal-title-group>
<journal-title>ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences</journal-title>
<abbrev-journal-title abbrev-type="publisher">ISPRS-Annals</abbrev-journal-title>
<abbrev-journal-title abbrev-type="nlm-ta">ISPRS Ann. Photogramm. Remote Sens. Spatial Inf. Sci.</abbrev-journal-title>
</journal-title-group>
<issn pub-type="epub">2194-9050</issn>
<publisher><publisher-name>Copernicus Publications</publisher-name>
<publisher-loc>Göttingen, Germany</publisher-loc>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.5194/isprs-annals-XI-2-2026-503-2026</article-id>
<title-group>
<article-title>Target Vessel Identification in Aerial Search Imagery via MLLM-Based Attribute Extraction and Geolocation Fusion</article-title>
</title-group>
<contrib-group><contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Oh</surname>
<given-names>Jeonghyo</given-names>
<ext-link>https://orcid.org/0009-0003-4083-101X</ext-link>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
</contrib>
<contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Oh</surname>
<given-names>Youngon</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
</contrib>
<contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Lee</surname>
<given-names>Impyeong</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
</contrib>
</contrib-group><aff id="aff1">
<label>1</label>
<addr-line>Dept. of Geoinformatics, University of Seoul, Seoul, Republic of Korea</addr-line>
</aff>
<pub-date pub-type="epub">
<day>03</day>
<month>07</month>
<year>2026</year>
</pub-date>
<volume>XI-2-2026</volume>
<fpage>503</fpage>
<lpage>509</lpage>
<permissions>
<copyright-statement>Copyright: &#x000a9; 2026 Jeonghyo Oh et al.</copyright-statement>
<copyright-year>2026</copyright-year>
<license license-type="open-access">
<license-p>This work is licensed under the Creative Commons Attribution 4.0 International License. To view a copy of this licence, visit <ext-link ext-link-type="uri"  xlink:href="https://creativecommons.org/licenses/by/4.0/">https://creativecommons.org/licenses/by/4.0/</ext-link></license-p>
</license>
</permissions>
<self-uri xlink:href="https://isprs-annals.copernicus.org/articles/XI-2-2026/503/2026/isprs-annals-XI-2-2026-503-2026.html">This article is available from https://isprs-annals.copernicus.org/articles/XI-2-2026/503/2026/isprs-annals-XI-2-2026-503-2026.html</self-uri>
<self-uri xlink:href="https://isprs-annals.copernicus.org/articles/XI-2-2026/503/2026/isprs-annals-XI-2-2026-503-2026.pdf">The full text article is available as a PDF file from https://isprs-annals.copernicus.org/articles/XI-2-2026/503/2026/isprs-annals-XI-2-2026-503-2026.pdf</self-uri>
<abstract>
<p>Identifying a distressed vessel among many ships detected in wide-area aerial imagery is a critical challenge in maritime Search and Rescue (SAR) operations. Conventional methods cannot determine which vessel matches the incident description, especially when Automatic Identification System (AIS) reports are uncertain. This study proposes an integrated framework that combines MLLM-based semantic attribute extraction with geolocation fusion to prioritize candidate vessels according to their consistency with Situation Report (SITREP) based scenarios. The method detects vessels using YOLOv8, tracks them with Deep Simple Online and Real-time Tracking (DeepSORT), and performs image-based georeferencing using onboard metadata. A Multi-modal Large Language Model (MLLM) extracts appearance/status attributes from representative vessel images, while scenario descriptions are also converted to attributes. Both sets are encoded using MiniLM embeddings. Finally, semantic similarity is fused with geolocation proximity within an Support Vector Machine (SVM) classifier to produce a probability-ranked list of candidates. Experiments using real aerial search footage demonstrate robust identification performance across a range of scenario quality levels. The correct vessel appears within the top three candidates in more than 73% of cases and within the top five in more than 91%, even when attribute extraction is affected by low resolution, illumination effects, or missing scenario information. These results show that coarse semantic cues, when combined with approximate geolocation, provide a resilient basis for identifying target vessels under high uncertainty. The proposed framework offers a practical foundation for automated SAR decision support, enabling faster and more reliable prioritization during wide-area maritime search operations.</p>
</abstract>
<counts><page-count count="7"/></counts>
</article-meta>
</front>
<body/>
<back>
</back>
</article>