<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v3.0 20080202//EN" "https://jats.nlm.nih.gov/nlm-dtd/publishing/3.0/journalpublishing3.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article" dtd-version="3.0" xml:lang="en">
<front>
<journal-meta>
<journal-id journal-id-type="publisher">ISPRS-Annals</journal-id>
<journal-title-group>
<journal-title>ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences</journal-title>
<abbrev-journal-title abbrev-type="publisher">ISPRS-Annals</abbrev-journal-title>
<abbrev-journal-title abbrev-type="nlm-ta">ISPRS Ann. Photogramm. Remote Sens. Spatial Inf. Sci.</abbrev-journal-title>
</journal-title-group>
<issn pub-type="epub">2194-9050</issn>
<publisher><publisher-name>Copernicus Publications</publisher-name>
<publisher-loc>Göttingen, Germany</publisher-loc>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.5194/isprs-annals-IV-1-W1-125-2017</article-id>
<title-group>
<article-title>VISUAL TRACKING UTILIZING OBJECT CONCEPT FROM DEEP LEARNING NETWORK</article-title>
</title-group>
<contrib-group><contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Xiao</surname>
<given-names>C.</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
</contrib>
<contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Yilmaz</surname>
<given-names>A.</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
</contrib>
<contrib contrib-type="author" xlink:type="simple"><name name-style="western"><surname>Lia</surname>
<given-names>S.</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
<xref ref-type="aff" rid="aff2">
<sup>2</sup>
</xref>
</contrib>
</contrib-group><aff id="aff1">
<label>1</label>
<addr-line>Photogrammetric Computer Vision Laboratory, The Ohio State University, USA</addr-line>
</aff>
<aff id="aff2">
<label>2</label>
<addr-line>Institute of Computing Technology, Chinese Academy of Sciences, Beijing, China</addr-line>
</aff>
<pub-date pub-type="epub">
<day>30</day>
<month>05</month>
<year>2017</year>
</pub-date>
<volume>IV-1/W1</volume>
<fpage>125</fpage>
<lpage>132</lpage>
<permissions>
<copyright-statement>Copyright: &#x000a9; 2017 C. Xiao et al.</copyright-statement>
<copyright-year>2017</copyright-year>
<license license-type="open-access">
<license-p>This work is licensed under the Creative Commons Attribution 3.0 Unported License. To view a copy of this licence, visit <ext-link ext-link-type="uri"  xlink:href="https://creativecommons.org/licenses/by/3.0/">https://creativecommons.org/licenses/by/3.0/</ext-link></license-p>
</license>
</permissions>
<self-uri xlink:href="https://isprs-annals.copernicus.org/articles/IV-1-W1/125/2017/isprs-annals-IV-1-W1-125-2017.html">This article is available from https://isprs-annals.copernicus.org/articles/IV-1-W1/125/2017/isprs-annals-IV-1-W1-125-2017.html</self-uri>
<self-uri xlink:href="https://isprs-annals.copernicus.org/articles/IV-1-W1/125/2017/isprs-annals-IV-1-W1-125-2017.pdf">The full text article is available as a PDF file from https://isprs-annals.copernicus.org/articles/IV-1-W1/125/2017/isprs-annals-IV-1-W1-125-2017.pdf</self-uri>
<abstract>
<p>Despite having achieved good performance, visual tracking is still an open area of research, especially when target undergoes serious
appearance changes which are not included in the model. So, in this paper, we replace the appearance model by a concept model
which is learned from large-scale datasets using a deep learning network. The concept model is a combination of high-level semantic
information that is learned from myriads of objects with various appearances. In our tracking method, we generate the target’s concept
by combining the learned object concepts from classification task. We also demonstrate that the last convolutional feature map can
be used to generate a heat map to highlight the possible location of the given target in new frames. Finally, in the proposed tracking
framework, we utilize the target image, the search image cropped from the new frame and their heat maps as input into a localization
network to find the final target position. Compared to the other state-of-the-art trackers, the proposed method shows the comparable
and at times better performance in real-time.</p>
</abstract>
<counts><page-count count="8"/></counts>
</article-meta>
</front>
<body/>
<back>
</back>
</article>