Faster than Light: An Embedded-Efficient Matching Model with ReLU Linear Attention
Keywords: North Automatic Control Technology Institute. Taiyuan, China
Abstract. Deep learning-based image matching faces a critical challenge when deployed on computationally constrained embedded aerial devices. Transformer-based architectures, particularly the scaled dot-product attention mechanism, incur high computational costs that limit inference speed for real-time applications. To address this bottleneck, we propose FastGlue, a sparse feature matching algorithm that adapts the LightGlue architecture through two targeted modifications: replacing the scaled dot-product attention with a ReLU-based linear attention module, and reducing the depth of the graph neural network. These changes reduce computational complexity while maintaining competitive matching performance. Evaluations on HPatches and MegaDepth-1500 benchmarks show that FastGlue achieves accuracy comparable to LightGlue while improving inference speed—from 20.05 ms to 17.05 ms on GPU, and from 840.45 ms to 665.85 ms on an RK3588 embedded CPU. Our work demonstrates that targeted architectural simplifications can yield meaningful efficiency gains for deep learning-based feature matching on resource-constrained platforms.
