PosiFusion: A Vehicle-to-Everything Cooperative Perception Framework with Positional Prior Fusion
Keywords: V2X, Cooperative perception, Autonomous driving
Abstract. Collaborative perception technology improves perception performance by enabling agents to exchange complementary perceptual data through the sharing and fusion of multi-viewpoint information. However, existing collaborative perception methods in V2X scenarios face two main challenges. On one hand, they overly rely on sensor observation data for global perception, resulting in an exponential increase in communication bandwidth demands as the scene complexity grows. On the other hand, existing methods treat the trajectory and location information of traffic participants separately from sensor data, neglecting the fact that vehicles, as intelligent agents, serve both as sources and targets of perception. This limitation constrains the further improvement of collaborative perception performance. To address these issues, this paper proposes a novel position-prior-enhanced collaborative perception network, PosiFusion. In terms of communication, PosiFusion introduces a position-prior-based communication selection mechanism that uses prior location information to generate a confidence map of the global perception space. By selecting critical perceptual areas, it significantly reduces the communication bandwidth requirement. Regarding perception performance, PosiFusion incorporates a critical-area perception guidance module, which generates a guidance map of the global perception space using prior information. This guides the network to focus on the perception data from critical areas, thereby enhancing overall perception accuracy. To evaluate the effectiveness of PosiFusion, we conducted tests on two large-scale vehicular collaborative perception datasets, OPV2V and V2XSet. Experimental results demonstrate that PosiFusion outperforms existing state-of-the-art collaborative perception methods while ensuring minimal communication transmission costs.
