FEG-VON: Frontier Embedding Graph for Efficient Visual Object Navigation

Published in IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2025, 2025

Recommended citation: Yingru Dai, Pengwei Xie, Yikai Liu, Siang Chen, Wenming Yang, Guijin Wang. (2025). FEG-VON: Frontier Embedding Graph for Efficient Visual Object Navigation.

Abstract

Visual object navigation, requiring agents to locate target objects in novel environments through egocentric visual observation, remains a critical challenge in Embodied AI. We propose FEG-VON, a training-free framework that constructs and maintains a Frontier Embedding Graph for efficient Visual Object Navigation. The graph initializes frontier embeddings using Vision Language Models (VLMs), where visual observations are encoded into spatially anchored semantic embeddings through cross-modal alignment with target text descriptors. We then update the graph by aggregating spatio-temporal semantic relations across frontiers, enabling online adaptation to new targets via similarity scoring without remapping. The evaluation results in public benchmarks demonstrate the superior performance of FEG-VON in both single- and multiobject navigation tasks compared with state-of-the-art methods. Crucially, FEG-VON eliminates dependency on task-specific training for exploration and advances the feasibility of zeroshot navigation in open-world environments.