A zero-shot object navigation framework for unknown multi-floor environments, achieving state-of-the-art results on open-world indoor navigation.
A simulation framework for embodied social interaction and scalable data generation, with integration of AI agent interfaces such as OpenClaw, Codex, and Claude Code.
An open-ended vision-language navigation framework, including benchmark, method, and real-world validation on a Unitree Go2 robot.
A camera-only 4D occupancy forecasting project for autonomous driving, designed to improve efficiency while preserving strong forecasting accuracy.
A novel egocentric procedural AI assistant for smart glasses, tailored to deliver step-by-step support for daily procedural tasks.