Abstract

State-of-the-art navigation methods leverage a spatial memory to generalize to new environments, but their occupancy maps are limited to capturing the geometric structures directly observed by the agent. We propose occupancy anticipation, where the agent uses its egocentric RGB-D observations to infer the occupancy state beyond the visible regions. In doing so, the agent builds its spatial awareness more rapidly, which facilitates efficient exploration and navigation in 3D environments. By exploiting context in both the egocentric views and top-down maps our model successfully anticipates a broader map of the environment, with performance significantly better than strong baselines. Furthermore, when deploying our model for the sequential decision-making tasks of exploration and navigation, we outperform state-of-the-art methods on the Gibson and Matterport3D datasets.

[arXiV] [PDF] [Code] [Slides] [Blog]


Occupancy Anticipation

The state-of-the-art approaches to visual exploration and navigation are limited to encoding what the agent actually sees infront of it. Our key idea is to anticipate occupancy. Rather than wait to directly observe a more distant or occluded region of the 3D environment to predict occupancy, our agent can infer occupancy for unseen regions based on the visual context from its egocentric view of the scene. For example, in the scene below, the agent can infer that the wall extents to its right, a corridor is present on its left, and the region immediately infront of it is free space.

Introduction figure

Habitat challenge talk

Our approach was the winning entry in the Habitat 2020 PointNav Challenge.



ECCV spotlight talk

Our approach was accepted for a spotlight presentation in ECCV 2020.


Demos exploration with occupancy anticipation

We show examples of an exploration agent that utilizes occupancy anticipation to efficiently map unseen environments on Gibson validation environments. We show the agent's first person agent view, the map built purely using observed regions, and the map built using the anticipation.

Source code and pretrained models

Code to reproduce results from our Habitat-challenge entry and ECCV paper, along with pretrained models are available on GitHub.


Citation

@misc{ramakrishnan2020occant,
    title={Occupancy Anticipation for Efficient Exploration and Navigation},
    author={Santhosh K. Ramakrishnan and Ziad Al-Halah and Kristen Grauman},
    booktitle = {ECCV},
    year = {2020}
}

Acknowledgements

UT Austin is supported in part by DARPA Lifelong Learning Machines and the GCP Research Credits Program. We thank Devendra Singh Chaplot for clarifying the implementation details for ANS.


Media coverage

MIT Technology Review     VentureBeat     ZDNet     UTexas news    Inside AI