Tushar Nagarajan1,2 | Santhosh K. Ramakrishnan1,2 | Ruta Desai2 | James Hillis3 | Kristen Grauman1,2 |
1UT Austin | 2FAIR, Meta | 2Reality Labs, Meta |
Room annotations for Ego4D and HouseTours: Each entry corresponds to a "visit" with the following information:
video_uid: Ego4D/HT video uid
start_time: timestamp when the camera-wearer enters the room
end_time: timestamp when the camera-wearer leaves the room
label: room category (e.g., kitchen, bedroom, garage)
instance: id for rooms of the same type (e.g., bedroom0, bedroom1 if there are two bedrooms)
NLQ annotations for HouseTours: Each entry corresponds to a natural language question asked about a video with the following information:
video_uid: Ego4D/HT video uid
query: natural language question to be grounded
response_start: timestamp for the response start
response_end: timestamp for the response end
category: question type (e.g., visit_x, see_x_then_y etc.)
@inproceedings{nagarajan2023egoenv, title={EgoEnv: Human-centric environment representations from egocentric video}, author={Nagarajan, Tushar and Ramakrishnan, Santhosh Kumar and Desai, Ruta and Hillis, James and Grauman, Kristen}, booktitle={NeurIPS}, year={2023} }