Human-Centric Object Interactions - A Fine-Grained Perspective from Egocentric Videos

Wednesday 21 April 2021, 1.35PM to 2.25pm

Speaker(s): Dina Damen Associate Professor (Reader) at the Department of Computer Science, Visual Information Laboratory, University of Bristol

This talk aims to argue for a fine(r)-grained perspective onto human-object interactions, from video sequences, captured in an egocentric perspective (i.e. first-person footage).

Using multi-modal input (appearance, motion, audio, language), I will present approaches for determining skill or expertise from video sequences [CVPR 2019], assessing action ‘completion’ – i.e. when an interaction is attempted but not completed [BMVC 2018], few-shot learning [CVPR2021], dual-domain [CVPR 2020] as well as multi-modal fusion using vision, audio and language [CVPR 2021, CVPR 2020, ICCV 2019]. See all project details.

I will also introduce EPIC-KITCHENS-100, the largest egocentric dataset in people’s homes. The dataset includes 20M frames of 90K action segments and 100 hours of recording fully annotated, based on unique annotations from the participants narrating their own videos, thus reflecting true intention.

