Gaze-Guided Graph Neural Network for Action Anticipation Conditioned on Intention

Süleyman Özdel, Yao Rong, Berat Mert Albaba, Yen Ling Kuo, Xi Wang, Enkelejda Kasneci

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

Humans utilize their gaze to concentrate on essential information while perceiving and interpreting intentions in videos. Incorporating human gaze into computational algorithms can significantly enhance model performance in video understanding tasks. In this work, we address a challenging and innovative task in video understanding: predicting the actions of an agent in a video based on a partial video. We introduce the Gaze-guided Action Anticipation algorithm, which establishes a visual-semantic graph from the video input. Our method utilizes a Graph Neural Network to recognize the agent's intention and predict the action sequence to fulfill this intention. To assess the efficiency of our approach, we collect a dataset containing household activities generated in the VirtualHome environment, accompanied by human gaze data of viewing videos. Our method outperforms state-of-the-art techniques, achieving a 7% improvement in accuracy for 18-class intention recognition. This highlights the efficiency of our method in learning important features from human gaze data.

Original languageEnglish
Title of host publicationProceedings - ETRA 2024, ACM Symposium on Eye Tracking Research and Applications
EditorsStephen N. Spencer
PublisherAssociation for Computing Machinery
ISBN (Electronic)9798400706073
DOIs
StatePublished - 04 06 2024
Externally publishedYes
Event16th Annual ACM Symposium on Eye Tracking Research and Applications, ETRA 2024 - Hybrid, Glasgow, United Kingdom
Duration: 04 06 202407 06 2024

Publication series

NameEye Tracking Research and Applications Symposium (ETRA)

Conference

Conference16th Annual ACM Symposium on Eye Tracking Research and Applications, ETRA 2024
Country/TerritoryUnited Kingdom
CityHybrid, Glasgow
Period04/06/2407/06/24

Bibliographical note

Publisher Copyright:
© 2024 ACM.

Keywords

  • Action prediction
  • Action recognition
  • Eye-tracking
  • Human-computer interaction

Fingerprint

Dive into the research topics of 'Gaze-Guided Graph Neural Network for Action Anticipation Conditioned on Intention'. Together they form a unique fingerprint.

Cite this