A Transformer-Based Model for the Prediction of Human Gaze Behavior on Videos

  • Süleyman Özdel
  • , Yao Rong
  • , Berat Mert Albaba
  • , Yen Ling Kuo
  • , Xi Wang
  • , Enkelejda Kasneci

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

4 Scopus citations

Abstract

Eye-tracking applications that utilize the human gaze in video understanding tasks have become increasingly important. To effectively automate the process of video analysis based on eye-tracking data, it is important to accurately replicate human gaze behavior. However, this task presents significant challenges due to the inherent complexity and ambiguity of human gaze patterns. In this work, we introduce a novel method for simulating human gaze behavior. Our approach uses a transformer-based reinforcement learning algorithm to train an agent that acts as a human observer, with the primary role of watching videos and simulating human gaze behavior. We employed an eye-tracking dataset gathered from videos generated by the VirtualHome simulator, with a primary focus on activity recognition. Our experimental results demonstrate the effectiveness of our gaze prediction method by highlighting its capability to replicate human gaze behavior and its applicability for downstream tasks where real human-gaze is used as input.

Original languageEnglish
Title of host publicationProceedings - ETRA 2024, ACM Symposium on Eye Tracking Research and Applications
EditorsStephen N. Spencer
PublisherAssociation for Computing Machinery
ISBN (Electronic)9798400706073
DOIs
StatePublished - 04 06 2024
Externally publishedYes
Event16th Annual ACM Symposium on Eye Tracking Research and Applications, ETRA 2024 - Hybrid, Glasgow, United Kingdom
Duration: 04 06 202407 06 2024

Publication series

NameEye Tracking Research and Applications Symposium (ETRA)

Conference

Conference16th Annual ACM Symposium on Eye Tracking Research and Applications, ETRA 2024
Country/TerritoryUnited Kingdom
CityHybrid, Glasgow
Period04/06/2407/06/24

Bibliographical note

Publisher Copyright:
© 2024 ACM.

Keywords

  • Action recognition
  • Eye-tracking
  • Human attention
  • Human gaze prediction

Fingerprint

Dive into the research topics of 'A Transformer-Based Model for the Prediction of Human Gaze Behavior on Videos'. Together they form a unique fingerprint.

Cite this