TY - CONF
T1 - Jack of many Faces: A Step Towards Facial Expression and Physiological State Analysis with a Single Network
AU - Tariq, Abdullah
AU - Mesak, Martin
AU - Azad, R. Muhammad Atif
AU - Gilani, Zulqarnain
PY - 2025/8/1
Y1 - 2025/8/1
N2 - Facial feature analysis, particularly dynamic facial expression recognition, is essential in computer vision for understanding human emotions, behaviors, and physiological states. However, existing approaches often exhibit limited performance, stemming from inadequate modelling of facial dynamics, noise sensitivity, ambiguous expression semantics, and are generally specific to single-task scenarios. To address these issues, we propose a compact 3D spatio-temporal network capable of handling both expression recognition and physiological state analysis. Our network includes two custom modules: (1) Contrastive Adversarial Efficient Local Channel Attention (ConAdv-ELCA), which extracts and disentangles fine-grained local facial features, and (2) Efficient Global Channel Attention (EGCA), to capture local-global interactions. Unlike prior work, which predominantly evaluates models on similar datasets within single-task domains, our work has demonstrated the ability to generalize across different tasks that are based on facial analysis. Experimental results demonstrate that our model consistently achieves state-ofthe-art or near-state-of-the-art performance on blood alcohol concentration estimation, dynamic facial expression recognition, and driver fatigue detection.
AB - Facial feature analysis, particularly dynamic facial expression recognition, is essential in computer vision for understanding human emotions, behaviors, and physiological states. However, existing approaches often exhibit limited performance, stemming from inadequate modelling of facial dynamics, noise sensitivity, ambiguous expression semantics, and are generally specific to single-task scenarios. To address these issues, we propose a compact 3D spatio-temporal network capable of handling both expression recognition and physiological state analysis. Our network includes two custom modules: (1) Contrastive Adversarial Efficient Local Channel Attention (ConAdv-ELCA), which extracts and disentangles fine-grained local facial features, and (2) Efficient Global Channel Attention (EGCA), to capture local-global interactions. Unlike prior work, which predominantly evaluates models on similar datasets within single-task domains, our work has demonstrated the ability to generalize across different tasks that are based on facial analysis. Experimental results demonstrate that our model consistently achieves state-ofthe-art or near-state-of-the-art performance on blood alcohol concentration estimation, dynamic facial expression recognition, and driver fatigue detection.
UR - https://bmvc2025.bmva.org/programme/accepted_papers/
UR - https://www.open-access.bcu.ac.uk/16654/
M3 - Paper
ER -