TY - JOUR
T1 - Self-supervised pain intensity estimation from facial videos via statistical spatiotemporal distillation
AU - Tavakolian, Mohammad
AU - Bordallo Lopez, Miguel
AU - Liu, Li
N1 - Funding Information:
We would like to acknowledge the financial support of the Academy of Finland (No. 331883), Infotech Oulu, Tauno Tönning, Nokia, and KAUTE foundations.
Funding Information:
We would like to acknowledge the financial support of the Academy of Finland (No. 331883), Infotech Oulu, Tauno T?nning, Nokia, and KAUTE foundations.
Publisher Copyright:
© 2020 Elsevier B.V.
Copyright:
Copyright 2020 Elsevier B.V., All rights reserved.
PY - 2020/12
Y1 - 2020/12
N2 - Recently, automatic pain assessment technology, in particular automatically detecting pain from facial expressions, has been developed to improve the quality of pain management, and has attracted increasing attention. In this paper, we propose self-supervised learning for automatic yet efficient pain assessment, in order to reduce the cost of collecting large amount of labeled data. To achieve this, we introduce a novel similarity function to learn generalized representations using a Siamese network in the pretext task. The learned representations are finetuned in the downstream task of pain intensity estimation. To make the method computationally efficient, we propose Statistical Spatiotemporal Distillation (SSD) to encode the spatiotemporal variations underlying the facial video into a single RGB image, enabling the use of less complex 2D deep models for video representation. Experiments on two publicly available pain datasets and cross-dataset evaluation demonstrate promising results, showing the good generalization ability of the learned representations.
AB - Recently, automatic pain assessment technology, in particular automatically detecting pain from facial expressions, has been developed to improve the quality of pain management, and has attracted increasing attention. In this paper, we propose self-supervised learning for automatic yet efficient pain assessment, in order to reduce the cost of collecting large amount of labeled data. To achieve this, we introduce a novel similarity function to learn generalized representations using a Siamese network in the pretext task. The learned representations are finetuned in the downstream task of pain intensity estimation. To make the method computationally efficient, we propose Statistical Spatiotemporal Distillation (SSD) to encode the spatiotemporal variations underlying the facial video into a single RGB image, enabling the use of less complex 2D deep models for video representation. Experiments on two publicly available pain datasets and cross-dataset evaluation demonstrate promising results, showing the good generalization ability of the learned representations.
KW - Pain assessment
KW - Representation learning
KW - Self-supervised learning
KW - Statistical spatiotemporal distillation
UR - http://www.scopus.com/inward/record.url?scp=85091734724&partnerID=8YFLogxK
U2 - 10.1016/j.patrec.2020.09.012
DO - 10.1016/j.patrec.2020.09.012
M3 - Article
AN - SCOPUS:85091734724
SN - 0167-8655
VL - 140
SP - 26
EP - 33
JO - Pattern Recognition Letters
JF - Pattern Recognition Letters
ER -