TY - JOUR
T1 - Explainable automated pain recognition in cats
AU - Feighelstein, Marcelo
AU - Henze, Lea
AU - Meller, Sebastian
AU - Shimshoni, Ilan
AU - Hermoni, Ben
AU - Berko, Michael
AU - Twele, Friederike
AU - Schütter, Alexandra
AU - Dorn, Nora
AU - Kästner, Sabine
AU - Finka, Lauren
AU - Luna, Stelio P.L.
AU - Mills, Daniel S.
AU - Volk, Holger A.
AU - Zamansky, Anna
N1 - Publisher Copyright:
© 2023, The Author(s).
PY - 2023/6/2
Y1 - 2023/6/2
N2 - Manual tools for pain assessment from facial expressions have been suggested and validated for several animal species. However, facial expression analysis performed by humans is prone to subjectivity and bias, and in many cases also requires special expertise and training. This has led to an increasing body of work on automated pain recognition, which has been addressed for several species, including cats. Even for experts, cats are a notoriously challenging species for pain assessment. A previous study compared two approaches to automated ‘pain’/‘no pain’ classification from cat facial images: a deep learning approach, and an approach based on manually annotated geometric landmarks, reaching comparable accuracy results. However, the study included a very homogeneous dataset of cats and thus further research to study generalizability of pain recognition to more realistic settings is required. This study addresses the question of whether AI models can classify ‘pain’/‘no pain’ in cats in a more realistic (multi-breed, multi-sex) setting using a more heterogeneous and thus potentially ‘noisy’ dataset of 84 client-owned cats. Cats were a convenience sample presented to the Department of Small Animal Medicine and Surgery of the University of Veterinary Medicine Hannover and included individuals of different breeds, ages, sex, and with varying medical conditions/medical histories. Cats were scored by veterinary experts using the Glasgow composite measure pain scale in combination with the well-documented and comprehensive clinical history of those patients; the scoring was then used for training AI models using two different approaches. We show that in this context the landmark-based approach performs better, reaching accuracy above 77% in pain detection as opposed to only above 65% reached by the deep learning approach. Furthermore, we investigated the explainability of such machine recognition in terms of identifying facial features that are important for the machine, revealing that the region of nose and mouth seems more important for machine pain classification, while the region of ears is less important, with these findings being consistent across the models and techniques studied here.
AB - Manual tools for pain assessment from facial expressions have been suggested and validated for several animal species. However, facial expression analysis performed by humans is prone to subjectivity and bias, and in many cases also requires special expertise and training. This has led to an increasing body of work on automated pain recognition, which has been addressed for several species, including cats. Even for experts, cats are a notoriously challenging species for pain assessment. A previous study compared two approaches to automated ‘pain’/‘no pain’ classification from cat facial images: a deep learning approach, and an approach based on manually annotated geometric landmarks, reaching comparable accuracy results. However, the study included a very homogeneous dataset of cats and thus further research to study generalizability of pain recognition to more realistic settings is required. This study addresses the question of whether AI models can classify ‘pain’/‘no pain’ in cats in a more realistic (multi-breed, multi-sex) setting using a more heterogeneous and thus potentially ‘noisy’ dataset of 84 client-owned cats. Cats were a convenience sample presented to the Department of Small Animal Medicine and Surgery of the University of Veterinary Medicine Hannover and included individuals of different breeds, ages, sex, and with varying medical conditions/medical histories. Cats were scored by veterinary experts using the Glasgow composite measure pain scale in combination with the well-documented and comprehensive clinical history of those patients; the scoring was then used for training AI models using two different approaches. We show that in this context the landmark-based approach performs better, reaching accuracy above 77% in pain detection as opposed to only above 65% reached by the deep learning approach. Furthermore, we investigated the explainability of such machine recognition in terms of identifying facial features that are important for the machine, revealing that the region of nose and mouth seems more important for machine pain classification, while the region of ears is less important, with these findings being consistent across the models and techniques studied here.
UR - http://www.scopus.com/inward/record.url?scp=85160893355&partnerID=8YFLogxK
U2 - 10.1038/s41598-023-35846-6
DO - 10.1038/s41598-023-35846-6
M3 - Article
C2 - 37268666
AN - SCOPUS:85160893355
SN - 2045-2322
VL - 13
JO - Scientific Reports
JF - Scientific Reports
IS - 1
M1 - 8973
ER -