TY - JOUR
T1 - Robust inter-subject audiovisual decoding in functional magnetic resonance imaging using high-dimensional regression
AU - Raz, Gal
AU - Svanera, Michele
AU - Singer, Neomi
AU - Gilam, Gadi
AU - Cohen, Maya Bleich
AU - Lin, Tamar
AU - Admon, Roee
AU - Gonen, Tal
AU - Thaler, Avner
AU - Granot, Roni Y.
AU - Goebel, Rainer
AU - Benini, Sergio
AU - Valente, Giancarlo
N1 - Publisher Copyright:
© 2017 Elsevier Inc.
PY - 2017/12
Y1 - 2017/12
N2 - Major methodological advancements have been recently made in the field of neural decoding, which is concerned with the reconstruction of mental content from neuroimaging measures. However, in the absence of a large-scale examination of the validity of the decoding models across subjects and content, the extent to which these models can be generalized is not clear. This study addresses the challenge of producing generalizable decoding models, which allow the reconstruction of perceived audiovisual features from human magnetic resonance imaging (fMRI) data without prior training of the algorithm on the decoded content. We applied an adapted version of kernel ridge regression combined with temporal optimization on data acquired during film viewing (234 runs) to generate standardized brain models for sound loudness, speech presence, perceived motion, face-to-frame ratio, lightness, and color brightness. The prediction accuracies were tested on data collected from different subjects watching other movies mainly in another scanner. Substantial and significant (QFDR<0.05) correlations between the reconstructed and the original descriptors were found for the first three features (loudness, speech, and motion) in all of the 9 test movies (R¯=0.62, R¯ = 0.60, R¯ = 0.60, respectively) with high reproducibility of the predictors across subjects. The face ratio model produced significant correlations in 7 out of 8 movies (R¯=0.56). The lightness and brightness models did not show robustness (R¯=0.23, R¯ = 0). Further analysis of additional data (95 runs) indicated that loudness reconstruction veridicality can consistently reveal relevant group differences in musical experience. The findings point to the validity and generalizability of our loudness, speech, motion, and face ratio models for complex cinematic stimuli (as well as for music in the case of loudness). While future research should further validate these models using controlled stimuli and explore the feasibility of extracting more complex models via this method, the reliability of our results indicates the potential usefulness of the approach and the resulting models in basic scientific and diagnostic contexts.
AB - Major methodological advancements have been recently made in the field of neural decoding, which is concerned with the reconstruction of mental content from neuroimaging measures. However, in the absence of a large-scale examination of the validity of the decoding models across subjects and content, the extent to which these models can be generalized is not clear. This study addresses the challenge of producing generalizable decoding models, which allow the reconstruction of perceived audiovisual features from human magnetic resonance imaging (fMRI) data without prior training of the algorithm on the decoded content. We applied an adapted version of kernel ridge regression combined with temporal optimization on data acquired during film viewing (234 runs) to generate standardized brain models for sound loudness, speech presence, perceived motion, face-to-frame ratio, lightness, and color brightness. The prediction accuracies were tested on data collected from different subjects watching other movies mainly in another scanner. Substantial and significant (QFDR<0.05) correlations between the reconstructed and the original descriptors were found for the first three features (loudness, speech, and motion) in all of the 9 test movies (R¯=0.62, R¯ = 0.60, R¯ = 0.60, respectively) with high reproducibility of the predictors across subjects. The face ratio model produced significant correlations in 7 out of 8 movies (R¯=0.56). The lightness and brightness models did not show robustness (R¯=0.23, R¯ = 0). Further analysis of additional data (95 runs) indicated that loudness reconstruction veridicality can consistently reveal relevant group differences in musical experience. The findings point to the validity and generalizability of our loudness, speech, motion, and face ratio models for complex cinematic stimuli (as well as for music in the case of loudness). While future research should further validate these models using controlled stimuli and explore the feasibility of extracting more complex models via this method, the reliability of our results indicates the potential usefulness of the approach and the resulting models in basic scientific and diagnostic contexts.
KW - Audiovisual decoding
KW - Face
KW - Kernel ridge regression
KW - Motion pictures
KW - Optical flow
KW - Sound loudness
KW - fMRI
UR - http://www.scopus.com/inward/record.url?scp=85030150296&partnerID=8YFLogxK
U2 - 10.1016/j.neuroimage.2017.09.032
DO - 10.1016/j.neuroimage.2017.09.032
M3 - Article
C2 - 28939433
AN - SCOPUS:85030150296
SN - 1053-8119
VL - 163
SP - 244
EP - 263
JO - NeuroImage
JF - NeuroImage
ER -