Demographic Biases in Naturalistic Language Recordings in the CHILDES Database

Camila Scaff, Georgia Loukatou, Alejandrina Cristia, Naomi Havron

Research output: Contribution to journalArticlepeer-review

Abstract

In recent years, the importance of estimating demographic biases in research has become apparent. Here, we provide a systematic review of the CHILDES database, the major source of naturalistic recordings of children's linguistic environment. We analyzed the database according to four dimensions considered central to language learning: SES, urbanization, family structure, and language. We present descriptive statistics of each dimension to assess whether naturalistic recordings were biased regarding the demographics of the countries and the families recorded within them. We find that CHILDES's recordings overrepresented wealthier countries and higher parental education levels, urban settings, and smaller households. Middle- and higher-class participants were likewise over-represented. The corpora were not representative of their countries in terms of urbanization either—with a larger percentage of families residing in urban settings than is overall true for their respective countries. In terms of family structure, nuclear families were more prevalent than in the countries where the data were collected. Last, we found that corpora were linguistically diverse, but we estimate that these recordings underrepresented bilingual and multilingual households. We conclude that researchers should be mindful when generalizing from naturalistic recordings of children's input and output obtained from CHILDES and make recommendations for the future use of CHILDES.

Original languageEnglish
Article numbere70011
JournalDevelopmental Science
Volume28
Issue number3
DOIs
StatePublished - May 2025

Bibliographical note

Publisher Copyright:
© 2025 John Wiley & Sons Ltd.

Keywords

  • CHILDES
  • demographic biases
  • home recordings
  • Naturalistic recordings
  • Spontaneous speech
  • WEIRD

ASJC Scopus subject areas

  • Developmental and Educational Psychology
  • Cognitive Neuroscience

Fingerprint

Dive into the research topics of 'Demographic Biases in Naturalistic Language Recordings in the CHILDES Database'. Together they form a unique fingerprint.

Cite this