A Prediction Model of Autism Spectrum Diagnosis from Well-Baby Electronic Data Using Machine Learning

Ayelet Ben-Sasson, Joshua Guedalia, Liat Nativ, Keren Ilan, Meirav Shaham, Lidia V. Gabis

Research output: Contribution to journalArticlepeer-review


Early detection of autism spectrum disorder (ASD) is crucial for timely intervention, yet diagnosis typically occurs after age three. This study aimed to develop a machine learning model to predict ASD diagnosis using infants’ electronic health records obtained through a national screening program and evaluate its accuracy. A retrospective cohort study analyzed health records of 780,610 children, including 1163 with ASD diagnoses. Data encompassed birth parameters, growth metrics, developmental milestones, and familial and post-natal variables from routine wellness visits within the first two years. Using a gradient boosting model with 3-fold cross-validation, 100 parameters predicted ASD diagnosis with an average area under the ROC curve of 0.86 (SD < 0.002). Feature importance was quantified using the Shapley Additive explanation tool. The model identified a high-risk group with a 4.3-fold higher ASD incidence (0.006) compared to the cohort (0.001). Key predictors included failing six milestones in language, social, and fine motor domains during the second year, male gender, parental developmental concerns, non-nursing, older maternal age, lower gestational age, and atypical growth percentiles. Machine learning algorithms capitalizing on preventative care electronic health records can facilitate ASD screening considering complex relations between familial and birth factors, post-natal growth, developmental parameters, and parent concern.

Original languageEnglish
Article number429
Issue number4
StatePublished - 3 Apr 2024

Bibliographical note

Publisher Copyright:
© 2024 by the authors.


  • autism spectrum disorders
  • development
  • electronic health records
  • machine learning
  • screening

ASJC Scopus subject areas

  • Pediatrics, Perinatology, and Child Health


Dive into the research topics of 'A Prediction Model of Autism Spectrum Diagnosis from Well-Baby Electronic Data Using Machine Learning'. Together they form a unique fingerprint.

Cite this