Background: Online communities are used as platforms by parents to verify developmental and health concerns related to their child. The increasing public awareness of autism spectrum disorders (ASD) leads more parents to suspect ASD in their child. Early identification of ASD is important for early intervention. Objective: To characterize the symptoms mentioned in online queries posed by parents who suspect that their child might have ASD and determine whether they are age-specific. To test the efficacy of machine learning tools in classifying the child's risk of ASD based on the parent's narrative. Methods: To this end, we analyzed online queries posed by parents who were concerned that their child might have ASD and categorized the warning signs they mentioned according to ASD-specific and non-ASD-specific domains. We then used the data to test the efficacy with which a trained machine learning tool classified the degree of ASD risk. Yahoo Answers, a social site for posting queries and finding answers, was mined for queries of parents asking the community whether their child has ASD. A total of 195 queries were sampled for this study (mean child age=38.0 months; 84.7% [160/189] boys). Content text analysis of the queries aimed to categorize the types of symptoms described and obtain clinical judgment of the child's ASD-risk level. Results: Concerns related to repetitive and restricted behaviors and interests (RRBI) were the most prevalent (75.4%, 147/195), followed by concerns related to language (61.5%, 120/195) and emotional markers (50.3%, 98/195). Of the 195 queries, 18.5% (36/195) were rated by clinical experts as low-risk, 30.8% (60/195) as medium-risk, and 50.8% (99/195) as high-risk. Risk groups differed significantly (P<.001) in the rate of concerns in the language, social, communication, and RRBI domains. When testing whether an automatic classifier (decision tree) could predict if a query was medium- or high-risk based on the text of the query and the coded symptoms, performance reached an area under the receiver operating curve (ROC) curve of 0.67 (CI 95% 0.50-0.78), whereas predicting from the text and the coded signs resulted in an area under the curve of 0.82 (0.80-0.86). Conclusions: Findings call for health care providers to closely listen to parental ASD-related concerns, as recommended by screening guidelines. They also demonstrate the need for Internet-based screening systems that utilize parents' narratives using a decision tree questioning method.
Bibliographical notePublisher Copyright:
© Ayelet Ben-Sasson, Elad Yom-Tov.
- Autistic disorders
- Early detection
- Machine learning
- Online queries
ASJC Scopus subject areas
- Health Informatics