Ben-Sasson Ayelet, Yom-Tov Elad
University of Haifa, Haifa, Israel.
Microsoft Research & Development, Herzelia, Israel.
J Med Internet Res. 2016 Nov 22;18(11):e300. doi: 10.2196/jmir.5439.
Online communities are used as platforms by parents to verify developmental and health concerns related to their child. The increasing public awareness of autism spectrum disorders (ASD) leads more parents to suspect ASD in their child. Early identification of ASD is important for early intervention.
To characterize the symptoms mentioned in online queries posed by parents who suspect that their child might have ASD and determine whether they are age-specific. To test the efficacy of machine learning tools in classifying the child's risk of ASD based on the parent's narrative.
To this end, we analyzed online queries posed by parents who were concerned that their child might have ASD and categorized the warning signs they mentioned according to ASD-specific and non-ASD-specific domains. We then used the data to test the efficacy with which a trained machine learning tool classified the degree of ASD risk. Yahoo Answers, a social site for posting queries and finding answers, was mined for queries of parents asking the community whether their child has ASD. A total of 195 queries were sampled for this study (mean child age=38.0 months; 84.7% [160/189] boys). Content text analysis of the queries aimed to categorize the types of symptoms described and obtain clinical judgment of the child's ASD-risk level.
Concerns related to repetitive and restricted behaviors and interests (RRBI) were the most prevalent (75.4%, 147/195), followed by concerns related to language (61.5%, 120/195) and emotional markers (50.3%, 98/195). Of the 195 queries, 18.5% (36/195) were rated by clinical experts as low-risk, 30.8% (60/195) as medium-risk, and 50.8% (99/195) as high-risk. Risk groups differed significantly (P<.001) in the rate of concerns in the language, social, communication, and RRBI domains. When testing whether an automatic classifier (decision tree) could predict if a query was medium- or high-risk based on the text of the query and the coded symptoms, performance reached an area under the receiver operating curve (ROC) curve of 0.67 (CI 95% 0.50-0.78), whereas predicting from the text and the coded signs resulted in an area under the curve of 0.82 (0.80-0.86).
Findings call for health care providers to closely listen to parental ASD-related concerns, as recommended by screening guidelines. They also demonstrate the need for Internet-based screening systems that utilize parents' narratives using a decision tree questioning method.
在线社区被家长用作平台,以核实与孩子发育和健康相关的问题。公众对自闭症谱系障碍(ASD)的认识不断提高,导致更多家长怀疑自己的孩子患有ASD。ASD的早期识别对于早期干预很重要。
描述怀疑孩子可能患有ASD的家长在在线咨询中提到的症状,并确定这些症状是否具有年龄特异性。测试机器学习工具根据家长描述对孩子患ASD风险进行分类的有效性。
为此,我们分析了担心孩子可能患有ASD的家长提出的在线咨询,并根据ASD特异性和非ASD特异性领域对他们提到的警示信号进行分类。然后,我们使用这些数据来测试经过训练的机器学习工具对ASD风险程度进行分类的有效性。我们在雅虎问答(一个用于发布问题和寻找答案的社交网站)上挖掘家长询问社区其孩子是否患有ASD的问题。本研究共抽取了195个问题(孩子平均年龄 = 38.0个月;84.7%[160/189]为男孩)。对问题的内容文本分析旨在对所描述的症状类型进行分类,并获得对孩子ASD风险水平的临床判断。
与重复和受限行为及兴趣(RRBI)相关的担忧最为普遍(75.4%,147/195),其次是与语言相关的担忧(61.5%,120/195)和情绪指标相关的担忧(50.3%,98/195)。在195个问题中,临床专家将18.5%(36/195)评为低风险,30.8%(60/195)评为中等风险,50.8%(99/195)评为高风险。风险组在语言、社交、沟通和RRBI领域的担忧发生率上存在显著差异(P<.001)。当测试自动分类器(决策树)能否根据问题文本和编码症状预测一个问题是中等风险还是高风险时,接收器操作曲线(ROC)下面积达到0.67(95%CI 0.50 - 0.78),而从文本和编码症状进行预测时,曲线下面积为0.82(0.80 - 0.86)。
研究结果呼吁医疗保健提供者按照筛查指南的建议,密切倾听家长与ASD相关的担忧。研究结果还表明需要基于互联网的筛查系统,该系统利用决策树提问方法来利用家长的描述。