通过家庭视频的机器学习进行自闭症的移动检测：一项开发和前瞻性验证研究。

Mobile detection of autism through machine learning on home video: A development and prospective validation study.

机构信息

Department of Pediatrics, Division of Systems Medicine, Stanford University, California, United States of America.

Department of Biomedical Data Science, Stanford University, California, United States of America.

出版信息

PLoS Med. 2018 Nov 27;15(11):e1002705. doi: 10.1371/journal.pmed.1002705. eCollection 2018 Nov.

DOI:10.1371/journal.pmed.1002705

PMID:30481180

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6258501/

Abstract

BACKGROUND

The standard approaches to diagnosing autism spectrum disorder (ASD) evaluate between 20 and 100 behaviors and take several hours to complete. This has in part contributed to long wait times for a diagnosis and subsequent delays in access to therapy. We hypothesize that the use of machine learning analysis on home video can speed the diagnosis without compromising accuracy. We have analyzed item-level records from 2 standard diagnostic instruments to construct machine learning classifiers optimized for sparsity, interpretability, and accuracy. In the present study, we prospectively test whether the features from these optimized models can be extracted by blinded nonexpert raters from 3-minute home videos of children with and without ASD to arrive at a rapid and accurate machine learning autism classification.

METHODS AND FINDINGS

We created a mobile web portal for video raters to assess 30 behavioral features (e.g., eye contact, social smile) that are used by 8 independent machine learning models for identifying ASD, each with >94% accuracy in cross-validation testing and subsequent independent validation from previous work. We then collected 116 short home videos of children with autism (mean age = 4 years 10 months, SD = 2 years 3 months) and 46 videos of typically developing children (mean age = 2 years 11 months, SD = 1 year 2 months). Three raters blind to the diagnosis independently measured each of the 30 features from the 8 models, with a median time to completion of 4 minutes. Although several models (consisting of alternating decision trees, support vector machine [SVM], logistic regression (LR), radial kernel, and linear SVM) performed well, a sparse 5-feature LR classifier (LR5) yielded the highest accuracy (area under the curve [AUC]: 92% [95% CI 88%-97%]) across all ages tested. We used a prospectively collected independent validation set of 66 videos (33 ASD and 33 non-ASD) and 3 independent rater measurements to validate the outcome, achieving lower but comparable accuracy (AUC: 89% [95% CI 81%-95%]). Finally, we applied LR to the 162-video-feature matrix to construct an 8-feature model, which achieved 0.93 AUC (95% CI 0.90-0.97) on the held-out test set and 0.86 on the validation set of 66 videos. Validation on children with an existing diagnosis limited the ability to generalize the performance to undiagnosed populations.

CONCLUSIONS

These results support the hypothesis that feature tagging of home videos for machine learning classification of autism can yield accurate outcomes in short time frames, using mobile devices. Further work will be needed to confirm that this approach can accelerate autism diagnosis at scale.

摘要

背景

诊断自闭症谱系障碍（ASD）的标准方法评估 20 到 100 种行为，需要数小时才能完成。这在一定程度上导致了诊断的长时间等待，进而导致治疗的延迟。我们假设在家用视频上使用机器学习分析可以在不影响准确性的情况下加快诊断速度。我们已经分析了来自 2 种标准诊断工具的项目级记录，以构建针对稀疏性、可解释性和准确性进行了优化的机器学习分类器。在本研究中，我们前瞻性地测试了这些优化模型的特征是否可以从 3 分钟的自闭症和非自闭症儿童家庭视频中由未经训练的盲评者提取出来，以实现快速准确的机器学习自闭症分类。

方法和发现

我们创建了一个移动网络门户，供视频评估者评估 30 种行为特征（例如，眼神接触、社交微笑），这些特征由 8 个独立的机器学习模型用于识别 ASD，每个模型在交叉验证测试和以前工作的后续独立验证中的准确率均超过 94%。然后，我们收集了 116 个自闭症儿童的短家庭视频（平均年龄=4 岁 10 个月，标准差=2 岁 3 个月）和 46 个正常发育儿童的视频（平均年龄=2 岁 11 个月，标准差=1 岁 2 个月）。三位对诊断结果不知情的评估者独立测量了 8 个模型中每个模型的 30 个特征，平均完成时间为 4 分钟。虽然有几个模型（包括交替决策树、支持向量机[SVM]、逻辑回归（LR）、径向核和线性 SVM）表现良好，但稀疏的 5 特征 LR 分类器（LR5）在所有测试年龄中都取得了最高的准确率（曲线下面积[AUC]：92%[95%CI 88%-97%]）。我们使用前瞻性收集的 66 个视频（33 个 ASD 和 33 个非 ASD）的独立验证集和 3 个独立评估者的测量值来验证结果，准确性虽然较低，但相当（AUC：89%[95%CI 81%-95%]）。最后，我们将 LR 应用于 162 个视频特征矩阵，构建了一个 8 特征模型，在保留的测试集中的 AUC 为 0.93（95%CI 0.90-0.97），在 66 个视频的验证集中为 0.86。对已有诊断的儿童进行验证限制了将性能推广到未诊断人群的能力。