• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

机器学习算法在自闭症谱系障碍监测中的比较。

A comparison of machine learning algorithms for the surveillance of autism spectrum disorder.

机构信息

Centers for Disease Control and Prevention, Atlanta, GA, United States of America.

出版信息

PLoS One. 2019 Sep 25;14(9):e0222907. doi: 10.1371/journal.pone.0222907. eCollection 2019.

DOI:10.1371/journal.pone.0222907
PMID:31553774
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6760799/
Abstract

OBJECTIVE

The Centers for Disease Control and Prevention (CDC) coordinates a labor-intensive process to measure the prevalence of autism spectrum disorder (ASD) among children in the United States. Random forests methods have shown promise in speeding up this process, but they lag behind human classification accuracy by about 5%. We explore whether more recently available document classification algorithms can close this gap.

MATERIALS AND METHODS

Using data gathered from a single surveillance site, we applied 8 supervised learning algorithms to predict whether children meet the case definition for ASD based solely on the words in their evaluations. We compared the algorithms' performance across 10 random train-test splits of the data, using classification accuracy, F1 score, and number of positive calls to evaluate their potential use for surveillance.

RESULTS

Across the 10 train-test cycles, the random forest and support vector machine with Naive Bayes features (NB-SVM) each achieved slightly more than 87% mean accuracy. The NB-SVM produced significantly more false negatives than false positives (P = 0.027), but the random forest did not, making its prevalence estimates very close to the true prevalence in the data. The best-performing neural network performed similarly to the random forest on both measures.

DISCUSSION

The random forest performed as well as more recently available models like the NB-SVM and the neural network, and it also produced good prevalence estimates. NB-SVM may not be a good candidate for use in a fully-automated surveillance workflow due to increased false negatives. More sophisticated algorithms, like hierarchical convolutional neural networks, may not be feasible to train due to characteristics of the data. Current algorithms might perform better if the data are abstracted and processed differently and if they take into account information about the children in addition to their evaluations.

CONCLUSION

Deep learning models performed similarly to traditional machine learning methods at predicting the clinician-assigned case status for CDC's autism surveillance system. While deep learning methods had limited benefit in this task, they may have applications in other surveillance systems.

摘要

目的

疾病控制与预防中心(CDC)协调了一项劳动密集型工作,以衡量美国儿童自闭症谱系障碍(ASD)的流行率。随机森林方法已显示出在加快这一过程中的潜力,但它们的分类准确性比人类低约 5%。我们探讨了最近可用的文档分类算法是否可以缩小这一差距。

材料和方法

使用从单个监测站点收集的数据,我们应用了 8 种监督学习算法,仅根据评估中儿童的单词来预测他们是否符合 ASD 的病例定义。我们比较了算法在数据的 10 次随机训练-测试分割中的性能,使用分类准确性、F1 分数和阳性预测值来评估它们在监测中的潜在用途。

结果

在 10 次训练-测试循环中,随机森林和带有朴素贝叶斯特征的支持向量机(NB-SVM)的平均准确率均略高于 87%。NB-SVM 产生的假阴性明显多于假阳性(P = 0.027),但随机森林没有,因此其患病率估计值非常接近数据中的真实患病率。表现最好的神经网络在这两个指标上的表现与随机森林相似。

讨论

随机森林的表现与最近可用的模型(如 NB-SVM 和神经网络)一样好,并且它也产生了良好的患病率估计值。由于假阴性的增加,NB-SVM 可能不是全自动监测工作流程的理想候选者。由于数据的特点,更复杂的算法(如层次卷积神经网络)可能无法训练。如果对数据进行抽象和处理,并考虑到儿童的信息,而不仅仅是他们的评估,那么当前的算法可能会表现得更好。

结论

深度学习模型在预测 CDC 自闭症监测系统的临床医生分配病例状态方面的表现与传统机器学习方法相似。虽然深度学习方法在这项任务中没有带来很大的好处,但它们可能在其他监测系统中有应用。

相似文献

1
A comparison of machine learning algorithms for the surveillance of autism spectrum disorder.机器学习算法在自闭症谱系障碍监测中的比较。
PLoS One. 2019 Sep 25;14(9):e0222907. doi: 10.1371/journal.pone.0222907. eCollection 2019.
2
Prevalence and Characteristics of Autism Spectrum Disorder Among Children Aged 8 Years - Autism and Developmental Disabilities Monitoring Network, 11 Sites, United States, 2012.8 岁儿童自闭症谱系障碍的流行率和特征 - 自闭症和发育障碍监测网络,美国 11 个地点,2012 年。
MMWR Surveill Summ. 2018 Nov 16;65(13):1-23. doi: 10.15585/mmwr.ss6513a1.
3
Development of a Machine Learning Algorithm for the Surveillance of Autism Spectrum Disorder.一种用于监测自闭症谱系障碍的机器学习算法的开发
PLoS One. 2016 Dec 21;11(12):e0168224. doi: 10.1371/journal.pone.0168224. eCollection 2016.
4
Prediction and Analysis of Autism Spectrum Disorder Using Machine Learning Techniques.使用机器学习技术预测和分析自闭症谱系障碍。
J Healthc Eng. 2023 Jul 10;2023:4853800. doi: 10.1155/2023/4853800. eCollection 2023.
5
Identification of newborns at risk for autism using electronic medical records and machine learning.利用电子病历和机器学习识别自闭症风险新生儿。
Eur Psychiatry. 2020 Feb 26;63(1):e22. doi: 10.1192/j.eurpsy.2020.17.
6
Comparison of supervised machine learning classification techniques in prediction of locoregional recurrences in early oral tongue cancer.比较早期口腔舌癌局部区域复发预测中监督机器学习分类技术。
Int J Med Inform. 2020 Apr;136:104068. doi: 10.1016/j.ijmedinf.2019.104068. Epub 2019 Dec 28.
7
Application of supervised machine learning algorithms in the classification of sagittal gait patterns of cerebral palsy children with spastic diplegia.监督机器学习算法在痉挛性双瘫脑瘫儿童矢状面步态模式分类中的应用。
Comput Biol Med. 2019 Mar;106:33-39. doi: 10.1016/j.compbiomed.2019.01.009. Epub 2019 Jan 16.
8
Pediatric Injury Surveillance From Uncoded Emergency Department Admission Records in Italy: Machine Learning-Based Text-Mining Approach.意大利基于无编码急诊入院记录的儿科伤害监测:基于机器学习的文本挖掘方法。
JMIR Public Health Surveill. 2023 Jul 12;9:e44467. doi: 10.2196/44467.
9
Soft Clustering for Enhancing the Diagnosis of Chronic Diseases over Machine Learning Algorithms.基于机器学习算法的软聚类在慢性病诊断中的应用。
J Healthc Eng. 2020 Mar 9;2020:4984967. doi: 10.1155/2020/4984967. eCollection 2020.
10
Optimizing neural networks for medical data sets: A case study on neonatal apnea prediction.优化神经网络在医学数据集上的应用:以新生儿呼吸暂停预测为例的研究
Artif Intell Med. 2019 Jul;98:59-76. doi: 10.1016/j.artmed.2019.07.008. Epub 2019 Jul 25.

引用本文的文献

1
Transparent deep learning to identify autism spectrum disorders (ASD) in EHR using clinical notes.利用电子健康记录中的临床记录进行透明的深度学习以识别自闭症谱系障碍(ASD)。
J Am Med Inform Assoc. 2024 May 20;31(6):1313-1321. doi: 10.1093/jamia/ocae080.
2
Leveraging automated approaches to categorize birth defects from abstracted birth hospitalization data.利用自动化方法从住院分娩数据中对出生缺陷进行分类。
Birth Defects Res. 2024 Jan;116(1):e2267. doi: 10.1002/bdr2.2267. Epub 2023 Nov 6.
3
Is the Combination of ADOS and ADI-R Necessary to Classify ASD? Rethinking the "Gold Standard" in Diagnosing ASD.诊断自闭症谱系障碍(ASD)时,是否需要将孤独症诊断观察量表(ADOS)和孤独症诊断访谈量表修订版(ADI-R)结合使用?对诊断ASD的“金标准”进行重新思考。
Front Psychiatry. 2021 Aug 24;12:727308. doi: 10.3389/fpsyt.2021.727308. eCollection 2021.
4
Automatic classification of autism spectrum disorder in children using cortical thickness and support vector machine.利用皮质厚度和支持向量机对儿童自闭症谱系障碍进行自动分类。
Brain Behav. 2021 Aug;11(8):e2238. doi: 10.1002/brb3.2238. Epub 2021 Jul 15.
5
Comparison of 2 Case Definitions for Ascertaining the Prevalence of Autism Spectrum Disorder Among 8-Year-Old Children.比较两种用于确定 8 岁儿童自闭症谱系障碍患病率的病例定义。
Am J Epidemiol. 2021 Oct 1;190(10):2198-2207. doi: 10.1093/aje/kwab106.
6
Identifying predictive features of autism spectrum disorders in a clinical sample of adolescents and adults using machine learning.使用机器学习在青少年和成人临床样本中识别自闭症谱系障碍的预测特征。
Sci Rep. 2020 Mar 18;10(1):4805. doi: 10.1038/s41598-020-61607-w.

本文引用的文献

1
Prevalence and Characteristics of Autism Spectrum Disorder Among Children Aged 8 Years - Autism and Developmental Disabilities Monitoring Network, 11 Sites, United States, 2012.8 岁儿童自闭症谱系障碍的流行率和特征 - 自闭症和发育障碍监测网络,美国 11 个地点,2012 年。
MMWR Surveill Summ. 2018 Nov 16;65(13):1-23. doi: 10.15585/mmwr.ss6513a1.
2
Development of a Machine Learning Algorithm for the Surveillance of Autism Spectrum Disorder.一种用于监测自闭症谱系障碍的机器学习算法的开发
PLoS One. 2016 Dec 21;11(12):e0168224. doi: 10.1371/journal.pone.0168224. eCollection 2016.
3
Prevalence of autism spectrum disorder among children aged 8 years - autism and developmental disabilities monitoring network, 11 sites, United States, 2010.8 岁儿童自闭症谱系障碍患病率 - 自闭症和发育障碍监测网络,11 个地点,美国,2010 年。
MMWR Surveill Summ. 2014 Mar 28;63(2):1-21.
4
Prevalence of autism spectrum disorders--Autism and Developmental Disabilities Monitoring Network, 14 sites, United States, 2008.自闭症谱系障碍的流行率——自闭症及发展障碍监测网络,美国 14 个监测点,2008 年。
MMWR Surveill Summ. 2012 Mar 30;61(3):1-19.
5
Prevalence of autism spectrum disorders - Autism and Developmental Disabilities Monitoring Network, United States, 2006.自闭症谱系障碍的患病率 - 美国自闭症与发育障碍监测网络,2006年
MMWR Surveill Summ. 2009 Dec 18;58(10):1-20.
6
Simultaneous inference in general parametric models.一般参数模型中的同时推断。
Biom J. 2008 Jun;50(3):346-63. doi: 10.1002/bimj.200810425.
7
A public health collaboration for the surveillance of autism spectrum disorders.一项针对自闭症谱系障碍监测的公共卫生合作项目。
Paediatr Perinat Epidemiol. 2007 Mar;21(2):179-90. doi: 10.1111/j.1365-3016.2007.00801.x.
8
Comparisons of predictive values of binary medical diagnostic tests for paired designs.配对设计二元医学诊断试验预测值的比较。
Biometrics. 2000 Jun;56(2):345-51. doi: 10.1111/j.0006-341x.2000.00345.x.
9
Improved confidence intervals for the difference between binomial proportions based on paired data.基于配对数据的二项比例差异的改进置信区间。
Stat Med. 1998 Nov 30;17(22):2635-50.
10
Long short-term memory.长短期记忆
Neural Comput. 1997 Nov 15;9(8):1735-80. doi: 10.1162/neco.1997.9.8.1735.