Shrivastava Trapti, Singh Vrijendra, Agrawal Anupam
Department of Information Technology, Indian Institute of Information Technology, Allahabad, Prayagraj, Uttar Pradesh 211015 India.
Health Inf Sci Syst. 2024 Mar 6;12(1):18. doi: 10.1007/s13755-024-00277-8. eCollection 2024 Dec.
Autism spectrum disorder (ASD) is a neurodevelopmental disorder. ASD cannot be fully cured, but early-stage diagnosis followed by therapies and rehabilitation helps an autistic person to live a quality life. Clinical diagnosis of ASD symptoms via questionnaire and screening tests such as Autism Spectrum Quotient-10 (AQ-10) and Quantitative Check-list for Autism in Toddlers (Q-chat) are expensive, inaccessible, and time-consuming processes. Machine learning (ML) techniques are beneficial to predict ASD easily at the initial stage of diagnosis. The main aim of this work is to classify ASD and typical developed (TD) class data using ML classifiers. In our work, we have used different ASD data sets of all age groups (toddlers, adults, children, and adolescents) to classify ASD and TD cases. We implemented One-Hot encoding to translate categorical data into numerical data during preprocessing. We then used kNN Imputer with MinMaxScaler feature transformation to handle missing values and data normalization. ASD and TD class data is classified using Support vector machine, k-nearest-neighbor (KNN), random forest (RF), and artificial neural network classifiers. RF gives the best performance in terms of the accuracy of 100% with different training and testing data split for all four types of data sets and has no over-fitting issue. We have also examined our results with already published work, including recent methods like Deep Neural Network (DNN) and Convolution Neural Network (CNN). Even using complex architectures like DNN and CNN, our proposed methods provide the best results with low-complexity models. In contrast, existing methods have shown accuracy upto 98% with log-loss upto 15%. Our proposed methodology demonstrates the improved generalization for real-time ASD detection during clinical trials.
自闭症谱系障碍(ASD)是一种神经发育障碍。ASD无法完全治愈,但早期诊断并随后进行治疗和康复有助于自闭症患者过上有质量的生活。通过问卷调查和筛查测试(如自闭症谱系商数-10(AQ-10)和幼儿自闭症定量检查表(Q-chat))对ASD症状进行临床诊断是昂贵、难以获得且耗时的过程。机器学习(ML)技术有助于在诊断的初始阶段轻松预测ASD。这项工作的主要目的是使用ML分类器对ASD和典型发育(TD)类别数据进行分类。在我们的工作中,我们使用了所有年龄组(幼儿、成人、儿童和青少年)的不同ASD数据集来对ASD和TD病例进行分类。我们在预处理期间实施了独热编码,将分类数据转换为数值数据。然后,我们使用带有MinMaxScaler特征变换的kNN插补器来处理缺失值和数据归一化。使用支持向量机、k近邻(KNN)、随机森林(RF)和人工神经网络分类器对ASD和TD类别数据进行分类。对于所有四种类型的数据集,RF在不同的训练和测试数据分割下,以100%的准确率表现最佳,并且没有过拟合问题。我们还将我们的结果与已发表的工作进行了比较,包括深度神经网络(DNN)和卷积神经网络(CNN)等最新方法。即使使用像DNN和CNN这样的复杂架构,我们提出的方法也能以低复杂度模型提供最佳结果。相比之下,现有方法的准确率高达98%,对数损失高达15%。我们提出的方法在临床试验中展示了用于实时ASD检测的改进的泛化能力。