Centre for Academic Mental Health, Population Health Sciences, University of Bristol, Bristol, UK.
MRC Integrative Epidemiology Unit, Population Health Sciences, Bristol Medical School, University of Bristol, Bristol, UK.
Mol Autism. 2023 May 23;14(1):19. doi: 10.1186/s13229-023-00549-2.
Genomic conditions can be associated with developmental delay, intellectual disability, autism spectrum disorder, and physical and mental health symptoms. They are individually rare and highly variable in presentation, which limits the use of standard clinical guidelines for diagnosis and treatment. A simple screening tool to identify young people with genomic conditions associated with neurodevelopmental disorders (ND-GCs) who could benefit from further support would be of considerable value. We used machine learning approaches to address this question.
A total of 493 individuals were included: 389 with a ND-GC, mean age = 9.01, 66% male) and 104 siblings without known genomic conditions (controls, mean age = 10.23, 53% male). Primary carers completed assessments of behavioural, neurodevelopmental and psychiatric symptoms and physical health and development. Machine learning techniques (penalised logistic regression, random forests, support vector machines and artificial neural networks) were used to develop classifiers of ND-GC status and identified limited sets of variables that gave the best classification performance. Exploratory graph analysis was used to understand associations within the final variable set.
All machine learning methods identified variable sets giving high classification accuracy (AUROC between 0.883 and 0.915). We identified a subset of 30 variables best discriminating between individuals with ND-GCs and controls which formed 5 dimensions: conduct, separation anxiety, situational anxiety, communication and motor development.
This study used cross-sectional data from a cohort study which was imbalanced with respect to ND-GC status. Our model requires validation in independent datasets and with longitudinal follow-up data for validation before clinical application.
In this study, we developed models that identified a compact set of psychiatric and physical health measures that differentiate individuals with a ND-GC from controls and highlight higher-order structure within these measures. This work is a step towards developing a screening instrument to identify young people with ND-GCs who might benefit from further specialist assessment.
基因组状况可与发育迟缓、智力障碍、自闭症谱系障碍以及身心症状相关联。它们各自出现的频率较低且表现形式多样,这限制了使用标准临床指南进行诊断和治疗。一种简单的筛查工具,用于识别与神经发育障碍(ND-GC)相关的、可能受益于进一步支持的年轻人,将具有相当大的价值。我们使用机器学习方法来解决这个问题。
共纳入 493 人:389 人患有 ND-GC,平均年龄为 9.01 岁,66%为男性),104 人为无已知基因组状况的兄弟姐妹(对照组,平均年龄为 10.23 岁,53%为男性)。主要照顾者完成了对行为、神经发育和精神症状以及身体健康和发育的评估。使用机器学习技术(惩罚逻辑回归、随机森林、支持向量机和人工神经网络)来开发 ND-GC 状态的分类器,并确定了一组最佳分类性能的有限变量。探索性图分析用于理解最终变量集中的关联。
所有机器学习方法都确定了具有高分类准确性的变量集(AUROC 在 0.883 至 0.915 之间)。我们确定了一组 30 个最佳区分 ND-GC 患者和对照组的变量,这些变量形成了 5 个维度:行为、分离焦虑、情境焦虑、沟通和运动发育。
本研究使用了队列研究的横断面数据,该数据在 ND-GC 状态方面不平衡。我们的模型需要在独立数据集和具有纵向随访数据的验证中进行验证,然后才能在临床应用中使用。
在这项研究中,我们开发了模型,该模型可以识别一组紧凑的精神健康和身体健康测量指标,用于区分患有 ND-GC 的个体和对照组,并突出这些测量指标中的高阶结构。这项工作是朝着开发一种筛查工具迈出的一步,该工具可以识别出可能受益于进一步专业评估的患有 ND-GC 的年轻人。