Autism Center of Excellence, Department of Neuroscience, University of California San Diego, La Jolla, CA, USA.
Department of Pediatrics, University of California San Diego, La Jolla, CA, USA.
Mol Psychiatry. 2023 Feb;28(2):822-833. doi: 10.1038/s41380-022-01826-x. Epub 2022 Oct 20.
Autism Spectrum Disorder (ASD) diagnosis remains behavior-based and the median age of diagnosis is ~52 months, nearly 5 years after its first-trimester origin. Accurate and clinically-translatable early-age diagnostics do not exist due to ASD genetic and clinical heterogeneity. Here we collected clinical, diagnostic, and leukocyte RNA data from 240 ASD and typically developing (TD) toddlers (175 toddlers for training and 65 for test). To identify gene expression ASD diagnostic classifiers, we developed 42,840 models composed of 3570 gene expression feature selection sets and 12 classification methods. We found that 742 models had AUC-ROC ≥ 0.8 on both Training and Test sets. Weighted Bayesian model averaging of these 742 models yielded an ensemble classifier model with accurate performance in Training and Test gene expression datasets with ASD diagnostic classification AUC-ROC scores of 85-89% and AUC-PR scores of 84-92%. ASD toddlers with ensemble scores above and below the overall ASD ensemble mean of 0.723 (on a scale of 0 to 1) had similar diagnostic and psychometric scores, but those below this ASD ensemble mean had more prenatal risk events than TD toddlers. Ensemble model feature genes were involved in cell cycle, inflammation/immune response, transcriptional gene regulation, cytokine response, and PI3K-AKT, RAS and Wnt signaling pathways. We additionally collected targeted DNA sequencing smMIPs data on a subset of ASD risk genes from 217 of the 240 ASD and TD toddlers. This DNA sequencing found about the same percentage of SFARI Level 1 and 2 ASD risk gene mutations in TD (12 of 105) as in ASD (13 of 112) toddlers, and classification based only on the presence of mutation in these risk genes performed at a chance level of 49%. By contrast, the leukocyte ensemble gene expression classifier correctly diagnostically classified 88% of TD and ASD toddlers with ASD risk gene mutations. Our ensemble ASD gene expression classifier is diagnostically predictive and replicable across different toddler ages, races, and ethnicities; out-performs a risk gene mutation classifier; and has potential for clinical translation.
自闭症谱系障碍(ASD)的诊断仍然基于行为,诊断的中位数年龄约为 52 个月,接近其在妊娠早期出现后的 5 年。由于 ASD 的遗传和临床异质性,目前还没有准确且可临床转化的早期诊断方法。在这里,我们从 240 名 ASD 和典型发育(TD)幼儿(175 名幼儿用于训练,65 名幼儿用于测试)中收集了临床、诊断和白细胞 RNA 数据。为了确定基因表达 ASD 诊断分类器,我们开发了 42840 个模型,这些模型由 3570 个基因表达特征选择集和 12 种分类方法组成。我们发现,在训练集和测试集上,有 742 个模型的 AUC-ROC 值均≥0.8。对这 742 个模型进行加权贝叶斯模型平均,得到一个具有准确性能的综合分类器模型,在训练和测试基因表达数据集中,ASD 诊断分类的 AUC-ROC 评分分别为 85-89%和 AUC-PR 评分分别为 84-92%。综合得分高于或低于 0.723(0 到 1 之间的标度)的 ASD 幼儿,具有相似的诊断和心理测量评分,但低于这个 ASD 综合评分均值的幼儿有更多的产前风险事件。综合模型特征基因涉及细胞周期、炎症/免疫反应、转录基因调控、细胞因子反应以及 PI3K-AKT、RAS 和 Wnt 信号通路。我们还在 240 名 ASD 和 TD 幼儿中的 217 名幼儿中收集了 ASD 风险基因的靶向 DNA 测序 smMIPs 数据。该 DNA 测序发现,在 TD 幼儿(105 名中的 12 名)和 ASD 幼儿(112 名中的 13 名)中,SFARI 一级和二级 ASD 风险基因的突变比例相同,而仅基于这些风险基因的突变存在进行分类的准确率为 49%。相比之下,白细胞综合 ASD 基因表达分类器正确地诊断出 88%携带 ASD 风险基因突变的 TD 和 ASD 幼儿。我们的 ASD 综合基因表达分类器具有诊断预测性,可在不同年龄、种族和族裔的幼儿中复制;优于风险基因突变分类器;并且具有临床转化的潜力。