Suppr超能文献

基于精神障碍和中间表型的多基因评分,使用机器学习对健康参与者和惊恐障碍患者进行区分。

Discrimination between healthy participants and people with panic disorder based on polygenic scores for psychiatric disorders and for intermediate phenotypes using machine learning.

机构信息

Department of Psychiatry, Gifu University Graduate School of Medicine, Gifu, Japan.

Department of General Internal Medicine, Kanazawa Medical University, Ishikawa, Japan.

出版信息

Aust N Z J Psychiatry. 2024 Jul;58(7):603-614. doi: 10.1177/00048674241242936. Epub 2024 Apr 6.

Abstract

OBJECTIVE

Panic disorder is a modestly heritable condition. Currently, diagnosis is based only on clinical symptoms; identifying objective biomarkers and a more reliable diagnostic procedure is desirable. We investigated whether people with panic disorder can be reliably diagnosed utilizing combinations of multiple polygenic scores for psychiatric disorders and their intermediate phenotypes, compared with single polygenic score approaches, by applying specific machine learning techniques.

METHODS

Polygenic scores for 48 psychiatric disorders and intermediate phenotypes based on large-scale genome-wide association studies ( = 7556-1,131,881) were calculated for people with panic disorder ( = 718) and healthy controls ( = 1717). Discrimination between people with panic disorder and healthy controls was based on the 48 polygenic scores using five methods for classification: logistic regression, neural networks, quadratic discriminant analysis, random forests and a support vector machine. Differences in discrimination accuracy (area under the curve) due to an increased number of polygenic score combinations and differences in the accuracy across five classifiers were investigated.

RESULTS

All five classifiers performed relatively well for distinguishing people with panic disorder from healthy controls by increasing the number of polygenic scores. Of the 48 polygenic scores, the polygenic score for anxiety UK Biobank was the most useful for discrimination by the classifiers. In combinations of two or three polygenic scores, the polygenic score for anxiety UK Biobank was included as one of polygenic scores in all classifiers. When all 48 polygenic scores were used in combination, the greatest areas under the curve significantly differed among the five classifiers. Support vector machine and logistic regression had higher accuracy than quadratic discriminant analysis and random forests. For each classifier, the greatest area under the curve was 0.600 ± 0.030 for logistic regression (polygenic score combinations = 14), 0.591 ± 0.039 for neural networks ( = 9), 0.603 ± 0.033 for quadratic discriminant analysis ( = 10), 0.572 ± 0.039 for random forests ( = 25) and 0.617 ± 0.041 for support vector machine ( = 11). The greatest areas under the curve at the best polygenic score combination significantly differed among the five classifiers. Random forests had the lowest accuracy among classifiers. Support vector machine had higher accuracy than neural networks.

CONCLUSIONS

These findings suggest that increasing the number of polygenic score combinations up to approximately 10 effectively improved the discrimination accuracy and that support vector machine exhibited greater accuracy among classifiers. However, the discrimination accuracy for panic disorder, when based solely on polygenic score combinations, was found to be modest.

摘要

目的

惊恐障碍是一种中度遗传的疾病。目前,诊断仅基于临床症状;希望能确定客观的生物标志物和更可靠的诊断程序。我们通过应用特定的机器学习技术,研究了是否可以利用多种精神疾病的多基因评分及其中间表型组合,而不是单一的多基因评分方法,来可靠地诊断惊恐障碍患者。

方法

基于大规模全基因组关联研究(n = 7556-1131881),为惊恐障碍患者(n = 718)和健康对照者(n = 1717)计算了 48 种精神疾病和基于中间表型的多基因评分。使用五种分类方法(逻辑回归、神经网络、二次判别分析、随机森林和支持向量机),基于 48 个多基因评分对惊恐障碍患者和健康对照者进行区分。研究了由于增加多基因评分组合数量和五个分类器之间的准确性差异而导致的区分准确性(曲线下面积)的差异。

结果

通过增加多基因评分,所有五种分类器都能很好地区分惊恐障碍患者和健康对照者。在 48 个多基因评分中,焦虑 UK Biobank 的多基因评分是分类器最有用的。在两个或三个多基因评分的组合中,焦虑 UK Biobank 的多基因评分都包含在所有分类器的多基因评分中。当所有 48 个多基因评分组合使用时,五个分类器之间的曲线下面积差异显著。支持向量机和逻辑回归的准确性高于二次判别分析和随机森林。对于每个分类器,逻辑回归的最大曲线下面积为 0.600 ± 0.030(多基因评分组合 = 14),神经网络为 0.591 ± 0.039( = 9),二次判别分析为 0.603 ± 0.033( = 10),随机森林为 0.572 ± 0.039( = 25),支持向量机为 0.617 ± 0.041( = 11)。最佳多基因评分组合的最大曲线下面积在五个分类器之间差异显著。随机森林在分类器中的准确性最低。支持向量机的准确性高于神经网络。

结论

这些发现表明,增加多基因评分组合的数量至大约 10 个可有效提高判别准确性,且支持向量机在分类器中的准确性更高。然而,仅基于多基因评分组合的惊恐障碍的判别准确性被发现是中等的。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验