Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, 02115, USA.
Department of Mathematics, University of Hamburg, 21109, Hamburg, Germany.
Respir Res. 2023 Feb 26;24(1):63. doi: 10.1186/s12931-023-02368-8.
Asthma is a heterogeneous disease with high morbidity. Advancement in high-throughput multi-omics approaches has enabled the collection of molecular assessments at different layers, providing a complementary perspective of complex diseases. Numerous computational methods have been developed for the omics-based patient classification or disease outcome prediction. Yet, a systematic benchmarking of those methods using various combinations of omics data for the prediction of asthma development is still lacking.
We aimed to investigate the computational methods in disease status prediction using multi-omics data.
We systematically benchmarked 18 computational methods using all the 63 combinations of six omics data (GWAS, miRNA, mRNA, microbiome, metabolome, DNA methylation) collected in The Vitamin D Antenatal Asthma Reduction Trial (VDAART) cohort. We evaluated each method using standard performance metrics for each of the 63 omics combinations.
Our results indicate that overall Logistic Regression, Multi-Layer Perceptron, and MOGONET display superior performance, and the combination of transcriptional, genomic and microbiome data achieves the best prediction. Moreover, we find that including the clinical data can further improve the prediction performance for some but not all the omics combinations.
Specific omics combinations can reach the optimal prediction of asthma development in children. And certain computational methods showed superior performance than other methods.
哮喘是一种具有高发病率的异质性疾病。高通量多组学方法的进步使得能够在不同层面进行分子评估,为复杂疾病提供了互补的视角。已经开发了许多基于组学的患者分类或疾病结果预测的计算方法。然而,仍然缺乏使用各种组学数据组合对哮喘发展进行预测的这些方法的系统基准测试。
我们旨在研究使用多组学数据进行疾病状态预测的计算方法。
我们使用在维生素 D 产前哮喘减少试验 (VDAART) 队列中收集的六个组学数据(GWAS、miRNA、mRNA、微生物组、代谢组、DNA 甲基化)的 63 种组合,系统地对 18 种计算方法进行了基准测试。我们使用标准性能指标评估了每种方法在 63 种组学组合中的表现。
我们的结果表明,总体而言,逻辑回归、多层感知机和 MOGONET 表现出优越的性能,转录组、基因组和微生物组数据的组合达到了最佳的预测效果。此外,我们发现包括临床数据可以进一步提高某些但不是所有组学组合的预测性能。
特定的组学组合可以达到对儿童哮喘发展的最佳预测。并且某些计算方法的表现优于其他方法。