IEEE J Biomed Health Inform. 2024 Nov;28(11):6983-6996. doi: 10.1109/JBHI.2024.3439713. Epub 2024 Nov 6.
Multi-omics integration has demonstrated promising performance in complex disease prediction. However, existing research typically focuses on maximizing prediction accuracy, while often neglecting the essential task of discovering meaningful biomarkers. This issue is particularly important in biomedicine, as molecules often interact rather than function individually to influence disease outcomes. To this end, we propose a two-phase framework named GREMI to assist multi-omics classification and explanation. In the prediction phase, we propose to improve prediction performance by employing a graph attention architecture on sample-wise co-functional networks to incorporate biomolecular interaction information for enhanced feature representation, followed by the integration of a joint-late mixed strategy and the true-class-probability block to adaptively evaluate classification confidence at both feature and omics levels. In the interpretation phase, we propose a multi-view approach to explain disease outcomes from the interaction module perspective, providing a more intuitive understanding and biomedical rationale. We incorporate Monte Carlo tree search (MCTS) to explore local-view subgraphs and pinpoint modules that highly contribute to disease characterization from the global-view. Extensive experiments demonstrate that the proposed framework outperforms state-of-the-art methods in seven different classification tasks, and our model effectively addresses data mutual interference when the number of omics types increases. We further illustrate the functional- and disease-relevance of the identified modules, as well as validate the classification performance of discovered modules using an independent cohort.
多组学整合在复杂疾病预测中表现出了很有前景的性能。然而,现有的研究通常侧重于最大化预测准确性,而往往忽略了发现有意义的生物标志物这一基本任务。在生物医学中,这个问题尤为重要,因为分子通常相互作用而不是单独作用来影响疾病结果。为此,我们提出了一个名为 GREMI 的两阶段框架,以协助多组学分类和解释。在预测阶段,我们建议通过在样本协功能网络上使用图注意力架构来提高预测性能,从而整合生物分子相互作用信息以增强特征表示,然后集成联合后期混合策略和真实类概率块,以便在特征和组学水平上自适应地评估分类置信度。在解释阶段,我们从交互模块的角度提出了一种多视图方法来解释疾病结果,提供了更直观的理解和生物医学依据。我们将蒙特卡洛树搜索(MCTS)纳入其中,以从全局视图探索局部视图子图,并确定对疾病特征具有高度贡献的模块。广泛的实验表明,所提出的框架在七个不同的分类任务中优于最先进的方法,并且当组学类型数量增加时,我们的模型可以有效地解决数据相互干扰的问题。我们进一步说明了所确定模块的功能和疾病相关性,并使用独立队列验证了发现模块的分类性能。