Suppr超能文献

Pathformer:一种基于生物通路的Transformer,用于使用多组学数据进行疾病诊断和预后。

Pathformer: a biological pathway informed transformer for disease diagnosis and prognosis using multi-omics data.

机构信息

MOE Key Laboratory of Bioinformatics, Center for Synthetic and Systems Biology, School of Life Sciences, Tsinghua University, Beijing 100084, China.

Institute for Precision Medicine, Tsinghua University, Beijing 100084, China.

出版信息

Bioinformatics. 2024 May 2;40(5). doi: 10.1093/bioinformatics/btae316.

Abstract

MOTIVATION

Multi-omics data provide a comprehensive view of gene regulation at multiple levels, which is helpful in achieving accurate diagnosis of complex diseases like cancer. However, conventional integration methods rarely utilize prior biological knowledge and lack interpretability.

RESULTS

To integrate various multi-omics data of tissue and liquid biopsies for disease diagnosis and prognosis, we developed a biological pathway informed Transformer, Pathformer. It embeds multi-omics input with a compacted multi-modal vector and a pathway-based sparse neural network. Pathformer also leverages criss-cross attention mechanism to capture the crosstalk between different pathways and modalities. We first benchmarked Pathformer with 18 comparable methods on multiple cancer datasets, where Pathformer outperformed all the other methods, with an average improvement of 6.3%-14.7% in F1 score for cancer survival prediction, 5.1%-12% for cancer stage prediction, and 8.1%-13.6% for cancer drug response prediction. Subsequently, for cancer prognosis prediction based on tissue multi-omics data, we used a case study to demonstrate the biological interpretability of Pathformer by identifying key pathways and their biological crosstalk. Then, for cancer early diagnosis based on liquid biopsy data, we used plasma and platelet datasets to demonstrate Pathformer's potential of clinical applications in cancer screening. Moreover, we revealed deregulation of interesting pathways (e.g. scavenger receptor pathway) and their crosstalk in cancer patients' blood, providing potential candidate targets for cancer microenvironment study.

AVAILABILITY AND IMPLEMENTATION

Pathformer is implemented and freely available at https://github.com/lulab/Pathformer.

摘要

动机

多组学数据提供了在多个层次上对基因调控的全面了解,这有助于实现对癌症等复杂疾病的准确诊断。然而,传统的整合方法很少利用先验的生物学知识,并且缺乏可解释性。

结果

为了整合组织和液体活检的各种多组学数据以进行疾病诊断和预后,我们开发了一种基于生物学途径的信息Transformer,即 Pathformer。它使用紧凑的多模态向量和基于途径的稀疏神经网络来嵌入多组学输入。Pathformer 还利用交叉注意力机制来捕捉不同途径和模态之间的相互作用。我们首先在多个癌症数据集上使用 18 种可比方法对 Pathformer 进行基准测试,其中 Pathformer 优于所有其他方法,在癌症生存预测的 F1 得分方面平均提高了 6.3%-14.7%,在癌症分期预测方面提高了 5.1%-12%,在癌症药物反应预测方面提高了 8.1%-13.6%。随后,对于基于组织多组学数据的癌症预后预测,我们使用一个案例研究来通过识别关键途径及其生物学相互作用来展示 Pathformer 的生物学可解释性。然后,对于基于液体活检数据的癌症早期诊断,我们使用血浆和血小板数据集来展示 Pathformer 在癌症筛查中的临床应用潜力。此外,我们揭示了癌症患者血液中有趣途径(如清道夫受体途径)及其相互作用的失调,为癌症微环境研究提供了潜在的候选靶点。

可用性和实现

Pathformer 已实现并可在 https://github.com/lulab/Pathformer 上免费获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/77cf/11139513/e69515739c1c/btae316f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验