Tan Jie, Xie Jiancong, Huang Jiarong, Deng Weizhen, Chai Hua, Yang Yuedong
School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou, China.
Guangzhou KingMed Center for Clinical Laboratory Co. Ltd., Guangzhou, China.
Comput Struct Biotechnol J. 2024 Jul 24;24:523-532. doi: 10.1016/j.csbj.2024.07.019. eCollection 2024 Dec.
Diffuse large B-cell lymphoma (DLBCL) is the most common subtype of non-Hodgkin lymphoma (NHL) and is characterized by high heterogeneity. Assessment of its prognosis and genetic subtyping hold significant clinical implications. However, existing DLBCL prognostic models are mainly based on transcriptomic profiles, while genetic variation detection is more commonly used in clinical practice. In addition, current clustering-based subtyping methods mostly focus on genes with high mutation frequencies, providing insufficient explanations for the heterogeneity of DLBCL. Here, we proposed VNNSurv (https://bio-web1.nscc-gz.cn/app/VNNSurv), a survival model for DLBCL patients based on a biologically informed visible neural network (VNN). VNNSurv achieved an average C-index of 0.72 on the cross-validation set (HMRN cohort, n = 928), outperforming the baseline methods. The remarkable interpretability of VNNSurv facilitated the identification of the most impactful genes and the underlying pathways through which they act on patient outcomes. When only the 30 highest-impact genes were used as genetic input, the overall performance of VNNSurv improved, and a C-index of 0.70 was achieved on the external TCGA cohort (n = 48). Leveraging these high-impact genes, including 16 genes with low (<5 %) alteration frequencies, we devised a genetic-based prognostic index (GPI) for risk stratification and a subtype identification method. We stratified the patient group according to the International Prognostic Index (IPI) into three risk grades with significant prognostic differences. Furthermore, the defined subtypes exhibited greater prognostic consistency than clustering-based methods. Broadly, VNNSurv is a valuable DLBCL survival model. Its high interpretability has significant value for precision medicine, and its framework is scalable to other diseases.
弥漫性大B细胞淋巴瘤(DLBCL)是非霍奇金淋巴瘤(NHL)最常见的亚型,具有高度异质性。评估其预后和基因分型具有重要的临床意义。然而,现有的DLBCL预后模型主要基于转录组谱,而基因变异检测在临床实践中应用更为普遍。此外,当前基于聚类的分型方法大多关注高突变频率的基因,对DLBCL的异质性解释不足。在此,我们提出了VNNSurv(https://bio-web1.nscc-gz.cn/app/VNNSurv),一种基于生物学信息可见神经网络(VNN)的DLBCL患者生存模型。VNNSurv在交叉验证集(HMRN队列,n = 928)上的平均C指数为0.72,优于基线方法。VNNSurv的显著可解释性有助于识别最具影响力的基因及其作用于患者预后的潜在途径。当仅将30个最具影响力的基因用作基因输入时,VNNSurv的整体性能有所提高,在外部TCGA队列(n = 48)上实现了0.70的C指数。利用这些高影响力基因,包括16个改变频率低(<5%)的基因,我们设计了一种基于基因的预后指数(GPI)用于风险分层和一种亚型识别方法。我们根据国际预后指数(IPI)将患者组分为三个预后差异显著的风险等级。此外,所定义的亚型比基于聚类的方法表现出更大的预后一致性。总体而言,VNNSurv是一个有价值的DLBCL生存模型。其高可解释性对精准医学具有重要价值,其框架可扩展到其他疾病。