Jia Liye, Jiang Liancheng, Yue Junhong, Hao Fang, Wu Yongfei, Liu Xilin
IEEE/ACM Trans Comput Biol Bioinform. 2024 Nov-Dec;21(6):2568-2579. doi: 10.1109/TCBB.2024.3486742. Epub 2024 Dec 10.
The stage prediction of kidney renal clear cell carcinoma (KIRC) is important for the diagnosis, personalized treatment, and prognosis of patients. Many prediction methods have been proposed, but most of them are based on unimodal gene data, and their accuracy is difficult to further improve. Therefore, we propose a novel multi-weighted dynamic cascade forest based on the bilinear feature extraction (MLW-BFECF) model for stage prediction of KIRC using multimodal gene data (RNA-seq, CNA, and methylation). The proposed model utilizes a dynamic cascade framework with shuffle layers to prevent early degradation of the model. In each cascade layer, a voting technique based on three gene selection algorithms is first employed to effectively retain gene features more relevant to KIRC and eliminate redundant information in gene features. Then, two new bilinear models based on the gated attention mechanism are proposed to better extract new intra-modal and inter-modal gene features; Finally, based on the idea of the bagging, a multi-weighted ensemble forest classifiers module is proposed to extract and fuse probabilistic features of the three-modal gene data. A series of experiments demonstrate that the MLW-BFECF model based on the three-modal KIRC dataset achieves the highest prediction performance with an accuracy of 88.9 %.
肾透明细胞癌(KIRC)的分期预测对于患者的诊断、个性化治疗及预后至关重要。已提出多种预测方法,但大多基于单峰基因数据,其准确性难以进一步提高。因此,我们提出一种基于双线性特征提取的新型多加权动态级联森林(MLW - BFECF)模型,用于利用多模态基因数据(RNA测序、拷贝数变异和甲基化)对KIRC进行分期预测。所提出的模型采用带有随机层的动态级联框架,以防止模型过早退化。在每个级联层中,首先采用基于三种基因选择算法的投票技术,有效保留与KIRC更相关的基因特征并消除基因特征中的冗余信息。然后,提出两种基于门控注意力机制 的新型双线性模型,以更好地提取新的模态内和模态间基因特征;最后,基于装袋法的思想,提出多加权集成森林分类器模块,以提取并融合三模态基因数据的概率特征。一系列实验表明,基于三模态KIRC数据集的MLW - BFECF模型实现了最高的预测性能,准确率达88.9%。