College of Computer Science and Technology (College of Data Science), Taiyuan University of Technology, Taiyuan, 030600, China.
College of Computer Science and Technology, Taiyuan Normal University, Taiyuan, 030619, China.
Interdiscip Sci. 2024 Dec;16(4):1019-1037. doi: 10.1007/s12539-024-00635-w. Epub 2024 May 17.
Copy number variation (CNV) is an essential genetic driving factor of cancer formation and progression, making intelligent classification based on CNV feasible. However, there are a few challenges in the current machine learning and deep learning methods, such as the design of base classifier combination schemes in ensemble methods and the selection of layers of neural networks, which often result in low accuracy. Therefore, an adaptive bilinear dynamic cascade model (Adap-BDCM) is developed to further enhance the accuracy and applicability of these methods for intelligent classification on CNV datasets. In this model, a feature selection module is introduced to mitigate the interference of redundant information, and a bilinear model based on the gated attention mechanism is proposed to extract more beneficial deep fusion features. Furthermore, an adaptive base classifier selection scheme is designed to overcome the difficulty of manually designing base classifier combinations and enhance the applicability of the model. Lastly, a novel feature fusion scheme with an attribute recall submodule is constructed, effectively avoiding getting stuck in local solutions and missing some valuable information. Numerous experiments have demonstrated that our Adap-BDCM model exhibits optimal performance in cancer classification, stage prediction, and recurrence on CNV datasets. This study can assist physicians in making diagnoses faster and better.
拷贝数变异 (CNV) 是癌症形成和发展的重要遗传驱动因素,使得基于 CNV 的智能分类成为可能。然而,当前的机器学习和深度学习方法存在一些挑战,例如集成方法中基分类器组合方案的设计和神经网络的层数选择,这些通常会导致准确率较低。因此,我们开发了一种自适应双线性动态级联模型 (Adap-BDCM),以进一步提高这些方法在 CNV 数据集上进行智能分类的准确性和适用性。在该模型中,引入了特征选择模块以减轻冗余信息的干扰,并提出了一种基于门控注意力机制的双线性模型,以提取更多有益的深度融合特征。此外,设计了自适应基分类器选择方案,以克服手动设计基分类器组合的困难并增强模型的适用性。最后,构建了一种具有属性召回子模块的新颖特征融合方案,有效避免了陷入局部最优解和丢失一些有价值信息的问题。大量实验表明,我们的 Adap-BDCM 模型在 CNV 数据集上的癌症分类、阶段预测和复发方面表现出最优性能。这项研究可以帮助医生更快更好地做出诊断。