Yao Dengju, Zhang Binbin, Zhan Xiaojuan, Zhang Bo, Li Xiang Kui
School of Computer Science and Technology, Harbin University of Science and Technology, Harbin 150080, China.
College of Computer Science and Technology, Heilongjiang Institute of Technology, Harbin 150050, China.
ACS Omega. 2024 Jul 30;9(32):35100-35112. doi: 10.1021/acsomega.4c05365. eCollection 2024 Aug 13.
Identifying the associations between long noncoding RNAs (lncRNAs) and disease is critical for disease prevention, diagnosis and treatment. However, conducting wet experiments to discover these associations is time-consuming and costly. Therefore, computational modeling for predicting lncRNA-disease associations (LDAs) has become an important alternative. To enhance the accuracy of LDAs prediction and alleviate the issue of node feature oversmoothing when exploring the potential features of nodes using graph neural networks, we introduce DPFELDA, a dual-path feature extraction network that leverages the integration of information from multiple sources to predict LDA. Initially, we establish a dual-view structure of lncRNAs and disease and a heterogeneous network of lncRNA-disease-microRNA (miRNA) interactions. Subsequently, features are extracted using a dual-path feature extraction network. In particular, we employ a combination of a graph convolutional network, a convolutional block attention module, and a node aggregation layer to perform multilayer topology feature extraction for the dual-view structure of lncRNAs and diseases. Additionally, we utilize a Transformer model to construct the node topology feature residual network for obtaining node-specific features in heterogeneous networks. Finally, XGBoost is employed for LDA prediction. The experimental results demonstrate that DPFELDA outperforms the benchmark model on various benchmark data sets. In the course of model exploration, it becomes evident that DPFELDA successfully alleviates the issue of node feature oversmoothing induced by graph-based learning. Ablation experiments confirm the effectiveness of the innovative module, and a case study substantiates the accuracy of DPFELDA model in predicting novel LDAs for characteristic diseases.
识别长链非编码RNA(lncRNA)与疾病之间的关联对于疾病的预防、诊断和治疗至关重要。然而,通过湿实验来发现这些关联既耗时又昂贵。因此,用于预测lncRNA-疾病关联(LDA)的计算建模已成为一种重要的替代方法。为了提高LDA预测的准确性,并在使用图神经网络探索节点潜在特征时缓解节点特征过度平滑的问题,我们引入了DPFELDA,一种双路径特征提取网络,它利用多源信息的整合来预测LDA。首先,我们建立了lncRNA和疾病的双视图结构以及lncRNA-疾病- microRNA(miRNA)相互作用的异质网络。随后,使用双路径特征提取网络提取特征。具体而言,我们采用图卷积网络、卷积块注意力模块和节点聚合层的组合,对lncRNA和疾病的双视图结构进行多层拓扑特征提取。此外,我们利用Transformer模型构建节点拓扑特征残差网络,以在异质网络中获得节点特定特征。最后,使用XGBoost进行LDA预测。实验结果表明,DPFELDA在各种基准数据集上优于基准模型。在模型探索过程中,很明显DPFELDA成功缓解了基于图学习引起的节点特征过度平滑问题。消融实验证实了创新模块的有效性,案例研究证实了DPFELDA模型在预测特征疾病新LDA方面的准确性。