EnANNDeep：基于集成学习的 lncRNA-蛋白质相互作用预测框架，采用自适应 k-最近邻分类器和深度模型。

EnANNDeep: An Ensemble-based lncRNA-protein Interaction Prediction Framework with Adaptive k-Nearest Neighbor Classifier and Deep Models.

机构信息

School of Computer Science, Hunan University of Technology, Zhuzhou, China.

College of Life Sciences and Chemistry, Hunan University of Technology, Zhuzhou, China.

出版信息

Interdiscip Sci. 2022 Mar;14(1):209-232. doi: 10.1007/s12539-021-00483-y. Epub 2022 Jan 10.

DOI:10.1007/s12539-021-00483-y

PMID:35006529

Abstract

lncRNA-protein interactions (LPIs) prediction can deepen the understanding of many important biological processes. Artificial intelligence methods have reported many possible LPIs. However, most computational techniques were evaluated mainly on one dataset, which may produce prediction bias. More importantly, they were validated only under cross validation on lncRNA-protein pairs, and did not consider the performance under cross validations on lncRNAs and proteins, thus fail to search related proteins/lncRNAs for a new lncRNA/protein. Under an ensemble learning framework (EnANNDeep) composed of adaptive k-nearest neighbor classifier and Deep models, this study focuses on systematically finding underlying linkages between lncRNAs and proteins. First, five LPI-related datasets are arranged. Second, multiple source features are integrated to depict an lncRNA-protein pair. Third, adaptive k-nearest neighbor classifier, deep neural network, and deep forest are designed to score unknown lncRNA-protein pairs, respectively. Finally, interaction probabilities from the three predictors are integrated based on a soft voting technique. In comparing to five classical LPI identification models (SFPEL, PMDKN, CatBoost, PLIPCOM, and LPI-SKF) under fivefold cross validations on lncRNAs, proteins, and LPIs, EnANNDeep computes the best average AUCs of 0.8660, 0.8775, and 0.9166, respectively, and the best average AUPRs of 0.8545, 0.8595, and 0.9054, respectively, indicating its superior LPI prediction ability. Case study analyses indicate that SNHG10 may have dense linkage with Q15717. In the ensemble framework, adaptive k-nearest neighbor classifier can separately pick the most appropriate k for each query lncRNA-protein pair. More importantly, deep models including deep neural network and deep forest can effectively learn the representative features of lncRNAs and proteins.

摘要

lncRNA-蛋白质相互作用（LPIs）预测可以加深对许多重要生物过程的理解。人工智能方法已经报道了许多可能的 LPIs。然而，大多数计算技术主要在一个数据集上进行评估，这可能会产生预测偏差。更重要的是，它们仅在 lncRNA-蛋白质对的交叉验证下进行验证，并且没有考虑在 lncRNA 和蛋白质的交叉验证下的性能，因此无法为新的 lncRNA/蛋白质搜索相关的蛋白质/lncRNA。在由自适应 k-最近邻分类器和 Deep 模型组成的集成学习框架（EnANNDeep）下，本研究重点系统地寻找 lncRNA 和蛋白质之间的潜在联系。首先，安排了五个与 LPI 相关的数据集。其次，整合多种源特征来描述 lncRNA-蛋白质对。第三，分别设计自适应 k-最近邻分类器、深度神经网络和深度森林来评分未知的 lncRNA-蛋白质对。最后，基于软投票技术整合来自三个预测器的交互概率。在五倍交叉验证下比较五个经典的 LPI 识别模型（SFPEL、PMDKN、CatBoost、PLIPCOM 和 LPI-SKF）在 lncRNA、蛋白质和 LPIs 上的表现，EnANNDeep 计算出最佳平均 AUCs 分别为 0.8660、0.8775 和 0.9166，最佳平均 AUPRs 分别为 0.8545、0.8595 和 0.9054，表明其具有优越的 LPI 预测能力。案例研究分析表明，SNHG10 可能与 Q15717 有密集的联系。在集成框架中，自适应 k-最近邻分类器可以为每个查询 lncRNA-蛋白质对分别选择最合适的 k。更重要的是，包括深度神经网络和深度森林在内的深度模型可以有效地学习 lncRNA 和蛋白质的代表性特征。

相似文献

EnANNDeep: An Ensemble-based lncRNA-protein Interaction Prediction Framework with Adaptive k-Nearest Neighbor Classifier and Deep Models.EnANNDeep：基于集成学习的 lncRNA-蛋白质相互作用预测框架，采用自适应 k-最近邻分类器和深度模型。

Interdiscip Sci. 2022 Mar;14(1):209-232. doi: 10.1007/s12539-021-00483-y. Epub 2022 Jan 10.

LPI-HyADBS: a hybrid framework for lncRNA-protein interaction prediction integrating feature selection and classification.LPI-HyADBS：一种集成特征选择和分类的 lncRNA-蛋白质相互作用预测的混合框架。

BMC Bioinformatics. 2021 Nov 26;22(1):568. doi: 10.1186/s12859-021-04485-x.

LPI-deepGBDT: a multiple-layer deep framework based on gradient boosting decision trees for lncRNA-protein interaction identification.LPI-deepGBDT：基于梯度提升决策树的多层深度框架，用于 lncRNA-蛋白质相互作用识别。

BMC Bioinformatics. 2021 Oct 4;22(1):479. doi: 10.1186/s12859-021-04399-8.

LPI-EnEDT: an ensemble framework with extra tree and decision tree classifiers for imbalanced lncRNA-protein interaction data classification.LPI-EnEDT：一种用于不平衡长链非编码RNA-蛋白质相互作用数据分类的集成框架，包含额外树和决策树分类器。

BioData Min. 2021 Dec 3;14(1):50. doi: 10.1186/s13040-021-00277-4.

A novel lncRNA-protein interaction prediction method based on deep forest with cascade forest structure.基于级联森林结构的深度森林新型长链非编码 RNA-蛋白质相互作用预测方法。

Sci Rep. 2021 Sep 23;11(1):18881. doi: 10.1038/s41598-021-98277-1.

Finding lncRNA-Protein Interactions Based on Deep Learning With Dual-Net Neural Architecture.基于双网络神经架构深度学习的长链非编码RNA-蛋白质相互作用研究

IEEE/ACM Trans Comput Biol Bioinform. 2022 Nov-Dec;19(6):3456-3468. doi: 10.1109/TCBB.2021.3116232. Epub 2022 Dec 8.

SFPEL-LPI: Sequence-based feature projection ensemble learning for predicting LncRNA-protein interactions.SFPEL-LPI：基于序列的特征投影集成学习预测 LncRNA-蛋白质相互作用。

PLoS Comput Biol. 2018 Dec 11;14(12):e1006616. doi: 10.1371/journal.pcbi.1006616. eCollection 2018 Dec.

LPI-SKMSC: Predicting LncRNA-Protein Interactions with Segmented k-mer Frequencies and Multi-space Clustering.LPI-SKMSC：基于分段 k--mer 频率和多空间聚类的长链非编码 RNA-蛋白质相互作用预测。

Interdiscip Sci. 2024 Jun;16(2):378-391. doi: 10.1007/s12539-023-00598-4. Epub 2024 Jan 11.

RLF-LPI: An ensemble learning framework using sequence information for predicting lncRNA-protein interaction based on AE-ResLSTM and fuzzy decision.RLF-LPI：一种基于 AE-ResLSTM 和模糊决策的利用序列信息进行 lncRNA-蛋白质相互作用预测的集成学习框架。

Math Biosci Eng. 2022 Mar 11;19(5):4749-4764. doi: 10.3934/mbe.2022222.

Predicting potential lncRNA biomarkers for lung cancer and neuroblastoma based on an ensemble of a deep neural network and LightGBM.基于深度神经网络和LightGBM集成模型预测肺癌和神经母细胞瘤的潜在长链非编码RNA生物标志物

Front Genet. 2023 Aug 16;14:1238095. doi: 10.3389/fgene.2023.1238095. eCollection 2023.

引用本文的文献

LncPTPred: predicting lncRNA-protein interaction based on crosslinking and immunoprecipitation (CLIP-Seq) data.LncPTPred：基于交联免疫沉淀（CLIP-Seq）数据预测长链非编码RNA与蛋白质的相互作用

Brief Bioinform. 2025 Jul 2;26(4). doi: 10.1093/bib/bbaf432.

MFH-LPI: based on multi-view similarity networks fusion and hypergraph learning for long non-coding RNA-protein interactions prediction.MFH-LPI：基于多视图相似性网络融合和超图学习的长链非编码RNA-蛋白质相互作用预测

BMC Genomics. 2025 Jul 1;26(1):597. doi: 10.1186/s12864-025-11774-9.

MLWNNR: LncRNA-Disease Association Prediction with Multi-Kernel Learning-Driven Weighted Nuclear Norm Regularization.MLWNNR：基于多核学习驱动的加权核范数正则化的长链非编码RNA-疾病关联预测

Interdiscip Sci. 2025 Jun 23. doi: 10.1007/s12539-025-00717-3.

Negative sampling strategies impact the prediction of scale-free biomolecular network interactions with machine learning.负采样策略会影响利用机器学习对无标度生物分子网络相互作用的预测。

BMC Biol. 2025 May 9;23(1):123. doi: 10.1186/s12915-025-02231-w.

Inter-view contrastive learning and miRNA fusion for lncRNA-protein interaction prediction in heterogeneous graphs.用于异构图中长链非编码RNA-蛋白质相互作用预测的访谈对比学习与微小RNA融合

Brief Bioinform. 2025 Mar 4;26(2). doi: 10.1093/bib/bbaf148.

NPI-HGNN: A Heterogeneous Graph Neural Network-Based Approach for Predicting ncRNA-Protein Interactions.NPI-HGNN：一种基于异构图神经网络的预测非编码RNA-蛋白质相互作用的方法。

Interdiscip Sci. 2025 Feb 21. doi: 10.1007/s12539-025-00689-4.

BEROLECMI: a novel prediction method to infer circRNA-miRNA interaction from the role definition of molecular attributes and biological networks.BEROLECMI：一种从分子属性和生物网络的角色定义推断 circRNA-miRNA 相互作用的新预测方法。

BMC Bioinformatics. 2024 Aug 10;25(1):264. doi: 10.1186/s12859-024-05891-7.

BioPrediction-RPI: Democratizing the prediction of interaction between non-coding RNA and protein with end-to-end machine learning.生物预测-RPI：通过端到端机器学习实现非编码RNA与蛋白质相互作用预测的普及。

Comput Struct Biotechnol J. 2024 May 22;23:2267-2276. doi: 10.1016/j.csbj.2024.05.031. eCollection 2024 Dec.

Adap-BDCM: Adaptive Bilinear Dynamic Cascade Model for Classification Tasks on CNV Datasets.Adap-BDCM：用于 CNV 数据集分类任务的自适应双线性动态级联模型。

Interdiscip Sci. 2024 Dec;16(4):1019-1037. doi: 10.1007/s12539-024-00635-w. Epub 2024 May 17.

GEnDDn: An lncRNA-Disease Association Identification Framework Based on Dual-Net Neural Architecture and Deep Neural Network.GEnDDn：一种基于双网络神经架构和深度神经网络的 lncRNA-疾病关联识别框架。

Interdiscip Sci. 2024 Jun;16(2):418-438. doi: 10.1007/s12539-024-00619-w. Epub 2024 May 11.

本文引用的文献

DLpTCR: an ensemble deep learning framework for predicting immunogenic peptide recognized by T cell receptor.DLpTCR：一种用于预测 T 细胞受体识别的免疫原性肽的集成深度学习框架。

Brief Bioinform. 2021 Nov 5;22(6). doi: 10.1093/bib/bbab335.

DeepLGP: a novel deep learning method for prioritizing lncRNA target genes.DeepLGP：一种用于优先化 lncRNA 靶基因的新型深度学习方法。

Bioinformatics. 2020 Aug 15;36(16):4466-4472. doi: 10.1093/bioinformatics/btaa428.

Multi-feature fusion for deep learning to predict plant lncRNA-protein interaction.深度学习的多特征融合预测植物 lncRNA-蛋白质相互作用。

Genomics. 2020 Sep;112(5):2928-2936. doi: 10.1016/j.ygeno.2020.05.005. Epub 2020 May 11.

Projection-Based Neighborhood Non-Negative Matrix Factorization for lncRNA-Protein Interaction Prediction.基于投影的邻域非负矩阵分解用于lncRNA-蛋白质相互作用预测

Front Genet. 2019 Nov 20;10:1148. doi: 10.3389/fgene.2019.01148. eCollection 2019.

Deciphering interaction fingerprints from protein molecular surfaces using geometric deep learning.利用几何深度学习破译蛋白质分子表面的相互作用指纹。

Nat Methods. 2020 Feb;17(2):184-192. doi: 10.1038/s41592-019-0666-6. Epub 2019 Dec 9.

An efficient approach based on multi-sources information to predict circRNA-disease associations using deep convolutional neural network.基于多源信息的深度学习卷积神经网络预测 circRNA 疾病关联的有效方法。

Bioinformatics. 2020 Jul 1;36(13):4038-4046. doi: 10.1093/bioinformatics/btz825.

A path-based computational model for long non-coding RNA-protein interaction prediction.基于通路的长非编码 RNA-蛋白质相互作用预测计算模型。

Genomics. 2020 Mar;112(2):1754-1760. doi: 10.1016/j.ygeno.2019.09.018. Epub 2019 Oct 19.

SNHG12: An LncRNA as a Potential Therapeutic Target and Biomarker for Human Cancer.SNHG12：一种作为人类癌症潜在治疗靶点和生物标志物的长链非编码RNA

Front Oncol. 2019 Sep 18;9:901. doi: 10.3389/fonc.2019.00901. eCollection 2019.

GkmExplain: fast and accurate interpretation of nonlinear gapped k-mer SVMs.GkmExplain：快速准确地解释非线性缺口 k-mer SVM。

Bioinformatics. 2019 Jul 15;35(14):i173-i182. doi: 10.1093/bioinformatics/btz322.

Integrating thermodynamic and sequence contexts improves protein-RNA binding prediction.整合热力学和序列背景可提高蛋白质-RNA 结合预测。

PLoS Comput Biol. 2019 Sep 4;15(9):e1007283. doi: 10.1371/journal.pcbi.1007283. eCollection 2019 Sep.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

EnANNDeep：基于集成学习的 lncRNA-蛋白质相互作用预测框架，采用自适应 k-最近邻分类器和深度模型。

EnANNDeep: An Ensemble-based lncRNA-protein Interaction Prediction Framework with Adaptive k-Nearest Neighbor Classifier and Deep Models.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献