• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

RPIPLM:通过使用监督对比学习对双塔预训练生物模型进行训练后预测非编码RNA与蛋白质的相互作用

RPIPLM: Prediction of ncRNA-protein interaction by post-training a dual-tower pretrained biological model with supervised contrastive learning.

作者信息

Liu Yiwei, Bao Ting, Yin Peng, Wang Shumin, Wang Yanbin

机构信息

Defence Industry Secrecy Examination and Certification Center, Beijing, China.

National Key Laboratory of Science and Technology on Information System Security, Beijing, China.

出版信息

PLoS One. 2025 Aug 14;20(8):e0329174. doi: 10.1371/journal.pone.0329174. eCollection 2025.

DOI:10.1371/journal.pone.0329174
PMID:40811705
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12352837/
Abstract

The field of biological research has been profoundly impacted by the emergence of biological pre-trained models, which have resulted in remarkable advancements in life sciences and medicine. However, the current landscape of biological pre-trained language models suffers from a shortcoming, i.e., their inability to grasp the intricacies of molecular interactions, such as ncRNA-protein interactions. It is in this context that our paper introduces a two-tower computational framework, termed RPIPLM, which brings forth a new paradigm for the prediction of ncRNA-protein interactions. The core of RPIPLM lies in its harnessing of the pre-trained RNA language model and protein language model to process ncRNA and protein sequences, thereby enabling the transfer of the general knowledge gained from self-supervised learning of vast data to ncRNA-protein interaction tasks. Additionally, to learn the intricate interaction patterns between RNA and protein embeddings across diverse scales, we employ a fusion of scaled dot-product self-attention mechanism and Multi-scale convolution operations on the output of the dual-tower architecture, effectively capturing both global and local information. Furthermore, we introduce supervised contrastive learning into the training of RPIPLM, enabling the model to effectively capture discriminative information by distinguishing between interacting and non-interacting samples in the learned representations. Through extensive experiments and an interpretability study, we demonstrate the effectiveness of RPIPLM and its superiority over other methods, establishing new state-of-the-art performance. RPIPLM is a powerful and scalable computational framework that holds the potential to unlock enormous insights from vast biological data, thereby accelerating the discovery of molecular interactions.

摘要

生物预训练模型的出现对生物学研究领域产生了深远影响,推动了生命科学和医学的显著进步。然而,当前生物预训练语言模型存在一个缺陷,即它们无法理解分子相互作用的复杂性,如非编码RNA(ncRNA)与蛋白质的相互作用。在这种背景下,我们的论文介绍了一种双塔计算框架,称为RPIPLM,它为ncRNA与蛋白质相互作用的预测带来了新的范式。RPIPLM的核心在于利用预训练的RNA语言模型和蛋白质语言模型来处理ncRNA和蛋白质序列,从而将从大量数据的自监督学习中获得的通用知识转移到ncRNA与蛋白质相互作用任务中。此外,为了跨不同尺度学习RNA和蛋白质嵌入之间复杂的相互作用模式,我们在双塔架构的输出上采用了缩放点积自注意力机制和多尺度卷积操作的融合,有效捕获全局和局部信息。此外,我们将监督对比学习引入RPIPLM的训练中,使模型能够通过在学习表示中区分相互作用和非相互作用样本,有效捕获判别信息。通过广泛的实验和可解释性研究,我们证明了RPIPLM的有效性及其优于其他方法的性能,创造了新的最优性能。RPIPLM是一个强大且可扩展的计算框架,有潜力从海量生物数据中解锁大量见解,从而加速分子相互作用的发现。

相似文献

1
RPIPLM: Prediction of ncRNA-protein interaction by post-training a dual-tower pretrained biological model with supervised contrastive learning.RPIPLM:通过使用监督对比学习对双塔预训练生物模型进行训练后预测非编码RNA与蛋白质的相互作用
PLoS One. 2025 Aug 14;20(8):e0329174. doi: 10.1371/journal.pone.0329174. eCollection 2025.
2
Prescription of Controlled Substances: Benefits and Risks管制药品的处方:益处与风险
3
Trajectory-Ordered Objectives for Self-Supervised Representation Learning of Temporal Healthcare Data Using Transformers: Model Development and Evaluation Study.使用Transformer进行时间序列医疗数据自监督表示学习的轨迹有序目标:模型开发与评估研究
JMIR Med Inform. 2025 Jun 4;13:e68138. doi: 10.2196/68138.
4
Short-Term Memory Impairment短期记忆障碍
5
Boundary-aware information maximization for self-supervised medical image segmentation.用于自监督医学图像分割的边界感知信息最大化
Med Image Anal. 2024 May;94:103150. doi: 10.1016/j.media.2024.103150. Epub 2024 Mar 28.
6
iACP-DPNet: a dual-pooling causal dilated convolutional network for interpretable anticancer peptide identification.iACP-DPNet:一种用于可解释抗癌肽识别的双池因果扩张卷积网络。
Funct Integr Genomics. 2025 Jul 4;25(1):147. doi: 10.1007/s10142-025-01641-x.
7
Distilling knowledge from graph neural networks trained on cell graphs to non-neural student models.从在细胞图上训练的图神经网络中提取知识,用于非神经学生模型。
Sci Rep. 2025 Aug 10;15(1):29274. doi: 10.1038/s41598-025-13697-7.
8
LOCAS: multilabel mRNA localization with supervised contrastive learning.LOCAS:基于监督对比学习的多标签mRNA定位
Brief Bioinform. 2025 Jul 2;26(4). doi: 10.1093/bib/bbaf441.
9
Comparison of Two Modern Survival Prediction Tools, SORG-MLA and METSSS, in Patients With Symptomatic Long-bone Metastases Who Underwent Local Treatment With Surgery Followed by Radiotherapy and With Radiotherapy Alone.两种现代生存预测工具 SORG-MLA 和 METSSS 在接受手术联合放疗和单纯放疗治疗有症状长骨转移患者中的比较。
Clin Orthop Relat Res. 2024 Dec 1;482(12):2193-2208. doi: 10.1097/CORR.0000000000003185. Epub 2024 Jul 23.
10
Enhancing Clinical Relevance of Pretrained Language Models Through Integration of External Knowledge: Case Study on Cardiovascular Diagnosis From Electronic Health Records.通过整合外部知识提高预训练语言模型的临床相关性:来自电子健康记录的心血管诊断案例研究
JMIR AI. 2024 Aug 6;3:e56932. doi: 10.2196/56932.

本文引用的文献

1
Regulatory non-coding RNAs: everything is possible, but what is important?调控性非编码RNA:一切皆有可能,但什么才是重要的?
Nat Methods. 2022 Oct;19(10):1156-1159. doi: 10.1038/s41592-022-01629-6.
2
Single-sequence protein structure prediction using a language model and deep learning.基于语言模型和深度学习的单序列蛋白质结构预测。
Nat Biotechnol. 2022 Nov;40(11):1617-1623. doi: 10.1038/s41587-022-01432-w. Epub 2022 Oct 3.
3
Predicting ncRNA-protein interactions based on dual graph convolutional network and pairwise learning.
基于双重图卷积网络和成对学习预测 ncRNA-蛋白质相互作用。
Brief Bioinform. 2022 Sep 20;23(5). doi: 10.1093/bib/bbac339.
4
Informative RNA base embedding for RNA structural alignment and clustering by deep representation learning.通过深度表示学习进行RNA结构比对和聚类的信息性RNA碱基嵌入
NAR Genom Bioinform. 2022 Feb 22;4(1):lqac012. doi: 10.1093/nargab/lqac012. eCollection 2022 Mar.
5
ProteinBERT: a universal deep-learning model of protein sequence and function.蛋白质 BERT:一种通用的蛋白质序列和功能深度学习模型。
Bioinformatics. 2022 Apr 12;38(8):2102-2110. doi: 10.1093/bioinformatics/btac020.
6
lncRNAfunc: a knowledgebase of lncRNA function in human cancer.lncRNAfunc:一个人类癌症中 lncRNA 功能的知识库。
Nucleic Acids Res. 2022 Jan 7;50(D1):D1295-D1306. doi: 10.1093/nar/gkab1035.
7
NPI-RGCNAE: Fast Predicting ncRNA-Protein Interactions Using the Relational Graph Convolutional Network Auto-Encoder.NPI-RGCNAE:基于关系图卷积网络自动编码器的 ncRNA-蛋白质相互作用快速预测
IEEE J Biomed Health Inform. 2022 Apr;26(4):1861-1871. doi: 10.1109/JBHI.2021.3122527. Epub 2022 Apr 14.
8
ProtTrans: Toward Understanding the Language of Life Through Self-Supervised Learning.ProtTrans:通过自监督学习理解生命语言。
IEEE Trans Pattern Anal Mach Intell. 2022 Oct;44(10):7112-7127. doi: 10.1109/TPAMI.2021.3095381. Epub 2022 Sep 14.
9
Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences.生物结构和功能源于将无监督学习扩展到 2.5 亿个蛋白质序列。
Proc Natl Acad Sci U S A. 2021 Apr 13;118(15). doi: 10.1073/pnas.2016239118.
10
NPI-GNN: Predicting ncRNA-protein interactions with deep graph neural networks.NPI-GNN:利用深度图神经网络预测 ncRNA-蛋白质相互作用。
Brief Bioinform. 2021 Sep 2;22(5). doi: 10.1093/bib/bbab051.