• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

仅使用正例和未标记示例进行蛋白质-RNA相互作用的计算预测。

Computationally predicting protein-RNA interactions using only positive and unlabeled examples.

作者信息

Cheng Zhanzhan, Zhou Shuigeng, Guan Jihong

机构信息

Shanghai Key Lab of Intelligent Information Processing and School of Computer Science, Fudan University, 220 Handan Road, Shanghai 200433, China.

出版信息

J Bioinform Comput Biol. 2015 Jun;13(3):1541005. doi: 10.1142/S021972001541005X. Epub 2015 Feb 8.

DOI:10.1142/S021972001541005X
PMID:25790785
Abstract

Protein-RNA interactions (PRIs) are considerably important in a wide variety of cellular processes, ranging from transcriptional and post-transcriptional regulations of gene expression to the active defense of host against virus. With the development of high throughput technology, large amounts of PRI information is available for computationally predicting unknown PRIs. In recent years, a number of computational methods for predicting PRIs have been developed in the literature, which usually artificially construct negative samples based on verified nonredundant datasets of PRIs to train classifiers. However, such negative samples are not real negative samples, some even may be unknown positive samples. Consequently, the classifiers trained with such training datasets cannot achieve satisfactory prediction performance. In this paper, we propose a novel method PRIPU that employs biased-support vector machine (SVM) for predicting Protein-RNA Interactions using only Positive and Unlabeled examples. To the best of our knowledge, this is the first work that predicts PRIs using only positive and unlabeled samples. We first collect known PRIs as our benchmark datasets and extract sequence-based features to represent each PRI. To reduce the dimension of feature vectors for lowering computational cost, we select a subset of features by a filter-based feature selection method. Then, biased-SVM is employed to train prediction models with different PRI datasets. To evaluate the new method, we also propose a new performance measure called explicit positive recall (EPR), which is specifically suitable for the task of learning positive and unlabeled data. Experimental results over three datasets show that our method not only outperforms four existing methods, but also is able to predict unknown PRIs. Source code, datasets and related documents of PRIPU are available at: http://admis.fudan.edu.cn/projects/pripu.htm .

摘要

蛋白质 - RNA相互作用(PRIs)在各种各样的细胞过程中相当重要,范围从基因表达的转录和转录后调控到宿主对病毒的主动防御。随着高通量技术的发展,大量的PRI信息可用于通过计算预测未知的PRIs。近年来,文献中已经开发了许多用于预测PRIs的计算方法,这些方法通常基于经过验证的PRIs非冗余数据集人工构建负样本以训练分类器。然而,这样的负样本并不是真正的负样本,有些甚至可能是未知的正样本。因此,使用这样的训练数据集训练的分类器无法实现令人满意的预测性能。在本文中,我们提出了一种新颖的方法PRIPU,该方法采用有偏支持向量机(SVM),仅使用正样本和未标记样本预测蛋白质 - RNA相互作用。据我们所知,这是第一项仅使用正样本和未标记样本预测PRIs的工作。我们首先收集已知的PRIs作为我们的基准数据集,并提取基于序列的特征来表示每个PRI。为了降低特征向量的维度以降低计算成本,我们通过基于过滤器的特征选择方法选择特征子集。然后,使用有偏SVM用不同的PRI数据集训练预测模型。为了评估新方法,我们还提出了一种称为显式正召回率(EPR)的新性能度量,它特别适用于学习正样本和未标记数据的任务。在三个数据集上的实验结果表明,我们的方法不仅优于四种现有方法,而且能够预测未知的PRIs。PRIPU的源代码、数据集和相关文档可在以下网址获取:http://admis.fudan.edu.cn/projects/pripu.htm 。

相似文献

1
Computationally predicting protein-RNA interactions using only positive and unlabeled examples.仅使用正例和未标记示例进行蛋白质-RNA相互作用的计算预测。
J Bioinform Comput Biol. 2015 Jun;13(3):1541005. doi: 10.1142/S021972001541005X. Epub 2015 Feb 8.
2
Effectively Identifying Compound-Protein Interactions by Learning from Positive and Unlabeled Examples.通过从正例和无标签样例中学习来有效识别化合物-蛋白质相互作用。
IEEE/ACM Trans Comput Biol Bioinform. 2018 Nov-Dec;15(6):1832-1843. doi: 10.1109/TCBB.2016.2570211. Epub 2016 May 18.
3
Selecting high-quality negative samples for effectively predicting protein-RNA interactions.选择高质量的阴性样本以有效预测蛋白质-RNA相互作用。
BMC Syst Biol. 2017 Mar 14;11(Suppl 2):9. doi: 10.1186/s12918-017-0390-8.
4
Predicting RNA-protein interactions using only sequence information.仅使用序列信息预测 RNA-蛋白质相互作用。
BMC Bioinformatics. 2011 Dec 22;12:489. doi: 10.1186/1471-2105-12-489.
5
RVMAB: Using the Relevance Vector Machine Model Combined with Average Blocks to Predict the Interactions of Proteins from Protein Sequences.RVMAB:使用相关向量机模型结合平均块从蛋白质序列预测蛋白质相互作用
Int J Mol Sci. 2016 May 18;17(5):757. doi: 10.3390/ijms17050757.
6
Improving protein-protein interactions prediction accuracy using protein evolutionary information and relevance vector machine model.利用蛋白质进化信息和相关向量机模型提高蛋白质-蛋白质相互作用预测准确性
Protein Sci. 2016 Oct;25(10):1825-33. doi: 10.1002/pro.2991. Epub 2016 Aug 9.
7
Predicting protein-protein interactions using high-quality non-interacting pairs.利用高质量的非互作蛋白对预测蛋白质相互作用。
BMC Bioinformatics. 2018 Dec 31;19(Suppl 19):525. doi: 10.1186/s12859-018-2525-3.
8
Prediction of protein-protein interactions from amino acid sequences using a novel multi-scale continuous and discontinuous feature set.利用新型多尺度连续和非连续特征集从氨基酸序列预测蛋白质-蛋白质相互作用。
BMC Bioinformatics. 2014;15 Suppl 15(Suppl 15):S9. doi: 10.1186/1471-2105-15-S15-S9. Epub 2014 Dec 3.
9
Effect of Protein Repetitiveness on Protein-Protein Interaction Prediction Results Using Support Vector Machines.蛋白质重复性对使用支持向量机进行蛋白质-蛋白质相互作用预测结果的影响
J Comput Biol. 2017 Feb;24(2):183-192. doi: 10.1089/cmb.2015.0233. Epub 2016 Aug 16.
10
A New Feature Vector Based on Gene Ontology Terms for Protein-Protein Interaction Prediction.一种基于基因本体术语的用于蛋白质-蛋白质相互作用预测的新特征向量
IEEE/ACM Trans Comput Biol Bioinform. 2017 Jul-Aug;14(4):762-770. doi: 10.1109/TCBB.2016.2555304. Epub 2016 Apr 20.

引用本文的文献

1
Plasmodium vivax antigen candidate prediction improves with the addition of Plasmodium falciparum data.恶性疟原虫抗原候选预测的改进得益于恶性疟原虫数据的增加。
NPJ Syst Biol Appl. 2024 Nov 13;10(1):133. doi: 10.1038/s41540-024-00465-y.
2
Leveraging permutation testing to assess confidence in positive-unlabeled learning applied to high-dimensional biological datasets.利用排列检验评估正无标签学习在高维生物学数据集上的置信度。
BMC Bioinformatics. 2024 Jun 19;25(1):218. doi: 10.1186/s12859-024-05834-2.
3
Learning peptide properties with positive examples only.
仅通过正例学习肽的特性。
Digit Discov. 2024 Apr 19;3(5):977-986. doi: 10.1039/d3dd00218g. eCollection 2024 May 15.
4
Positive-unlabeled learning identifies vaccine candidate antigens in the malaria parasite Plasmodium falciparum.正未标记学习可识别恶性疟原虫中的疫苗候选抗原。
NPJ Syst Biol Appl. 2024 Apr 27;10(1):44. doi: 10.1038/s41540-024-00365-1.
5
Artificial intelligence methods enhance the discovery of RNA interactions.人工智能方法促进了RNA相互作用的发现。
Front Mol Biosci. 2022 Oct 7;9:1000205. doi: 10.3389/fmolb.2022.1000205. eCollection 2022.
6
Roles of Emerging RNA-Binding Activity of cGAS in Innate Antiviral Response.cGAS新兴RNA结合活性在先天性抗病毒反应中的作用
Front Immunol. 2021 Nov 26;12:741599. doi: 10.3389/fimmu.2021.741599. eCollection 2021.
7
Construction of Complex Features for Computational Predicting ncRNA-Protein Interaction.用于计算预测非编码RNA-蛋白质相互作用的复杂特征构建
Front Genet. 2019 Feb 1;10:18. doi: 10.3389/fgene.2019.00018. eCollection 2019.
8
Chaperones, Membrane Trafficking and Signal Transduction Proteins Regulate Zaire Ebola Virus trVLPs and Interact With trVLP Elements.伴侣蛋白、膜运输蛋白和信号转导蛋白调节扎伊尔埃博拉病毒trVLP并与trVLP元件相互作用。
Front Microbiol. 2018 Nov 12;9:2724. doi: 10.3389/fmicb.2018.02724. eCollection 2018.
9
The Ever-Evolving Concept of the Gene: The Use of RNA/Protein Experimental Techniques to Understand Genome Functions.不断演变的基因概念:利用RNA/蛋白质实验技术理解基因组功能。
Front Mol Biosci. 2018 Mar 6;5:20. doi: 10.3389/fmolb.2018.00020. eCollection 2018.
10
Fusing multiple protein-protein similarity networks to effectively predict lncRNA-protein interactions.融合多个蛋白质-蛋白质相似性网络以有效预测长链非编码RNA-蛋白质相互作用。
BMC Bioinformatics. 2017 Oct 16;18(Suppl 12):420. doi: 10.1186/s12859-017-1819-1.