• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

选择用于预测蛋白质-蛋白质相互作用的负样本。

Choosing negative examples for the prediction of protein-protein interactions.

作者信息

Ben-Hur Asa, Noble William Stafford

机构信息

Department of Computer Science, Colorado State University, Fort Collins CO, USA.

出版信息

BMC Bioinformatics. 2006 Mar 20;7 Suppl 1(Suppl 1):S2. doi: 10.1186/1471-2105-7-S1-S2.

DOI:10.1186/1471-2105-7-S1-S2
PMID:16723005
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC1810313/
Abstract

The protein-protein interaction networks of even well-studied model organisms are sketchy at best, highlighting the continued need for computational methods to help direct experimentalists in the search for novel interactions. This need has prompted the development of a number of methods for predicting protein-protein interactions based on various sources of data and methodologies. The common method for choosing negative examples for training a predictor of protein-protein interactions is based on annotations of cellular localization, and the observation that pairs of proteins that have different localization patterns are unlikely to interact. While this method leads to high quality sets of non-interacting proteins, we find that this choice can lead to biased estimates of prediction accuracy, because the constraints placed on the distribution of the negative examples makes the task easier. The effects of this bias are demonstrated in the context of both sequence-based and non-sequence based features used for predicting protein-protein interactions.

摘要

即使是研究充分的模式生物,其蛋白质-蛋白质相互作用网络也充其量只是粗略的,这凸显了持续需要计算方法来帮助指导实验人员寻找新的相互作用。这种需求促使人们基于各种数据来源和方法开发了许多预测蛋白质-蛋白质相互作用的方法。为训练蛋白质-蛋白质相互作用预测器选择阴性示例的常用方法基于细胞定位注释,以及具有不同定位模式的蛋白质对不太可能相互作用的观察结果。虽然这种方法能得到高质量的非相互作用蛋白质集,但我们发现这种选择可能导致对预测准确性的偏差估计,因为对阴性示例分布施加的限制使任务变得更容易。这种偏差的影响在用于预测蛋白质-蛋白质相互作用的基于序列和非基于序列的特征的背景下都得到了证明。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b9aa/1810313/788dcfeb67af/1471-2105-7-S1-S2-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b9aa/1810313/788dcfeb67af/1471-2105-7-S1-S2-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b9aa/1810313/788dcfeb67af/1471-2105-7-S1-S2-1.jpg

相似文献

1
Choosing negative examples for the prediction of protein-protein interactions.选择用于预测蛋白质-蛋白质相互作用的负样本。
BMC Bioinformatics. 2006 Mar 20;7 Suppl 1(Suppl 1):S2. doi: 10.1186/1471-2105-7-S1-S2.
2
Kernel methods for predicting protein-protein interactions.用于预测蛋白质-蛋白质相互作用的核方法。
Bioinformatics. 2005 Jun;21 Suppl 1:i38-46. doi: 10.1093/bioinformatics/bti1016.
3
Computational verification of protein-protein interactions by orthologous co-expression.通过直系同源共表达对蛋白质-蛋白质相互作用进行计算验证。
BMC Bioinformatics. 2005 Mar 2;6:40. doi: 10.1186/1471-2105-6-40.
4
Filtering high-throughput protein-protein interaction data using a combination of genomic features.使用基因组特征组合过滤高通量蛋白质-蛋白质相互作用数据。
BMC Bioinformatics. 2005 Apr 18;6:100. doi: 10.1186/1471-2105-6-100.
5
Prediction of protein subcellular localization.蛋白质亚细胞定位预测
Proteins. 2006 Aug 15;64(3):643-51. doi: 10.1002/prot.21018.
6
PSSM-based prediction of DNA binding sites in proteins.基于位置特异性得分矩阵的蛋白质中DNA结合位点预测
BMC Bioinformatics. 2005 Feb 19;6:33. doi: 10.1186/1471-2105-6-33.
7
Ontological visualization of protein-protein interactions.蛋白质-蛋白质相互作用的本体可视化。
BMC Bioinformatics. 2005 Feb 11;6:29. doi: 10.1186/1471-2105-6-29.
8
Predicting protein interaction sites: binding hot-spots in protein-protein and protein-ligand interfaces.预测蛋白质相互作用位点:蛋白质-蛋白质和蛋白质-配体界面中的结合热点
Bioinformatics. 2006 Jun 1;22(11):1335-42. doi: 10.1093/bioinformatics/btl079. Epub 2006 Mar 7.
9
An introduction to protein contact prediction.蛋白质接触预测简介。
Methods Mol Biol. 2008;453:87-104. doi: 10.1007/978-1-60327-429-6_3.
10
The targets of CAPRI rounds 6-12.CAPRI第6至12轮的目标。
Proteins. 2007 Dec 1;69(4):699-703. doi: 10.1002/prot.21689.

引用本文的文献

1
Evaluating sequence and structural similarity metrics for predicting shared paralog functions.评估用于预测共享旁系同源基因功能的序列和结构相似性指标。
NAR Genom Bioinform. 2025 Apr 26;7(2):lqaf051. doi: 10.1093/nargab/lqaf051. eCollection 2025 Jun.
2
Topology-driven negative sampling enhances generalizability in protein-protein interaction prediction.拓扑驱动的负采样增强了蛋白质-蛋白质相互作用预测的泛化能力。
Bioinformatics. 2025 May 6;41(5). doi: 10.1093/bioinformatics/btaf148.
3
Prediction of influenza A virus-human protein-protein interactions using XGBoost with continuous and discontinuous amino acids information.

本文引用的文献

1
Kernel methods for predicting protein-protein interactions.用于预测蛋白质-蛋白质相互作用的核方法。
Bioinformatics. 2005 Jun;21 Suppl 1:i38-46. doi: 10.1093/bioinformatics/bti1016.
2
Random forest similarity for protein-protein interaction prediction from multiple sources.基于多源数据的蛋白质-蛋白质相互作用预测的随机森林相似度
Pac Symp Biocomput. 2005:531-42.
3
eBLOCKs: enumerating conserved protein blocks to achieve maximal sensitivity and specificity.eBLOCKs:枚举保守蛋白质模块以实现最大灵敏度和特异性。
使用具有连续和不连续氨基酸信息的XGBoost预测甲型流感病毒与人的蛋白质-蛋白质相互作用
PeerJ. 2025 Jan 30;13:e18863. doi: 10.7717/peerj.18863. eCollection 2025.
4
Prediction of virus-host associations using protein language models and multiple instance learning.使用蛋白质语言模型和多实例学习预测病毒-宿主关联
PLoS Comput Biol. 2024 Nov 19;20(11):e1012597. doi: 10.1371/journal.pcbi.1012597. eCollection 2024 Nov.
5
A predictive approach for host-pathogen interactions using deep learning and protein sequences.一种利用深度学习和蛋白质序列预测宿主-病原体相互作用的方法。
Virusdisease. 2024 Sep;35(3):434-445. doi: 10.1007/s13337-024-00882-x. Epub 2024 Jul 16.
6
Guiding questions to avoid data leakage in biological machine learning applications.指导问题以避免生物机器学习应用中的数据泄露。
Nat Methods. 2024 Aug;21(8):1444-1453. doi: 10.1038/s41592-024-02362-y. Epub 2024 Aug 9.
7
Cracking the black box of deep sequence-based protein-protein interaction prediction.破解基于深度序列的蛋白质-蛋白质相互作用预测的黑箱。
Brief Bioinform. 2024 Jan 22;25(2). doi: 10.1093/bib/bbae076.
8
Pitfalls of machine learning models for protein-protein interaction networks.机器学习模型在蛋白质-蛋白质相互作用网络中的陷阱。
Bioinformatics. 2024 Feb 1;40(2). doi: 10.1093/bioinformatics/btae012.
9
Speos: an ensemble graph representation learning framework to predict core gene candidates for complex diseases.Speos:一种用于预测复杂疾病核心基因候选物的集成图表示学习框架。
Nat Commun. 2023 Nov 8;14(1):7206. doi: 10.1038/s41467-023-42975-z.
10
Peptides of a Feather: How Computation Is Taking Peptide Therapeutics under Its Wing.羽毛同源的肽:计算如何为肽治疗学提供支持。
Genes (Basel). 2023 May 29;14(6):1194. doi: 10.3390/genes14061194.
Nucleic Acids Res. 2005 Jan 1;33(Database issue):D178-82. doi: 10.1093/nar/gki060.
4
Information assessment on predicting protein-protein interactions.预测蛋白质-蛋白质相互作用的信息评估
BMC Bioinformatics. 2004 Oct 18;5:154. doi: 10.1186/1471-2105-5-154.
5
Analyzing protein function on a genomic scale: the importance of gold-standard positives and negatives for network prediction.在基因组规模上分析蛋白质功能:网络预测中黄金标准阳性和阴性样本的重要性。
Curr Opin Microbiol. 2004 Oct;7(5):535-45. doi: 10.1016/j.mib.2004.08.012.
6
Predicting protein-protein interactions using signature products.使用特征产物预测蛋白质-蛋白质相互作用。
Bioinformatics. 2005 Jan 15;21(2):218-26. doi: 10.1093/bioinformatics/bth483. Epub 2004 Aug 19.
7
Predicting co-complexed protein pairs using genomic and proteomic data integration.利用基因组和蛋白质组数据整合预测共复合蛋白质对
BMC Bioinformatics. 2004 Apr 16;5:38. doi: 10.1186/1471-2105-5-38.
8
A Bayesian networks approach for predicting protein-protein interactions from genomic data.一种用于从基因组数据预测蛋白质-蛋白质相互作用的贝叶斯网络方法。
Science. 2003 Oct 17;302(5644):449-53. doi: 10.1126/science.1087361.
9
Learning to predict protein-protein interactions from protein sequences.学习从蛋白质序列预测蛋白质-蛋白质相互作用。
Bioinformatics. 2003 Oct 12;19(15):1875-81. doi: 10.1093/bioinformatics/btg352.
10
Remote homology detection: a motif based approach.远程同源性检测:一种基于基序的方法。
Bioinformatics. 2003;19 Suppl 1:i26-33. doi: 10.1093/bioinformatics/btg1002.