• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

蛋白质相互作用位点预测的不平衡数据处理策略。

Imbalance Data Processing Strategy for Protein Interaction Sites Prediction.

出版信息

IEEE/ACM Trans Comput Biol Bioinform. 2021 May-Jun;18(3):985-994. doi: 10.1109/TCBB.2019.2953908. Epub 2021 Jun 3.

DOI:10.1109/TCBB.2019.2953908
PMID:31751283
Abstract

Protein-protein interactions play essential roles in various biological progresses. Identifying protein interaction sites can facilitate researchers to understand life activities and therefore will be helpful for drug design. However, the number of experimental determined protein interaction sites is far less than that of protein sites in protein-protein interaction or protein complexes. Therefore, the negative and positive samples are usually imbalanced, which is common but bring result bias on the prediction of protein interaction sites by computational approaches. In this work, we presented three imbalance data processing strategies to reconstruct the original dataset, and then extracted protein features from the evolutionary conservation of amino acids to build a predictor for identification of protein interaction sites. On a dataset with 10,430 surface residues but only 2,299 interface residues, the imbalance dataset processing strategies can obviously reduce the prediction bias, and therefore improve the prediction performance of protein interaction sites. The experimental results show that our prediction models can achieve a better prediction performance, such as a prediction accuracy of 0.758, or a high F-measure of 0.737, which demonstrated the effectiveness of our method.

摘要

蛋白质-蛋白质相互作用在各种生物进程中起着至关重要的作用。鉴定蛋白质相互作用位点可以帮助研究人员了解生命活动,因此有助于药物设计。然而,实验确定的蛋白质相互作用位点的数量远远少于蛋白质-蛋白质相互作用或蛋白质复合物中的蛋白质位点数量。因此,阴性和阳性样本通常是不平衡的,这在计算方法预测蛋白质相互作用位点时很常见,但会带来结果偏差。在这项工作中,我们提出了三种不平衡数据处理策略来重建原始数据集,然后从氨基酸的进化保守性中提取蛋白质特征,以构建用于识别蛋白质相互作用位点的预测器。在一个包含 10430 个表面残基但只有 2299 个界面残基的数据集上,不平衡数据集处理策略可以明显减少预测偏差,从而提高蛋白质相互作用位点的预测性能。实验结果表明,我们的预测模型可以实现更好的预测性能,例如预测准确率为 0.758,或高 F-measure 为 0.737,这证明了我们方法的有效性。

相似文献

1
Imbalance Data Processing Strategy for Protein Interaction Sites Prediction.蛋白质相互作用位点预测的不平衡数据处理策略。
IEEE/ACM Trans Comput Biol Bioinform. 2021 May-Jun;18(3):985-994. doi: 10.1109/TCBB.2019.2953908. Epub 2021 Jun 3.
2
Prediction of Protein-Protein Interaction via co-occurring Aligned Pattern Clusters.通过共现比对模式簇预测蛋白质-蛋白质相互作用
Methods. 2016 Nov 1;110:26-34. doi: 10.1016/j.ymeth.2016.07.018. Epub 2016 Jul 27.
3
Machine Learning Approaches for Protein⁻Protein Interaction Hot Spot Prediction: Progress and Comparative Assessment.机器学习方法在蛋白质-蛋白质相互作用热点预测中的应用:进展与比较评估。
Molecules. 2018 Oct 4;23(10):2535. doi: 10.3390/molecules23102535.
4
Improved Prediction of Protein-Protein Interaction Mapping on by Using Amino Acid Sequence Features in a Supervised Learning Framework.利用监督学习框架中的氨基酸序列特征改进蛋白质相互作用预测映射。
Protein Pept Lett. 2021;28(1):74-83. doi: 10.2174/0929866527666200610141258.
5
Joint evolutionary trees: a large-scale method to predict protein interfaces based on sequence sampling.联合进化树:一种基于序列采样预测蛋白质界面的大规模方法。
PLoS Comput Biol. 2009 Jan;5(1):e1000267. doi: 10.1371/journal.pcbi.1000267. Epub 2009 Jan 23.
6
Progress and challenges in predicting protein interfaces.预测蛋白质界面的进展与挑战。
Brief Bioinform. 2016 Jan;17(1):117-31. doi: 10.1093/bib/bbv027. Epub 2015 May 13.
7
Evolution of protein interactions: from interactomes to interfaces.蛋白质相互作用的进化:从互作组学到界面。
Arch Biochem Biophys. 2014 Jul 15;554:65-75. doi: 10.1016/j.abb.2014.05.010. Epub 2014 May 20.
8
Computational design, construction, and characterization of a set of specificity determining residues in protein-protein interactions.计算设计、构建和鉴定蛋白质-蛋白质相互作用中一组特异性决定残基。
Proteins. 2012 Oct;80(10):2426-36. doi: 10.1002/prot.24127. Epub 2012 Jul 10.
9
A Cascade Random Forests Algorithm for Predicting Protein-Protein Interaction Sites.一种用于预测蛋白质-蛋白质相互作用位点的级联随机森林算法。
IEEE Trans Nanobioscience. 2015 Oct;14(7):746-60. doi: 10.1109/TNB.2015.2475359. Epub 2015 Sep 28.
10
Conservation of hot regions in protein-protein interaction in evolution.蛋白质-蛋白质相互作用中热点区域在进化过程中的保守性。
Methods. 2016 Nov 1;110:73-80. doi: 10.1016/j.ymeth.2016.06.020. Epub 2016 Jun 21.

引用本文的文献

1
DeepSEA: an alignment-free explainable approach to annotate antimicrobial resistance proteins.DeepSEA:一种用于注释抗微生物蛋白的无序列比对可解释方法。
BMC Bioinformatics. 2025 Sep 1;26(1):224. doi: 10.1186/s12859-025-06256-4.
2
ASCE-PPIS: a protein-protein interaction sites predictor based on equivariant graph neural network with fusion of structure-aware pooling and graph collapse.ASCE-PPIS:一种基于等变图神经网络的蛋白质-蛋白质相互作用位点预测器,融合了结构感知池化和图折叠。
Bioinformatics. 2025 Aug 2;41(8). doi: 10.1093/bioinformatics/btaf423.
3
GTE-PPIS: a protein-protein interaction site predictor based on graph transformer and equivariant graph neural network.
GTE-PPIS:一种基于图变换器和等变图神经网络的蛋白质-蛋白质相互作用位点预测器。
Brief Bioinform. 2025 May 1;26(3). doi: 10.1093/bib/bbaf290.
4
Gated-GPS: enhancing protein-protein interaction site prediction with scalable learning and imbalance-aware optimization.门控全局预测系统(Gated-GPS):通过可扩展学习和不平衡感知优化增强蛋白质-蛋白质相互作用位点预测
Brief Bioinform. 2025 May 1;26(3). doi: 10.1093/bib/bbaf248.
5
DeepPBI-KG: a deep learning method for the prediction of phage-bacteria interactions based on key genes.DeepPBI-KG:一种基于关键基因的噬菌体-细菌相互作用预测的深度学习方法。
Brief Bioinform. 2024 Sep 23;25(6). doi: 10.1093/bib/bbae484.
6
Discovery of Antimicrobial Lysins from the "Dark Matter" of Uncharacterized Phages Using Artificial Intelligence.利用人工智能从未鉴定噬菌体的“暗物质”中发现抗菌溶菌酶。
Adv Sci (Weinh). 2024 Aug;11(32):e2404049. doi: 10.1002/advs.202404049. Epub 2024 Jun 20.
7
MEG-PPIS: a fast protein-protein interaction site prediction method based on multi-scale graph information and equivariant graph neural network.MEG-PPIS:一种基于多尺度图信息和等变图神经网络的快速蛋白质-蛋白质相互作用位点预测方法。
Bioinformatics. 2024 Jan 5;40(5). doi: 10.1093/bioinformatics/btae269.
8
Hypermethylation-Mediated lncRNA MAGI2-AS3 Downregulation Facilitates Malignant Progression of Laryngeal Squamous Cell Carcinoma via Interacting With SPT6.甲基化介导的长链非编码 RNA MAGI2-AS3 下调通过与 SPT6 相互作用促进喉鳞状细胞癌的恶性进展。
Cell Transplant. 2023 Jan-Dec;32:9636897231154574. doi: 10.1177/09636897231154574.
9
Learning the protein language of proteome-wide protein-protein binding sites via explainable ensemble deep learning.通过可解释的集成深度学习学习蛋白质组范围内蛋白质-蛋白质结合位点的蛋白质语言。
Commun Biol. 2023 Jan 19;6(1):73. doi: 10.1038/s42003-023-04462-5.
10
A Deep Learning and XGBoost-Based Method for Predicting Protein-Protein Interaction Sites.一种基于深度学习和XGBoost的蛋白质-蛋白质相互作用位点预测方法。
Front Genet. 2021 Oct 26;12:752732. doi: 10.3389/fgene.2021.752732. eCollection 2021.