• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过分析不同的机器学习算法来提高热点区域预测。

Improve hot region prediction by analyzing different machine learning algorithms.

机构信息

School of Computer Science and Technology, Wuhan University of Science and Technology, Wuhan, Hubei, China.

Hubei Province Key Laboratory of Intelligent Information Processing and Real-Time Industrial System, Wuhan, 430065, Hubei, China.

出版信息

BMC Bioinformatics. 2021 Oct 25;22(Suppl 3):522. doi: 10.1186/s12859-021-04420-0.

DOI:10.1186/s12859-021-04420-0
PMID:34696728
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8543831/
Abstract

BACKGROUND

In the process of designing drugs and proteins, it is crucial to recognize hot regions in protein-protein interactions. Each hot region of protein-protein interaction is composed of at least three hot spots, which play an important role in binding. However, it takes time and labor force to identify hot spots through biological experiments. If predictive models based on machine learning methods can be trained, the drug design process can be effectively accelerated.

RESULTS

The results show that different machine learning algorithms perform similarly, as evaluating using the F-measure. The main differences between these methods are recall and precision. Since the key attribute of hot regions is that they are packed tightly, we used the cluster algorithm to predict hot regions. By combining Gaussian Naïve Bayes and DBSCAN, the F-measure of hot region prediction can reach 0.809.

CONCLUSIONS

In this paper, different machine learning models such as Gaussian Naïve Bayes, SVM, Xgboost, Random Forest, and Artificial Neural Network are used to predict hot spots. The experiment results show that the combination of hot spot classification algorithm with higher recall rate and clustering algorithm with higher precision can effectively improve the accuracy of hot region prediction.

摘要

背景

在药物和蛋白质设计过程中,识别蛋白质-蛋白质相互作用中的热点区域至关重要。每个蛋白质-蛋白质相互作用的热点区域至少由三个热点组成,这些热点在结合中起着重要作用。然而,通过生物实验来识别热点需要耗费大量的时间和劳动力。如果能够训练基于机器学习方法的预测模型,那么药物设计过程将得到有效加速。

结果

结果表明,不同的机器学习算法在使用 F 度量进行评估时表现相似。这些方法之间的主要区别在于召回率和精度。由于热点区域的主要属性是它们紧密包装,因此我们使用聚类算法来预测热点区域。通过结合高斯朴素贝叶斯和 DBSCAN,热点预测的 F 度量可以达到 0.809。

结论

本文使用了高斯朴素贝叶斯、SVM、Xgboost、随机森林和人工神经网络等不同的机器学习模型来预测热点。实验结果表明,将具有更高召回率的热点分类算法与具有更高精度的聚类算法相结合,可以有效地提高热点区域预测的准确性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eb37/8543831/c1b5a2685370/12859_2021_4420_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eb37/8543831/08e59d911628/12859_2021_4420_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eb37/8543831/c1b5a2685370/12859_2021_4420_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eb37/8543831/08e59d911628/12859_2021_4420_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eb37/8543831/c1b5a2685370/12859_2021_4420_Fig2_HTML.jpg

相似文献

1
Improve hot region prediction by analyzing different machine learning algorithms.通过分析不同的机器学习算法来提高热点区域预测。
BMC Bioinformatics. 2021 Oct 25;22(Suppl 3):522. doi: 10.1186/s12859-021-04420-0.
2
A feature-based approach to predict hot spots in protein-DNA binding interfaces.基于特征的方法预测蛋白质-DNA 结合界面热点。
Brief Bioinform. 2020 May 21;21(3):1038-1046. doi: 10.1093/bib/bbz037.
3
Soft Clustering for Enhancing the Diagnosis of Chronic Diseases over Machine Learning Algorithms.基于机器学习算法的软聚类在慢性病诊断中的应用。
J Healthc Eng. 2020 Mar 9;2020:4984967. doi: 10.1155/2020/4984967. eCollection 2020.
4
Application of a developed triple-classification machine learning model for carcinogenic prediction of hazardous organic chemicals to the US, EU, and WHO based on Chinese database.应用基于中国数据库开发的三分类机器学习模型对美国、欧盟和世界卫生组织的危险有机化学品进行致癌性预测。
Ecotoxicol Environ Saf. 2023 Apr 15;255:114806. doi: 10.1016/j.ecoenv.2023.114806. Epub 2023 Mar 20.
5
Comparing different supervised machine learning algorithms for disease prediction.比较不同的监督机器学习算法在疾病预测中的应用。
BMC Med Inform Decis Mak. 2019 Dec 21;19(1):281. doi: 10.1186/s12911-019-1004-8.
6
Machine Learning Technology-Based Heart Disease Detection Models.基于机器学习技术的心脏病检测模型。
J Healthc Eng. 2022 Feb 27;2022:7351061. doi: 10.1155/2022/7351061. eCollection 2022.
7
Machine learning algorithms to predict early pregnancy loss after in vitro fertilization-embryo transfer with fetal heart rate as a strong predictor.以胎儿心率作为强预测指标,用于预测体外受精-胚胎移植后早期妊娠丢失的机器学习算法。
Comput Methods Programs Biomed. 2020 Nov;196:105624. doi: 10.1016/j.cmpb.2020.105624. Epub 2020 Jun 25.
8
Comparison of Machine Learning Algorithms in the Prediction of Hospitalized Patients with Schizophrenia.机器学习算法在预测住院精神分裂症患者中的比较。
Sensors (Basel). 2022 Mar 25;22(7):2517. doi: 10.3390/s22072517.
9
Utilizing machine learning algorithms to predict subject genetic mutation class from in silico models of neuronal networks.利用机器学习算法从神经元网络的计算模型中预测研究对象的基因突变类别。
BMC Med Inform Decis Mak. 2022 Nov 9;22(1):290. doi: 10.1186/s12911-022-02038-7.
10
Comparison of supervised machine learning classification techniques in prediction of locoregional recurrences in early oral tongue cancer.比较早期口腔舌癌局部区域复发预测中监督机器学习分类技术。
Int J Med Inform. 2020 Apr;136:104068. doi: 10.1016/j.ijmedinf.2019.104068. Epub 2019 Dec 28.

引用本文的文献

1
Machine learning for early prediction of the infection in patients with urinary stone after treatment of holmium laser lithotripsy.机器学习用于预测钬激光碎石术后尿路结石患者感染的早期情况。
PLoS One. 2025 May 16;20(5):e0317584. doi: 10.1371/journal.pone.0317584. eCollection 2025.
2
Revolutionizing Medicinal Chemistry: The Application of Artificial Intelligence (AI) in Early Drug Discovery.变革药物化学:人工智能在早期药物发现中的应用。
Pharmaceuticals (Basel). 2023 Sep 6;16(9):1259. doi: 10.3390/ph16091259.
3
Overview of methods for characterization and visualization of a protein-protein interaction network in a multi-omics integration context.

本文引用的文献

1
SKEMPI 2.0: an updated benchmark of changes in protein-protein binding energy, kinetics and thermodynamics upon mutation.SKEMPI 2.0:一个更新的蛋白质-蛋白质结合能、动力学和热力学突变的基准。
Bioinformatics. 2019 Feb 1;35(3):462-469. doi: 10.1093/bioinformatics/bty635.
2
Prediction of Hot Regions in PPIs Based on Improved Local Community Structure Detecting.基于改进的局部社区结构检测的蛋白质-蛋白质相互作用热点区域预测。
IEEE/ACM Trans Comput Biol Bioinform. 2018 Sep-Oct;15(5):1470-1479. doi: 10.1109/TCBB.2018.2793858. Epub 2018 Jan 15.
3
Protein binding hot spots prediction from sequence only by a new ensemble learning method.
多组学整合背景下蛋白质-蛋白质相互作用网络的表征与可视化方法概述。
Front Mol Biosci. 2022 Sep 8;9:962799. doi: 10.3389/fmolb.2022.962799. eCollection 2022.
仅通过一种新的集成学习方法从序列预测蛋白质结合热点
Amino Acids. 2017 Oct;49(10):1773-1785. doi: 10.1007/s00726-017-2474-6. Epub 2017 Aug 1.
4
ConSurf 2016: an improved methodology to estimate and visualize evolutionary conservation in macromolecules.ConSurf 2016:一种用于估计和可视化大分子进化保守性的改进方法。
Nucleic Acids Res. 2016 Jul 8;44(W1):W344-50. doi: 10.1093/nar/gkw408. Epub 2016 May 10.
5
Protein-protein interaction modulator drug discovery: past efforts and future opportunities using a rich source of low- and high-throughput screening assays.蛋白质-蛋白质相互作用调节剂药物发现:利用丰富的低通量和高通量筛选试验的过往努力与未来机遇
Expert Opin Drug Discov. 2014 Dec;9(12):1393-404. doi: 10.1517/17460441.2014.954544. Epub 2014 Nov 6.
6
Hot spots in protein-protein interfaces: towards drug discovery.蛋白质-蛋白质相互作用界面的热点:迈向药物发现
Prog Biophys Mol Biol. 2014 Nov-Dec;116(2-3):165-73. doi: 10.1016/j.pbiomolbio.2014.06.003. Epub 2014 Jul 2.
7
SKEMPI: a Structural Kinetic and Energetic database of Mutant Protein Interactions and its use in empirical models.SKEMPI:突变蛋白相互作用的结构动力学和能量学数据库及其在经验模型中的应用。
Bioinformatics. 2012 Oct 15;28(20):2600-7. doi: 10.1093/bioinformatics/bts489. Epub 2012 Aug 1.
8
Prediction of protein-binding areas by small-world residue networks and application to docking.通过小世界残差网络预测蛋白质结合区域及其在对接中的应用。
BMC Bioinformatics. 2011 Sep 26;12:378. doi: 10.1186/1471-2105-12-378.
9
Analysis of hot region organization in hub proteins.分析枢纽蛋白中的热点区域组织。
Ann Biomed Eng. 2010 Jun;38(6):2068-78. doi: 10.1007/s10439-010-0048-9. Epub 2010 May 1.
10
APIS: accurate prediction of hot spots in protein interfaces by combining protrusion index with solvent accessibility.APIS:通过结合突出指数和溶剂可及性来准确预测蛋白质界面热点。
BMC Bioinformatics. 2010 Apr 8;11:174. doi: 10.1186/1471-2105-11-174.