利用特权信息学习蛋白质结合亲和力。

Learning protein binding affinity using privileged information.

机构信息

Biomedical Informatics Research Laboratory (BIRL), Department of Computer and Information Sciences (DCIS), Pakistan Institute of Engineering and Applied Sciences (PIEAS), Nilore, ISL, 45650, Pakistan.

Information Technology Center (ITC), University of Azad Jammu & Kashmir, Muzaffarabad, Azad Kashmir, 13100, Pakistan.

出版信息

BMC Bioinformatics. 2018 Nov 15;19(1):425. doi: 10.1186/s12859-018-2448-z.

DOI:10.1186/s12859-018-2448-z

PMID:30442086

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6238365/

Abstract

BACKGROUND

Determining protein-protein interactions and their binding affinity are important in understanding cellular biological processes, discovery and design of novel therapeutics, protein engineering, and mutagenesis studies. Due to the time and effort required in wet lab experiments, computational prediction of binding affinity from sequence or structure is an important area of research. Structure-based methods, though more accurate than sequence-based techniques, are limited in their applicability due to limited availability of protein structure data.

RESULTS

In this study, we propose a novel machine learning method for predicting binding affinity that uses protein 3D structure as privileged information at training time while expecting only protein sequence information during testing. Using the method, which is based on the framework of learning using privileged information (LUPI), we have achieved improved performance over corresponding sequence-based binding affinity prediction methods that do not have access to privileged information during training. Our experiments show that with the proposed framework which uses structure only during training, it is possible to achieve classification performance comparable to that which is obtained using structure-based features. Evaluation on an independent test set shows improved performance over the PPA-Pred2 method as well.

CONCLUSIONS

The proposed method outperforms several baseline learners and a state-of-the-art binding affinity predictor not only in cross-validation, but also on an additional validation dataset, demonstrating the utility of the LUPI framework for problems that would benefit from classification using structure-based features. The implementation of LUPI developed for this work is expected to be useful in other areas of bioinformatics as well.

摘要

背景

确定蛋白质-蛋白质相互作用及其结合亲和力对于理解细胞生物学过程、新型治疗药物的发现和设计、蛋白质工程以及诱变研究非常重要。由于在湿实验室实验中需要耗费大量的时间和精力，因此从序列或结构预测结合亲和力是一个重要的研究领域。基于结构的方法虽然比基于序列的技术更准确，但由于蛋白质结构数据的有限可用性，其适用性受到限制。

结果

在这项研究中，我们提出了一种新的机器学习方法，用于预测结合亲和力，该方法在训练时使用蛋白质 3D 结构作为特权信息，而在测试时仅期望使用蛋白质序列信息。使用该方法，该方法基于使用特权信息的学习框架（LUPI），我们在不使用训练期间特权信息的情况下，实现了比相应的基于序列的结合亲和力预测方法更好的性能。我们的实验表明，使用仅在训练期间使用结构的框架，有可能实现与使用基于结构的特征获得的分类性能相当的性能。在独立测试集上的评估也表明，该方法优于 PPA-Pred2 方法。

结论

该方法不仅在交叉验证中，而且在额外的验证数据集上，均优于几个基准学习者和最先进的结合亲和力预测器，证明了 LUPI 框架对于受益于基于结构的特征分类的问题的实用性。为这项工作开发的 LUPI 的实现有望在其他生物信息学领域也很有用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e7f6/6238365/27491722028a/12859_2018_2448_Fig1_HTML.jpg

相似文献

Learning protein binding affinity using privileged information.利用特权信息学习蛋白质结合亲和力。

BMC Bioinformatics. 2018 Nov 15;19(1):425. doi: 10.1186/s12859-018-2448-z.

PANDA: Predicting the change in proteins binding affinity upon mutations by finding a signal in primary structures.PANDA：通过在一级结构中寻找信号来预测突变后蛋白质结合亲和力的变化。

J Bioinform Comput Biol. 2021 Aug;19(4):2150015. doi: 10.1142/S0219720021500153. Epub 2021 Jun 11.

Characterizing informative sequence descriptors and predicting binding affinities of heterodimeric protein complexes.表征信息性序列描述符并预测异源二聚体蛋白复合物的结合亲和力。

BMC Bioinformatics. 2015;16 Suppl 18(Suppl 18):S14. doi: 10.1186/1471-2105-16-S18-S14. Epub 2015 Dec 9.

Ensemble Deep Random Vector Functional Link Network Using Privileged Information for Alzheimer's Disease Diagnosis.基于特权信息的集成深度随机向量函数链接网络在阿尔茨海默病诊断中的应用。

IEEE/ACM Trans Comput Biol Bioinform. 2024 Jul-Aug;21(4):534-545. doi: 10.1109/TCBB.2022.3170351. Epub 2024 Aug 8.

Accurate Sequence-Based Prediction of Deleterious nsSNPs with Multiple Sequence Profiles and Putative Binding Residues.基于多序列图谱和假定结合残基的有害 nsSNP 准确序列预测。

Biomolecules. 2021 Sep 9;11(9):1337. doi: 10.3390/biom11091337.

Structure-based prediction of protein- peptide binding regions using Random Forest.基于结构的随机森林预测蛋白肽结合区域。

Bioinformatics. 2018 Feb 1;34(3):477-484. doi: 10.1093/bioinformatics/btx614.

Robust and accurate prediction of protein self-interactions from amino acids sequence using evolutionary information.利用进化信息从氨基酸序列对蛋白质自身相互作用进行稳健且准确的预测。

Mol Biosyst. 2016 Nov 15;12(12):3702-3710. doi: 10.1039/c6mb00599c.

Boosting phosphorylation site prediction with sequence feature-based machine learning.基于序列特征的机器学习提高磷酸化位点预测。

Proteins. 2020 Feb;88(2):284-291. doi: 10.1002/prot.25801. Epub 2019 Aug 22.

DeepDTA: deep drug-target binding affinity prediction.深度 DTA：深度药物-靶标结合亲和力预测。

Bioinformatics. 2018 Sep 1;34(17):i821-i829. doi: 10.1093/bioinformatics/bty593.

Effective DNA binding protein prediction by using key features via Chou's general PseAAC.利用周元的通用 PseAAC 算法通过关键特征预测有效 DNA 结合蛋白。

J Theor Biol. 2019 Jan 7;460:64-78. doi: 10.1016/j.jtbi.2018.10.027. Epub 2018 Oct 11.

引用本文的文献

Predictive Models and Impact of Interfacial Contacts and Amino Acids on Protein-Protein Binding Affinity.预测模型以及界面接触和氨基酸对蛋白质-蛋白质结合亲和力的影响

ACS Omega. 2024 Jan 11;9(3):3454-3468. doi: 10.1021/acsomega.3c06996. eCollection 2024 Jan 23.

Prelnc2: A prediction tool for lncRNAs with enhanced multi-level features of RNAs.Prelnc2：一种具有增强的 RNA 多层次特征的 lncRNAs 预测工具。

PLoS One. 2023 Jun 1;18(6):e0286377. doi: 10.1371/journal.pone.0286377. eCollection 2023.

Attentive Variational Information Bottleneck for TCR-peptide interaction prediction.注意力变分信息瓶颈用于 TCR-肽相互作用预测。

Bioinformatics. 2023 Jan 1;39(1). doi: 10.1093/bioinformatics/btac820.

PPI-Affinity: A Web Tool for the Prediction and Optimization of Protein-Peptide and Protein-Protein Binding Affinity.PPI-Affinity：用于预测和优化蛋白-肽和蛋白-蛋白结合亲和力的网络工具。

J Proteome Res. 2022 Aug 5;21(8):1829-1841. doi: 10.1021/acs.jproteome.2c00020. Epub 2022 Jun 2.

Implications of disease-related mutations at protein-protein interfaces.疾病相关突变对蛋白质-蛋白质界面的影响。

Curr Opin Struct Biol. 2022 Feb;72:219-225. doi: 10.1016/j.sbi.2021.11.012. Epub 2021 Dec 24.

Prediction of Protein-Protein Binding Affinities from Unbound Protein Structures.从未结合蛋白结构预测蛋白-蛋白结合亲和力。

Methods Mol Biol. 2022;2385:335-351. doi: 10.1007/978-1-0716-1767-0_16.

Protein-Protein Interactions: Insight from Molecular Dynamics Simulations and Nanoparticle Tracking Analysis.蛋白质-蛋白质相互作用：分子动力学模拟和纳米颗粒跟踪分析的见解。

Molecules. 2021 Sep 20;26(18):5696. doi: 10.3390/molecules26185696.

Deep Learning for Protein-Protein Interaction Site Prediction.用于蛋白质-蛋白质相互作用位点预测的深度学习

Methods Mol Biol. 2021;2361:263-288. doi: 10.1007/978-1-0716-1641-3_16.

Structural Aspects and Prediction of Calmodulin-Binding Proteins.钙调蛋白结合蛋白的结构特征与预测。

Int J Mol Sci. 2020 Dec 30;22(1):308. doi: 10.3390/ijms22010308.

ISLAND: in-silico proteins binding affinity prediction using sequence information.ISLAND：利用序列信息进行计算机模拟蛋白质结合亲和力预测。

BioData Min. 2020 Nov 25;13(1):20. doi: 10.1186/s13040-020-00231-w.

本文引用的文献

Large-scale prediction of binding affinity in protein-small ligand complexes: the PRODIGY-LIG web server.大规模预测蛋白质-小分子配体复合物的结合亲和力：PRODIGY-LIG 网络服务器。

Bioinformatics. 2019 May 1;35(9):1585-1587. doi: 10.1093/bioinformatics/bty816.

Development and evaluation of a deep learning model for protein-ligand binding affinity prediction.开发和评估用于预测蛋白质-配体结合亲和力的深度学习模型。

Bioinformatics. 2018 Nov 1;34(21):3666-3674. doi: 10.1093/bioinformatics/bty374.

CaMELS: In silico prediction of calmodulin binding proteins and their binding sites.CaMELS：钙调蛋白结合蛋白及其结合位点的计算机模拟预测

Proteins. 2017 Sep;85(9):1724-1740. doi: 10.1002/prot.25330. Epub 2017 Jul 3.

Structure, Function, Involvement in Diseases and Targeting of 14-3-3 Proteins: An Update.14-3-3蛋白的结构、功能、与疾病的关联及靶向作用：最新进展

Curr Med Chem. 2018;25(1):5-21. doi: 10.2174/0929867324666170426095015.

Improving the accuracy of high-throughput protein-protein affinity prediction may require better training data.提高高通量蛋白质-蛋白质亲和力预测的准确性可能需要更好的训练数据。

BMC Bioinformatics. 2017 Mar 23;18(Suppl 5):102. doi: 10.1186/s12859-017-1533-z.

SAnDReS a Computational Tool for Statistical Analysis of Docking Results and Development of Scoring Functions.SAnDReS：一种用于对接结果统计分析和评分函数开发的计算工具。

Comb Chem High Throughput Screen. 2016;19(10):801-812. doi: 10.2174/1386207319666160927111347.

Sequence-based prediction of protein-protein interactions using weighted sparse representation model combined with global encoding.基于序列的蛋白质-蛋白质相互作用预测：结合全局编码的加权稀疏表示模型

BMC Bioinformatics. 2016 Apr 26;17(1):184. doi: 10.1186/s12859-016-1035-4.

Machine-learning scoring functions to improve structure-based binding affinity prediction and virtual screening.用于改进基于结构的结合亲和力预测和虚拟筛选的机器学习评分函数。

Wiley Interdiscip Rev Comput Mol Sci. 2015 Nov-Dec;5(6):405-424. doi: 10.1002/wcms.1225. Epub 2015 Aug 28.

Issues in performance evaluation for host-pathogen protein interaction prediction.宿主-病原体蛋白质相互作用预测的性能评估问题

J Bioinform Comput Biol. 2016 Jun;14(3):1650011. doi: 10.1142/S0219720016500116. Epub 2016 Jan 14.

Insights into Protein-Ligand Interactions: Mechanisms, Models, and Methods.蛋白质-配体相互作用的见解：机制、模型与方法

Int J Mol Sci. 2016 Jan 26;17(2):144. doi: 10.3390/ijms17020144.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

利用特权信息学习蛋白质结合亲和力。

Learning protein binding affinity using privileged information.

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSIONS

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献