Suppr超能文献

使用 AlphaFold2 和机器学习预测蛋白质单核苷酸结合位点。

Prediction of protein mononucleotide binding sites using AlphaFold2 and machine learning.

机构信息

Department of Biotechnology, The University of Tokyo, Japan.

Department of Biotechnology, The University of Tokyo, Japan; Collaborative Research Institute for Innovative Microbiology, The University of Tokyo, Japan.

出版信息

Comput Biol Chem. 2022 Oct;100:107744. doi: 10.1016/j.compbiolchem.2022.107744. Epub 2022 Jul 23.

Abstract

In this study, we developed a system that predicts the binding sites of proteins for five mononucleotides (AMP, ADP, ATP, GDP, and GTP). The system comprises two machine learning (ML)-based predictors using a convolutional neural network and a gradient boosting machine, two template-based predictors based on sequence and structure alignment, and a predictor that performs ensemble learning of these four predictors. In this study, data augmentation of ligand binding sites with similar ligand structures was performed. For example, in the prediction of ADP-binding sites using ML methods, the binding sites of AMP and ATP, which have similar structures, are considered. In addition, we constructed the structure models using AlphaFold2, a highly accurate protein prediction method. The secondary structure and dihedral angle information obtained using the model structures were used as ML predictor features. Additionally, in the template-based predictor, the structures of the binding sites were used as templates to be explored based on structure alignment to identify the binding site of the target. Consequently, the template-based predictor based on structure alignment showed the best performance among the four individual predictors, and the ensemble predictor achieved the best performance, with an area under the curve of 0.958 for all mononucleotides.

摘要

在这项研究中,我们开发了一个系统,用于预测五种单核苷酸(AMP、ADP、ATP、GDP 和 GTP)的蛋白质结合位点。该系统包括两个基于机器学习(ML)的预测器,使用卷积神经网络和梯度提升机,两个基于序列和结构比对的模板预测器,以及一个对这四个预测器进行集成学习的预测器。在这项研究中,对具有相似配体结构的配体结合位点进行了数据扩充。例如,在使用 ML 方法预测 ADP 结合位点时,考虑了具有相似结构的 AMP 和 ATP 的结合位点。此外,我们使用 AlphaFold2 构建了高度精确的蛋白质预测方法的结构模型。使用模型结构获得的二级结构和二面角信息被用作 ML 预测器特征。此外,在基于模板的预测器中,使用结合位点的结构作为模板,通过结构比对进行探索,以识别目标的结合位点。因此,基于结构比对的模板预测器在四个独立预测器中表现最好,而集成预测器的表现最好,所有单核苷酸的曲线下面积为 0.958。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验