Suppr超能文献

深度学习预测表明蛋白质与碳水化合物之间存在大量相互作用。

Predictions from Deep Learning Propose Substantial Protein-Carbohydrate Interplay.

作者信息

Canner Samuel W, Schnaar Ronald L, Gray Jeffrey J

机构信息

Program in Molecular Biophysics, Johns Hopkins University, Baltimore, MD, United States.

Department of Pharmacology and Molecular Sciences, Johns Hopkins University School of Medicine, Baltimore, Maryland, United States.

出版信息

bioRxiv. 2025 Mar 15:2025.03.07.641884. doi: 10.1101/2025.03.07.641884.

Abstract

It is a grand challenge to identify all the protein - carbohydrate interactions in an organism. Direct experiments would require extensive libraries of glycans to definitively distinguish binding from non-binding proteins. Computational screening of proteins for carbohydrate-binding provides an attractive and ultimately testable alternative. Recent computational techniques have focused primarily on which protein residues interact with carbohydrates or which carbohydrate species a protein binds to. Current estimates label 1.5 to 5% of proteins as carbohydrate-binding proteins; however, 50-70% of proteins are known to be glycosylated, suggesting a potential wealth of proteins that bind to carbohydrates. We therefore developed a novel dataset and neural network architecture, named rotein nteraction of rbohydrates redictor (PiCAP), to predict whether a protein non-covalently binds to a carbohydrate. We trained PiCAP on a dataset of known carbohydrate binders, and we selected proteins that we identified as likely to bind carbohydrates, including DNA-binding transcription factors, cytoskeletal components, selected antibodies, and selected small-molecule-binding proteins. PiCAP achieves a 90% balanced accuracy on protein-level predictions of carbohydrate binding/non-binding. Using the same dataset, we developed a model named rbohydrate rotein ite dentiier 2 (CAPSIF2) to predict protein residues that interact non-covalently with carbohydrates. CAPSIF2 achieves a Dice coefficient of 0.57 on residue-level predictions on our independent test dataset, outcompeting all previous models for this task. To demonstrate the biological applicability of PiCAP and CAPSIF2, we investigated cell surface proteins of human neural cells and further predicted the likelihood of three proteomes, notably and , to bind to carbohydrates. PiCAP predicts that approximately 35-40% of proteins in these proteomes bind carbohydrates, indicating a substantial interplay of protein-carbohydrate interactions for cellular functionality.

摘要

识别生物体中所有的蛋白质 - 碳水化合物相互作用是一项巨大的挑战。直接实验需要大量的聚糖文库,以明确区分结合蛋白和非结合蛋白。对蛋白质进行碳水化合物结合的计算筛选提供了一种有吸引力且最终可测试的替代方法。最近的计算技术主要集中在哪些蛋白质残基与碳水化合物相互作用,或者蛋白质与哪些碳水化合物种类结合。目前的估计表明,1.5%至5%的蛋白质被标记为碳水化合物结合蛋白;然而,已知50 - 70%的蛋白质是糖基化的,这表明可能存在大量与碳水化合物结合的蛋白质。因此,我们开发了一个名为“蛋白质 - 碳水化合物相互作用预测器(PiCAP)”的新型数据集和神经网络架构,以预测蛋白质是否与碳水化合物非共价结合。我们在已知碳水化合物结合剂的数据集上训练PiCAP,并选择了我们确定可能与碳水化合物结合的蛋白质,包括DNA结合转录因子、细胞骨架成分、选定的抗体和选定的小分子结合蛋白。PiCAP在蛋白质水平上对碳水化合物结合/非结合的预测中达到了90%的平衡准确率。使用相同的数据集,我们开发了一个名为“碳水化合物 - 蛋白质位点标识符2(CAPSIF2)”的模型,以预测与碳水化合物非共价相互作用的蛋白质残基。在我们的独立测试数据集上,CAPSIF2在残基水平预测中的骰子系数为0.57,优于此前针对此任务的所有模型。为了证明PiCAP和CAPSIF2的生物学适用性,我们研究了人类神经细胞的细胞表面蛋白,并进一步预测了三个蛋白质组与碳水化合物结合的可能性,特别是[此处原文缺失具体蛋白质组信息]。PiCAP预测这些蛋白质组中约35 - 40%的蛋白质与碳水化合物结合,这表明蛋白质 - 碳水化合物相互作用在细胞功能中存在大量相互作用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7876/11952328/3d33184b3633/nihpp-2025.03.07.641884v2-f0001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验