Suppr超能文献

TUnA:一种基于序列的蛋白质-蛋白质相互作用预测的不确定性感知的 Transformer 模型。

TUnA: an uncertainty-aware transformer model for sequence-based protein-protein interaction prediction.

机构信息

Department of Chemistry and Biochemistry, University of California, San Diego, La Jolla, CA 92093-0359, United States.

Department of Cellular and Molecular Medicine, University of California, San Diego, La Jolla, CA 92093-0359, United States.

出版信息

Brief Bioinform. 2024 Jul 25;25(5). doi: 10.1093/bib/bbae359.

Abstract

Protein-protein interactions (PPIs) are important for many biological processes, but predicting them from sequence data remains challenging. Existing deep learning models often cannot generalize to proteins not present in the training set and do not provide uncertainty estimates for their predictions. To address these limitations, we present TUnA, a Transformer-based uncertainty-aware model for PPI prediction. TUnA uses ESM-2 embeddings with Transformer encoders and incorporates a Spectral-normalized Neural Gaussian Process. TUnA achieves state-of-the-art performance and, importantly, evaluates uncertainty for unseen sequences. We demonstrate that TUnA's uncertainty estimates can effectively identify the most reliable predictions, significantly reducing false positives. This capability is crucial in bridging the gap between computational predictions and experimental validation.

摘要

蛋白质-蛋白质相互作用(PPIs)对许多生物过程都很重要,但从序列数据中预测它们仍然具有挑战性。现有的深度学习模型通常无法泛化到训练集中不存在的蛋白质,并且不能为其预测提供不确定性估计。为了解决这些限制,我们提出了 TUnA,这是一种用于 PPI 预测的基于 Transformer 的不确定性感知模型。TUnA 使用 ESM-2 嵌入和 Transformer 编码器,并结合了谱归一化神经高斯过程。TUnA 实现了最先进的性能,重要的是,它为看不见的序列评估不确定性。我们证明,TUnA 的不确定性估计可以有效地识别最可靠的预测,显著减少假阳性。这种能力对于缩小计算预测和实验验证之间的差距至关重要。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c7f2/11269822/f0a2351ff8e2/bbae359f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验