Suppr超能文献

DiffPaSS——使用软评分对蛋白质序列进行高性能可微配对

DiffPaSS-high-performance differentiable pairing of protein sequences using soft scores.

作者信息

Lupo Umberto, Sgarbossa Damiano, Milighetti Martina, Bitbol Anne-Florence

机构信息

Institute of Bioengineering, School of Life Sciences, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne CH-1015, Switzerland.

SIB Swiss Institute of Bioinformatics, Lausanne CH-1015, Switzerland.

出版信息

Bioinformatics. 2024 Dec 26;41(1). doi: 10.1093/bioinformatics/btae738.

Abstract

MOTIVATION

Identifying interacting partners from two sets of protein sequences has important applications in computational biology. Interacting partners share similarities across species due to their common evolutionary history, and feature correlations in amino acid usage due to the need to maintain complementary interaction interfaces. Thus, the problem of finding interacting pairs can be formulated as searching for a pairing of sequences that maximizes a sequence similarity or a coevolution score. Several methods have been developed to address this problem, applying different approximate optimization methods to different scores.

RESULTS

We introduce Differentiable Pairing using Soft Scores (DiffPaSS), a differentiable framework for flexible, fast, and hyperparameter-free optimization for pairing interacting biological sequences, which can be applied to a wide variety of scores. We apply it to a benchmark prokaryotic dataset, using mutual information and neighbor graph alignment scores. DiffPaSS outperforms existing algorithms for optimizing the same scores. We demonstrate the usefulness of our paired alignments for the prediction of protein complex structure. DiffPaSS does not require sequences to be aligned, and we also apply it to nonaligned sequences from T-cell receptors.

AVAILABILITY AND IMPLEMENTATION

A PyTorch implementation and installable Python package are available at https://github.com/Bitbol-Lab/DiffPaSS.

摘要

动机

从两组蛋白质序列中识别相互作用的伙伴在计算生物学中具有重要应用。由于共同的进化历史,相互作用的伙伴在物种间具有相似性,并且由于需要维持互补的相互作用界面,在氨基酸使用上存在特征相关性。因此,寻找相互作用对的问题可以表述为搜索能使序列相似性或共进化分数最大化的序列配对。已经开发了几种方法来解决这个问题,针对不同的分数应用不同的近似优化方法。

结果

我们引入了使用软分数的可微配对(DiffPaSS),这是一个用于灵活、快速且无超参数优化配对相互作用生物序列的可微框架,可应用于多种分数。我们将其应用于一个原核生物基准数据集,使用互信息和邻域图比对分数。DiffPaSS在优化相同分数时优于现有算法。我们展示了我们的配对比对在预测蛋白质复合物结构方面的有用性。DiffPaSS不需要序列进行比对,我们还将其应用于T细胞受体的未比对序列。

可用性和实现方式

可在https://github.com/Bitbol-Lab/DiffPaSS获取PyTorch实现和可安装的Python包。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d901/11676329/bf9613e48a0d/btae738f1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验