Suppr超能文献

DiffPaSS——使用软评分对蛋白质序列进行高性能可微配对

DiffPaSS-high-performance differentiable pairing of protein sequences using soft scores.

作者信息

Lupo Umberto, Sgarbossa Damiano, Milighetti Martina, Bitbol Anne-Florence

机构信息

Institute of Bioengineering, School of Life Sciences, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne CH-1015, Switzerland.

SIB Swiss Institute of Bioinformatics, Lausanne CH-1015, Switzerland.

出版信息

Bioinformatics. 2024 Dec 26;41(1). doi: 10.1093/bioinformatics/btae738.

Abstract

MOTIVATION

Identifying interacting partners from two sets of protein sequences has important applications in computational biology. Interacting partners share similarities across species due to their common evolutionary history, and feature correlations in amino acid usage due to the need to maintain complementary interaction interfaces. Thus, the problem of finding interacting pairs can be formulated as searching for a pairing of sequences that maximizes a sequence similarity or a coevolution score. Several methods have been developed to address this problem, applying different approximate optimization methods to different scores.

RESULTS

We introduce Differentiable Pairing using Soft Scores (DiffPaSS), a differentiable framework for flexible, fast, and hyperparameter-free optimization for pairing interacting biological sequences, which can be applied to a wide variety of scores. We apply it to a benchmark prokaryotic dataset, using mutual information and neighbor graph alignment scores. DiffPaSS outperforms existing algorithms for optimizing the same scores. We demonstrate the usefulness of our paired alignments for the prediction of protein complex structure. DiffPaSS does not require sequences to be aligned, and we also apply it to nonaligned sequences from T-cell receptors.

AVAILABILITY AND IMPLEMENTATION

A PyTorch implementation and installable Python package are available at https://github.com/Bitbol-Lab/DiffPaSS.

摘要

动机

从两组蛋白质序列中识别相互作用的伙伴在计算生物学中具有重要应用。由于共同的进化历史,相互作用的伙伴在物种间具有相似性,并且由于需要维持互补的相互作用界面,在氨基酸使用上存在特征相关性。因此,寻找相互作用对的问题可以表述为搜索能使序列相似性或共进化分数最大化的序列配对。已经开发了几种方法来解决这个问题,针对不同的分数应用不同的近似优化方法。

结果

我们引入了使用软分数的可微配对(DiffPaSS),这是一个用于灵活、快速且无超参数优化配对相互作用生物序列的可微框架,可应用于多种分数。我们将其应用于一个原核生物基准数据集,使用互信息和邻域图比对分数。DiffPaSS在优化相同分数时优于现有算法。我们展示了我们的配对比对在预测蛋白质复合物结构方面的有用性。DiffPaSS不需要序列进行比对,我们还将其应用于T细胞受体的未比对序列。

可用性和实现方式

可在https://github.com/Bitbol-Lab/DiffPaSS获取PyTorch实现和可安装的Python包。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d901/11676329/bf9613e48a0d/btae738f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验