Suppr超能文献

基于残基间距离矩阵的蛋白质结构域解析的统一方法

A unified approach to protein domain parsing with inter-residue distance matrix.

机构信息

School of Mathematical Sciences, Nankai University, Tianjin 300071, China.

Ministry of Education Frontiers Science Center for Nonlinear Expectations, Research Center for Mathematics and Interdisciplinary Sciences, Shandong University, Qingdao 266237, China.

出版信息

Bioinformatics. 2023 Feb 3;39(2). doi: 10.1093/bioinformatics/btad070.

Abstract

MOTIVATION

It is fundamental to cut multi-domain proteins into individual domains, for precise domain-based structural and functional studies. In the past, sequence-based and structure-based domain parsing was carried out independently with different methodologies. The recent progress in deep learning-based protein structure prediction provides the opportunity to unify sequence-based and structure-based domain parsing.

RESULTS

Based on the inter-residue distance matrix, which can be either derived from the input structure or predicted by trRosettaX, we can decode the domain boundaries under a unified framework. We name the proposed method UniDoc. The principle of UniDoc is based on the well-accepted physical concept of maximizing intra-domain interaction while minimizing inter-domain interaction. Comprehensive tests on five benchmark datasets indicate that UniDoc outperforms other state-of-the-art methods in terms of both accuracy and speed, for both sequence-based and structure-based domain parsing. The major contribution of UniDoc is providing a unified framework for structure-based and sequence-based domain parsing. We hope that UniDoc would be a convenient tool for protein domain analysis.

AVAILABILITY AND IMPLEMENTATION

https://yanglab.nankai.edu.cn/UniDoc/.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

将多结构域蛋白切割成单个结构域对于基于结构域的精确结构和功能研究至关重要。过去,序列和基于结构的结构域解析是使用不同的方法独立进行的。基于深度学习的蛋白质结构预测的最新进展为统一序列和基于结构的结构域解析提供了机会。

结果

基于残基间距离矩阵,该矩阵可以从输入结构中得出,也可以由 trRosettaX 预测,我们可以在统一的框架下解码结构域边界。我们将提出的方法命名为 UniDoc。UniDoc 的原理基于广泛接受的物理概念,即最大化结构域内相互作用,同时最小化结构域间相互作用。在五个基准数据集上的综合测试表明,无论是基于序列的还是基于结构的结构域解析,UniDoc 在准确性和速度方面均优于其他最先进的方法。UniDoc 的主要贡献在于为基于结构和基于序列的结构域解析提供了一个统一的框架。我们希望 UniDoc 能成为蛋白质结构域分析的便捷工具。

可用性和实现

https://yanglab.nankai.edu.cn/UniDoc/。

补充信息

补充数据可在生物信息学在线获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e5ba/9919455/5e152e4a2510/btad070f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验