整合基于序列和结构的相似性度量以划分多个病毒分类水平

Integrating Sequence- and Structure-Based Similarity Metrics for the Demarcation of Multiple Viral Taxonomic Levels.

作者信息

Dos Santos Igor C, de Souza Rebecca di Stephano, Tolstoy Igor, Oliveira Liliane S, Gruber Arthur

机构信息

Escola de Artes, Ciências e Humanidades, Universidade de São Paulo, São Paulo 038288-000, Brazil.

Instituto de Biociências, Universidade de São Paulo, São Paulo 03828-000, Brazil.

出版信息

Viruses. 2025 Apr 29;17(5):642. doi: 10.3390/v17050642.

Abstract

Viruses exhibit significantly greater diversity than cellular organisms, posing a complex challenge to their taxonomic classification. While primary sequences may diverge considerably, protein functional domains can maintain conserved 3D structures throughout evolution. Consequently, structural homology of viral proteins can reveal deep taxonomic relationships, overcoming limitations inherent in sequence-based methods. In this work, we introduce MPACT (Multimetric Pairwise Comparison Tool), an integrated tool that utilizes both sequence- and structure-based metrics. The program incorporates five metrics: sequence identity, similarity, maximum likelihood distance, TM-score, and 3Di-character similarity. MPACT generates heatmaps and distance trees to visualize viral relationships across multiple levels, enabling users to substantiate viral taxa demarcation. Taxa delineation can be achieved by specifying appropriate score cutoffs for each metric, facilitating the definition of viral groups, and storing their corresponding sequence data. By analyzing diverse viral datasets spanning various levels of divergence, we demonstrate MPACT's capability to reveal viral relationships, even among distantly related taxa. This tool provides a comprehensive approach to assist viral classification, exceeding the current methods by integrating multiple metrics and uncovering deeper evolutionary connections.

摘要

病毒表现出比细胞生物显著更大的多样性,这给它们的分类带来了复杂的挑战。虽然一级序列可能有很大差异,但蛋白质功能域在整个进化过程中可以保持保守的三维结构。因此,病毒蛋白的结构同源性可以揭示深层次的分类关系,克服基于序列的方法固有的局限性。在这项工作中,我们引入了MPACT(多指标成对比较工具),这是一种综合工具,它利用基于序列和结构的指标。该程序纳入了五个指标:序列同一性、相似性、最大似然距离、TM分数和3Di特征相似性。MPACT生成热图和距离树,以可视化多个层面的病毒关系,使用户能够证实病毒分类单元的划分。通过为每个指标指定适当的分数阈值,促进病毒组的定义,并存储其相应的序列数据,可以实现分类单元的划分。通过分析跨越不同分化水平的各种病毒数据集,我们证明了MPACT揭示病毒关系的能力,即使是在远缘分类单元之间。该工具提供了一种全面的方法来辅助病毒分类,通过整合多个指标并揭示更深层次的进化联系,超越了当前的方法。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索