Suppr超能文献

强大的序列相似性搜索方法和深入的人工分析能够在许多看似“孤立”的病毒蛋白中识别出远源同源物。

Powerful sequence similarity search methods and in-depth manual analyses can identify remote homologs in many apparently "orphan" viral proteins.

作者信息

Kuchibhatla Durga B, Sherman Westley A, Chung Betty Y W, Cook Shelley, Schneider Georg, Eisenhaber Birgit, Karlin David G

机构信息

Bioinformatics Institute (BII), A*STAR (Agency for Science, Technology and Research), Matrix, Singapore.

出版信息

J Virol. 2014 Jan;88(1):10-20. doi: 10.1128/JVI.02595-13. Epub 2013 Oct 23.

Abstract

The genome sequences of new viruses often contain many "orphan" or "taxon-specific" proteins apparently lacking homologs. However, because viral proteins evolve very fast, commonly used sequence similarity detection methods such as BLAST may overlook homologs. We analyzed a data set of proteins from RNA viruses characterized as "genus specific" by BLAST. More powerful methods developed recently, such as HHblits or HHpred (available through web-based, user-friendly interfaces), could detect distant homologs of a quarter of these proteins, suggesting that these methods should be used to annotate viral genomes. In-depth manual analyses of a subset of the remaining sequences, guided by contextual information such as taxonomy, gene order, or domain cooccurrence, identified distant homologs of another third. Thus, a combination of powerful automated methods and manual analyses can uncover distant homologs of many proteins thought to be orphans. We expect these methodological results to be also applicable to cellular organisms, since they generally evolve much more slowly than RNA viruses. As an application, we reanalyzed the genome of a bee pathogen, Chronic bee paralysis virus (CBPV). We could identify homologs of most of its proteins thought to be orphans; in each case, identifying homologs provided functional clues. We discovered that CBPV encodes a domain homologous to the Alphavirus methyltransferase-guanylyltransferase; a putative membrane protein, SP24, with homologs in unrelated insect viruses and insect-transmitted plant viruses having different morphologies (cileviruses, higreviruses, blunerviruses, negeviruses); and a putative virion glycoprotein, ORF2, also found in negeviruses. SP24 and ORF2 are probably major structural components of the virions.

摘要

新病毒的基因组序列通常包含许多明显缺乏同源物的“孤儿”或“分类群特异性”蛋白质。然而,由于病毒蛋白进化非常快,常用的序列相似性检测方法(如BLAST)可能会忽略同源物。我们分析了一组通过BLAST被鉴定为“属特异性”的RNA病毒蛋白质数据集。最近开发的更强大的方法,如HHblits或HHpred(可通过基于网络的用户友好界面获得),可以检测到这些蛋白质中四分之一的远缘同源物,这表明这些方法应用于注释病毒基因组。在分类学、基因顺序或结构域共现等上下文信息的指导下,对其余序列的一个子集进行深入的人工分析,又鉴定出另外三分之一的远缘同源物。因此,强大的自动化方法和人工分析相结合,可以揭示许多被认为是孤儿的蛋白质的远缘同源物。我们预计这些方法学结果也适用于细胞生物,因为它们的进化速度通常比RNA病毒慢得多。作为一个应用,我们重新分析了一种蜜蜂病原体——慢性蜜蜂麻痹病毒(CBPV)的基因组。我们能够鉴定出其大多数被认为是孤儿的蛋白质的同源物;在每种情况下,鉴定同源物都提供了功能线索。我们发现CBPV编码一个与甲病毒甲基转移酶-鸟苷酸转移酶同源的结构域;一种假定的膜蛋白SP24,在形态不同的无关昆虫病毒和昆虫传播的植物病毒(卷曲病毒、高里病毒、蓝纳病毒、奈格病毒)中有同源物;以及一种假定的病毒粒子糖蛋白ORF2,也在奈格病毒中发现。SP24和ORF2可能是病毒粒子的主要结构成分。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8c79/3911697/09dab330388a/zjv9990984680001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验