Jiang Jing-Zhe, Yuan Wen-Guang, Shang Jiayu, Shi Ying-Hui, Yang Li-Ling, Liu Min, Zhu Peng, Jin Tao, Sun Yanni, Yuan Li-Hong
Key Laboratory of South China Sea Fishery Resources Exploitation & Utilization, Ministry of Agriculture and Rural Affairs, South China Sea Fisheries Research Institute, Chinese Academy of Fishery Sciences, Guangzhou 510300, Guangdong, China.
Guangdong Province Key Laboratory for Biotechnology Drug Candidates, School of Biosciences and Biopharmaceutics, Guangdong Pharmaceutical University, Guangzhou 510006, Guangdong, China.
Brief Bioinform. 2023 Jan 19;24(1). doi: 10.1093/bib/bbac505.
Viruses are the most ubiquitous and diverse entities in the biome. Due to the rapid growth of newly identified viruses, there is an urgent need for accurate and comprehensive virus classification, particularly for novel viruses. Here, we present PhaGCN2, which can rapidly classify the taxonomy of viral sequences at the family level and supports the visualization of the associations of all families. We evaluate the performance of PhaGCN2 and compare it with the state-of-the-art virus classification tools, such as vConTACT2, CAT and VPF-Class, using the widely accepted metrics. The results show that PhaGCN2 largely improves the precision and recall of virus classification, increases the number of classifiable virus sequences in the Global Ocean Virome dataset (v2.0) by four times and classifies more than 90% of the Gut Phage Database. PhaGCN2 makes it possible to conduct high-throughput and automatic expansion of the database of the International Committee on Taxonomy of Viruses. The source code is freely available at https://github.com/KennthShang/PhaGCN2.0.
病毒是生物群落中最普遍且多样的实体。由于新发现病毒的迅速增加,迫切需要准确且全面的病毒分类,尤其是针对新型病毒。在此,我们展示了PhaGCN2,它能够在科水平上快速对病毒序列进行分类,并支持可视化所有科之间的关联。我们使用广泛认可的指标评估了PhaGCN2的性能,并将其与最先进的病毒分类工具(如vConTACT2、CAT和VPF-Class)进行比较。结果表明,PhaGCN2在很大程度上提高了病毒分类的精度和召回率,使全球海洋病毒组数据集(v2.0)中可分类的病毒序列数量增加了四倍,并对超过90%的肠道噬菌体数据库进行了分类。PhaGCN2使得对国际病毒分类委员会数据库进行高通量和自动扩展成为可能。其源代码可在https://github.com/KennthShang/PhaGCN2.0上免费获取。