Ma Zhanshan Sam
Computational Biology and Medical Ecology Lab, State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China.
Center for Excellence in Animal Evolution and Genetics, Chinese Academy of Sciences, Beijing, China.
J Med Virol. 2023 Apr;95(4):e28682. doi: 10.1002/jmv.28682.
The human virome, or the viral communities distributed on or in our body, is estimated to contain about 380 trillion of viruses (individuals), which has far reaching influences on our health and diseases. Obviously, the sheer numbers of viruses alone make the comparisons of two or multiple viromes extremely challenging. In fact, the theory of computation in computer science for so-termed NP-hard problems stipulates that the problem is unsolvable when the size of virome is sufficiently large even with fastest supercomputers. Practically, one has to develop heuristic and approximate algorithms to obtain practically satisfactory solutions for NP-hard problems. Here, we extend the species-specificity and specificity-diversity framework to develop a method for virome comparison (VC). The VC method consists of a pair of metrics: virus species specificity (VS) and virome specificity diversity (VSD) and corresponding pair of random search algorithms. Specifically, the VS and VS permutation (VSP) test can detect unique virus species (US) or enriched virus species (ES) in each virome (treatment), and the VSD and VSD permutation (VSDP) test can further determine holistic differences between two viromes or their subsets (assemblages of viruses). The test with four virome data sets demonstrated that the VC method is effective, efficient, and robust.
人类病毒组,即分布在我们身体上或体内的病毒群落,估计包含约380万亿个病毒(个体),这对我们的健康和疾病有着深远的影响。显然,仅病毒的数量之多就使得比较两个或多个病毒组极具挑战性。事实上,计算机科学中针对所谓NP难问题的计算理论规定,即使使用最快的超级计算机,当病毒组规模足够大时,该问题也无法解决。实际上,人们必须开发启发式和近似算法来获得NP难问题的实际满意解决方案。在此,我们扩展了物种特异性和特异性多样性框架,以开发一种病毒组比较(VC)方法。VC方法由一对指标组成:病毒物种特异性(VS)和病毒组特异性多样性(VSD)以及相应的一对随机搜索算法。具体而言,VS和VS置换(VSP)测试可以检测每个病毒组(处理组)中的独特病毒物种(US)或富集病毒物种(ES),而VSD和VSD置换(VSDP)测试可以进一步确定两个病毒组或其子集(病毒组合)之间的整体差异。对四个病毒组数据集的测试表明,VC方法是有效、高效且稳健的。