Yan Qiulong, Huang Liansha, Li Shenghui, Zhang Yue, Guo Ruochun, Zhang Pan, Lei Zhixin, Lv Qingbo, Chen Fang, Li Zhiming, Meng Jinxin, Li Jing, Wang Guangyang, Chen Changming, Ullah Hayan, Cheng Lin, Fan Shao, You Wei, Zhang Yan, Ma Jie, Sha Shanshan, Sun Wen
The Fifth Affiliated Hospital of Southern Medical University, Guangzhou, 510900, China.
Department of Microbiology, Department of Biochemistry and Molecular Biology, College of Basic Medical Sciences, Dalian Medical University, Dalian, 116044, China.
Genome Med. 2025 Mar 26;17(1):30. doi: 10.1186/s13073-025-01460-6.
The gut viral community has been increasingly recognized for its role in human physiology and health; however, our understanding of its genetic makeup, functional potential, and disease associations remains incomplete.
In this study, we collected 11,286 bulk or viral metagenomes from fecal samples across large-scale Chinese populations to establish a Chinese Gut Virus Catalogue (cnGVC) using a de novo virus identification approach. We then examined the diversity and compositional patterns of the gut virome in relation to common diseases by analyzing 6311 bulk metagenomes representing 28 disease or unhealthy states.
The cnGVC contains 93,462 nonredundant viral genomes, with over 70% of these being novel viruses not included in existing gut viral databases. This resource enabled us to characterize the functional diversity and specificity of the gut virome. Using cnGVC, we profiled the gut virome in large-scale populations, assessed sex- and age-related variations, and identified 4238 universal viral signatures of diseases. A random forest classifier based on these signatures achieved high accuracy in distinguishing diseased individuals from controls (AUC = 0.698) and high-risk patients from controls (AUC = 0.761), and its predictive ability was also validated in external cohorts.
Our resources and findings significantly expand the current understanding of the human gut virome and provide a comprehensive view of the associations between gut viruses and common diseases. This will pave the way for novel strategies in the treatment and prevention of these diseases.
肠道病毒群落因其在人类生理和健康中的作用而越来越受到认可;然而,我们对其基因组成、功能潜力和疾病关联的理解仍不完整。
在本研究中,我们从大规模中国人群的粪便样本中收集了11286个宏基因组或病毒宏基因组,采用从头病毒鉴定方法建立了中国肠道病毒目录(cnGVC)。然后,我们通过分析代表28种疾病或不健康状态的6311个宏基因组,研究了肠道病毒组与常见疾病相关的多样性和组成模式。
cnGVC包含93462个非冗余病毒基因组,其中70%以上是现有肠道病毒数据库中未包含的新型病毒。这一资源使我们能够表征肠道病毒组的功能多样性和特异性。利用cnGVC,我们对大规模人群的肠道病毒组进行了分析,评估了性别和年龄相关的差异,并确定了4238种疾病的通用病毒特征。基于这些特征的随机森林分类器在区分患病个体与对照(AUC = 0.698)以及高危患者与对照(AUC = 0.761)方面具有很高的准确性,其预测能力也在外部队列中得到了验证。
我们的资源和发现显著扩展了当前对人类肠道病毒组的理解,并提供了肠道病毒与常见疾病之间关联的全面视图。这将为这些疾病的治疗和预防新策略铺平道路。