Jiao Du, Dong Xiaorui, Fan Shiyu, Liu Xinyi, Yu Yingyan, Wei Chaochun
Department of Bioinformatics and Biostatistics, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China.
Department of General Surgery of Ruijin Hospital, Shanghai Institute of Digestive Surgery, and Shanghai Key Laboratory for Gastric Neoplasms, Shanghai Jiao Tong University School of Medicine, Shanghai, China
Life Sci Alliance. 2025 Jan 27;8(4). doi: 10.26508/lsa.202402977. Print 2025 Apr.
A pangenome is the sum of the genetic information of all individuals in a species or a population. Genomics research has been gradually shifted to a paradigm using a pangenome as the reference. However, in disease genomics study, pangenome-based analysis is still in its infancy. In this study, we introduced a graph-based pangenome GGCPan from 185 patients with gastric cancer. We then systematically compared the cancer genomics study results using GGCPan, a linear pangenome GCPan, and the human reference genome as the reference. For small variant detection and microsatellite instability status identification, there is little difference in using three different genomes. Using GGCPan as the reference had a significant advantage in structural variant identification. A total of 24 candidate gastric cancer driver genes were detected using three different reference genomes, of which eight were common and five were detected only based on pangenomes. Our results showed that disease-specific pangenome as a reference is promising and a whole set of tools are still to be developed or improved for disease genomics study in the pangenome era.
泛基因组是一个物种或群体中所有个体遗传信息的总和。基因组学研究已逐渐转向以泛基因组作为参考的范式。然而,在疾病基因组学研究中,基于泛基因组的分析仍处于起步阶段。在本研究中,我们从185例胃癌患者中引入了基于图的泛基因组GGCPan。然后,我们系统地比较了使用GGCPan、线性泛基因组GCPan和人类参考基因组作为参考的癌症基因组学研究结果。对于小变异检测和微卫星不稳定性状态鉴定,使用三种不同基因组的差异不大。以GGCPan作为参考在结构变异鉴定方面具有显著优势。使用三种不同的参考基因组共检测到24个候选胃癌驱动基因,其中8个是常见的,5个是仅基于泛基因组检测到的。我们的结果表明,疾病特异性泛基因组作为参考很有前景,并且在泛基因组时代,仍有待开发或改进一整套用于疾病基因组学研究的工具。