Suppr超能文献

迈向京族越南人参考基因组:利用长读长测序和光学图谱构建从头基因组组装

Toward a Kinh Vietnamese Reference Genome: Constructing a De Novo Genome Assembly Using Long-Read Sequencing and Optical Mapping.

作者信息

Dung Le Thi, Lam Le Tung, Trang Nguyen Hong, Anh Nguyen Vu Hung, Nam Nguyen Ngoc, Nhung Doan Thi, Linh Tran Huyen, Giang Le Ngoc, Ha Hoang, Huy Nguyen Quang, Hai Truong Nam

机构信息

Institute of Biology, Vietnam Academy of Science and Technology (VAST), Hanoi 10072, Vietnam.

Department of Life Sciences, University of Science and Technology of Hanoi (USTH), Vietnam Academy of Science and Technology (VAST), Hanoi 10072, Vietnam.

出版信息

Genes (Basel). 2025 Apr 29;16(5):536. doi: 10.3390/genes16050536.

Abstract

Population-specific reference genomes are essential for improving the accuracy and reliability of genomic analyses across diverse human populations. Although Vietnam ranks as the 16th most populous country in the world, with more than 86% of its population identifying as Kinh, studies specifically focusing on the Kinh Vietnamese reference genome remain scarce. Therefore, constructing a Kinh Vietnamese reference genome is valuable in the genetic research of Vietnamese. In this study, we combined PacBio long-read sequencing and Bionano optical mapping data to generate a de novo assembly of a Kinh Vietnamese genome (VHG), which was subsequently polished using multiple Kinh Vietnamese short-read whole-genome sequences (WGSs). The final assembly, named VHG1.2, comprised 3.22 gigabase pairs of high-quality sequence data, demonstrating high accuracy (QV: 48), completeness (BUSCO: 92%), and continuity (295 super scaffolds, super scaffold N50: 50 Kbp). Using multiple bioinformatic tools for variant calling, we observed significant variants when the population-specific reference VHG1.2 was used compared to the standard reference genome hg38. Overall, our genome assembly demonstrates the advantages of a long-read hybrid sequencing approach for de novo assembly and highlights the benefit of using population-specific reference genomes in population genomic analysis.

摘要

特定人群的参考基因组对于提高不同人类群体基因组分析的准确性和可靠性至关重要。尽管越南是世界上人口第16多的国家,超过86%的人口为京族,但专门针对京族越南人参考基因组的研究仍然很少。因此,构建京族越南人参考基因组对越南的基因研究具有重要价值。在本研究中,我们结合了PacBio长读长测序和Bionano光学图谱数据,对京族越南人基因组(VHG)进行了从头组装,随后使用多个京族越南人短读长全基因组序列(WGS)进行了优化。最终组装的名为VHG1.2的基因组包含32.2亿碱基对的高质量序列数据,显示出高准确性(QV:48)、完整性(BUSCO:92%)和连续性(295个超级支架,超级支架N50:50 Kbp)。使用多种生物信息学工具进行变异检测时,与标准参考基因组hg38相比,我们发现使用特定人群参考基因组VHG1.2时存在显著变异。总体而言,我们的基因组组装展示了长读长混合测序方法在从头组装中的优势,并突出了在群体基因组分析中使用特定人群参考基因组的好处。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验