Department of Genetics, Washington University School of Medicine, St. Louis, MO, USA.
Edison Family Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, MO, USA.
Nature. 2022 Apr;604(7906):437-446. doi: 10.1038/s41586-022-04601-8. Epub 2022 Apr 20.
The human reference genome is the most widely used resource in human genetics and is due for a major update. Its current structure is a linear composite of merged haplotypes from more than 20 people, with a single individual comprising most of the sequence. It contains biases and errors within a framework that does not represent global human genomic variation. A high-quality reference with global representation of common variants, including single-nucleotide variants, structural variants and functional elements, is needed. The Human Pangenome Reference Consortium aims to create a more sophisticated and complete human reference genome with a graph-based, telomere-to-telomere representation of global genomic diversity. Here we leverage innovations in technology, study design and global partnerships with the goal of constructing the highest-possible quality human pangenome reference. Our goal is to improve data representation and streamline analyses to enable routine assembly of complete diploid genomes. With attention to ethical frameworks, the human pangenome reference will contain a more accurate and diverse representation of global genomic variation, improve gene-disease association studies across populations, expand the scope of genomics research to the most repetitive and polymorphic regions of the genome, and serve as the ultimate genetic resource for future biomedical research and precision medicine.
人类参考基因组是人类遗传学中使用最广泛的资源,正需要进行重大更新。它目前的结构是由来自 20 多个人的合并单倍型的线性组合,大多数序列来自单个个体。它在一个不代表全球人类基因组变异的框架内包含了偏差和错误。需要一个具有全球代表性的高质量参考,包括常见变体,如单核苷酸变体、结构变体和功能元件。人类泛基因组参考联盟旨在创建一个更复杂和完整的人类参考基因组,采用基于图形的端粒到端粒的方式表示全球基因组多样性。在这里,我们利用技术创新、研究设计和全球伙伴关系的优势,构建尽可能高质量的人类泛基因组参考。我们的目标是改善数据表示和简化分析,从而实现完整二倍体基因组的常规组装。在关注伦理框架的情况下,人类泛基因组参考将包含更准确和多样化的全球基因组变异表示,改善人群中的基因疾病关联研究,将基因组学研究扩展到基因组中最重复和多态性的区域,并作为未来生物医学研究和精准医学的最终遗传资源。