Yoo DongAhn, Rhie Arang, Hebbar Prajna, Antonacci Francesca, Logsdon Glennis A, Solar Steven J, Antipov Dmitry, Pickett Brandon D, Safonova Yana, Montinaro Francesco, Luo Yanting, Malukiewicz Joanna, Storer Jessica M, Lin Jiadong, Sequeira Abigail N, Mangan Riley J, Hickey Glenn, Anez Graciela Monfort, Balachandran Parithi, Bankevich Anton, Beck Christine R, Biddanda Arjun, Borchers Matthew, Bouffard Gerard G, Brannan Emry, Brooks Shelise Y, Carbone Lucia, Carrel Laura, Chan Agnes P, Crawford Juyun, Diekhans Mark, Engelbrecht Eric, Feschotte Cedric, Formenti Giulio, Garcia Gage H, de Gennaro Luciana, Gilbert David, Green Richard E, Guarracino Andrea, Gupta Ishaan, Haddad Diana, Han Junmin, Harris Robert S, Hartley Gabrielle A, Harvey William T, Hiller Michael, Hoekzema Kendra, Houck Marlys L, Jeong Hyeonsoo, Kamali Kaivan, Kellis Manolis, Kille Bryce, Lee Chul, Lee Youngho, Lees William, Lewis Alexandra P, Li Qiuhui, Loftus Mark, Loh Yong Hwee Eddie, Loucks Hailey, Ma Jian, Mao Yafei, Martinez Juan F I, Masterson Patrick, McCoy Rajiv C, McGrath Barbara, McKinney Sean, Meyer Britta S, Miga Karen H, Mohanty Saswat K, Munson Katherine M, Pal Karol, Pennell Matt, Pevzner Pavel A, Porubsky David, Potapova Tamara, Ringeling Francisca R, Roha Joana L, Ryder Oliver A, Sacco Samuel, Saha Swati, Sasaki Takayo, Schatz Michael C, Schork Nicholas J, Shanks Cole, Smeds Linnéa, Son Dongmin R, Steiner Cynthia, Sweeten Alexander P, Tassia Michael G, Thibaud-Nissen Françoise, Torres-González Edmundo, Trivedi Mihir, Wei Wenjie, Wertz Julie, Yang Muyu, Zhang Panpan, Zhang Shilong, Zhang Yang, Zhang Zhenmiao, Zhao Sarah A, Zhu Yixin, Jarvis Erich D, Gerton Jennifer L, Rivas-González Iker, Paten Benedict, Szpiech Zachary A, Huber Christian D, Lenz Tobias L, Konkel Miriam K, Yi Soojin V, Canzar Stefan, Watson Corey T, Sudmant Peter H, Molloy Erin, Garrison Erik, Lowe Craig B, Ventura Mario, O'Neill Rachel J, Koren Sergey, Makova Kateryna D, Phillippy Adam M, Eichler Evan E
Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA.
Genome Informatics Section, Center for Genomics and Data Science Research, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20892, USA.
bioRxiv. 2024 Oct 5:2024.07.31.605654. doi: 10.1101/2024.07.31.605654.
We present haplotype-resolved reference genomes and comparative analyses of six ape species, namely: chimpanzee, bonobo, gorilla, Bornean orangutan, Sumatran orangutan, and siamang. We achieve chromosome-level contiguity with unparalleled sequence accuracy (<1 error in 500,000 base pairs), completely sequencing 215 gapless chromosomes telomere-to-telomere. We resolve challenging regions, such as the major histocompatibility complex and immunoglobulin loci, providing more in-depth evolutionary insights. Comparative analyses, including human, allow us to investigate the evolution and diversity of regions previously uncharacterized or incompletely studied without bias from mapping to the human reference. This includes newly minted gene families within lineage-specific segmental duplications, centromeric DNA, acrocentric chromosomes, and subterminal heterochromatin. This resource should serve as a definitive baseline for all future evolutionary studies of humans and our closest living ape relatives.
我们展示了六种猿类物种(即黑猩猩、倭黑猩猩、大猩猩、婆罗洲猩猩、苏门答腊猩猩和合趾猿)的单倍型解析参考基因组及比较分析。我们实现了染色体水平的连续性以及无与伦比的序列准确性(每50万个碱基对中误差小于1个),完成了215条无间隙染色体从端粒到端粒的完全测序。我们解析了具有挑战性的区域,如主要组织相容性复合体和免疫球蛋白基因座,提供了更深入的进化见解。包括人类在内的比较分析使我们能够研究以前未表征或研究不充分的区域的进化和多样性,而不会因映射到人类参考基因组而产生偏差。这包括谱系特异性片段重复、着丝粒DNA、近端着丝粒染色体和亚端粒异染色质内新形成的基因家族。该资源应为未来所有关于人类及其最亲近的现存猿类亲属的进化研究提供一个明确的基线。