Suppr超能文献

人类基因组的染色体规模、单倍型解析组装。

Chromosome-scale, haplotype-resolved assembly of human genomes.

机构信息

Department of Genetics, Harvard Medical School, Boston, MA, USA.

Department of Data Sciences, Dana-Farber Cancer Institute, Boston, MA, USA.

出版信息

Nat Biotechnol. 2021 Mar;39(3):309-312. doi: 10.1038/s41587-020-0711-0. Epub 2020 Dec 7.

Abstract

Haplotype-resolved or phased genome assembly provides a complete picture of genomes and their complex genetic variations. However, current algorithms for phased assembly either do not generate chromosome-scale phasing or require pedigree information, which limits their application. We present a method named diploid assembly (DipAsm) that uses long, accurate reads and long-range conformation data for single individuals to generate a chromosome-scale phased assembly within 1 day. Applied to four public human genomes, PGP1, HG002, NA12878 and HG00733, DipAsm produced haplotype-resolved assemblies with minimum contig length needed to cover 50% of the known genome (NG50) up to 25 Mb and phased ~99.5% of heterozygous sites at 98-99% accuracy, outperforming other approaches in terms of both contiguity and phasing completeness. We demonstrate the importance of chromosome-scale phased assemblies for the discovery of structural variants (SVs), including thousands of new transposon insertions, and of highly polymorphic and medically important regions such as the human leukocyte antigen (HLA) and killer cell immunoglobulin-like receptor (KIR) regions. DipAsm will facilitate high-quality precision medicine and studies of individual haplotype variation and population diversity.

摘要

单倍型解析或相位基因组组装提供了基因组及其复杂遗传变异的完整图景。然而,目前用于相位组装的算法要么不能生成染色体尺度的相位,要么需要系谱信息,这限制了它们的应用。我们提出了一种名为二倍体组装(DipAsm)的方法,该方法使用长的、准确的读取和长程构象数据来对单个个体进行单倍型解析组装,在 1 天内生成染色体尺度的相位组装。将 DipAsm 应用于四个公开的人类基因组 PGP1、HG002、NA12878 和 HG00733,生成的单倍型解析组装具有最小的 contig 长度,可覆盖 50%的已知基因组(NG50),达到 25Mb,并以 98-99%的准确率对~99.5%的杂合位点进行相位,在连续性和相位完整性方面都优于其他方法。我们证明了染色体尺度相位组装对于结构变异(SV)的发现的重要性,包括数千个新的转座子插入,以及高度多态性和医学上重要的区域,如人类白细胞抗原(HLA)和杀伤细胞免疫球蛋白样受体(KIR)区域。DipAsm 将促进高质量的精准医学和个体单倍型变异和人群多样性的研究。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1fe2/7954703/e23da24c53fc/41587_2020_711_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验