Suppr超能文献

丹麦 150 个个体基因组的测序和从头组装作为一个群体参考。

Sequencing and de novo assembly of 150 genomes from Denmark as a population reference.

机构信息

Bioinformatics Centre, Department of Biology, University of Copenhagen, 2200 Copenhagen, Denmark.

Bioinformatics Research Centre, Aarhus University, 8000 Aarhus, Denmark.

出版信息

Nature. 2017 Aug 3;548(7665):87-91. doi: 10.1038/nature23264. Epub 2017 Jul 26.

Abstract

Hundreds of thousands of human genomes are now being sequenced to characterize genetic variation and use this information to augment association mapping studies of complex disorders and other phenotypic traits. Genetic variation is identified mainly by mapping short reads to the reference genome or by performing local assembly. However, these approaches are biased against discovery of structural variants and variation in the more complex parts of the genome. Hence, large-scale de novo assembly is needed. Here we show that it is possible to construct excellent de novo assemblies from high-coverage sequencing with mate-pair libraries extending up to 20 kilobases. We report de novo assemblies of 150 individuals (50 trios) from the GenomeDenmark project. The quality of these assemblies is similar to those obtained using the more expensive long-read technology. We use the assemblies to identify a rich set of structural variants including many novel insertions and demonstrate how this variant catalogue enables further deciphering of known association mapping signals. We leverage the assemblies to provide 100 completely resolved major histocompatibility complex haplotypes and to resolve major parts of the Y chromosome. Our study provides a regional reference genome that we expect will improve the power of future association mapping studies and hence pave the way for precision medicine initiatives, which now are being launched in many countries including Denmark.

摘要

现在已经有数十万个人类基因组被测序,以描述遗传变异,并利用这些信息来增强对复杂疾病和其他表型特征的关联映射研究。遗传变异主要通过将短读序列映射到参考基因组或进行局部组装来识别。然而,这些方法对发现结构变异和基因组更复杂部分的变异存在偏见。因此,需要进行大规模的从头组装。在这里,我们展示了通过覆盖度高的测序和长达 20kb 的 mate-pair 文库,构建优秀的从头组装是可行的。我们报告了来自 GenomeDenmark 项目的 150 个人(50 个三核苷酸)的从头组装结果。这些组装的质量与使用更昂贵的长读长技术获得的组装质量相似。我们使用这些组装来识别丰富的结构变异,包括许多新的插入,并展示了这个变异目录如何进一步解析已知的关联映射信号。我们利用这些组装提供了 100 个完全解析的主要组织相容性复合体单倍型,并解析了 Y 染色体的主要部分。我们的研究提供了一个区域参考基因组,我们预计这将提高未来关联映射研究的效力,从而为精准医疗计划铺平道路,包括丹麦在内的许多国家现在都在启动这些计划。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验