Suppr超能文献

从非洲裔 910 人的深度测序中组装泛基因组。

Assembly of a pan-genome from deep sequencing of 910 humans of African descent.

机构信息

Center for Computational Biology, McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University, Baltimore, MD, USA.

Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA.

出版信息

Nat Genet. 2019 Jan;51(1):30-35. doi: 10.1038/s41588-018-0273-y. Epub 2018 Nov 19.

Abstract

We used a deeply sequenced dataset of 910 individuals, all of African descent, to construct a set of DNA sequences that is present in these individuals but missing from the reference human genome. We aligned 1.19 trillion reads from the 910 individuals to the reference genome (GRCh38), collected all reads that failed to align, and assembled these reads into contiguous sequences (contigs). We then compared all contigs to one another to identify a set of unique sequences representing regions of the African pan-genome missing from the reference genome. Our analysis revealed 296,485,284 bp in 125,715 distinct contigs present in the populations of African descent, demonstrating that the African pan-genome contains ~10% more DNA than the current human reference genome. Although the functional significance of nearly all of this sequence is unknown, 387 of the novel contigs fall within 315 distinct protein-coding genes, and the rest appear to be intergenic.

摘要

我们使用了一个深度测序数据集,其中包含 910 名全部来自非洲血统的个体,构建了一组存在于这些个体中但在参考人类基因组中缺失的 DNA 序列。我们将这 910 个人的 11.9 万亿条读取与参考基因组(GRCh38)进行比对,收集所有无法比对的读取,并将这些读取组装成连续的序列(contigs)。然后,我们将所有 contigs 相互比较,以确定一组代表参考基因组中缺失的非洲泛基因组区域的独特序列。我们的分析揭示了在非洲血统人群中存在的 125715 个独特 contigs 中,有 296485284bp,这表明非洲泛基因组包含比当前人类参考基因组多约 10%的 DNA。尽管几乎所有这些序列的功能意义都未知,但 387 个新的 contigs 位于 315 个不同的蛋白编码基因内,其余的似乎位于基因间。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0d5c/6309586/21405849f30a/nihms-1509230-f0001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验