deCODE Genetics/Amgen, Inc., Reykjavik, Iceland.
School of Engineering and Natural Sciences, University of Iceland, Reykjavik, Iceland.
Nat Genet. 2017 Nov;49(11):1654-1660. doi: 10.1038/ng.3964. Epub 2017 Sep 25.
A fundamental requirement for genetic studies is an accurate determination of sequence variation. While human genome sequence diversity is increasingly well characterized, there is a need for efficient ways to use this knowledge in sequence analysis. Here we present Graphtyper, a publicly available novel algorithm and software for discovering and genotyping sequence variants. Graphtyper realigns short-read sequence data to a pangenome, a variation-aware graph structure that encodes sequence variation within a population by representing possible haplotypes as graph paths. Our results show that Graphtyper is fast, highly scalable, and provides sensitive and accurate genotype calls. Graphtyper genotyped 89.4 million sequence variants in the whole genomes of 28,075 Icelanders using less than 100,000 CPU days, including detailed genotyping of six human leukocyte antigen (HLA) genes. We show that Graphtyper is a valuable tool in characterizing sequence variation in both small and population-scale sequencing studies.
遗传研究的一个基本要求是准确确定序列变异。虽然人类基因组序列多样性越来越多地得到了很好的描述,但需要有效的方法来在序列分析中利用这些知识。在这里,我们介绍了 Graphtyper,这是一种新的、可公开获取的算法和软件,用于发现和基因分型序列变体。Graphtyper 将短读序列数据重新排列到泛基因组中,这是一种基于变异的图结构,通过将可能的单倍型表示为图路径来编码群体内的序列变异。我们的结果表明,Graphtyper 速度快、高度可扩展,并提供敏感和准确的基因型调用。Graphtyper 使用不到 10 万个 CPU 天,在 28075 名冰岛人的全基因组中对 8940 万个序列变体进行了基因分型,包括对六个人类白细胞抗原 (HLA) 基因的详细基因分型。我们表明,Graphtyper 是在小规模和群体规模测序研究中描述序列变异的一种有价值的工具。