Illumina Inc, 5200 Illumina Way, San Diego, CA, USA.
Illumina Cambridge Ltd, Chesterford Research Park, Little Chesterford, UK.
Genome Biol. 2019 Dec 19;20(1):291. doi: 10.1186/s13059-019-1909-7.
Accurate detection and genotyping of structural variations (SVs) from short-read data is a long-standing area of development in genomics research and clinical sequencing pipelines. We introduce Paragraph, an accurate genotyper that models SVs using sequence graphs and SV annotations. We demonstrate the accuracy of Paragraph on whole-genome sequence data from three samples using long-read SV calls as the truth set, and then apply Paragraph at scale to a cohort of 100 short-read sequenced samples of diverse ancestry. Our analysis shows that Paragraph has better accuracy than other existing genotypers and can be applied to population-scale studies.
准确检测和基因分型短读长数据中的结构变异(SVs)是基因组学研究和临床测序管道中的一个长期发展领域。我们引入了 Paragraph,这是一种使用序列图和 SV 注释来对 SV 进行建模的精确基因分型器。我们使用来自三个样本的长读 SV 调用作为真实集,在全基因组序列数据上展示了 Paragraph 的准确性,然后在一个具有多种血统的 100 个短读长测序样本的队列中大规模应用 Paragraph。我们的分析表明,Paragraph 比其他现有的基因分型器具有更高的准确性,并且可以应用于人群规模的研究。