Loh Po-Ru, Palamara Pier Francesco, Price Alkes L
Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, USA.
Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, Massachusetts, USA.
Nat Genet. 2016 Jul;48(7):811-6. doi: 10.1038/ng.3571. Epub 2016 Jun 6.
Recent work has leveraged the extensive genotyping of the Icelandic population to perform long-range phasing (LRP), enabling accurate imputation and association analysis of rare variants in target samples typed on genotyping arrays. Here we develop a fast and accurate LRP method, Eagle, that extends this paradigm to populations with much smaller proportions of genotyped samples by harnessing long (>4-cM) identical-by-descent (IBD) tracts shared among distantly related individuals. We applied Eagle to N ≈ 150,000 samples (0.2% of the British population) from the UK Biobank, and we determined that it is 1-2 orders of magnitude faster than existing methods while achieving similar or better phasing accuracy (switch error rate ≈ 0.3%, corresponding to perfect phase in a majority of 10-Mb segments). We also observed that, when used within an imputation pipeline, Eagle prephasing improved downstream imputation accuracy in comparison to prephasing in batches using existing methods, as necessary to achieve comparable computational cost.
近期的研究利用冰岛人群广泛的基因分型来进行长程定相(LRP),从而能够对基因分型阵列上分型的目标样本中的罕见变异进行准确的填充和关联分析。在此,我们开发了一种快速且准确的LRP方法Eagle,通过利用远亲个体之间共享的长(>4厘摩)同源片段,将这一范式扩展到基因分型样本比例小得多的人群。我们将Eagle应用于来自英国生物银行的约150,000个样本(占英国人口的0.2%),并确定它比现有方法快1至2个数量级,同时实现了相似或更好的定相准确性(切换错误率约为0.3%,相当于在大多数10兆碱基片段中达到完美定相)。我们还观察到,在填充流程中使用时,与为达到可比计算成本而按批次使用现有方法进行预定相相比,Eagle预定相提高了下游填充准确性。