Department of Medicine, Division of Medical Genetics, University of Washington, Seattle, WA 98195, USA.
Genetics. 2013 Jun;194(2):459-71. doi: 10.1534/genetics.113.150029. Epub 2013 Mar 27.
Segments of indentity-by-descent (IBD) detected from high-density genetic data are useful for many applications, including long-range phase determination, phasing family data, imputation, IBD mapping, and heritability analysis in founder populations. We present Refined IBD, a new method for IBD segment detection. Refined IBD achieves both computational efficiency and highly accurate IBD segment reporting by searching for IBD in two steps. The first step (identification) uses the GERMLINE algorithm to find shared haplotypes exceeding a length threshold. The second step (refinement) evaluates candidate segments with a probabilistic approach to assess the evidence for IBD. Like GERMLINE, Refined IBD allows for IBD reporting on a haplotype level, which facilitates determination of multi-individual IBD and allows for haplotype-based downstream analyses. To investigate the properties of Refined IBD, we simulate SNP data from a model with recent superexponential population growth that is designed to match United Kingdom data. The simulation results show that Refined IBD achieves a better power/accuracy profile than fastIBD or GERMLINE. We find that a single run of Refined IBD achieves greater power than 10 runs of fastIBD. We also apply Refined IBD to SNP data for samples from the United Kingdom and from Northern Finland and describe the IBD sharing in these data sets. Refined IBD is powerful, highly accurate, and easy to use and is implemented in Beagle version 4.
从高密度遗传数据中检测到的身份区段(IBD)片段可用于许多应用,包括长程相位确定、相位家族数据、插补、IBD 映射和创始人群中的遗传力分析。我们提出了 Refined IBD,这是一种用于 IBD 片段检测的新方法。Refined IBD 通过分两步搜索 IBD 来实现计算效率和高度准确的 IBD 片段报告。第一步(识别)使用 GERMLINE 算法来查找超过长度阈值的共享单倍型。第二步(细化)使用概率方法评估候选片段,以评估 IBD 的证据。与 GERMLINE 一样,Refined IBD 允许在单倍型水平上报告 IBD,这有助于确定多个人的 IBD,并允许基于单倍型的下游分析。为了研究 Refined IBD 的特性,我们从一个具有近期超指数种群增长的模型中模拟 SNP 数据,该模型旨在匹配英国数据。模拟结果表明,Refined IBD 比 fastIBD 或 GERMLINE 具有更好的功率/准确性。我们发现,单个 Refined IBD 运行比 10 个 fastIBD 运行具有更高的功率。我们还将 Refined IBD 应用于来自英国和芬兰北部的样本的 SNP 数据,并描述了这些数据集的 IBD 共享情况。Refined IBD 功能强大、高度准确、易于使用,并在 Beagle 版本 4 中实现。