Department of Life Sciences, Silwood Park Campus, Imperial College London, Ascot, UK.
Department of Animal and Plant Sciences, University of Sheffield, Sheffield, UK.
Bioinformatics. 2019 Oct 1;35(19):3855-3856. doi: 10.1093/bioinformatics/btz200.
Linkage disequilibrium (LD) measures the correlation between genetic loci and is highly informative for association mapping and population genetics. As many studies rely on called genotypes for estimating LD, their results can be affected by data uncertainty, especially when employing a low read depth sequencing strategy. Furthermore, there is a manifest lack of tools for the analysis of large-scale, low-depth and short-read sequencing data from non-model organisms with limited sample sizes.
ngsLD addresses these issues by estimating LD directly from genotype likelihoods in a fast, reliable and user-friendly implementation. This method makes use of the full information available from sequencing data and provides accurate estimates of linkage disequilibrium patterns compared with approaches based on genotype calling. We conducted a case study to investigate how LD decays over physical distance in two avian species.
The methods presented in this work were implemented in C/C and are freely available for non-commercial use from https://github.com/fgvieira/ngsLD.
Supplementary data are available at Bioinformatics online.
连锁不平衡 (LD) 衡量遗传位点之间的相关性,对于关联作图和群体遗传学非常有信息量。由于许多研究依赖于已调用的基因型来估计 LD,因此它们的结果可能会受到数据不确定性的影响,尤其是在采用低读深度测序策略时。此外,对于具有有限样本量的非模式生物的大规模、低深度和短读测序数据,缺乏分析这些数据的工具。
ngsLD 通过在快速、可靠和用户友好的实现中直接从基因型似然估计 LD 来解决这些问题。该方法利用测序数据中可用的全部信息,并与基于基因型调用的方法相比,提供了对连锁不平衡模式的准确估计。我们进行了一项案例研究,以调查在两种鸟类物种中,LD 如何随物理距离衰减。
本工作中提出的方法是用 C/C++ 实现的,可从 https://github.com/fgvieira/ngsLD 免费获取,非商业用途。
补充数据可在 Bioinformatics 在线获得。