Electrical and Information Engineering College, Jilin Agricultural Science and Technology University, Jilin, China.
Information Technology Academy, Jilin Agricultural University, Changchun, China.
BMC Bioinformatics. 2020 Oct 16;21(1):461. doi: 10.1186/s12859-020-03754-5.
Linkage disequilibrium (LD) analysis is broadly utilized in genetics to understand the evolutionary and demographic history and helps geneticists identify genes associated with interested inherited traits, such as diseases. There are some tools for linkage disequilibrium analysis either in a local or online way; however, there has been no such tool supporting both graphical user interface (GUI) and parallel computing.
We developed a GUI software called LDkit for LD analysis, which supports parallel computing. The LDkit supports both variant call format (VCF) and PLINK 'ped + map' format. At the same time, users could also just analyze a subset of individuals from the whole population. The LDkit reads the data by block and then paralleled the computation process by monitoring the usage of processes. Assessment on the Human 1000 genome data showed that when paralleled with 32 threads, the running time was reduced to less than 6 minutes from ~77 minutes using the chromosome 22 dataset with 1,103,547 SNPs and 2504 individuals.
The software LDkit can be effectively used to calculate and plot LD decay, LD block, and linkage disequilibrium analysis between a site and a given region. Most importantly, both graphical user interface (GUI) and stand-alone packages are available for users' convenience. LDkit was written in JAVA language under cross-platform support.
连锁不平衡(LD)分析在遗传学中被广泛用于了解进化和人口历史,并帮助遗传学家识别与感兴趣的遗传特征(如疾病)相关的基因。有一些本地或在线的连锁不平衡分析工具;然而,还没有支持图形用户界面(GUI)和并行计算的工具。
我们开发了一个名为 LDkit 的 LD 分析 GUI 软件,它支持并行计算。LDkit 支持变体调用格式(VCF)和 PLINK 'ped + map' 格式。同时,用户也可以只分析整个人群中的一部分个体。LDkit 通过块读取数据,然后通过监控进程的使用情况来并行计算过程。对人类 1000 基因组数据的评估表明,当使用 32 个线程并行处理时,对于包含 1103547 个 SNP 和 2504 个个体的染色体 22 数据集,运行时间从大约 77 分钟减少到不到 6 分钟。
该软件 LDkit 可有效用于计算和绘制 LD 衰减、LD 块以及给定区域与一个位点之间的连锁不平衡分析。最重要的是,为了方便用户,提供了图形用户界面(GUI)和独立软件包。LDkit 是用支持跨平台的 Java 语言编写的。