The Research Institute of Basic Sciences, Seoul National University, Seoul, South Korea.
Prosserman Centre for Health Research, The Lunenfeld-Tanenbaum Research Institute, Sinai Health System, Toronto, ON, Canada.
Bioinformatics. 2019 Nov 1;35(21):4419-4421. doi: 10.1093/bioinformatics/btz308.
For the analysis of high-throughput genomic data produced by next-generation sequencing (NGS) technologies, researchers need to identify linkage disequilibrium (LD) structure in the genome. In this work, we developed an R package gpart which provides clustering algorithms to define LD blocks or analysis units consisting of SNPs. The visualization tool in gpart can display the LD structure and gene positions for up to 20 000 SNPs in one image. The gpart functions facilitate construction of LD blocks and SNP partitions for vast amounts of genome sequencing data within reasonable time and memory limits in personal computing environments.
The R package is available at https://bioconductor.org/packages/gpart.
Supplementary data are available at Bioinformatics online.
对于下一代测序(NGS)技术产生的高通量基因组数据的分析,研究人员需要识别基因组中的连锁不平衡(LD)结构。在这项工作中,我们开发了一个 R 包 gpart,它提供了聚类算法来定义由 SNPs 组成的 LD 块或分析单元。gpart 的可视化工具可以在一张图像中显示多达 20000 个 SNPs 的 LD 结构和基因位置。gpart 函数有助于在个人计算环境中的合理时间和内存限制内构建大量基因组测序数据的 LD 块和 SNP 分区。
R 包可在 https://bioconductor.org/packages/gpart 获得。
补充数据可在 Bioinformatics 在线获得。