Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, 1350 Copenhagen K, Denmark.
Université de Toulouse, University Paul Sabatier (UPS), Laboratoire AMIS, CNRS UMR 5288, Toulouse, France.
Gigascience. 2019 May 1;8(5). doi: 10.1093/gigascience/giz034.
The estimation of relatedness between pairs of possibly inbred individuals from high-throughput sequencing (HTS) data has previously not been possible for samples where we cannot obtain reliable genotype calls, as in the case of low-coverage data.
We introduce ngsRelateV2, a major revision of ngsRelateV1, a program that originally allowed for estimation of relatedness from HTS data among non-inbred individuals only. The new revised version takes into account the possibility of individuals being inbred by estimating the 9 condensed Jacquard coefficients along with various other relatedness statistics. The program is threaded and scales linearly with the number of cores allocated to the process.
The program is available as an open source C/C++ program under the GPL license and hosted at https://github.com/ANGSD/ngsRelate. To facilitate easy analysis, the program is able to work directly on the most commonly used container formats for raw sequence (BAM/CRAM) and summary data (VCF/BCF).
在无法获得可靠基因型信息的情况下,如在低覆盖度数据的情况下,之前无法从高通量测序(HTS)数据中估算可能近交的个体对之间的相关性。
我们引入了 ngsRelateV2,这是 ngsRelateV1 的重大修订版,该程序最初仅允许在非近交个体中从 HTS 数据中估算相关性。新版本考虑到个体可能存在近交的可能性,通过估计 9 个压缩的杰卡德系数以及其他各种相关统计量来实现。该程序是线程化的,并随着分配给该过程的核心数量线性扩展。
该程序是一个基于 GPL 许可证的开源 C/C++程序,并托管在 https://github.com/ANGSD/ngsRelate 上。为了便于分析,该程序能够直接处理最常用的原始序列(BAM/CRAM)和摘要数据(VCF/BCF)容器格式。