Markowski Julia, Kempfer Rieke, Kukalev Alexander, Irastorza-Azcarate Ibai, Loof Gesa, Kehr Birte, Pombo Ana, Rahmann Sven, Schwarz Roland F
Berlin Institute for Medical Systems Biology, Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), 10115 Berlin, Germany.
Humboldt-Universität zu Berlin, Department of Biology, 10099 Berlin, Germany.
Bioinformatics. 2021 Oct 11;37(19):3128-3135. doi: 10.1093/bioinformatics/btab238.
Genome Architecture Mapping (GAM) was recently introduced as a digestion- and ligation-free method to detect chromatin conformation. Orthogonal to existing approaches based on chromatin conformation capture (3C), GAM's ability to capture both inter- and intra-chromosomal contacts from low amounts of input data makes it particularly well suited for allele-specific analyses in a clinical setting. Allele-specific analyses are powerful tools to investigate the effects of genetic variants on many cellular phenotypes including chromatin conformation, but require the haplotypes of the individuals under study to be known a priori. So far, however, no algorithm exists for haplotype reconstruction and phasing of genetic variants from GAM data, hindering the allele-specific analysis of chromatin contact points in non-model organisms or individuals with unknown haplotypes.
We present GAMIBHEAR, a tool for accurate haplotype reconstruction from GAM data. GAMIBHEAR aggregates allelic co-observation frequencies from GAM data and employs a GAM-specific probabilistic model of haplotype capture to optimize phasing accuracy. Using a hybrid mouse embryonic stem cell line with known haplotype structure as a benchmark dataset, we assess correctness and completeness of the reconstructed haplotypes, and demonstrate the power of GAMIBHEAR to infer accurate genome-wide haplotypes from GAM data.
GAMIBHEAR is available as an R package under the open-source GPL-2 license at https://bitbucket.org/schwarzlab/gamibhear.
Supplementary data are available at Bioinformatics online.
基因组结构图谱(GAM)是最近引入的一种无需消化和连接的检测染色质构象的方法。与基于染色质构象捕获(3C)的现有方法不同,GAM能够从少量输入数据中捕获染色体间和染色体内的接触,这使其特别适合临床环境中的等位基因特异性分析。等位基因特异性分析是研究遗传变异对包括染色质构象在内的许多细胞表型影响的有力工具,但需要事先知道所研究个体的单倍型。然而,到目前为止,还没有用于从GAM数据中重建单倍型和对遗传变异进行定相的算法,这阻碍了对非模式生物或单倍型未知个体中染色质接触点的等位基因特异性分析。
我们提出了GAMIBHEAR,一种用于从GAM数据中准确重建单倍型的工具。GAMIBHEAR汇总了GAM数据中的等位基因共观察频率,并采用特定于GAM的单倍型捕获概率模型来优化定相准确性。使用具有已知单倍型结构的杂交小鼠胚胎干细胞系作为基准数据集,我们评估了重建单倍型的正确性和完整性,并证明了GAMIBHEAR从GAM数据中推断全基因组准确单倍型的能力。
GAMIBHEAR作为一个R包,根据开源的GPL-2许可,可在https://bitbucket.org/schwarzlab/gamibhear获取。
补充数据可在《生物信息学》在线版获取。