Wang Siyu, Xu Jinbo, Zeng Jianyang
Department of Automation, Tsinghua University, Beijing 100084, P.R. China.
Toyota Technological Institute at Chicago, 6045 S Kenwood, IL 60637, USA.
Nucleic Acids Res. 2015 Apr 30;43(8):e54. doi: 10.1093/nar/gkv100. Epub 2015 Feb 17.
For eukaryotic cells, the biological processes involving regulatory DNA elements play an important role in cell cycle. Understanding 3D spatial arrangements of chromosomes and revealing long-range chromatin interactions are critical to decipher these biological processes. In recent years, chromosome conformation capture (3C) related techniques have been developed to measure the interaction frequencies between long-range genome loci, which have provided a great opportunity to decode the 3D organization of the genome. In this paper, we develop a new Bayesian framework to derive the 3D architecture of a chromosome from 3C-based data. By modeling each chromosome as a polymer chain, we define the conformational energy based on our current knowledge on polymer physics and use it as prior information in the Bayesian framework. We also propose an expectation-maximization (EM) based algorithm to estimate the unknown parameters of the Bayesian model and infer an ensemble of chromatin structures based on interaction frequency data. We have validated our Bayesian inference approach through cross-validation and verified the computed chromatin conformations using the geometric constraints derived from fluorescence in situ hybridization (FISH) experiments. We have further confirmed the inferred chromatin structures using the known genetic interactions derived from other studies in the literature. Our test results have indicated that our Bayesian framework can compute an accurate ensemble of 3D chromatin conformations that best interpret the distance constraints derived from 3C-based data and also agree with other sources of geometric constraints derived from experimental evidence in the previous studies. The source code of our approach can be found in https://github.com/wangsy11/InfMod3DGen.
对于真核细胞而言,涉及调控DNA元件的生物学过程在细胞周期中起着重要作用。理解染色体的三维空间排列并揭示长程染色质相互作用对于解读这些生物学过程至关重要。近年来,已开发出与染色体构象捕获(3C)相关的技术来测量长程基因组位点之间的相互作用频率,这为解码基因组的三维组织提供了绝佳机会。在本文中,我们开发了一种新的贝叶斯框架,用于从基于3C的数据中推导染色体的三维结构。通过将每条染色体建模为聚合物链,我们基于当前对聚合物物理学的认识定义构象能量,并将其用作贝叶斯框架中的先验信息。我们还提出了一种基于期望最大化(EM)的算法来估计贝叶斯模型的未知参数,并根据相互作用频率数据推断染色质结构的集合。我们通过交叉验证验证了我们的贝叶斯推理方法,并使用荧光原位杂交(FISH)实验得出的几何约束来验证计算出的染色质构象。我们进一步利用文献中其他研究得出的已知遗传相互作用来确认推断出的染色质结构。我们的测试结果表明,我们的贝叶斯框架能够计算出准确的三维染色质构象集合,该集合能最好地解释基于3C的数据得出的距离约束,并且也与先前研究中实验证据得出的其他几何约束来源一致。我们方法的源代码可在https://github.com/wangsy11/InfMod3DGen中找到。