School of Computing Sciences and Computer Engineering, University of Southern Mississippi, Hattiesburg, MS, USA.
Department of Computer Science, University of Miami, Coral Gables, FL, USA.
Bioinformatics. 2019 Oct 15;35(20):3981-3988. doi: 10.1093/bioinformatics/btz181.
In contrast to population-based Hi-C data, single-cell Hi-C data are zero-inflated and do not indicate the frequency of proximate DNA segments. There are a limited number of computational tools that can model the 3D structures of chromosomes based on single-cell Hi-C data.
We developed single-cell lattice (SCL), a computational method to reconstruct 3D structures of chromosomes based on single-cell Hi-C data. We designed a loss function and a 2 D Gaussian function specifically for the characteristics of single-cell Hi-C data. A chromosome is represented as beads-on-a-string and stored in a 3 D cubic lattice. Metropolis-Hastings simulation and simulated annealing are used to simulate the structure and minimize the loss function. We evaluated the SCL-inferred 3 D structures (at both 500 and 50 kb resolutions) using multiple criteria and compared them with the ones generated by another modeling software program. The results indicate that the 3 D structures generated by SCL closely fit single-cell Hi-C data. We also found similar patterns of trans-chromosomal contact beads, Lamin-B1 enriched topologically associating domains (TADs), and H3K4me3 enriched TADs by mapping data from previous studies onto the SCL-inferred 3 D structures.
The C++ source code of SCL is freely available at http://dna.cs.miami.edu/SCL/.
Supplementary data are available at Bioinformatics online.
与基于人群的 Hi-C 数据相比,单细胞 Hi-C 数据是零膨胀的,并且不能指示邻近 DNA 片段的频率。目前,能够基于单细胞 Hi-C 数据来模拟染色体 3D 结构的计算工具数量有限。
我们开发了单细胞晶格(SCL),这是一种基于单细胞 Hi-C 数据来重建染色体 3D 结构的计算方法。我们专门针对单细胞 Hi-C 数据的特点设计了损失函数和二维高斯函数。染色体表示为串珠,并存储在 3D 立方晶格中。Metropolis-Hastings 模拟和模拟退火用于模拟结构并最小化损失函数。我们使用多个标准评估 SCL 推断的 3D 结构(在 500 和 50kb 分辨率下),并将其与另一个建模软件程序生成的结构进行比较。结果表明,SCL 生成的 3D 结构与单细胞 Hi-C 数据紧密吻合。我们还通过将之前研究的数据映射到 SCL 推断的 3D 结构上,发现了跨染色体接触珠、富含 Lamin-B1 的拓扑关联域(TAD)和富含 H3K4me3 的 TAD 的相似模式。
SCL 的 C++源代码可在 http://dna.cs.miami.edu/SCL/ 上免费获得。
补充数据可在 Bioinformatics 在线获得。