Pinchuk Dmitry, Chowdhury H M A Mohit, Pandeya Abhishek, Oluwadare Oluwatosin
Department of Computer Science, University of Wisconsin-Madison, Madison, WI 53706, United States.
Department of Computer Science, University of Colorado, Colorado Springs, CO 80918, United States.
Bioinformatics. 2025 Feb 4;41(2). doi: 10.1093/bioinformatics/btaf030.
The exploration of the 3D organization of DNA within the nucleus in relation to various stages of cellular development has led to experiments generating spatiotemporal Hi-C data. However, there is limited spatiotemporal Hi-C data for many organisms, impeding the study of 3D genome dynamics. To overcome this limitation and advance our understanding of genome organization, it is crucial to develop methods for forecasting Hi-C data at future time points from existing timeseries Hi-C data.
In this work, we designed a novel framework named HiCForecast, adopting a dynamic voxel flow algorithm to forecast future spatiotemporal Hi-C data. We evaluated how well our method generalizes forecasting data across different species and systems, ensuring performance in homogeneous, heterogeneous, and general contexts. Using both computational and biological evaluation metrics, our results show that HiCForecast outperforms the current state-of-the-art algorithm, emerging as an efficient and powerful tool for forecasting future spatiotemporal Hi-C datasets.
HiCForecast is publicly available at https://github.com/OluwadareLab/HiCForecast.
对细胞核内DNA三维结构与细胞发育各阶段关系的探索催生了生成时空Hi-C数据的实验。然而,许多生物体的时空Hi-C数据有限,这阻碍了对三维基因组动力学的研究。为克服这一限制并增进我们对基因组组织的理解,从现有的时间序列Hi-C数据预测未来时间点的Hi-C数据的方法至关重要。
在这项工作中,我们设计了一个名为HiCForecast的新颖框架,采用动态体素流算法来预测未来的时空Hi-C数据。我们评估了我们的方法在不同物种和系统中对预测数据的泛化程度,确保在同质、异质和一般情况下的性能。使用计算和生物学评估指标,我们的结果表明HiCForecast优于当前的最先进算法,成为预测未来时空Hi-C数据集的高效且强大的工具。
HiCForecast可在https://github.com/OluwadareLab/HiCForecast上公开获取。