Zheng Suchen, Thakkar Nitya, Harris Hannah L, Liu Susanna, Zhang Megan, Gerstein Mark, Aiden Erez Lieberman, Rowley M Jordan, Noble William Stafford, Gürsoy Gamze, Singh Ritambhara
Department of Computer Science, Brown University, Providence, RI, USA.
Department of Genetics, Cell Biology and Anatomy, University of Nebraska Medical Center, Omaha, NE, USA.
iScience. 2024 Mar 27;27(5):109570. doi: 10.1016/j.isci.2024.109570. eCollection 2024 May 17.
The three-dimensional organization of genomes plays a crucial role in essential biological processes. The segregation of chromatin into A and B compartments highlights regions of activity and inactivity, providing a window into the genomic activities specific to each cell type. Yet, the steep costs associated with acquiring Hi-C data, necessary for studying this compartmentalization across various cell types, pose a significant barrier in studying cell type specific genome organization. To address this, we present a prediction tool called compartment prediction using recurrent neural networks (CoRNN), which predicts compartmentalization of 3D genome using histone modification enrichment. CoRNN demonstrates robust cross-cell-type prediction of A/B compartments with an average AuROC of 90.9%. Cell-type-specific predictions align well with known functional elements, with H3K27ac and H3K36me3 identified as highly predictive histone marks. We further investigate our mispredictions and found that they are located in regions with ambiguous compartmental status. Furthermore, our model's generalizability is validated by predicting compartments in independent tissue samples, which underscores its broad applicability.
基因组的三维组织在基本生物学过程中起着至关重要的作用。染色质分为A和B区室,突出了活性和非活性区域,为了解每种细胞类型特有的基因组活动提供了一个窗口。然而,获取用于研究不同细胞类型间这种区室化的Hi-C数据成本高昂,这在研究细胞类型特异性基因组组织方面构成了重大障碍。为解决这一问题,我们提出了一种名为使用递归神经网络进行区室预测(CoRNN)的预测工具,该工具利用组蛋白修饰富集来预测三维基因组的区室化。CoRNN对A/B区室进行了强大的跨细胞类型预测,平均曲线下面积(AuROC)为90.9%。细胞类型特异性预测与已知功能元件高度吻合,其中H3K27ac和H3K36me3被确定为具有高度预测性的组蛋白标记。我们进一步研究了错误预测情况,发现它们位于区室状态不明确的区域。此外,通过对独立组织样本中的区室进行预测,验证了我们模型的通用性,这突出了其广泛的适用性。