Informational Biology at University of Electronic Science and Technology of China.
Laboratory of Theoretical Biophysics at Inner Mongolia University.
Brief Bioinform. 2021 Sep 2;22(5). doi: 10.1093/bib/bbab031.
Three-dimensional (3D) architecture of the chromosomes is of crucial importance for transcription regulation and DNA replication. Various high-throughput chromosome conformation capture-based methods have revealed that CTCF-mediated chromatin loops are a major component of 3D architecture. However, CTCF-mediated chromatin loops are cell type specific, and most chromatin interaction capture techniques are time-consuming and labor-intensive, which restricts their usage on a very large number of cell types. Genomic sequence-based computational models are sophisticated enough to capture important features of chromatin architecture and help to identify chromatin loops. In this work, we develop Deep-loop, a convolutional neural network model, to integrate k-tuple nucleotide frequency component, nucleotide pair spectrum encoding, position conservation, position scoring function and natural vector features for the prediction of chromatin loops. By a series of examination based on cross-validation, Deep-loop shows excellent performance in the identification of the chromatin loops from different cell types. The source code of Deep-loop is freely available at the repository https://github.com/linDing-group/Deep-loop.
染色体的三维(3D)结构对于转录调控和 DNA 复制至关重要。各种基于高通量染色体构象捕获的方法已经揭示,CTCF 介导的染色质环是 3D 结构的主要组成部分。然而,CTCF 介导的染色质环是细胞类型特异性的,并且大多数染色质相互作用捕获技术既耗时又费力,这限制了它们在大量细胞类型上的使用。基于基因组序列的计算模型足够复杂,可以捕获染色质结构的重要特征,并有助于识别染色质环。在这项工作中,我们开发了 Deep-loop,一种卷积神经网络模型,用于整合 k 元核苷酸频率分量、核苷酸对谱编码、位置保守性、位置评分函数和自然向量特征,以预测染色质环。通过一系列基于交叉验证的检查,Deep-loop 在识别来自不同细胞类型的染色质环方面表现出优异的性能。Deep-loop 的源代码可在存储库 https://github.com/linDing-group/Deep-loop 上免费获得。