School of Computer Science, Chengdu University of Information Technology, Chengdu, 610225, China.
West China Second University Hospital, Sichuan University, Chengdu, 610041, China.
Comput Biol Med. 2022 Oct;149:105993. doi: 10.1016/j.compbiomed.2022.105993. Epub 2022 Aug 17.
Transcription factors (TFs) can regulate gene expression by recognizing specific cis-regulatory elements in DNA sequences. TF-DNA binding prediction has become a fundamental step in comprehending the underlying cis-regulation mechanism. Since a particular genome region is bound depending on multiple features, such as the arrangement of nucleotides, DNA shape, and an epigenetic mechanism, many researchers attempt to develop computational methods to predict TF binding sites (TFBSs) based on various genomic features. This paper provides a comprehensive compendium to better understand TF-DNA binding from genomic features. We first summarize the commonly used datasets and data processing manners. Subsequently, we classify current deep learning methods in TFBS prediction according to their utilized genomic features and analyze each technique's merit and weakness. Furthermore, we illustrate the functional consequences characterization of TF-DNA binding by prioritizing noncoding variants in identified motif instances. Finally, the challenges and opportunities of deep learning in TF-DNA binding prediction are discussed. This survey can bring valuable insights for researchers to study the modeling of TF-DNA binding.
转录因子 (TFs) 可以通过识别 DNA 序列中的特定顺式调控元件来调节基因表达。TF-DNA 结合预测已成为理解潜在顺式调控机制的基本步骤。由于特定的基因组区域取决于多种特征,如核苷酸排列、DNA 形状和表观遗传机制,因此许多研究人员试图开发基于各种基因组特征的计算方法来预测 TF 结合位点 (TFBS)。本文提供了一个全面的纲要,以更好地从基因组特征理解 TF-DNA 结合。我们首先总结了常用的数据集和数据处理方式。随后,我们根据所利用的基因组特征对 TFBS 预测中的深度学习方法进行分类,并分析每种技术的优缺点。此外,我们通过在鉴定的模体实例中优先考虑非编码变体,说明了 TF-DNA 结合的功能后果特征。最后,讨论了深度学习在 TF-DNA 结合预测中的挑战和机遇。这项调查可以为研究人员研究 TF-DNA 结合的建模提供有价值的见解。