Electrical and Computer Engineering Department, University of Texas at San Antonio, San Antonio, TX, USA.
Department of Epidemiology and Biostatistics, University of Texas Health Science Center, San Antonio, TX, USA.
Bioinformatics. 2018 Oct 15;34(20):3446-3453. doi: 10.1093/bioinformatics/bty383.
Transcription factor (TF) binds to the promoter region of a gene to control gene expression. Identifying precise TF binding sites (TFBSs) is essential for understanding the detailed mechanisms of TF-mediated gene regulation. However, there is a shortage of computational approach that can deliver single base pair resolution prediction of TFBS.
In this paper, we propose DeepSNR, a Deep Learning algorithm for predicting TF binding location at Single Nucleotide Resolution de novo from DNA sequence. DeepSNR adopts a novel deconvolutional network (deconvNet) model and is inspired by the similarity to image segmentation by deconvNet. The proposed deconvNet architecture is constructed on top of 'DeepBind' and we trained the entire model using TF-specific data from ChIP-exonuclease (ChIP-exo) experiments. DeepSNR has been shown to outperform motif search-based methods for several evaluation metrics. We have also demonstrated the usefulness of DeepSNR in the regulatory analysis of TFBS as well as in improving the TFBS prediction specificity using ChIP-seq data.
DeepSNR is available open source in the GitHub repository (https://github.com/sirajulsalekin/DeepSNR).
Supplementary data are available at Bioinformatics online.
转录因子 (TF) 结合到基因的启动子区域以控制基因表达。识别精确的 TF 结合位点 (TFBS) 对于理解 TF 介导的基因调控的详细机制至关重要。然而,缺乏能够提供单碱基分辨率的 TFBS 预测的计算方法。
在本文中,我们提出了 DeepSNR,这是一种从 DNA 序列中从头预测 TF 结合位置的单核苷酸分辨率的深度学习算法。DeepSNR 采用了新颖的去卷积网络 (deconvNet) 模型,其灵感来自于 deconvNet 对图像分割的相似性。所提出的 deconvNet 架构构建在 'DeepBind' 之上,我们使用 ChIP-exonuclease (ChIP-exo) 实验中的 TF 特异性数据来训练整个模型。DeepSNR 在几个评估指标上都优于基于基序搜索的方法。我们还证明了 DeepSNR 在 TFBS 调控分析以及使用 ChIP-seq 数据提高 TFBS 预测特异性方面的有用性。
DeepSNR 可在 GitHub 存储库(https://github.com/sirajulsalekin/DeepSNR)中获得开源。
补充数据可在 Bioinformatics 在线获得。