Department of Biotechnology, Beijing Institute of Radiation Medicine, Beijing 100850.
Department of Information, The 188th Hospital of Chaozhou, Chaozhou 521000.
Bioinformatics. 2017 Jul 1;33(13):1930-1936. doi: 10.1093/bioinformatics/btx105.
Enhancer elements are noncoding stretches of DNA that play key roles in controlling gene expression programmes. Despite major efforts to develop accurate enhancer prediction methods, identifying enhancer sequences continues to be a challenge in the annotation of mammalian genomes. One of the major issues is the lack of large, sufficiently comprehensive and experimentally validated enhancers for humans or other species. Thus, the development of computational methods based on limited experimentally validated enhancers and deciphering the transcriptional regulatory code encoded in the enhancer sequences is urgent.
We present a deep-learning-based hybrid architecture, BiRen, which predicts enhancers using the DNA sequence alone. Our results demonstrate that BiRen can learn common enhancer patterns directly from the DNA sequence and exhibits superior accuracy, robustness and generalizability in enhancer prediction relative to other state-of-the-art enhancer predictors based on sequence characteristics. Our BiRen will enable researchers to acquire a deeper understanding of the regulatory code of enhancer sequences.
Our BiRen method can be freely accessed at https://github.com/wenjiegroup/BiRen .
shuwj@bmi.ac.cn or boxc@bmi.ac.cn.
Supplementary data are available at Bioinformatics online.
增强子元件是 DNA 中的非编码片段,在控制基因表达程序中起着关键作用。尽管人们付出了巨大努力来开发准确的增强子预测方法,但在哺乳动物基因组的注释中,识别增强子序列仍然是一个挑战。其中一个主要问题是缺乏针对人类或其他物种的大型、足够全面和经过实验验证的增强子。因此,基于有限的经过实验验证的增强子开发计算方法,并破译增强子序列中编码的转录调控代码是当务之急。
我们提出了一种基于深度学习的混合架构 BiRen,它仅使用 DNA 序列来预测增强子。我们的结果表明,BiRen 可以直接从 DNA 序列中学习常见的增强子模式,并且在增强子预测方面相对于基于序列特征的其他最先进的增强子预测器具有更高的准确性、鲁棒性和通用性。我们的 BiRen 将使研究人员能够更深入地了解增强子序列的调控代码。
我们的 BiRen 方法可以在 https://github.com/wenjiegroup/BiRen 上免费访问。
shuwj@bmi.ac.cn 或 boxc@bmi.ac.cn。
补充数据可在生物信息学在线获得。