School of Life Science, Beijing Institute of Technology, South Zhongguancun Street, Beijing, 100081, China.
Key Laboratory of Convergence Medical Engineering System and Healthcare Technology the Ministry of Industry and Information Technology, Beijing Institute of Technology, Beijing, China.
BMC Genomics. 2018 Dec 31;19(Suppl 10):905. doi: 10.1186/s12864-018-5283-8.
The DNase I hypersensitive sites (DHSs) are associated with the cis-regulatory DNA elements. An efficient method of identifying DHSs can enhance the understanding on the accessibility of chromatin. Despite a multitude of resources available on line including experimental datasets and computational tools, the complex language of DHSs remains incompletely understood.
Here, we address this challenge using an approach based on a state-of-the-art machine learning method. We present a novel convolutional neural network (CNN) which combined Inception like networks with a gating mechanism for the response of multiple patterns and longterm association in DNA sequences to predict multi-scale DHSs in Arabidopsis, rice and Homo sapiens.
Our method obtains 0.961 area under curve (AUC) on Arabidopsis, 0.969 AUC on rice and 0.918 AUC on Homo sapiens.
Our method provides an efficient and accurate way to identify multi-scale DHSs sequences by deep learning.
DNase I 超敏位点(DHSs)与顺式调控 DNA 元件相关。一种有效的 DHSs 识别方法可以增强对染色质可及性的理解。尽管在线上有许多资源,包括实验数据集和计算工具,但 DHSs 复杂的语言仍然不完全理解。
在这里,我们使用一种基于最先进的机器学习方法的方法来解决这个挑战。我们提出了一种新的卷积神经网络(CNN),该网络将 Inception 类似的网络与门控机制相结合,用于预测拟南芥、水稻和人类中的多尺度 DHSs,以响应多个模式和 DNA 序列中的长程关联。
我们的方法在拟南芥中获得了 0.961 的曲线下面积(AUC),在水稻中获得了 0.969 的 AUC,在人类中获得了 0.918 的 AUC。
我们的方法通过深度学习为识别多尺度 DHSs 序列提供了一种高效准确的方法。