Department of Biomedical Engineering.
Key Laboratory of Convergence Medical Engineering System and Healthcare Technology of the Ministry of Industry and Information Technology, School of Life Science, Beijing Institute of Technology, Beijing 100081, China.
Bioinformatics. 2018 May 15;34(10):1705-1712. doi: 10.1093/bioinformatics/bty003.
Nucleosome positioning plays significant roles in proper genome packing and its accessibility to execute transcription regulation. Despite a multitude of nucleosome positioning resources available on line including experimental datasets of genome-wide nucleosome occupancy profiles and computational tools to the analysis on these data, the complex language of eukaryotic Nucleosome positioning remains incompletely understood.
Here, we address this challenge using an approach based on a state-of-the-art machine learning method. We present a novel convolutional neural network (CNN) to understand nucleosome positioning. We combined Inception-like networks with a gating mechanism for the response of multiple patterns and long term association in DNA sequences. We developed the open-source package LeNup based on the CNN to predict nucleosome positioning in Homo sapiens, Caenorhabditis elegans, Drosophila melanogaster as well as Saccharomyces cerevisiae genomes. We trained LeNup on four benchmark datasets. LeNup achieved greater predictive accuracy than previously published methods.
LeNup is freely available as Python and Lua script source code under a BSD style license from https://github.com/biomedBit/LeNup.
Supplementary data are available at Bioinformatics online.
核小体定位在正确的基因组包装及其转录调控的可及性方面发挥着重要作用。尽管在线上有许多核小体定位资源,包括全基因组核小体占有率图谱的实验数据集和用于分析这些数据的计算工具,但真核生物核小体定位的复杂语言仍未被完全理解。
在这里,我们使用基于最先进的机器学习方法的方法来解决这个挑战。我们提出了一种新的卷积神经网络(CNN)来理解核小体定位。我们将类似于 Inception 的网络与门控机制结合起来,以响应 DNA 序列中的多种模式和长期关联。我们基于 CNN 开发了开源软件包 LeNup,用于预测人类、秀丽隐杆线虫、黑腹果蝇和酿酒酵母基因组中的核小体定位。我们在四个基准数据集上训练了 LeNup。LeNup 的预测准确性优于以前发表的方法。
LeNup 可作为 Python 和 Lua 脚本源代码,根据 BSD 样式许可证从 https://github.com/biomedBit/LeNup 获得。
补充数据可在 Bioinformatics 在线获取。