Dipartimento di Matematica e Informatica, Università degli studi di Palermo, Via Archirafi, 34, Palermo, 90123, Italy.
Dipartimento di Scienze per l'Innovazione tecnologica, Istituto Euro-Mediterraneo di Scienza e Tecnologia, Via Michele Miraglia, 20, Palermo, 9039, Italy.
BMC Bioinformatics. 2020 Sep 16;21(Suppl 8):326. doi: 10.1186/s12859-020-03627-x.
Nucleosomes wrap the DNA into the nucleus of the Eukaryote cell and regulate its transcription phase. Several studies indicate that nucleosomes are determined by the combined effects of several factors, including DNA sequence organization. Interestingly, the identification of nucleosomes on a genomic scale has been successfully performed by computational methods using DNA sequence as input data.
In this work, we propose CORENup, a deep learning model for nucleosome identification. CORENup processes a DNA sequence as input using one-hot representation and combines in a parallel fashion a fully convolutional neural network and a recurrent layer. These two parallel levels are devoted to catching both non periodic and periodic DNA string features. A dense layer is devoted to their combination to give a final classification.
Results computed on public data sets of different organisms show that CORENup is a state of the art methodology for nucleosome positioning identification based on a Deep Neural Network architecture. The comparisons have been carried out using two groups of datasets, currently adopted by the best performing methods, and CORENup has shown top performance both in terms of classification metrics and elapsed computation time.
核小体将 DNA 包裹在真核细胞的细胞核中,并调节其转录阶段。有几项研究表明,核小体是由包括 DNA 序列组织在内的多种因素的综合作用决定的。有趣的是,已经通过使用 DNA 序列作为输入数据的计算方法成功地在基因组范围内识别核小体。
在这项工作中,我们提出了 CORENup,这是一种用于核小体识别的深度学习模型。CORENup 使用独热表示法将 DNA 序列作为输入进行处理,并以并行的方式结合了全卷积神经网络和循环层。这两个并行的层次结构用于捕捉非周期性和周期性 DNA 字符串特征。密集层用于它们的组合以给出最终的分类。
在不同生物体的公共数据集上计算的结果表明,CORENup 是一种基于深度神经网络架构的核小体定位识别的最新方法。比较是使用当前最佳方法采用的两组数据集进行的,CORENup 在分类指标和计算时间方面都表现出了最佳性能。