Suppr超能文献

基于共享混合深度学习架构,利用DNA形状特征预测转录因子结合位点。

Predicting transcription factor binding sites using DNA shape features based on shared hybrid deep learning architecture.

作者信息

Wang Siguo, Zhang Qinhu, Shen Zhen, He Ying, Chen Zhen-Heng, Li Jianqiang, Huang De-Shuang

机构信息

The Institute of Machine Learning and Systems Biology, School of Electronics and Information Engineering, Tongji University, No. 4800 Caoan Road, Shanghai 201804, China.

Translational Medical Center for Stem Cell Therapy and Institute for Regenerative Medicine, Shanghai East Hospital, Tongji University, Siping Road 1239, Shanghai 200092, China.

出版信息

Mol Ther Nucleic Acids. 2021 Feb 18;24:154-163. doi: 10.1016/j.omtn.2021.02.014. eCollection 2021 Jun 4.

Abstract

The study of transcriptional regulation is still difficult yet fundamental in molecular biology research. Recent research has shown that the double helix structure of nucleotides plays an important role in improving the accuracy and interpretability of transcription factor binding sites (TFBSs). Although several computational methods have been designed to take both DNA sequence and DNA shape features into consideration simultaneously, how to design an efficient model is still an intractable topic. In this paper, we proposed a hybrid convolutional recurrent neural network (CNN/RNN) architecture, CRPTS, to predict TFBSs by combining DNA sequence and DNA shape features. The novelty of our proposed method relies on three critical aspects: (1) the application of a shared hybrid CNN and RNN has the ability to efficiently extract features from large-scale genomic sequences obtained by high-throughput technology; (2) the common patterns were found from DNA sequences and their corresponding DNA shape features; (3) our proposed CRPTS can capture local structural information of DNA sequences without completely relying on DNA shape data. A series of comprehensive experiments on 66 datasets derived from universal protein binding microarrays (uPBMs) shows that our proposed method CRPTS obviously outperforms the state-of-the-art methods.

摘要

转录调控的研究在分子生物学研究中仍然困难但却至关重要。最近的研究表明,核苷酸的双螺旋结构在提高转录因子结合位点(TFBSs)的准确性和可解释性方面发挥着重要作用。尽管已经设计了几种计算方法来同时考虑DNA序列和DNA形状特征,但如何设计一个高效的模型仍然是一个棘手的问题。在本文中,我们提出了一种混合卷积循环神经网络(CNN/RNN)架构CRPTS,通过结合DNA序列和DNA形状特征来预测TFBSs。我们提出的方法的新颖之处在于三个关键方面:(1)共享的混合CNN和RNN的应用能够有效地从高通量技术获得的大规模基因组序列中提取特征;(2)从DNA序列及其相应的DNA形状特征中发现了共同模式;(3)我们提出的CRPTS可以捕获DNA序列的局部结构信息,而无需完全依赖DNA形状数据。对来自通用蛋白质结合微阵列(uPBMs)的66个数据集进行的一系列综合实验表明,我们提出的方法CRPTS明显优于现有方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/db63/7972936/d8405ecb8bb9/fx1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验