• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于共享混合深度学习架构,利用DNA形状特征预测转录因子结合位点。

Predicting transcription factor binding sites using DNA shape features based on shared hybrid deep learning architecture.

作者信息

Wang Siguo, Zhang Qinhu, Shen Zhen, He Ying, Chen Zhen-Heng, Li Jianqiang, Huang De-Shuang

机构信息

The Institute of Machine Learning and Systems Biology, School of Electronics and Information Engineering, Tongji University, No. 4800 Caoan Road, Shanghai 201804, China.

Translational Medical Center for Stem Cell Therapy and Institute for Regenerative Medicine, Shanghai East Hospital, Tongji University, Siping Road 1239, Shanghai 200092, China.

出版信息

Mol Ther Nucleic Acids. 2021 Feb 18;24:154-163. doi: 10.1016/j.omtn.2021.02.014. eCollection 2021 Jun 4.

DOI:10.1016/j.omtn.2021.02.014
PMID:33767912
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7972936/
Abstract

The study of transcriptional regulation is still difficult yet fundamental in molecular biology research. Recent research has shown that the double helix structure of nucleotides plays an important role in improving the accuracy and interpretability of transcription factor binding sites (TFBSs). Although several computational methods have been designed to take both DNA sequence and DNA shape features into consideration simultaneously, how to design an efficient model is still an intractable topic. In this paper, we proposed a hybrid convolutional recurrent neural network (CNN/RNN) architecture, CRPTS, to predict TFBSs by combining DNA sequence and DNA shape features. The novelty of our proposed method relies on three critical aspects: (1) the application of a shared hybrid CNN and RNN has the ability to efficiently extract features from large-scale genomic sequences obtained by high-throughput technology; (2) the common patterns were found from DNA sequences and their corresponding DNA shape features; (3) our proposed CRPTS can capture local structural information of DNA sequences without completely relying on DNA shape data. A series of comprehensive experiments on 66 datasets derived from universal protein binding microarrays (uPBMs) shows that our proposed method CRPTS obviously outperforms the state-of-the-art methods.

摘要

转录调控的研究在分子生物学研究中仍然困难但却至关重要。最近的研究表明,核苷酸的双螺旋结构在提高转录因子结合位点(TFBSs)的准确性和可解释性方面发挥着重要作用。尽管已经设计了几种计算方法来同时考虑DNA序列和DNA形状特征,但如何设计一个高效的模型仍然是一个棘手的问题。在本文中,我们提出了一种混合卷积循环神经网络(CNN/RNN)架构CRPTS,通过结合DNA序列和DNA形状特征来预测TFBSs。我们提出的方法的新颖之处在于三个关键方面:(1)共享的混合CNN和RNN的应用能够有效地从高通量技术获得的大规模基因组序列中提取特征;(2)从DNA序列及其相应的DNA形状特征中发现了共同模式;(3)我们提出的CRPTS可以捕获DNA序列的局部结构信息,而无需完全依赖DNA形状数据。对来自通用蛋白质结合微阵列(uPBMs)的66个数据集进行的一系列综合实验表明,我们提出的方法CRPTS明显优于现有方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/db63/7972936/0b5413e74dae/gr4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/db63/7972936/d8405ecb8bb9/fx1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/db63/7972936/532e5f0b823e/gr1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/db63/7972936/816548209c94/gr2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/db63/7972936/c68fcb4af20e/gr3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/db63/7972936/0b5413e74dae/gr4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/db63/7972936/d8405ecb8bb9/fx1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/db63/7972936/532e5f0b823e/gr1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/db63/7972936/816548209c94/gr2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/db63/7972936/c68fcb4af20e/gr3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/db63/7972936/0b5413e74dae/gr4.jpg

相似文献

1
Predicting transcription factor binding sites using DNA shape features based on shared hybrid deep learning architecture.基于共享混合深度学习架构,利用DNA形状特征预测转录因子结合位点。
Mol Ther Nucleic Acids. 2021 Feb 18;24:154-163. doi: 10.1016/j.omtn.2021.02.014. eCollection 2021 Jun 4.
2
Predicting in-vitro Transcription Factor Binding Sites Using DNA Sequence + Shape.使用 DNA 序列+形状预测体外转录因子结合位点。
IEEE/ACM Trans Comput Biol Bioinform. 2021 Mar-Apr;18(2):667-676. doi: 10.1109/TCBB.2019.2947461. Epub 2021 Apr 6.
3
DeepSTF: predicting transcription factor binding sites by interpretable deep neural networks combining sequence and shape.DeepSTF:通过结合序列和形状的可解释深度神经网络预测转录因子结合位点。
Brief Bioinform. 2023 Jul 20;24(4). doi: 10.1093/bib/bbad231.
4
FCNGRU: Locating Transcription Factor Binding Sites by Combing Fully Convolutional Neural Network With Gated Recurrent Unit.FCNGRU:通过全卷积神经网络与门控循环单元相结合定位转录因子结合位点。
IEEE J Biomed Health Inform. 2022 Apr;26(4):1883-1890. doi: 10.1109/JBHI.2021.3117616. Epub 2022 Apr 14.
5
Predicting In-Vitro DNA-Protein Binding With a Spatially Aligned Fusion of Sequence and Shape.通过序列与形状的空间对齐融合预测体外DNA-蛋白质结合
IEEE/ACM Trans Comput Biol Bioinform. 2022 Nov-Dec;19(6):3144-3153. doi: 10.1109/TCBB.2021.3133869. Epub 2022 Dec 8.
6
A novel convolution attention model for predicting transcription factor binding sites by combination of sequence and shape.一种新型卷积注意力模型,通过序列和形状的结合来预测转录因子结合位点。
Brief Bioinform. 2022 Jan 17;23(1). doi: 10.1093/bib/bbab525.
7
A survey on protein-DNA-binding sites in computational biology.计算生物学中蛋白质-DNA 结合位点研究综述。
Brief Funct Genomics. 2022 Sep 16;21(5):357-375. doi: 10.1093/bfgp/elac009.
8
Prediction of Transcription Factor Binding Sites With an Attention Augmented Convolutional Neural Network.基于注意力增强卷积神经网络的转录因子结合位点预测
IEEE/ACM Trans Comput Biol Bioinform. 2022 Nov-Dec;19(6):3614-3623. doi: 10.1109/TCBB.2021.3126623. Epub 2022 Dec 8.
9
Cooperation of local features and global representations by a dual-branch network for transcription factor binding sites prediction.用于转录因子结合位点预测的双分支网络对局部特征和全局表示的协同作用
Brief Bioinform. 2023 Mar 19;24(2). doi: 10.1093/bib/bbad036.
10
MLSNet: a deep learning model for predicting transcription factor binding sites.MLSNet:一种用于预测转录因子结合位点的深度学习模型。
Brief Bioinform. 2024 Sep 23;25(6). doi: 10.1093/bib/bbae489.

引用本文的文献

1
OptimDase: An Algorithm for Predicting DNA Binding Sites with Combined Feature Encoding.OptimDase:一种采用组合特征编码预测DNA结合位点的算法。
Interdiscip Sci. 2025 Jun 10. doi: 10.1007/s12539-025-00704-8.
2
AI-Driven Biomarker Discovery and Personalized Allergy Treatment: Utilizing Machine Learning and NGS.人工智能驱动的生物标志物发现与个性化过敏治疗:利用机器学习和二代测序技术
Curr Allergy Asthma Rep. 2025 Jun 3;25(1):27. doi: 10.1007/s11882-025-01207-8.
3
RiceSNP-BST: a deep learning framework for predicting biotic stress-associated SNPs in rice.

本文引用的文献

1
Basic polar and hydrophobic properties are the main characteristics that affect the binding of transcription factors to methylation sites.基本的极性和疏水性是影响转录因子与甲基化位点结合的主要特征。
Bioinformatics. 2020 Aug 1;36(15):4263-4268. doi: 10.1093/bioinformatics/btaa492.
2
TFBSshape: an expanded motif database for DNA shape features of transcription factor binding sites.TFBSshape:转录因子结合位点 DNA 形状特征的扩展基序数据库。
Nucleic Acids Res. 2020 Jan 8;48(D1):D246-D255. doi: 10.1093/nar/gkz970.
3
Predicting in-vitro Transcription Factor Binding Sites Using DNA Sequence + Shape.
RiceSNP-BST:一种用于预测水稻中与生物胁迫相关的 SNP 的深度学习框架。
Brief Bioinform. 2024 Sep 23;25(6). doi: 10.1093/bib/bbae599.
4
DeepMEns: an ensemble model for predicting sgRNA on-target activity based on multiple features.DeepMEns:一种基于多种特征预测sgRNA靶向活性的集成模型。
Brief Funct Genomics. 2025 Jan 15;24. doi: 10.1093/bfgp/elae043.
5
Enhancers in Plant Development, Adaptation and Evolution.植物发育、适应与进化中的增强子
Plant Cell Physiol. 2025 May 17;66(4):461-476. doi: 10.1093/pcp/pcae121.
6
MLSNet: a deep learning model for predicting transcription factor binding sites.MLSNet:一种用于预测转录因子结合位点的深度学习模型。
Brief Bioinform. 2024 Sep 23;25(6). doi: 10.1093/bib/bbae489.
7
Deep DNAshape webserver: prediction and real-time visualization of DNA shape considering extended k-mers.Deep DNAshape 网络服务器:考虑扩展 k-mer 的 DNA 形状预测和实时可视化。
Nucleic Acids Res. 2024 Jul 5;52(W1):W7-W12. doi: 10.1093/nar/gkae433.
8
Predicting Transcription Factor Binding Sites with Deep Learning.利用深度学习预测转录因子结合位点
Int J Mol Sci. 2024 May 3;25(9):4990. doi: 10.3390/ijms25094990.
9
BERT-TFBS: a novel BERT-based model for predicting transcription factor binding sites by transfer learning.BERT-TFBS:一种基于迁移学习的用于预测转录因子结合位点的新型基于BERT的模型。
Brief Bioinform. 2024 Mar 27;25(3). doi: 10.1093/bib/bbae195.
10
Harnessing regulatory networks in Actinobacteria for natural product discovery.利用放线菌中的调控网络进行天然产物发现。
J Ind Microbiol Biotechnol. 2024 Jan 9;51. doi: 10.1093/jimb/kuae011.
使用 DNA 序列+形状预测体外转录因子结合位点。
IEEE/ACM Trans Comput Biol Bioinform. 2021 Mar-Apr;18(2):667-676. doi: 10.1109/TCBB.2019.2947461. Epub 2021 Apr 6.
4
Comprehensive evaluation of deep learning architectures for prediction of DNA/RNA sequence binding specificities.深度学习架构在预测 DNA/RNA 序列结合特异性方面的综合评估。
Bioinformatics. 2019 Jul 15;35(14):i269-i277. doi: 10.1093/bioinformatics/btz339.
5
Prediction of regulatory motifs from human Chip-sequencing data using a deep learning framework.使用深度学习框架从人类 Chip-sequencing 数据中预测调控基序。
Nucleic Acids Res. 2019 Sep 5;47(15):7809-7824. doi: 10.1093/nar/gkz672.
6
Neural networks with circular filters enable data efficient inference of sequence motifs.具有循环滤波器的神经网络能够实现对序列基序的数据高效推断。
Bioinformatics. 2019 Oct 15;35(20):3937-3943. doi: 10.1093/bioinformatics/btz194.
7
Probe Efficient Feature Representation of Gapped K-mer Frequency Vectors from Sequences Using Deep Neural Networks.利用深度神经网络从序列中探测间隙 K-mer 频率向量的有效特征表示。
IEEE/ACM Trans Comput Biol Bioinform. 2020 Mar-Apr;17(2):657-667. doi: 10.1109/TCBB.2018.2868071. Epub 2018 Aug 31.
8
Expanding the repertoire of DNA shape features for genome-scale studies of transcription factor binding.拓展用于转录因子结合的全基因组规模研究的DNA形状特征库。
Nucleic Acids Res. 2017 Dec 15;45(22):12877-12887. doi: 10.1093/nar/gkx1145.
9
DNA sequence+shape kernel enables alignment-free modeling of transcription factor binding.DNA 序列+形状核函数实现了无比对的转录因子结合建模。
Bioinformatics. 2017 Oct 1;33(19):3003-3010. doi: 10.1093/bioinformatics/btx336.
10
Impact of cytosine methylation on DNA binding specificities of human transcription factors.胞嘧啶甲基化对人类转录因子DNA结合特异性的影响。
Science. 2017 May 5;356(6337). doi: 10.1126/science.aaj2239.