• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一种基于KAN的混合深度神经网络,用于准确识别转录因子结合位点。

A KAN-based hybrid deep neural networks for accurate identification of transcription factor binding sites.

作者信息

He Guodong, Ye Jiahao, Hao Huijun, Chen Wei

机构信息

School of Information Engineering, Wenzhou Business College, Wenzhou, Zhejiang, PR China.

出版信息

PLoS One. 2025 May 7;20(5):e0322978. doi: 10.1371/journal.pone.0322978. eCollection 2025.

DOI:10.1371/journal.pone.0322978
PMID:40334196
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12058130/
Abstract

BACKGROUND

Predicting protein-DNA binding sites in vivo is a challenging but urgent task in many fields such as drug design and development. Most promoters contain many transcription factor (TF) binding sites, yet only a few have been identified through time-consuming biochemical experiments. To address this challenge, numerous computational approaches have been proposed to predict TF binding sites from DNA sequences. However, current deep learning methods often face issues such as gradient vanishing as the model depth increases, leading to suboptimal feature extraction.

RESULTS

We propose a model called CBR-KAN (where C represents Convolutional Neural Network (CNN), B represents Bidirectional Long Short Term Memory (BiLSTM), and R represents Residual Mechanism) to predict transcription factor binding sites. Specifically, we designed a multi-scale convolution module (ConvBlock1, 2, 3) combined with BiLSTM network, introduced KAN network to replace traditional multilayer perceptron, and promoted model optimization through residual connections. Testing on 50 common ChIP seq benchmark datasets shows that CBR-KAN outperforms other state-of-the-art methods such as DeepBind, DanQ, DeepD2V, and DeepSEA in predicting TF binding sites.

CONCLUSIONS

The CBR-KAN model significantly improves prediction accuracy for transcription factor binding sites by effectively integrating multiple neural network architectures and mechanisms. This approach not only enhances feature extraction but also stabilizes training and boosts generalization capabilities. The promising results on multiple key performance indicators demonstrate the potential of CBR-KAN in bioinformatics applications.

摘要

背景

在药物设计与开发等诸多领域,预测体内蛋白质 - DNA 结合位点是一项具有挑战性但又紧迫的任务。大多数启动子包含许多转录因子(TF)结合位点,但通过耗时的生化实验仅鉴定出了少数几个。为应对这一挑战,人们提出了众多计算方法来从 DNA 序列预测 TF 结合位点。然而,当前的深度学习方法常常面临随着模型深度增加梯度消失等问题,导致特征提取效果欠佳。

结果

我们提出了一种名为 CBR - KAN 的模型(其中 C 代表卷积神经网络(CNN),B 代表双向长短期记忆网络(BiLSTM),R 代表残差机制)来预测转录因子结合位点。具体而言,我们设计了一个与 BiLSTM 网络相结合的多尺度卷积模块(ConvBlock1、2、3),引入 KAN 网络来替代传统的多层感知器,并通过残差连接促进模型优化。在 50 个常见的 ChIP seq 基准数据集上进行测试表明,在预测 TF 结合位点方面,CBR - KAN 优于其他当前最先进的方法,如 DeepBind、DanQ、DeepD2V 和 DeepSEA。

结论

CBR - KAN 模型通过有效整合多种神经网络架构和机制,显著提高了转录因子结合位点的预测准确性。这种方法不仅增强了特征提取能力,还稳定了训练并提升了泛化能力。在多个关键性能指标上取得的良好结果证明了 CBR - KAN 在生物信息学应用中的潜力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2128/12058130/795556a1c82c/pone.0322978.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2128/12058130/a8f074120fdd/pone.0322978.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2128/12058130/94e217bcf919/pone.0322978.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2128/12058130/73b6cc07af35/pone.0322978.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2128/12058130/8bd93a6b9cef/pone.0322978.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2128/12058130/10d854528976/pone.0322978.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2128/12058130/795556a1c82c/pone.0322978.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2128/12058130/a8f074120fdd/pone.0322978.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2128/12058130/94e217bcf919/pone.0322978.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2128/12058130/73b6cc07af35/pone.0322978.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2128/12058130/8bd93a6b9cef/pone.0322978.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2128/12058130/10d854528976/pone.0322978.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2128/12058130/795556a1c82c/pone.0322978.g006.jpg

相似文献

1
A KAN-based hybrid deep neural networks for accurate identification of transcription factor binding sites.一种基于KAN的混合深度神经网络,用于准确识别转录因子结合位点。
PLoS One. 2025 May 7;20(5):e0322978. doi: 10.1371/journal.pone.0322978. eCollection 2025.
2
DeepD2V: A Novel Deep Learning-Based Framework for Predicting Transcription Factor Binding Sites from Combined DNA Sequence.DeepD2V:一种基于深度学习的新型框架,用于从组合 DNA 序列预测转录因子结合位点。
Int J Mol Sci. 2021 May 24;22(11):5521. doi: 10.3390/ijms22115521.
3
BERT-TFBS: a novel BERT-based model for predicting transcription factor binding sites by transfer learning.BERT-TFBS:一种基于迁移学习的用于预测转录因子结合位点的新型基于BERT的模型。
Brief Bioinform. 2024 Mar 27;25(3). doi: 10.1093/bib/bbae195.
4
CacPred: a cascaded convolutional neural network for TF-DNA binding prediction.CacPred:用于转录因子-脱氧核糖核酸结合预测的级联卷积神经网络
BMC Genomics. 2025 Mar 18;26(Suppl 2):264. doi: 10.1186/s12864-025-11399-y.
5
Multi-Scale Capsule Network for Predicting DNA-Protein Binding Sites.多尺度胶囊网络预测 DNA-蛋白质结合位点
IEEE/ACM Trans Comput Biol Bioinform. 2021 Sep-Oct;18(5):1793-1800. doi: 10.1109/TCBB.2020.3025579. Epub 2021 Oct 7.
6
Enhancing the interpretability of transcription factor binding site prediction using attention mechanism.利用注意力机制提高转录因子结合位点预测的可解释性。
Sci Rep. 2020 Aug 7;10(1):13413. doi: 10.1038/s41598-020-70218-4.
7
DeepReg: a deep learning hybrid model for predicting transcription factors in eukaryotic and prokaryotic genomes.DeepReg:一种用于预测真核生物和原核生物基因组中转录因子的深度学习混合模型。
Sci Rep. 2024 Apr 21;14(1):9155. doi: 10.1038/s41598-024-59487-5.
8
A survey on protein-DNA-binding sites in computational biology.计算生物学中蛋白质-DNA 结合位点研究综述。
Brief Funct Genomics. 2022 Sep 16;21(5):357-375. doi: 10.1093/bfgp/elac009.
9
BCDB: A dual-branch network based on transformer for predicting transcription factor binding sites.BCDB:一种基于变压器的双分支网络,用于预测转录因子结合位点。
Methods. 2025 Feb;234:141-151. doi: 10.1016/j.ymeth.2024.12.006. Epub 2024 Dec 17.
10
High-Order Convolutional Neural Network Architecture for Predicting DNA-Protein Binding Sites.用于预测 DNA-蛋白质结合位点的高阶卷积神经网络架构。
IEEE/ACM Trans Comput Biol Bioinform. 2019 Jul-Aug;16(4):1184-1192. doi: 10.1109/TCBB.2018.2819660. Epub 2018 Mar 26.

本文引用的文献

1
PSSM-Sumo: deep learning based intelligent model for prediction of sumoylation sites using discriminative features.PSSM-Sumo:基于深度学习的智能模型,用于使用判别特征预测类泛素化位点。
BMC Bioinformatics. 2024 Aug 30;25(1):284. doi: 10.1186/s12859-024-05917-0.
2
DeepD2V: A Novel Deep Learning-Based Framework for Predicting Transcription Factor Binding Sites from Combined DNA Sequence.DeepD2V:一种基于深度学习的新型框架,用于从组合 DNA 序列预测转录因子结合位点。
Int J Mol Sci. 2021 May 24;22(11):5521. doi: 10.3390/ijms22115521.
3
AIKYATAN: mapping distal regulatory elements using convolutional learning on GPU.
AIKYATAN:使用 GPU 上的卷积学习进行远端调控元件的作图。
BMC Bioinformatics. 2019 Oct 7;20(1):488. doi: 10.1186/s12859-019-3049-1.
4
Predicting the impact of single nucleotide variants on splicing via sequence-based deep neural networks and genomic features.基于序列的深度神经网络和基因组特征预测单核苷酸变异对剪接的影响。
Hum Mutat. 2019 Sep;40(9):1261-1269. doi: 10.1002/humu.23794. Epub 2019 Jun 23.
5
Weakly-Supervised Convolutional Neural Network Architecture for Predicting Protein-DNA Binding.弱监督卷积神经网络结构用于预测蛋白质-DNA 结合。
IEEE/ACM Trans Comput Biol Bioinform. 2020 Mar-Apr;17(2):679-689. doi: 10.1109/TCBB.2018.2864203. Epub 2018 Aug 7.
6
gkmSVM: an R package for gapped-kmer SVM.gkmSVM:一个用于带间隔k-mer支持向量机的R软件包。
Bioinformatics. 2016 Jul 15;32(14):2205-7. doi: 10.1093/bioinformatics/btw203. Epub 2016 Apr 19.
7
DanQ: a hybrid convolutional and recurrent deep neural network for quantifying the function of DNA sequences.DanQ:一种用于量化DNA序列功能的卷积与循环相结合的深度神经网络。
Nucleic Acids Res. 2016 Jun 20;44(11):e107. doi: 10.1093/nar/gkw226. Epub 2016 Apr 15.
8
Gene expression inference with deep learning.基于深度学习的基因表达推断
Bioinformatics. 2016 Jun 15;32(12):1832-9. doi: 10.1093/bioinformatics/btw074. Epub 2016 Feb 11.
9
Predicting effects of noncoding variants with deep learning-based sequence model.使用基于深度学习的序列模型预测非编码变异的影响。
Nat Methods. 2015 Oct;12(10):931-4. doi: 10.1038/nmeth.3547. Epub 2015 Aug 24.
10
Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning.通过深度学习预测 DNA 和 RNA 结合蛋白的序列特异性。
Nat Biotechnol. 2015 Aug;33(8):831-8. doi: 10.1038/nbt.3300. Epub 2015 Jul 27.