• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

TFProtBert:利用ProtBert潜在空间表示法检测与甲基化DNA结合的转录因子

TFProtBert: Detection of Transcription Factors Binding to Methylated DNA Using ProtBert Latent Space Representation.

作者信息

Gaffar Saima, Chong Kil To, Tayara Hilal

机构信息

Department of Electronics and Information Engineering, Jeonbuk National University, Jeonju 54896, Republic of Korea.

Advances Electronics and Information Research Centre, Jeonbuk National University, Jeonju 54896, Republic of Korea.

出版信息

Int J Mol Sci. 2025 Apr 29;26(9):4234. doi: 10.3390/ijms26094234.

DOI:10.3390/ijms26094234
PMID:40362469
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12071566/
Abstract

Transcription factors (TFs) are fundamental regulators of gene expression and perform diverse functions in cellular processes. The management of 3-dimensional (3D) genome conformation and gene expression relies primarily on TFs. TFs are crucial regulators of gene expression, performing various roles in biological processes. They attract transcriptional machinery to the enhancers or promoters of specific genes, thereby activating or inhibiting transcription. Identifying these TFs is a significant step towards understanding cellular gene expression mechanisms. Due to the time-consuming and labor-intensive nature of experimental methods, the development of computational models is essential. In this work, we introduced a two-layer prediction framework based on a support vector machine (SVM) using the latent space representation of a protein language model, ProtBert. The first layer of the method reliably predicts and identifies transcription factors (TFs), and in the second layer, the proposed method predicts and identifies transcription factors that prefer binding to methylated deoxyribonucleic acid (TFPMs). In addition, we also tested the proposed method on an imbalanced database. In detecting TFs and TFPMs, the proposed model consistently outperformed state-of-the-art approaches, as demonstrated by performance comparisons via empirical cross-validation analysis and independent tests.

摘要

转录因子(TFs)是基因表达的基本调节因子,在细胞过程中发挥多种功能。三维(3D)基因组构象和基因表达的调控主要依赖于转录因子。转录因子是基因表达的关键调节因子,在生物过程中发挥着各种作用。它们将转录机制吸引到特定基因的增强子或启动子上,从而激活或抑制转录。识别这些转录因子是理解细胞基因表达机制的重要一步。由于实验方法耗时且费力,因此开发计算模型至关重要。在这项工作中,我们引入了一种基于支持向量机(SVM)的两层预测框架,该框架使用蛋白质语言模型ProtBert的潜在空间表示。该方法的第一层可靠地预测和识别转录因子(TFs),在第二层中,该方法预测和识别偏好结合甲基化脱氧核糖核酸的转录因子(TFPMs)。此外,我们还在一个不平衡数据库上测试了该方法。在检测转录因子和TFPMs时,通过经验交叉验证分析和独立测试的性能比较表明,所提出的模型始终优于现有方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2ac4/12071566/92133251ce19/ijms-26-04234-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2ac4/12071566/4f2b1c67aaa1/ijms-26-04234-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2ac4/12071566/5d96ea6ad913/ijms-26-04234-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2ac4/12071566/2126a1516bf9/ijms-26-04234-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2ac4/12071566/92133251ce19/ijms-26-04234-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2ac4/12071566/4f2b1c67aaa1/ijms-26-04234-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2ac4/12071566/5d96ea6ad913/ijms-26-04234-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2ac4/12071566/2126a1516bf9/ijms-26-04234-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2ac4/12071566/92133251ce19/ijms-26-04234-g004.jpg

相似文献

1
TFProtBert: Detection of Transcription Factors Binding to Methylated DNA Using ProtBert Latent Space Representation.TFProtBert:利用ProtBert潜在空间表示法检测与甲基化DNA结合的转录因子
Int J Mol Sci. 2025 Apr 29;26(9):4234. doi: 10.3390/ijms26094234.
2
Identifying the DNA methylation preference of transcription factors using ProtBERT and SVM.使用ProtBERT和支持向量机识别转录因子的DNA甲基化偏好性。
PLoS Comput Biol. 2025 May 13;21(5):e1012513. doi: 10.1371/journal.pcbi.1012513. eCollection 2025 May.
3
Identifying Transcription Factors That Prefer Binding to Methylated DNA Using Reduced -Gap Dipeptide Composition.利用减少间隙二肽组成鉴定偏好结合甲基化DNA的转录因子
ACS Omega. 2022 Aug 30;7(36):32322-32330. doi: 10.1021/acsomega.2c03696. eCollection 2022 Sep 13.
4
Detection of transcription factors binding to methylated DNA by deep recurrent neural network.通过深度递归神经网络检测与甲基化 DNA 结合的转录因子。
Brief Bioinform. 2022 Jan 17;23(1). doi: 10.1093/bib/bbab533.
5
TFpredict and SABINE: sequence-based prediction of structural and functional characteristics of transcription factors.TFpredict 和 SABINE:基于序列的转录因子结构和功能特征预测。
PLoS One. 2013 Dec 12;8(12):e82238. doi: 10.1371/journal.pone.0082238. eCollection 2013.
6
Base-resolution methylation patterns accurately predict transcription factor bindings in vivo.碱基分辨率甲基化模式可在体内准确预测转录因子结合情况。
Nucleic Acids Res. 2015 Mar 11;43(5):2757-66. doi: 10.1093/nar/gkv151. Epub 2015 Feb 26.
7
Quantitative modeling of transcription factor binding specificities using DNA shape.利用DNA形状对转录因子结合特异性进行定量建模。
Proc Natl Acad Sci U S A. 2015 Apr 14;112(15):4654-9. doi: 10.1073/pnas.1422023112. Epub 2015 Mar 9.
8
BindSpace decodes transcription factor binding signals by large-scale sequence embedding.BindSpace 通过大规模序列嵌入来解码转录因子结合信号。
Nat Methods. 2019 Sep;16(9):858-861. doi: 10.1038/s41592-019-0511-y. Epub 2019 Aug 12.
9
Network motif-based identification of transcription factor-target gene relationships by integrating multi-source biological data.通过整合多源生物数据基于网络基序识别转录因子-靶基因关系
BMC Bioinformatics. 2008 Apr 21;9:203. doi: 10.1186/1471-2105-9-203.
10
DeepTFactor: A deep learning-based tool for the prediction of transcription factors.DeepTFactor:一种基于深度学习的转录因子预测工具。
Proc Natl Acad Sci U S A. 2021 Jan 12;118(2). doi: 10.1073/pnas.2021171118.

本文引用的文献

1
Comparative Study of Deep Transfer Learning Models for Semantic Segmentation of Human Mesenchymal Stem Cell Micrographs.用于人间充质干细胞显微图像语义分割的深度迁移学习模型的比较研究
Int J Mol Sci. 2025 Mar 6;26(5):2338. doi: 10.3390/ijms26052338.
2
Machine Learning Methods for Classifying Multiple Sclerosis and Alzheimer's Disease Using Genomic Data.使用基因组数据对多发性硬化症和阿尔茨海默病进行分类的机器学习方法
Int J Mol Sci. 2025 Feb 27;26(5):2085. doi: 10.3390/ijms26052085.
3
iAnOxPep: a machine learning model for the identification of anti-oxidative peptides using ensemble learning.
iAnOxPep:一种使用集成学习识别抗氧化肽的机器学习模型。
IEEE/ACM Trans Comput Biol Bioinform. 2024 Nov 11;PP. doi: 10.1109/TCBB.2024.3489614.
4
Possum: identification and interpretation of potassium ion inhibitors using probabilistic feature vectors.负鼠:使用概率特征向量识别和解释钾离子抑制剂
Arch Toxicol. 2025 Jan;99(1):225-235. doi: 10.1007/s00204-024-03888-y. Epub 2024 Oct 22.
5
SB-Net: Synergizing CNN and LSTM networks for uncovering retrosynthetic pathways in organic synthesis.SB-Net:融合卷积神经网络(CNN)和长短期记忆网络(LSTM)以揭示有机合成中的逆合成途径。
Comput Biol Chem. 2024 Oct;112:108130. doi: 10.1016/j.compbiolchem.2024.108130. Epub 2024 Jun 15.
6
ADMET-AI: a machine learning ADMET platform for evaluation of large-scale chemical libraries.ADMET-AI:用于评估大规模化学文库的机器学习 ADMET 平台。
Bioinformatics. 2024 Jul 1;40(7). doi: 10.1093/bioinformatics/btae416.
7
NaII-Pred: An ensemble-learning framework for the identification and interpretation of sodium ion inhibitors by fusing multiple feature representation.NaII-Pred:一种融合多种特征表示的集成学习框架,用于鉴定和解释钠离子抑制剂。
Comput Biol Med. 2024 Aug;178:108737. doi: 10.1016/j.compbiomed.2024.108737. Epub 2024 Jun 15.
8
Harnessing machine learning to predict cytochrome P450 inhibition through molecular properties.利用机器学习通过分子性质预测细胞色素P450抑制作用。
Arch Toxicol. 2024 Aug;98(8):2647-2658. doi: 10.1007/s00204-024-03756-9. Epub 2024 Apr 15.
9
Stack-AAgP: Computational prediction and interpretation of anti-angiogenic peptides using a meta-learning framework.Stack-AAgP:使用元学习框架进行抗血管生成肽的计算预测和解释。
Comput Biol Med. 2024 May;174:108438. doi: 10.1016/j.compbiomed.2024.108438. Epub 2024 Apr 9.
10
A bidirectional interpretable compound-protein interaction prediction framework based on cross attention.基于交叉注意力的双向可解释化合物-蛋白质相互作用预测框架。
Comput Biol Med. 2024 Apr;172:108239. doi: 10.1016/j.compbiomed.2024.108239. Epub 2024 Mar 2.