• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

利用减少间隙二肽组成鉴定偏好结合甲基化DNA的转录因子

Identifying Transcription Factors That Prefer Binding to Methylated DNA Using Reduced -Gap Dipeptide Composition.

作者信息

Nguyen Quang H, Tran Hoang V, Nguyen Binh P, Do Trang T T

机构信息

School of Information and Communication Technology, Hanoi University of Science and Technology, 1 Dai Co Viet, Hanoi 100000, Vietnam.

School of Mathematics and Statistics, Victoria University of Wellington, Kelburn Parade, Wellington 6140, New Zealand.

出版信息

ACS Omega. 2022 Aug 30;7(36):32322-32330. doi: 10.1021/acsomega.2c03696. eCollection 2022 Sep 13.

DOI:10.1021/acsomega.2c03696
PMID:36119976
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9475634/
Abstract

Transcription factors (TFs) play an important role in gene expression and regulation of 3D genome conformation. TFs have ability to bind to specific DNA fragments called enhancers and promoters. Some TFs bind to promoter DNA fragments which are near the transcription initiation site and form complexes that allow polymerase enzymes to bind to initiate transcription. Previous studies showed that methylated DNAs had ability to inhibit and prevent TFs from binding to DNA fragments. However, recent studies have found that there were TFs that could bind to methylated DNA fragments. The identification of these TFs is an important steppingstone to a better understanding of cellular gene expression mechanisms. However, as experimental methods are often time-consuming and labor-intensive, developing computational methods is essential. In this study, we propose two machine learning methods for two problems: (1) identifying TFs and (2) identifying TFs that prefer binding to methylated DNA targets (TFPMs). For the TF identification problem, the proposed method uses the position-specific scoring matrix for data representation and a deep convolutional neural network for modeling. This method achieved 90.56% sensitivity, 83.96% specificity, and an area under the receiver operating characteristic curve (AUC) of 0.9596 on an independent test set. For the TFPM identification problem, we propose to use the reduced -gap dipeptide composition for data representation and the support vector machine algorithm for modeling. This method achieved 82.61% sensitivity, 64.86% specificity, and an AUC of 0.8486 on another independent test set. These results are higher than those of other studies on the same problems.

摘要

转录因子(TFs)在基因表达和三维基因组构象调控中发挥着重要作用。转录因子能够与被称为增强子和启动子的特定DNA片段结合。一些转录因子与靠近转录起始位点的启动子DNA片段结合,形成允许聚合酶结合以启动转录的复合物。先前的研究表明,甲基化DNA具有抑制和阻止转录因子与DNA片段结合的能力。然而,最近的研究发现存在能够与甲基化DNA片段结合的转录因子。识别这些转录因子是更好地理解细胞基因表达机制的重要基石。然而,由于实验方法通常既耗时又费力,因此开发计算方法至关重要。在本研究中,我们针对两个问题提出了两种机器学习方法:(1)识别转录因子,(2)识别偏好与甲基化DNA靶点结合的转录因子(TFPMs)。对于转录因子识别问题,所提出的方法使用位置特异性评分矩阵进行数据表示,并使用深度卷积神经网络进行建模。该方法在独立测试集上实现了90.56%的灵敏度、83.96%的特异性以及受试者工作特征曲线下面积(AUC)为0.9596。对于TFPM识别问题,我们建议使用减少间隙二肽组成进行数据表示,并使用支持向量机算法进行建模。该方法在另一个独立测试集上实现了82.61%的灵敏度、64.86%的特异性以及AUC为0.8486。这些结果高于针对相同问题的其他研究结果。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1be7/9475634/8fd81225ee79/ao2c03696_0007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1be7/9475634/1da6799236de/ao2c03696_0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1be7/9475634/1d0dea450c12/ao2c03696_0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1be7/9475634/3b81dd412ee2/ao2c03696_0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1be7/9475634/f7d4027887e1/ao2c03696_0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1be7/9475634/6750dd65e874/ao2c03696_0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1be7/9475634/8fd81225ee79/ao2c03696_0007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1be7/9475634/1da6799236de/ao2c03696_0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1be7/9475634/1d0dea450c12/ao2c03696_0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1be7/9475634/3b81dd412ee2/ao2c03696_0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1be7/9475634/f7d4027887e1/ao2c03696_0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1be7/9475634/6750dd65e874/ao2c03696_0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1be7/9475634/8fd81225ee79/ao2c03696_0007.jpg

相似文献

1
Identifying Transcription Factors That Prefer Binding to Methylated DNA Using Reduced -Gap Dipeptide Composition.利用减少间隙二肽组成鉴定偏好结合甲基化DNA的转录因子
ACS Omega. 2022 Aug 30;7(36):32322-32330. doi: 10.1021/acsomega.2c03696. eCollection 2022 Sep 13.
2
Detection of transcription factors binding to methylated DNA by deep recurrent neural network.通过深度递归神经网络检测与甲基化 DNA 结合的转录因子。
Brief Bioinform. 2022 Jan 17;23(1). doi: 10.1093/bib/bbab533.
3
Effects of DNA Methylation on TFs in Human Embryonic Stem Cells.DNA甲基化对人类胚胎干细胞中转录因子的影响。
Front Genet. 2021 Feb 23;12:639461. doi: 10.3389/fgene.2021.639461. eCollection 2021.
4
Predicting Preference of Transcription Factors for Methylated DNA Using Sequence Information.利用序列信息预测转录因子对甲基化DNA的偏好性
Mol Ther Nucleic Acids. 2020 Jul 31;22:1043-1050. doi: 10.1016/j.omtn.2020.07.035. eCollection 2020 Dec 4.
5
Network motif-based identification of transcription factor-target gene relationships by integrating multi-source biological data.通过整合多源生物数据基于网络基序识别转录因子-靶基因关系
BMC Bioinformatics. 2008 Apr 21;9:203. doi: 10.1186/1471-2105-9-203.
6
A deep learning model to identify gene expression level using cobinding transcription factor signals.利用共结合转录因子信号识别基因表达水平的深度学习模型。
Brief Bioinform. 2022 Jan 17;23(1). doi: 10.1093/bib/bbab501.
7
DeepTFactor: A deep learning-based tool for the prediction of transcription factors.DeepTFactor:一种基于深度学习的转录因子预测工具。
Proc Natl Acad Sci U S A. 2021 Jan 12;118(2). doi: 10.1073/pnas.2021171118.
8
Modeling binding specificities of transcription factor pairs with random forests.用随机森林模型模拟转录因子对的结合特异性。
BMC Bioinformatics. 2022 Jun 3;23(1):212. doi: 10.1186/s12859-022-04734-7.
9
High-resolution DNA-binding specificity analysis of yeast transcription factors.酵母转录因子的高分辨率DNA结合特异性分析
Genome Res. 2009 Apr;19(4):556-66. doi: 10.1101/gr.090233.108. Epub 2009 Jan 21.
10
A capsule network-based method for identifying transcription factors.一种基于胶囊网络的转录因子识别方法。
Front Microbiol. 2022 Dec 6;13:1048478. doi: 10.3389/fmicb.2022.1048478. eCollection 2022.

引用本文的文献

1
TFProtBert: Detection of Transcription Factors Binding to Methylated DNA Using ProtBert Latent Space Representation.TFProtBert:利用ProtBert潜在空间表示法检测与甲基化DNA结合的转录因子
Int J Mol Sci. 2025 Apr 29;26(9):4234. doi: 10.3390/ijms26094234.
2
Identifying the DNA methylation preference of transcription factors using ProtBERT and SVM.使用ProtBERT和支持向量机识别转录因子的DNA甲基化偏好性。
PLoS Comput Biol. 2025 May 13;21(5):e1012513. doi: 10.1371/journal.pcbi.1012513. eCollection 2025 May.
3
Uncovering the roles of DNA hemi-methylation in transcriptional regulation using MspJI-assisted hemi-methylation sequencing.

本文引用的文献

1
Detection of transcription factors binding to methylated DNA by deep recurrent neural network.通过深度递归神经网络检测与甲基化 DNA 结合的转录因子。
Brief Bioinform. 2022 Jan 17;23(1). doi: 10.1093/bib/bbab533.
2
MutTMPredictor: Robust and accurate cascade XGBoost classifier for prediction of mutations in transmembrane proteins.MutTMPredictor:用于预测跨膜蛋白突变的强大且准确的级联XGBoost分类器。
Comput Struct Biotechnol J. 2021 Nov 19;19:6400-6416. doi: 10.1016/j.csbj.2021.11.024. eCollection 2021.
3
Sequence-Based Prediction of Plant Protein-Protein Interactions by Combining Discrete Sine Transformation With Rotation Forest.
利用 MspJI 辅助半甲基化测序技术揭示 DNA 半甲基化在转录调控中的作用。
Nucleic Acids Res. 2024 Mar 21;52(5):e24. doi: 10.1093/nar/gkae023.
4
eMIC-AntiKP: Estimating minimum inhibitory concentrations of antibiotics towards using deep learning.eMIC-抗肺炎克雷伯菌:利用深度学习估算抗生素对肺炎克雷伯菌的最低抑菌浓度
Comput Struct Biotechnol J. 2022 Dec 26;21:751-757. doi: 10.1016/j.csbj.2022.12.041. eCollection 2023.
基于离散正弦变换与旋转森林相结合的植物蛋白质-蛋白质相互作用的序列预测
Evol Bioinform Online. 2021 Oct 12;17:11769343211050067. doi: 10.1177/11769343211050067. eCollection 2021.
4
Improved Large-Scale Homology Search by Two-Step Seed Search Using Multiple Reduced Amino Acid Alphabets.两步种子搜索结合多套简化氨基酸字母表提高大规模同源性搜索
Genes (Basel). 2021 Sep 21;12(9):1455. doi: 10.3390/genes12091455.
5
Accurate Identification of Antioxidant Proteins Based on a Combination of Machine Learning Techniques and Hidden Markov Model Profiles.基于机器学习技术和隐马尔可夫模型谱的抗氧化蛋白的准确识别。
Comput Math Methods Med. 2021 Aug 7;2021:5770981. doi: 10.1155/2021/5770981. eCollection 2021.
6
A secondary structure-based position-specific scoring matrix applied to the improvement in protein secondary structure prediction.基于二级结构的位置特异性评分矩阵在提高蛋白质二级结构预测中的应用。
PLoS One. 2021 Jul 28;16(7):e0255076. doi: 10.1371/journal.pone.0255076. eCollection 2021.
7
Amino Acid Reduction Can Help to Improve the Identification of Antimicrobial Peptides and Their Functional Activities.氨基酸还原有助于提高抗菌肽的鉴定及其功能活性。
Front Genet. 2021 Apr 20;12:669328. doi: 10.3389/fgene.2021.669328. eCollection 2021.
8
IDRBP-PPCT: Identifying Nucleic Acid-Binding Proteins Based on Position-Specific Score Matrix and Position-Specific Frequency Matrix Cross Transformation.基于位置特异得分矩阵和位置特异频率矩阵交叉变换的核酸结合蛋白识别方法(IDRBP-PPCT)
IEEE/ACM Trans Comput Biol Bioinform. 2022 Jul-Aug;19(4):2284-2293. doi: 10.1109/TCBB.2021.3069263. Epub 2022 Aug 8.
9
iBLP: An XGBoost-Based Predictor for Identifying Bioluminescent Proteins.iBLP:一种基于 XGBoost 的生物发光蛋白鉴定预测器。
Comput Math Methods Med. 2021 Jan 7;2021:6664362. doi: 10.1155/2021/6664362. eCollection 2021.
10
IHEC_RAAC: a online platform for identifying human enzyme classes via reduced amino acid cluster strategy.IHEC\_RAAC:一种通过简化氨基酸簇策略来鉴定人类酶类的在线平台。
Amino Acids. 2021 Feb;53(2):239-251. doi: 10.1007/s00726-021-02941-9. Epub 2021 Jan 23.