• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一种利用 CIS 调控元件模式识别 DNA 增强子区域的机器学习技术。

A machine learning technique for identifying DNA enhancer regions utilizing CIS-regulatory element patterns.

机构信息

Department of Computer Science, School of Systems and Technology, University of Management and Technology, Lahore, Pakistan.

Department of Computer, College of Science and Arts in Ar Rass, Qassim University, Ar Rass, Saudi Arabia.

出版信息

Sci Rep. 2022 Sep 7;12(1):15183. doi: 10.1038/s41598-022-19099-3.

DOI:10.1038/s41598-022-19099-3
PMID:36071071
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9452539/
Abstract

Enhancers regulate gene expression, by playing a crucial role in the synthesis of RNAs and proteins. They do not directly encode proteins or RNA molecules. In order to control gene expression, it is important to predict enhancers and their potency. Given their distance from the target gene, lack of common motifs, and tissue/cell specificity, enhancer regions are thought to be difficult to predict in DNA sequences. Recently, a number of bioinformatics tools were created to distinguish enhancers from other regulatory components and to pinpoint their advantages. However, because the quality of its prediction method needs to be improved, its practical application value must also be improved. Based on nucleotide composition and statistical moment-based features, the current study suggests a novel method for identifying enhancers and non-enhancers and evaluating their strength. The proposed study outperformed state-of-the-art techniques using fivefold and tenfold cross-validation in terms of accuracy. The accuracy from the current study results in 86.5% and 72.3% in enhancer site and its strength prediction respectively. The results of the suggested methodology point to the potential for more efficient and successful outcomes when statistical moment-based features are used. The current study's source code is available to the research community at https://github.com/csbioinfopk/enpred .

摘要

增强子调节基因表达,在 RNA 和蛋白质的合成中起着至关重要的作用。它们不直接编码蛋白质或 RNA 分子。为了控制基因表达,预测增强子及其活性非常重要。由于它们与靶基因的距离、缺乏共同基序以及组织/细胞特异性,因此增强子区域被认为难以在 DNA 序列中预测。最近,已经开发了许多生物信息学工具来区分增强子和其他调控成分,并确定它们的优势。然而,由于其预测方法的质量需要提高,因此其实际应用价值也必须提高。基于核苷酸组成和基于统计矩的特征,本研究提出了一种新的方法来识别增强子和非增强子,并评估它们的强度。在五重和十倍交叉验证方面,该研究在准确性方面优于最先进的技术。当前研究的结果在增强子位点及其强度预测方面分别达到了 86.5%和 72.3%的准确率。所提出方法的结果表明,当使用基于统计矩的特征时,可能会获得更高效和成功的结果。本研究的源代码可在 https://github.com/csbioinfopk/enpred 上获得,供研究社区使用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cd4f/9452539/e66bb553f12f/41598_2022_19099_Fig15_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cd4f/9452539/665835fca7e7/41598_2022_19099_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cd4f/9452539/3fd139b2a0f2/41598_2022_19099_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cd4f/9452539/03ecbb21d4b8/41598_2022_19099_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cd4f/9452539/5dfad157edfc/41598_2022_19099_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cd4f/9452539/928a3a96d791/41598_2022_19099_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cd4f/9452539/a501a9849571/41598_2022_19099_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cd4f/9452539/a0919217098a/41598_2022_19099_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cd4f/9452539/a425290374cc/41598_2022_19099_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cd4f/9452539/c96a17848e3b/41598_2022_19099_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cd4f/9452539/e7302b74e8cc/41598_2022_19099_Fig10_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cd4f/9452539/d1039f86a86e/41598_2022_19099_Fig11_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cd4f/9452539/9bcaa80be3a1/41598_2022_19099_Fig12_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cd4f/9452539/c73454a38c37/41598_2022_19099_Fig13_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cd4f/9452539/6fe07def2092/41598_2022_19099_Fig14_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cd4f/9452539/e66bb553f12f/41598_2022_19099_Fig15_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cd4f/9452539/665835fca7e7/41598_2022_19099_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cd4f/9452539/3fd139b2a0f2/41598_2022_19099_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cd4f/9452539/03ecbb21d4b8/41598_2022_19099_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cd4f/9452539/5dfad157edfc/41598_2022_19099_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cd4f/9452539/928a3a96d791/41598_2022_19099_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cd4f/9452539/a501a9849571/41598_2022_19099_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cd4f/9452539/a0919217098a/41598_2022_19099_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cd4f/9452539/a425290374cc/41598_2022_19099_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cd4f/9452539/c96a17848e3b/41598_2022_19099_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cd4f/9452539/e7302b74e8cc/41598_2022_19099_Fig10_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cd4f/9452539/d1039f86a86e/41598_2022_19099_Fig11_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cd4f/9452539/9bcaa80be3a1/41598_2022_19099_Fig12_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cd4f/9452539/c73454a38c37/41598_2022_19099_Fig13_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cd4f/9452539/6fe07def2092/41598_2022_19099_Fig14_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cd4f/9452539/e66bb553f12f/41598_2022_19099_Fig15_HTML.jpg

相似文献

1
A machine learning technique for identifying DNA enhancer regions utilizing CIS-regulatory element patterns.一种利用 CIS 调控元件模式识别 DNA 增强子区域的机器学习技术。
Sci Rep. 2022 Sep 7;12(1):15183. doi: 10.1038/s41598-022-19099-3.
2
iEnhancer-SKNN: a stacking ensemble learning-based method for enhancer identification and classification using sequence information.iEnhancer-SKNN:一种基于堆叠集成学习的方法,用于使用序列信息进行增强子识别和分类。
Brief Funct Genomics. 2023 May 18;22(3):302-311. doi: 10.1093/bfgp/elac057.
3
Enhancer-FRL: Improved and Robust Identification of Enhancers and Their Activities Using Feature Representation Learning.增强子-FRL:利用特征表示学习改进并稳健识别增强子及其活性
IEEE/ACM Trans Comput Biol Bioinform. 2023 Mar-Apr;20(2):967-975. doi: 10.1109/TCBB.2022.3204365. Epub 2023 Apr 3.
4
Integrative machine learning framework for the identification of cell-specific enhancers from the human genome.从人类基因组中识别细胞特异性增强子的综合机器学习框架。
Brief Bioinform. 2021 Nov 5;22(6). doi: 10.1093/bib/bbab252.
5
A comprehensive revisit of the machine-learning tools developed for the identification of enhancers in the human genome.全面回顾用于识别人类基因组增强子的机器学习工具。
Proteomics. 2023 Jul;23(13-14):e2200409. doi: 10.1002/pmic.202200409. Epub 2023 Jun 7.
6
Improving Enhancer Identification with a Multi-Classifier Stacked Ensemble Model.利用多分类器堆叠集成模型提高增强子识别
J Mol Biol. 2023 Dec 1;435(23):168314. doi: 10.1016/j.jmb.2023.168314. Epub 2023 Oct 16.
7
Cross-species enhancer prediction using machine learning.基于机器学习的跨物种增强子预测。
Genomics. 2022 Sep;114(5):110454. doi: 10.1016/j.ygeno.2022.110454. Epub 2022 Aug 25.
8
BiRen: predicting enhancers with a deep-learning-based model using the DNA sequence alone.比人:仅使用 DNA 序列通过深度学习模型预测增强子。
Bioinformatics. 2017 Jul 1;33(13):1930-1936. doi: 10.1093/bioinformatics/btx105.
9
EnhancerPred2.0: predicting enhancers and their strength based on position-specific trinucleotide propensity and electron-ion interaction potential feature selection.增强子预测2.0:基于位置特异性三核苷酸倾向和电子-离子相互作用势特征选择预测增强子及其强度。
Mol Biosyst. 2017 Mar 28;13(4):767-774. doi: 10.1039/c7mb00054e.
10
Opening up the blackbox: an interpretable deep neural network-based classifier for cell-type specific enhancer predictions.打开黑箱:一种基于可解释深度神经网络的细胞类型特异性增强子预测分类器。
BMC Syst Biol. 2016 Aug 1;10 Suppl 2(Suppl 2):54. doi: 10.1186/s12918-016-0302-3.

引用本文的文献

1
An ensemble strategy for piRNA identification through hybrid moment-based feature modeling.一种基于混合矩特征建模的piRNA识别集成策略。
Sci Rep. 2025 Aug 18;15(1):30157. doi: 10.1038/s41598-025-14194-7.
2
m5c-iEnsem: 5-methylcytosine sites identification through ensemble models.m5c-iEnsem:通过集成模型进行5-甲基胞嘧啶位点识别。
Bioinformatics. 2022 Jan 1;41(1). doi: 10.1093/bioinformatics/btae722.
3
Gene replacement therapies for inherited disorders of neurotransmission: Current progress in succinic semialdehyde dehydrogenase deficiency.

本文引用的文献

1
iEnhancer-MFGBDT: Identifying enhancers and their strength by fusing multiple features and gradient boosting decision tree.iEnhancer-MFGBDT:通过融合多种特征和梯度提升决策树来识别增强子及其强度。
Math Biosci Eng. 2021 Oct 14;18(6):8797-8814. doi: 10.3934/mbe.2021434.
2
4mC-RF: Improving the prediction of 4mC sites using composition and position relative features and statistical moment.4mC-RF:利用组成和位置相关特征及统计矩改进4mC位点预测
Anal Biochem. 2021 Nov 15;633:114385. doi: 10.1016/j.ab.2021.114385. Epub 2021 Sep 25.
3
iSUMOK-PseAAC: prediction of lysine sumoylation sites using statistical moments and Chou's PseAAC.
神经递质遗传性疾病的基因替代疗法:琥珀酸半醛脱氢酶缺乏症的最新进展。
J Inherit Metab Dis. 2024 May;47(3):476-493. doi: 10.1002/jimd.12735. Epub 2024 Apr 6.
4
m1A-Ensem: accurate identification of 1-methyladenosine sites through ensemble models.m1A-Ensem:通过集成模型准确识别1-甲基腺苷位点。
BioData Min. 2024 Feb 15;17(1):4. doi: 10.1186/s13040-023-00353-x.
5
Prediction accuracy of regulatory elements from sequence varies by functional sequencing technique.从序列预测调控元件的准确性因功能测序技术而异。
Front Cell Infect Microbiol. 2023 Aug 2;13:1182567. doi: 10.3389/fcimb.2023.1182567. eCollection 2023.
6
iDHU-Ensem: Identification of dihydrouridine sites through ensemble learning models.iDHU-Ensem:通过集成学习模型识别二氢尿苷位点。
Digit Health. 2023 Mar 29;9:20552076231165963. doi: 10.1177/20552076231165963. eCollection 2023 Jan-Dec.
iSUMOK-PseAAC:利用统计矩和周氏伪氨基酸组成预测赖氨酸的类泛素化位点
PeerJ. 2021 Aug 4;9:e11581. doi: 10.7717/peerj.11581. eCollection 2021.
4
iEnhancer-RD: Identification of enhancers and their strength using RKPK features and deep neural networks.iEnhancer-RD:利用RKPK特征和深度神经网络识别增强子及其强度。
Anal Biochem. 2021 Oct 1;630:114318. doi: 10.1016/j.ab.2021.114318. Epub 2021 Aug 5.
5
iEnhancer-GAN: A Deep Learning Framework in Combination with Word Embedding and Sequence Generative Adversarial Net to Identify Enhancers and Their Strength.iEnhancer-GAN:一种结合词嵌入和序列生成对抗网络以识别增强子及其强度的深度学习框架。
Int J Mol Sci. 2021 Mar 30;22(7):3589. doi: 10.3390/ijms22073589.
6
A transformer architecture based on BERT and 2D convolutional neural network to identify DNA enhancers from sequence information.基于 BERT 和二维卷积神经网络的变压器架构,用于从序列信息中识别 DNA 增强子。
Brief Bioinform. 2021 Sep 2;22(5). doi: 10.1093/bib/bbab005.
7
ES-ARCNN: Predicting enhancer strength by using data augmentation and residual convolutional neural network.ES-ARCNN:利用数据增强和残差卷积神经网络预测增强子强度。
Anal Biochem. 2021 Apr 1;618:114120. doi: 10.1016/j.ab.2021.114120. Epub 2021 Jan 31.
8
iEnhancer-KL: A Novel Two-Layer Predictor for Identifying Enhancers by Position Specific of Nucleotide Composition.iEnhancer-KL:一种通过核苷酸组成的位置特异性识别增强子的新型双层预测器。
IEEE/ACM Trans Comput Biol Bioinform. 2021 Nov-Dec;18(6):2809-2815. doi: 10.1109/TCBB.2021.3053608. Epub 2021 Dec 8.
9
iHyd-LysSite (EPSV): Identifying Hydroxylysine Sites in Protein Using Statistical Formulation by Extracting Enhanced Position and Sequence Variant Feature Technique.iHyd-LysSite(EPSV):通过提取增强位置和序列变异特征技术,使用统计公式识别蛋白质中的羟赖氨酸位点。
Curr Genomics. 2020 Nov;21(7):536-545. doi: 10.2174/1389202921999200831142629.
10
iEnhancer-XG: interpretable sequence-based enhancers and their strength predictor.iEnhancer-XG:基于序列的可解释增强子及其强度预测器。
Bioinformatics. 2021 May 23;37(8):1060-1067. doi: 10.1093/bioinformatics/btaa914.