DHU-Pred：使用多种分类器上的位置和组成变体特征准确预测二氢尿嘧啶位点。

DHU-Pred: accurate prediction of dihydrouridine sites using position and composition variant features on diverse classifiers.

机构信息

Department of Computer Science, School of Systems and Technology, University of Management & Technology, Lahore, Pakistan.

Department of Computer, College of Science and Arts in Ar Rass Qassim University, Ar Rass, Qassim, Saudi Arabia.

出版信息

PeerJ. 2022 Oct 27;10:e14104. doi: 10.7717/peerj.14104. eCollection 2022.

DOI:10.7717/peerj.14104

PMID:36320563

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9618264/

Abstract

BACKGROUND

Dihydrouridine (D) is a modified transfer RNA post-transcriptional modification (PTM) that occurs abundantly in bacteria, eukaryotes, and archaea. The D modification assists in the stability and conformational flexibility of tRNA. The D modification is also responsible for pulmonary carcinogenesis in humans.

OBJECTIVE

For the detection of D sites, mass spectrometry and site-directed mutagenesis have been developed. However, both are labor-intensive and time-consuming methods. The availability of sequence data has provided the opportunity to build computational models for enhancing the identification of D sites. Based on the sequence data, the DHU-Pred model was proposed in this study to find possible D sites.

METHODOLOGY

The model was built by employing comprehensive machine learning and feature extraction approaches. It was then validated using in-demand evaluation metrics and rigorous experimentation and testing approaches.

RESULTS

The DHU-Pred revealed an accuracy score of 96.9%, which was considerably higher compared to the existing D site predictors.

AVAILABILITY AND IMPLEMENTATION

A user-friendly web server for the proposed model was also developed and is freely available for the researchers.

摘要

背景

二氢尿嘧啶 (D) 是一种在细菌、真核生物和古菌中大量存在的 tRNA 转录后修饰 (PTM)。D 修饰有助于 tRNA 的稳定性和构象灵活性。D 修饰还与人类的肺癌发生有关。

目的

为了检测 D 位点，已经开发了质谱和定点突变技术。然而，这两种方法都既繁琐又耗时。序列数据的可用性为构建用于增强 D 位点鉴定的计算模型提供了机会。基于序列数据，本研究提出了 DHU-Pred 模型来寻找可能的 D 位点。

方法

该模型通过采用全面的机器学习和特征提取方法构建。然后，使用需求评估指标以及严格的实验和测试方法对其进行验证。

结果

DHU-Pred 模型的准确率达到了 96.9%，与现有的 D 位点预测器相比有了显著提高。

可用性和实现

还开发了一个易于使用的针对该模型的 Web 服务器，并免费提供给研究人员使用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9641/9618264/ae8117a10a3b/peerj-10-14104-g001.jpg

相似文献

DHU-Pred: accurate prediction of dihydrouridine sites using position and composition variant features on diverse classifiers.DHU-Pred：使用多种分类器上的位置和组成变体特征准确预测二氢尿嘧啶位点。

PeerJ. 2022 Oct 27;10:e14104. doi: 10.7717/peerj.14104. eCollection 2022.

Stack-DHUpred: Advancing the accuracy of dihydrouridine modification sites detection via stacking approach.Stack-DHUpred：通过堆叠方法提高二氢尿嘧啶修饰位点检测的准确性。

Comput Biol Med. 2024 Feb;169:107848. doi: 10.1016/j.compbiomed.2023.107848. Epub 2023 Dec 13.

iDHU-Ensem: Identification of dihydrouridine sites through ensemble learning models.iDHU-Ensem：通过集成学习模型识别二氢尿苷位点。

Digit Health. 2023 Mar 29;9:20552076231165963. doi: 10.1177/20552076231165963. eCollection 2023 Jan-Dec.

m1A-pred: Prediction of Modified 1-methyladenosine Sites in RNA Sequences through Artificial Intelligence.m1A-pred：通过人工智能预测 RNA 序列中的修饰 1-甲基腺苷位点。

Comb Chem High Throughput Screen. 2022;25(14):2473-2484. doi: 10.2174/1386207325666220617152743.

iRNAD: a computational tool for identifying D modification sites in RNA sequence.iRNAD：一种用于识别 RNA 序列中 D 修饰位点的计算工具。

Bioinformatics. 2019 Dec 1;35(23):4922-4929. doi: 10.1093/bioinformatics/btz358.

Identification of D Modification Sites Using a Random Forest Model Based on Nucleotide Chemical Properties.基于核苷酸化学性质的随机森林模型鉴定 D 修饰位点。

Int J Mol Sci. 2022 Mar 11;23(6):3044. doi: 10.3390/ijms23063044.

Identification of D Modification Sites by Integrating Heterogeneous Features in .通过整合异构特征鉴定. 中的 D 修饰位点

Molecules. 2019 Jan 22;24(3):380. doi: 10.3390/molecules24030380.

Comprehensive review and assessment of computational methods for predicting RNA post-transcriptional modification sites from RNA sequences.全面综述和评估基于 RNA 序列预测 RNA 转录后修饰位点的计算方法。

Brief Bioinform. 2020 Sep 25;21(5):1676-1696. doi: 10.1093/bib/bbz112.

Accurate identification of RNA D modification using multiple features.使用多种特征准确识别 RNA D 修饰。

RNA Biol. 2021 Dec;18(12):2236-2246. doi: 10.1080/15476286.2021.1898160. Epub 2021 Mar 17.

Large-scale comparative assessment of computational predictors for lysine post-translational modification sites.大规模比较评估赖氨酸翻译后修饰位点的计算预测因子。

Brief Bioinform. 2019 Nov 27;20(6):2267-2290. doi: 10.1093/bib/bby089.

引用本文的文献

Diaproteo: A supervised learning framework for early detection of diabetes mellitus based on proteomic profiles.Diaproteo：一种基于蛋白质组学图谱的糖尿病早期检测监督学习框架。

Digit Health. 2025 Jul 30;11:20552076251362281. doi: 10.1177/20552076251362281. eCollection 2025 Jan-Dec.

TNFR-LSTM: A Deep Intelligent Model for Identification of Tumour Necroses Factor Receptor (TNFR) Activity.TNFR-LSTM：一种用于识别肿瘤坏死因子受体（TNFR）活性的深度智能模型。

IET Syst Biol. 2025 Jan-Dec;19(1):e70007. doi: 10.1049/syb2.70007.

PADG-Pred: Exploring Ensemble Approaches for Identifying Parkinson's Disease Associated Biomarkers Using Genomic Sequences Analysis.PADG-Pred：利用基因组序列分析探索用于识别帕金森病相关生物标志物的集成方法。

IET Syst Biol. 2025 Jan-Dec;19(1):e70006. doi: 10.1049/syb2.70006.

eNSMBL-PASD: Spearheading early autism spectrum disorder detection through advanced genomic computational frameworks utilizing ensemble learning models.欧洲生物信息学研究所自闭症谱系障碍预测分析系统（eNSMBL-PASD）：通过利用集成学习模型的先进基因组计算框架引领早期自闭症谱系障碍检测。

Digit Health. 2025 Jan 27;11:20552076241313407. doi: 10.1177/20552076241313407. eCollection 2025 Jan-Dec.

m5c-iEnsem: 5-methylcytosine sites identification through ensemble models.m5c-iEnsem：通过集成模型进行5-甲基胞嘧啶位点识别。

Bioinformatics. 2022 Jan 1;41(1). doi: 10.1093/bioinformatics/btae722.

iDLB-Pred: identification of disordered lipid binding residues in protein sequences using convolutional neural network.iDLB-Pred：使用卷积神经网络鉴定蛋白质序列中紊乱脂质结合残基

Sci Rep. 2024 Oct 21;14(1):24724. doi: 10.1038/s41598-024-75700-x.

m1A-Ensem: accurate identification of 1-methyladenosine sites through ensemble models.m1A-Ensem：通过集成模型准确识别1-甲基腺苷位点。

BioData Min. 2024 Feb 15;17(1):4. doi: 10.1186/s13040-023-00353-x.

RCCC_Pred: A Novel Method for Sequence-Based Identification of Renal Clear Cell Carcinoma Genes through DNA Mutations and a Blend of Features.RCCC_Pred：一种通过DNA突变和特征融合基于序列鉴定肾透明细胞癌基因的新方法。

Diagnostics (Basel). 2022 Dec 3;12(12):3036. doi: 10.3390/diagnostics12123036.

本文引用的文献

Machine learning applications in RNA modification sites prediction.机器学习在RNA修饰位点预测中的应用。

Comput Struct Biotechnol J. 2021 Sep 29;19:5510-5524. doi: 10.1016/j.csbj.2021.09.025. eCollection 2021.

Evaluating machine learning methodologies for identification of cancer driver genes.评估用于识别癌症驱动基因的机器学习方法。

Sci Rep. 2021 Jun 10;11(1):12281. doi: 10.1038/s41598-021-91656-8.

Accurate identification of RNA D modification using multiple features.使用多种特征准确识别 RNA D 修饰。

RNA Biol. 2021 Dec;18(12):2236-2246. doi: 10.1080/15476286.2021.1898160. Epub 2021 Mar 17.

iHyd-LysSite (EPSV): Identifying Hydroxylysine Sites in Protein Using Statistical Formulation by Extracting Enhanced Position and Sequence Variant Feature Technique.iHyd-LysSite（EPSV）：通过提取增强位置和序列变异特征技术，使用统计公式识别蛋白质中的羟赖氨酸位点。

Curr Genomics. 2020 Nov;21(7):536-545. doi: 10.2174/1389202921999200831142629.

Identification of 4-carboxyglutamate residue sites based on position based statistical feature and multiple classification.基于位置的统计特征和多分类识别 4-羧基谷氨酸残基位点

Sci Rep. 2020 Oct 9;10(1):16913. doi: 10.1038/s41598-020-73107-y.

iSulfoTyr-PseAAC: Identify Tyrosine Sulfation Sites by Incorporating Statistical Moments Chou's 5-steps Rule and Pseudo Components.iSulfoTyr-PseAAC：通过结合统计矩、周氏五步法则和伪组分来识别酪氨酸硫酸化位点

Curr Genomics. 2019 May;20(4):306-320. doi: 10.2174/1389202920666190819091609.

AOPs-SVM: A Sequence-Based Classifier of Antioxidant Proteins Using a Support Vector Machine.AOPs-SVM：一种基于序列的使用支持向量机的抗氧化蛋白分类器。

Front Bioeng Biotechnol. 2019 Sep 18;7:224. doi: 10.3389/fbioe.2019.00224. eCollection 2019.

XG-PseU: an eXtreme Gradient Boosting based method for identifying pseudouridine sites.XG-PseU：一种基于极端梯度提升的假尿嘧啶位点识别方法。

Mol Genet Genomics. 2020 Jan;295(1):13-21. doi: 10.1007/s00438-019-01600-9. Epub 2019 Aug 7.

iRNAD: a computational tool for identifying D modification sites in RNA sequence.iRNAD：一种用于识别 RNA 序列中 D 修饰位点的计算工具。

Bioinformatics. 2019 Dec 1;35(23):4922-4929. doi: 10.1093/bioinformatics/btz358.

Develop machine learning-based regression predictive models for engineering protein solubility.开发基于机器学习的回归预测模型，用于工程蛋白质溶解度。

Bioinformatics. 2019 Nov 1;35(22):4640-4646. doi: 10.1093/bioinformatics/btz294.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

DHU-Pred：使用多种分类器上的位置和组成变体特征准确预测二氢尿嘧啶位点。

DHU-Pred: accurate prediction of dihydrouridine sites using position and composition variant features on diverse classifiers.

机构信息

出版信息

BACKGROUND

OBJECTIVE

METHODOLOGY

RESULTS

AVAILABILITY AND IMPLEMENTATION

背景

目的

方法

结果

可用性和实现

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献