iDHU-Ensem：通过集成学习模型识别二氢尿苷位点。

iDHU-Ensem: Identification of dihydrouridine sites through ensemble learning models.

作者信息

Suleman Muhammad Taseer, Alturise Fahad, Alkhalifah Tamim, Khan Yaser Daanial

机构信息

Department of Computer Science, School of systems and technology, University of Management and Technology, Lahore, Pakistan.

Department of Computer, College of Science and Arts in Ar Rass, Qassim University, Ar Rass, Qassim, Saudi Arabia.

出版信息

Digit Health. 2023 Mar 29;9:20552076231165963. doi: 10.1177/20552076231165963. eCollection 2023 Jan-Dec.

DOI:10.1177/20552076231165963

PMID:37009307

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10064468/

Abstract

BACKGROUND

Dihydrouridine (D) is one of the most significant uridine modifications that have a prominent occurrence in eukaryotes. The folding and conformational flexibility of transfer RNA (tRNA) can be attained through this modification.

OBJECTIVE

The modification also triggers lung cancer in humans. The identification of D sites was carried out through conventional laboratory methods; however, those were costly and time-consuming. The readiness of RNA sequences helps in the identification of D sites through computationally intelligent models. However, the most challenging part is turning these biological sequences into distinct vectors.

METHODS

The current research proposed novel feature extraction mechanisms and the identification of D sites in tRNA sequences using ensemble models. The ensemble models were then subjected to evaluation using k-fold cross-validation and independent testing.

RESULTS

The results revealed that the stacking ensemble model outperformed all the ensemble models by revealing 0.98 accuracy, 0.98 specificity, 0.97 sensitivity, and 0.92 Matthews Correlation Coefficient. The proposed model, iDHU-Ensem, was also compared with pre-existing predictors using an independent test. The accuracy scores have shown that the proposed model in this research study performed better than the available predictors.

CONCLUSION

The current research contributed towards the enhancement of D site identification capabilities through computationally intelligent methods. A web-based server, iDHU-Ensem, was also made available for the researchers at https://taseersuleman-idhu-ensem-idhu-ensem.streamlit.app/.

摘要

背景

二氢尿苷（D）是真核生物中最显著的尿苷修饰之一。通过这种修饰可实现转运RNA（tRNA）的折叠和构象灵活性。

目的

这种修饰也会引发人类肺癌。通过传统实验室方法进行D位点的鉴定；然而，这些方法成本高且耗时。RNA序列的可得性有助于通过计算智能模型鉴定D位点。然而，最具挑战性的部分是将这些生物序列转化为独特的向量。

方法

当前研究提出了新颖的特征提取机制，并使用集成模型鉴定tRNA序列中的D位点。然后使用k折交叉验证和独立测试对集成模型进行评估。

结果

结果显示，堆叠集成模型表现优于所有集成模型，准确率为0.98、特异性为0.98、灵敏度为0.97、马修斯相关系数为0.92。还使用独立测试将所提出的模型iDHU-Ensem与先前存在的预测器进行了比较。准确率得分表明，本研究中提出的模型比现有预测器表现更好。

结论

当前研究通过计算智能方法有助于提高D位点的识别能力。还为研究人员在https://taseersuleman-idhu-ensem-idhu-ensem.streamlit.app/提供了基于网络的服务器iDHU-Ensem。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6176/10064468/22baab6805dc/10.1177_20552076231165963-fig1.jpg

相似文献

iDHU-Ensem: Identification of dihydrouridine sites through ensemble learning models.iDHU-Ensem：通过集成学习模型识别二氢尿苷位点。

Digit Health. 2023 Mar 29;9:20552076231165963. doi: 10.1177/20552076231165963. eCollection 2023 Jan-Dec.

m1A-Ensem: accurate identification of 1-methyladenosine sites through ensemble models.m1A-Ensem：通过集成模型准确识别1-甲基腺苷位点。

BioData Min. 2024 Feb 15;17(1):4. doi: 10.1186/s13040-023-00353-x.

PseU-Pred: An ensemble model for accurate identification of pseudouridine sites.PseU-Pred：一种用于准确识别假尿嘧啶位点的集成模型。

Anal Biochem. 2023 Sep 1;676:115247. doi: 10.1016/j.ab.2023.115247. Epub 2023 Jul 10.

DHU-Pred: accurate prediction of dihydrouridine sites using position and composition variant features on diverse classifiers.DHU-Pred：使用多种分类器上的位置和组成变体特征准确预测二氢尿嘧啶位点。

PeerJ. 2022 Oct 27;10:e14104. doi: 10.7717/peerj.14104. eCollection 2022.

Stack-DHUpred: Advancing the accuracy of dihydrouridine modification sites detection via stacking approach.Stack-DHUpred：通过堆叠方法提高二氢尿嘧啶修饰位点检测的准确性。

Comput Biol Med. 2024 Feb;169:107848. doi: 10.1016/j.compbiomed.2023.107848. Epub 2023 Dec 13.

m5c-iDeep: 5-Methylcytosine sites identification through deep learning.m5c-iDeep：通过深度学习识别5-甲基胞嘧啶位点

Methods. 2024 Oct;230:80-90. doi: 10.1016/j.ymeth.2024.07.008. Epub 2024 Jul 31.

Ensem-HAR: An Ensemble Deep Learning Model for Smartphone Sensor-Based Human Activity Recognition for Measurement of Elderly Health Monitoring.基于智能手机传感器的人类活动识别的集成深度学习模型 Ensem-HAR：用于测量老年人健康监测。

Biosensors (Basel). 2022 Jun 7;12(6):393. doi: 10.3390/bios12060393.

Identification of D Modification Sites by Integrating Heterogeneous Features in .通过整合异构特征鉴定. 中的 D 修饰位点

Molecules. 2019 Jan 22;24(3):380. doi: 10.3390/molecules24030380.

Stacking based ensemble learning framework for identification of nitrotyrosine sites.基于堆叠的集成学习框架用于鉴定硝化酪氨酸位点。

Comput Biol Med. 2024 Dec;183:109200. doi: 10.1016/j.compbiomed.2024.109200. Epub 2024 Oct 3.

iPseU-Layer: Identifying RNA Pseudouridine Sites Using Layered Ensemble Model.iPseU-Layer：使用分层集成模型识别 RNA 假尿嘧啶位点。

Interdiscip Sci. 2020 Jun;12(2):193-203. doi: 10.1007/s12539-020-00362-y. Epub 2020 Mar 13.

引用本文的文献

An ensemble strategy for piRNA identification through hybrid moment-based feature modeling.一种基于混合矩特征建模的piRNA识别集成策略。

Sci Rep. 2025 Aug 18;15(1):30157. doi: 10.1038/s41598-025-14194-7.

PADG-Pred: Exploring Ensemble Approaches for Identifying Parkinson's Disease Associated Biomarkers Using Genomic Sequences Analysis.PADG-Pred：利用基因组序列分析探索用于识别帕金森病相关生物标志物的集成方法。

IET Syst Biol. 2025 Jan-Dec;19(1):e70006. doi: 10.1049/syb2.70006.

eNSMBL-PASD: Spearheading early autism spectrum disorder detection through advanced genomic computational frameworks utilizing ensemble learning models.欧洲生物信息学研究所自闭症谱系障碍预测分析系统（eNSMBL-PASD）：通过利用集成学习模型的先进基因组计算框架引领早期自闭症谱系障碍检测。

Digit Health. 2025 Jan 27;11:20552076241313407. doi: 10.1177/20552076241313407. eCollection 2025 Jan-Dec.

m5c-iEnsem: 5-methylcytosine sites identification through ensemble models.m5c-iEnsem：通过集成模型进行5-甲基胞嘧啶位点识别。

Bioinformatics. 2022 Jan 1;41(1). doi: 10.1093/bioinformatics/btae722.

本文引用的文献

RCCC_Pred: A Novel Method for Sequence-Based Identification of Renal Clear Cell Carcinoma Genes through DNA Mutations and a Blend of Features.RCCC_Pred：一种通过DNA突变和特征融合基于序列鉴定肾透明细胞癌基因的新方法。

Diagnostics (Basel). 2022 Dec 3;12(12):3036. doi: 10.3390/diagnostics12123036.

Evaluation of deep learning techniques for identification of sarcoma-causing carcinogenic mutations.用于识别肉瘤致癌突变的深度学习技术评估

Digit Health. 2022 Oct 22;8:20552076221133703. doi: 10.1177/20552076221133703. eCollection 2022 Jan-Dec.

A machine learning technique for identifying DNA enhancer regions utilizing CIS-regulatory element patterns.一种利用 CIS 调控元件模式识别 DNA 增强子区域的机器学习技术。

Sci Rep. 2022 Sep 7;12(1):15183. doi: 10.1038/s41598-022-19099-3.

Machine learning techniques for identification of carcinogenic mutations, which cause breast adenocarcinoma.机器学习技术用于鉴定致癌突变，这些突变导致乳腺腺癌。

Sci Rep. 2022 Jul 11;12(1):11738. doi: 10.1038/s41598-022-15533-8.

m1A-pred: Prediction of Modified 1-methyladenosine Sites in RNA Sequences through Artificial Intelligence.m1A-pred：通过人工智能预测 RNA 序列中的修饰 1-甲基腺苷位点。

Comb Chem High Throughput Screen. 2022;25(14):2473-2484. doi: 10.2174/1386207325666220617152743.

DNAPred_Prot: Identification of DNA-Binding Proteins Using Composition- and Position-Based Features.DNAPred_Prot：利用基于组成和位置的特征识别DNA结合蛋白。

Appl Bionics Biomech. 2022 Apr 13;2022:5483115. doi: 10.1155/2022/5483115. eCollection 2022.

Identification of D Modification Sites Using a Random Forest Model Based on Nucleotide Chemical Properties.基于核苷酸化学性质的随机森林模型鉴定 D 修饰位点。

Int J Mol Sci. 2022 Mar 11;23(6):3044. doi: 10.3390/ijms23063044.

LBCEPred: a machine learning model to predict linear B-cell epitopes.LBCEPred：一种用于预测线性 B 细胞表位的机器学习模型。

Brief Bioinform. 2022 May 13;23(3). doi: 10.1093/bib/bbac035.

ORI-Deep: improving the accuracy for predicting origin of replication sites by using a blend of features and long short-term memory network.ORI-Deep：通过混合使用特征和长短期记忆网络来提高复制起始位点预测的准确性。

Brief Bioinform. 2022 Mar 10;23(2). doi: 10.1093/bib/bbac001.

An AI-based Prediction Model for Drug-drug Interactions in Osteoporosis and Paget's Diseases from SMILES.基于 AI 的 SMILES 药物相互作用预测模型在骨质疏松症和 Pagets 病中的应用

Mol Inform. 2022 Jun;41(6):e2100264. doi: 10.1002/minf.202100264. Epub 2022 Jan 22.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

iDHU-Ensem：通过集成学习模型识别二氢尿苷位点。

iDHU-Ensem: Identification of dihydrouridine sites through ensemble learning models.

作者信息

机构信息

出版信息

BACKGROUND

OBJECTIVE

METHODS

RESULTS

CONCLUSION

背景

目的

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献