Department of Computer Science, School of Systems and Technology, University of Management & Technology, Lahore, Pakistan.
Department of Computer, College of Science and Arts in Ar Rass Qassim University, Ar Rass, Qassim, Saudi Arabia.
PeerJ. 2022 Oct 27;10:e14104. doi: 10.7717/peerj.14104. eCollection 2022.
Dihydrouridine (D) is a modified transfer RNA post-transcriptional modification (PTM) that occurs abundantly in bacteria, eukaryotes, and archaea. The D modification assists in the stability and conformational flexibility of tRNA. The D modification is also responsible for pulmonary carcinogenesis in humans.
For the detection of D sites, mass spectrometry and site-directed mutagenesis have been developed. However, both are labor-intensive and time-consuming methods. The availability of sequence data has provided the opportunity to build computational models for enhancing the identification of D sites. Based on the sequence data, the DHU-Pred model was proposed in this study to find possible D sites.
The model was built by employing comprehensive machine learning and feature extraction approaches. It was then validated using in-demand evaluation metrics and rigorous experimentation and testing approaches.
The DHU-Pred revealed an accuracy score of 96.9%, which was considerably higher compared to the existing D site predictors.
A user-friendly web server for the proposed model was also developed and is freely available for the researchers.
二氢尿嘧啶 (D) 是一种在细菌、真核生物和古菌中大量存在的 tRNA 转录后修饰 (PTM)。D 修饰有助于 tRNA 的稳定性和构象灵活性。D 修饰还与人类的肺癌发生有关。
为了检测 D 位点,已经开发了质谱和定点突变技术。然而,这两种方法都既繁琐又耗时。序列数据的可用性为构建用于增强 D 位点鉴定的计算模型提供了机会。基于序列数据,本研究提出了 DHU-Pred 模型来寻找可能的 D 位点。
该模型通过采用全面的机器学习和特征提取方法构建。然后,使用需求评估指标以及严格的实验和测试方法对其进行验证。
DHU-Pred 模型的准确率达到了 96.9%,与现有的 D 位点预测器相比有了显著提高。
还开发了一个易于使用的针对该模型的 Web 服务器,并免费提供给研究人员使用。