Ali Farman, Almuhaimeed Abdullah, Alghamdi Wajdi, Aldossary Haya, Asiry Othman, Masmoudi Atef
Department of Computer Science, Bahria University Islamabad Campus, Islamabad, Pakistan.
King Abdulaziz City for Science and Technology, Digital Health Institute, 11442 Riyadh, Saudi Arabia.
Health Inf Sci Syst. 2025 Mar 11;13(1):28. doi: 10.1007/s13755-025-00347-5. eCollection 2025 Dec.
Epigenetic protein (EP) plays a crucial role in influencing disease development, controlling gene expression, and shaping cell identity. They hold potential as targets for future therapies, and studying their mechanisms can lead to improved diagnosis and treatment strategies for various diseases. Anticipating EP is imperative, yet conventional experimental approaches for prediction prove time-intensive and expensive. This work constructed CNN-BiLSTM, computational method for identification of EP prediction. Utilizing primary sequences, two datasets were constructed, and an amphiphilic pseudo amino acid, group dipeptide composition and group amino acid composition were devised to extract numerical features. Model training incorporated a suite of deep learning architectures, including BiLSTM, GRU, and CNN. Notably, an ensemble model combining CNN and BiLSTM, trained using AmpPseAAC features, demonstrated superior performance across both training and testing datasets compared to other predictors. This research contributes to the ongoing efforts to revolutionize therapeutic approaches by facilitating the identification of novel drug targets and improving disease treatment outcomes.
表观遗传蛋白(EP)在影响疾病发展、控制基因表达和塑造细胞特性方面起着至关重要的作用。它们有望成为未来治疗的靶点,对其机制的研究可以为各种疾病带来改进的诊断和治疗策略。预测EP势在必行,但传统的预测实验方法既耗时又昂贵。这项工作构建了用于识别EP预测的计算方法CNN-BiLSTM。利用一级序列构建了两个数据集,并设计了两亲性伪氨基酸、基团二肽组成和基团氨基酸组成来提取数值特征。模型训练采用了一系列深度学习架构,包括BiLSTM、GRU和CNN。值得注意的是,与其他预测器相比,使用AmpPseAAC特征训练的结合CNN和BiLSTM的集成模型在训练和测试数据集上均表现出卓越的性能。这项研究通过促进新型药物靶点的识别和改善疾病治疗结果,为正在进行的治疗方法变革努力做出了贡献。