Islam Md Khairul, Wagh Himanshu, Wei Hairong
Computational Science and Engineering, Michigan Technological University, Houghton, MI, USA.
College of Forest Resources and Environmental Science, Michigan Technological University, Houghton, MI, USA.
Bioinform Biol Insights. 2025 Mar 27;19:11779322251325390. doi: 10.1177/11779322251325390. eCollection 2025.
The DyGAF model, which stands for Dynamic Gene Attention Focus, is specifically designed and tailored to address the challenges in biomarker detection, progression reporting of pathogen infection, and disease diagnostics. The DyGAF model introduced a novel dual-model attention-based mechanism within neural networks, combined with machine learning algorithms to enhance the process of biomarker identification. The model transcended traditional diagnostic approaches by meticulously analyzing gene expression data. DyGAF not only identified but also ranked genes based on their significance, revealing a comprehensive list of the top genes essential for disease detection and prognosis. In addition, KEGG pathways, Wiki Pathways, and Gene Ontology-based analyses provided a multileveled evaluation of the genes' roles. In our analyses, we tailored COVID-19 gene expression profile from nasopharyngeal swabs that offer a more nuanced view of the intricate interplay between the host and the virus. The genes ranked by the DyGAF model were compared against those selected by differential expression analysis and random forest feature selection methods for further validation of our model. DyGAF demonstrated its prowess in identifying important biomarkers that could enrich gene ontologies and pathways crucial for elucidating the pathogenesis of COVID-19. Furthermore, DyGAF was also employed for diagnosing COVID-19 patients by classifying gene-expression profiles with an accuracy of 94.23%. Benchmarking against other conventional models revealed DyGAF's superior performance, highlighting its effectiveness in identifying and categorizing COVID-19 cases. In summary, DyGAF model represents a significant advancement in genomic research, providing a more comprehensive and precise tool for identifying key genetic markers and unraveling the complex biological insights of a disease. The DyGAF model is available as a software package at the following link: https://github.com/hiddenntreasure/DyGAF.
DyGAF模型,即动态基因注意力聚焦模型,是专门设计和定制的,以应对生物标志物检测、病原体感染进展报告和疾病诊断方面的挑战。DyGAF模型在神经网络中引入了一种基于双模型注意力的新型机制,并结合机器学习算法来增强生物标志物识别过程。该模型通过精心分析基因表达数据超越了传统诊断方法。DyGAF不仅识别基因,还根据其重要性对基因进行排名,揭示了疾病检测和预后所必需的顶级基因的综合列表。此外,基于京都基因与基因组百科全书(KEGG)通路、维基通路和基因本体论的分析提供了对基因作用的多层次评估。在我们的分析中,我们对来自鼻咽拭子的新冠病毒基因表达谱进行了定制,从而更细致地了解宿主与病毒之间复杂的相互作用。将DyGAF模型排名的基因与通过差异表达分析和随机森林特征选择方法选择的基因进行比较,以进一步验证我们的模型。DyGAF在识别重要生物标志物方面展现出其优势,这些生物标志物可丰富对阐明新冠病毒发病机制至关重要的基因本体和通路。此外,DyGAF还被用于通过对基因表达谱进行分类来诊断新冠患者,准确率达94.23%。与其他传统模型进行基准测试表明DyGAF具有卓越性能,突出了其在识别和分类新冠病例方面的有效性。总之,DyGAF模型代表了基因组研究的重大进展,为识别关键遗传标记和揭示疾病复杂的生物学见解提供了一个更全面、精确的工具。DyGAF模型可作为软件包通过以下链接获取:https://github.com/hiddenntreasure/DyGAF 。