Rezapour Mostafa, Niazi Muhammad Khalid Khan, Lu Hao, Narayanan Aarthi, Gurcan Metin Nafi
Center for Artificial Intelligence Research, Wake Forest University School of Medicine, Winston-Salem, NC, United States.
Department of Biology, George Mason University, Fairfax, VA, United States.
Front Artif Intell. 2024 Aug 30;7:1405332. doi: 10.3389/frai.2024.1405332. eCollection 2024.
This study introduces the Supervised Magnitude-Altitude Scoring (SMAS) methodology, a novel machine learning-based approach for analyzing gene expression data from non-human primates (NHPs) infected with Ebola virus (EBOV). By focusing on host-pathogen interactions, this research aims to enhance the understanding and identification of critical biomarkers for Ebola infection.
We utilized a comprehensive dataset of NanoString gene expression profiles from Ebola-infected NHPs. The SMAS system combines gene selection based on both statistical significance and expression changes. Employing linear classifiers such as logistic regression, the method facilitates precise differentiation between RT-qPCR positive and negative NHP samples.
The application of SMAS led to the identification of IFI6 and IFI27 as key biomarkers, which demonstrated perfect predictive performance with 100% accuracy and optimal Area Under the Curve (AUC) metrics in classifying various stages of Ebola infection. Additionally, genes including MX1, OAS1, and ISG15 were significantly upregulated, underscoring their vital roles in the immune response to EBOV.
Gene Ontology (GO) analysis further elucidated the involvement of these genes in critical biological processes and immune response pathways, reinforcing their significance in Ebola pathogenesis. Our findings highlight the efficacy of the SMAS methodology in revealing complex genetic interactions and response mechanisms, which are essential for advancing the development of diagnostic tools and therapeutic strategies.
This study provides valuable insights into EBOV pathogenesis, demonstrating the potential of SMAS to enhance the precision of diagnostics and interventions for Ebola and other viral infections.
本研究介绍了监督幅度-高度评分(SMAS)方法,这是一种基于机器学习的新型方法,用于分析感染埃博拉病毒(EBOV)的非人灵长类动物(NHP)的基因表达数据。通过关注宿主-病原体相互作用,本研究旨在加深对埃博拉感染关键生物标志物的理解和识别。
我们利用了来自感染埃博拉病毒的非人灵长类动物的NanoString基因表达谱综合数据集。SMAS系统结合了基于统计显著性和表达变化的基因选择。该方法采用逻辑回归等线性分类器,有助于精确区分逆转录定量聚合酶链反应(RT-qPCR)阳性和阴性的非人灵长类动物样本。
SMAS的应用导致鉴定出IFI6和IFI27为关键生物标志物,它们在对埃博拉感染的各个阶段进行分类时,显示出100%的准确率和最佳曲线下面积(AUC)指标的完美预测性能。此外,包括MX1、OAS1和ISG15在内的基因显著上调,突出了它们在对EBOV免疫反应中的重要作用。
基因本体(GO)分析进一步阐明了这些基因在关键生物学过程和免疫反应途径中的参与,强化了它们在埃博拉发病机制中的重要性。我们的研究结果突出了SMAS方法在揭示复杂基因相互作用和反应机制方面的有效性,这对于推进诊断工具和治疗策略的开发至关重要。
本研究为EBOV发病机制提供了有价值的见解,证明了SMAS在提高埃博拉和其他病毒感染诊断和干预精度方面的潜力。