Samadishadlou Mehrdad, Rahbarghazi Reza, Kavousi Kaveh, Bani Farhad
Department of Medical Nanotechnology, Faculty of Advanced Medical Sciences, Tabriz University of Medical Sciences, Tabriz, Iran.
Stem Cell Research Center, Tabriz University of Medical Sciences, Tabriz, Iran.
Biol Direct. 2024 Dec 10;19(1):127. doi: 10.1186/s13062-024-00543-5.
MicroRNAs (miRNAs) have shown potential as diagnostic biomarkers for myocardial infarction (MI) due to their early dysregulation and stability in circulation after MI. Moreover, they play a crucial role in regulating adaptive and maladaptive responses in cardiovascular diseases, making them attractive targets for potential biomarkers. However, their potential as novel biomarkers for diagnosing cardiovascular diseases requires systematic evaluation.
This study aimed to identify a miRNA biomarker panel for early-stage MI detection using bioinformatics and machine learning (ML) methods. miRNA expression data were obtained for early-stage MI patients and healthy controls from the Gene Expression Omnibus. Separate datasets were allocated for training and independent testing. Differential expression analysis was performed to identify dysregulated miRNAs in the training set. The least absolute shrinkage and selection operator (LASSO) was applied for feature selection to prioritize relevant miRNAs associated with MI. The selected miRNAs were used to develop ML models including support vector machine, Gradient Boosted, XGBoost, and a hard voting ensemble (HVE).
Differential expression analysis discovered 99 dysregulated miRNAs in the training set. LASSO feature selection prioritized 21 miRNAs. Ten miRNAs were identified in both the LASSO subset and independent test set. The HVE model trained with the selected miRNAs achieved an accuracy of 0.86 and AUC of 0.83 on the independent test set.
An integrated framework for robust miRNA selection from omics data shows promise for developing accurate diagnostic models for early-stage MI detection. The HVE model demonstrated good performance despite differences between training and test datasets.
微小RNA(miRNA)因其在心肌梗死(MI)后早期的表达失调及在循环中的稳定性,已显示出作为MI诊断生物标志物的潜力。此外,它们在调节心血管疾病的适应性和适应不良反应中起关键作用,使其成为潜在生物标志物的有吸引力的靶点。然而,它们作为诊断心血管疾病的新型生物标志物的潜力需要系统评估。
本研究旨在使用生物信息学和机器学习(ML)方法识别用于早期MI检测的miRNA生物标志物面板。从基因表达综合数据库获取早期MI患者和健康对照的miRNA表达数据。将单独的数据集分配用于训练和独立测试。进行差异表达分析以识别训练集中失调的miRNA。应用最小绝对收缩和选择算子(LASSO)进行特征选择,以优先选择与MI相关的相关miRNA。所选的miRNA用于开发包括支持向量机、梯度提升、XGBoost和硬投票集成(HVE)的ML模型。
差异表达分析在训练集中发现了99个失调的miRNA。LASSO特征选择对21个miRNA进行了优先排序。在LASSO子集和独立测试集中均鉴定出10个miRNA。用所选miRNA训练的HVE模型在独立测试集上的准确率达到0.86,曲线下面积(AUC)为0.83。
一个从组学数据中稳健选择miRNA的综合框架显示出开发用于早期MI检测的准确诊断模型的前景。尽管训练和测试数据集存在差异,但HVE模型表现良好。