Chaurasia Anushka, Kumar Deepak
Computer Science and Engineering, 385889 National Institute of Technology Meghalaya , Shillong, India.
Computer Engineering, National Institute of Technology Kurukshetra, Kurukshetra, India.
J Integr Bioinform. 2025 Jun 10. doi: 10.1515/jib-2024-0056.
Predicting Drug-Drug interaction (DDI)-induced adverse drug reactions (ADRs) using computational methods is challenging due to the availability of limited data samples, data sparsity, and high dimensionality. The issue of class imbalance further increases the intricacy of prediction. Different computational techniques have been presented for predicting DDI-induced ADRs in the general population; however, the area of DDI-induced pregnancy and neonatal ADRs has been underexplored. In the present work, a sparse ensemble-based computational approach is proposed that leverages SMILES strings as features, addresses high-dimensional and sparse data using Sparse Principal Component Analysis (SPCA), mitigates class imbalance with the Multilabel Synthetic Minority Oversampling Technique (MLSMOTE), and predicts pregnancy and neonatal ADRs through a stacking ensemble model. The SPCA has been evaluated for handling sparse data and shown 2.67 %-5.45 % improvement compared to PCA. The proposed stacking ensemble model has outperformed six state-of-the-art predictors regarding micro and macro scores for True Positive Rate (), F1 Score, False Positive Rate (), Precision, Hamming Loss, and ROC-AUC Score with 1.16 %-14.94 %.