Department of Computer Science and Engineering, RCC Institute of Information Technology, Kolkata, West Bengal, India.
Department of Computer Science and Engineering, Aliah University, Kolkata, West Bengal, India.
PLoS One. 2021 Feb 19;16(2):e0246920. doi: 10.1371/journal.pone.0246920. eCollection 2021.
In-silico prediction of repurposable drugs is an effective drug discovery strategy that supplements de-nevo drug discovery from scratch. Reduced development time, less cost and absence of severe side effects are significant advantages of using drug repositioning. Most recent and most advanced artificial intelligence (AI) approaches have boosted drug repurposing in terms of throughput and accuracy enormously. However, with the growing number of drugs, targets and their massive interactions produce imbalanced data which may not be suitable as input to the classification model directly. Here, we have proposed DTI-SNNFRA, a framework for predicting drug-target interaction (DTI), based on shared nearest neighbour (SNN) and fuzzy-rough approximation (FRA). It uses sampling techniques to collectively reduce the vast search space covering the available drugs, targets and millions of interactions between them. DTI-SNNFRA operates in two stages: first, it uses SNN followed by a partitioning clustering for sampling the search space. Next, it computes the degree of fuzzy-rough approximations and proper degree threshold selection for the negative samples' undersampling from all possible interaction pairs between drugs and targets obtained in the first stage. Finally, classification is performed using the positive and selected negative samples. We have evaluated the efficacy of DTI-SNNFRA using AUC (Area under ROC Curve), Geometric Mean, and F1 Score. The model performs exceptionally well with a high prediction score of 0.95 for ROC-AUC. The predicted drug-target interactions are validated through an existing drug-target database (Connectivity Map (Cmap)).
基于共享最近邻(SNN)和模糊粗糙近似(FRA)的药物-靶点相互作用(DTI)预测的新框架
从无到有地发现新药是一种有效的药物发现策略,而基于计算机的药物再利用预测则是对这一策略的补充。使用药物重定位具有显著的优势,例如缩短开发时间、降低成本和避免严重的副作用。最近和最先进的人工智能(AI)方法在提高药物再利用的通量和准确性方面取得了巨大的进展。然而,随着药物、靶点数量的增加及其大量的相互作用,产生了不平衡的数据,这些数据可能不适合直接作为分类模型的输入。在这里,我们提出了 DTI-SNNFRA,这是一个基于共享最近邻(SNN)和模糊粗糙近似(FRA)的药物-靶点相互作用(DTI)预测框架。它使用抽样技术,共同减少涵盖现有药物、靶点和它们之间数百万种相互作用的庞大搜索空间。DTI-SNNFRA 分两个阶段运行:首先,它使用 SNN ,然后使用分区聚类对搜索空间进行抽样。接下来,它计算模糊粗糙近似的程度,并从第一阶段获得的药物和靶点之间所有可能的相互作用对中选择适当的负样本欠采样的程度阈值。最后,使用正样本和选择的负样本进行分类。我们使用 AUC(ROC 曲线下面积)、几何均值和 F1 分数来评估 DTI-SNNFRA 的功效。该模型的表现非常出色,ROC-AUC 的预测得分高达 0.95。通过现有的药物-靶点数据库(Connectivity Map (Cmap))验证预测的药物-靶点相互作用。