Suppr超能文献

基于融合多种特征、数据平衡和特征选择技术提高药物-靶标相互作用预测。

Improving prediction of drug-target interactions based on fusing multiple features with data balancing and feature selection techniques.

机构信息

Department of Computer Engineering, University of Zanjan, Zanjan, Iran.

School of Biological Sciences Institute for Research in Fundamental Sciences (IPM), Tehran, Iran.

出版信息

PLoS One. 2023 Aug 3;18(8):e0288173. doi: 10.1371/journal.pone.0288173. eCollection 2023.

Abstract

Drug discovery relies on predicting drug-target interaction (DTI), which is an important challenging task. The purpose of DTI is to identify the interaction between drug chemical compounds and protein targets. Traditional wet lab experiments are time-consuming and expensive, that's why in recent years, the use of computational methods based on machine learning has attracted the attention of many researchers. Actually, a dry lab environment focusing more on computational methods of interaction prediction can be helpful in limiting search space for wet lab experiments. In this paper, a novel multi-stage approach for DTI is proposed that called SRX-DTI. In the first stage, combination of various descriptors from protein sequences, and a FP2 fingerprint that is encoded from drug are extracted as feature vectors. A major challenge in this application is the imbalanced data due to the lack of known interactions, in this regard, in the second stage, the One-SVM-US technique is proposed to deal with this problem. Next, the FFS-RF algorithm, a forward feature selection algorithm, coupled with a random forest (RF) classifier is developed to maximize the predictive performance. This feature selection algorithm removes irrelevant features to obtain optimal features. Finally, balanced dataset with optimal features is given to the XGBoost classifier to identify DTIs. The experimental results demonstrate that our proposed approach SRX-DTI achieves higher performance than other existing methods in predicting DTIs. The datasets and source code are available at: https://github.com/Khojasteh-hb/SRX-DTI.

摘要

药物发现依赖于预测药物-靶标相互作用(DTI),这是一项具有挑战性的任务。DTI 的目的是识别药物化学化合物与蛋白质靶标的相互作用。传统的湿实验室实验既耗时又昂贵,这就是为什么近年来,基于机器学习的计算方法的使用引起了许多研究人员的关注。实际上,一个专注于交互预测的计算方法的干实验室环境有助于限制湿实验室实验的搜索空间。在本文中,提出了一种称为 SRX-DTI 的用于 DTI 的新型多阶段方法。在第一阶段,从蛋白质序列和编码药物的 FP2 指纹中提取各种描述符的组合作为特征向量。由于缺乏已知的相互作用,该应用中的一个主要挑战是数据不平衡,在这方面,在第二阶段,提出了 One-SVM-US 技术来解决这个问题。接下来,开发了一种称为 FFS-RF 的前向特征选择算法,与随机森林(RF)分类器相结合,以最大程度地提高预测性能。该特征选择算法可删除不相关的特征,从而获得最佳特征。最后,将具有最佳特征的平衡数据集提供给 XGBoost 分类器以识别 DTI。实验结果表明,我们提出的方法 SRX-DTI 在预测 DTI 方面的性能优于其他现有方法。数据集和源代码可在:https://github.com/Khojasteh-hb/SRX-DTI 获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b2ff/10399861/35dd1f678c74/pone.0288173.g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验