College of Information Engineering, Shanghai Maritime University, Shanghai 201306, China.
Shanghai Key Laboratory of PMMP, East China Normal University, Shanghai 200241, China.
Comput Math Methods Med. 2020 May 9;2020:1573543. doi: 10.1155/2020/1573543. eCollection 2020.
Drugs are an important way to treat various diseases. However, they inevitably produce side effects, bringing great risks to human bodies and pharmaceutical companies. How to predict the side effects of drugs has become one of the essential problems in drug research. Designing efficient computational methods is an alternative way. Some studies paired the drug and side effect as a sample, thereby modeling the problem as a binary classification problem. However, the selection of negative samples is a key problem in this case. In this study, a novel negative sample selection strategy was designed for accessing high-quality negative samples. Such strategy applied the random walk with restart (RWR) algorithm on a chemical-chemical interaction network to select pairs of drugs and side effects, such that drugs were less likely to have corresponding side effects, as negative samples. Through several tests with a fixed feature extraction scheme and different machine-learning algorithms, models with selected negative samples produced high performance. The best model even yielded nearly perfect performance. These models had much higher performance than those without such strategy or with another selection strategy. Furthermore, it is not necessary to consider the balance of positive and negative samples under such a strategy.
药物是治疗各种疾病的重要手段。然而,它们不可避免地会产生副作用,给人体和制药公司带来巨大的风险。如何预测药物的副作用已成为药物研究的基本问题之一。设计高效的计算方法是一种替代方法。一些研究将药物和副作用配对作为一个样本,从而将问题建模为一个二进制分类问题。然而,在这种情况下,负样本的选择是一个关键问题。在这项研究中,设计了一种新的负样本选择策略,以获取高质量的负样本。该策略在化学-化学相互作用网络上应用随机游走重启动(RWR)算法来选择药物和副作用对,使药物不太可能产生相应的副作用作为负样本。通过使用固定特征提取方案和不同的机器学习算法进行的几次测试,使用所选负样本的模型产生了较高的性能。最佳模型甚至产生了几乎完美的性能。与没有该策略或使用另一种选择策略的模型相比,这些模型的性能要高得多。此外,在这种策略下,不必考虑正负样本的平衡。