Suppr超能文献

用于药物副作用预测的反相似性和可靠负样本。

Inverse similarity and reliable negative samples for drug side-effect prediction.

机构信息

Advanced Analytics Institute, FEIT, University of Technology Sydney, 15 Broadway, Ultimo, NSW 2007, Australia.

出版信息

BMC Bioinformatics. 2019 Feb 4;19(Suppl 13):554. doi: 10.1186/s12859-018-2563-x.

Abstract

BACKGROUND

In silico prediction of potential drug side-effects is of crucial importance for drug development, since wet experimental identification of drug side-effects is expensive and time-consuming. Existing computational methods mainly focus on leveraging validated drug side-effect relations for the prediction. The performance is severely impeded by the lack of reliable negative training data. Thus, a method to select reliable negative samples becomes vital in the performance improvement.

METHODS

Most of the existing computational prediction methods are essentially based on the assumption that similar drugs are inclined to share the same side-effects, which has given rise to remarkable performance. It is also rational to assume an inverse proposition that dissimilar drugs are less likely to share the same side-effects. Based on this inverse similarity hypothesis, we proposed a novel method to select highly-reliable negative samples for side-effect prediction. The first step of our method is to build a drug similarity integration framework to measure the similarity between drugs from different perspectives. This step integrates drug chemical structures, drug target proteins, drug substituents, and drug therapeutic information as features into a unified framework. Then, a similarity score between each candidate negative drug and validated positive drugs is calculated using the similarity integration framework. Those candidate negative drugs with lower similarity scores are preferentially selected as negative samples. Finally, both the validated positive drugs and the selected highly-reliable negative samples are used for predictions.

RESULTS

The performance of the proposed method was evaluated on simulative side-effect prediction of 917 DrugBank drugs, comparing with four machine-learning algorithms. Extensive experiments show that the drug similarity integration framework has superior capability in capturing drug features, achieving much better performance than those based on a single type of drug property. Besides, the four machine-learning algorithms achieved significant improvement in macro-averaging F1-score (e.g., SVM from 0.655 to 0.898), macro-averaging precision (e.g., RBF from 0.592 to 0.828) and macro-averaging recall (e.g., KNN from 0.651 to 0.772) complimentarily attributed to the highly-reliable negative samples selected by the proposed method.

CONCLUSIONS

The results suggest that the inverse similarity hypothesis and the integration of different drug properties are valuable for side-effect prediction. The selection of highly-reliable negative samples can also make significant contributions to the performance improvement.

摘要

背景

在药物研发中,对潜在药物副作用的计算机预测至关重要,因为通过湿实验识别药物副作用既昂贵又耗时。现有的计算方法主要侧重于利用已验证的药物副作用关系进行预测。由于缺乏可靠的负训练数据,性能受到严重阻碍。因此,选择可靠的负样本的方法在性能提高中变得至关重要。

方法

现有的大多数计算预测方法基本上都是基于这样一种假设,即类似的药物往往会产生相同的副作用,这一假设已经取得了显著的成果。同样合理的假设是,不相似的药物不太可能产生相同的副作用。基于这一反相似假设,我们提出了一种新的方法来选择用于副作用预测的高度可靠的负样本。我们方法的第一步是构建一个药物相似性综合框架,从不同角度测量药物之间的相似性。该步骤将药物化学结构、药物靶蛋白、药物取代基和药物治疗信息等特征集成到一个统一的框架中。然后,使用相似性综合框架计算每个候选负药物与验证阳性药物之间的相似性得分。那些具有较低相似性得分的候选负药物被优先选为负样本。最后,将验证阳性药物和选择的高度可靠的负样本一起用于预测。

结果

在对 917 种 DrugBank 药物的模拟副作用预测中,我们评估了所提出方法的性能,并与四种机器学习算法进行了比较。广泛的实验表明,药物相似性综合框架在捕捉药物特征方面具有卓越的能力,其性能优于基于单一药物特性的方法。此外,四种机器学习算法在宏观平均 F1 分数(例如,SVM 从 0.655 提高到 0.898)、宏观平均精度(例如,RBF 从 0.592 提高到 0.828)和宏观平均召回率(例如,KNN 从 0.651 提高到 0.772)方面都有显著的提高,这主要归因于所提出的方法选择了高度可靠的负样本。

结论

结果表明,反相似假设和不同药物特性的综合应用对副作用预测具有价值。选择高度可靠的负样本也可以对性能提高做出重大贡献。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3c71/7402513/507f1ebbc4a0/12859_2018_2563_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验