Suppr超能文献

基于已知和未知活性化合物的 QSAR 药物设计中特征选择的 Fisher 和拉普拉斯得分的组合。

A combined Fisher and Laplacian score for feature selection in QSAR based drug design using compounds with known and unknown activities.

机构信息

Clinical Research Development Unit of Imam Khomeini Hospital, Urmia University of Medical Sciences, Urmia, Iran.

Department of Computer Engineering, Yazd University, Yazd, Iran.

出版信息

J Comput Aided Mol Des. 2018 Feb;32(2):375-384. doi: 10.1007/s10822-017-0094-6. Epub 2017 Dec 26.

Abstract

Quantitative structure-activity relationship (QSAR) is an effective computational technique for drug design that relates the chemical structures of compounds to their biological activities. Feature selection is an important step in QSAR based drug design to select the most relevant descriptors. One of the most popular feature selection methods for classification problems is Fisher score which aim is to minimize the within-class distance and maximize the between-class distance. In this study, the properties of Fisher criterion were extended for QSAR models to define the new distance metrics based on the continuous activity values of compounds with known activities. Then, a semi-supervised feature selection method was proposed based on the combination of Fisher and Laplacian criteria which exploits both compounds with known and unknown activities to select the relevant descriptors. To demonstrate the efficiency of the proposed semi-supervised feature selection method in selecting the relevant descriptors, we applied the method and other feature selection methods on three QSAR data sets such as serine/threonine-protein kinase PLK3 inhibitors, ROCK inhibitors and phenol compounds. The results demonstrated that the QSAR models built on the selected descriptors by the proposed semi-supervised method have better performance than other models. This indicates the efficiency of the proposed method in selecting the relevant descriptors using the compounds with known and unknown activities. The results of this study showed that the compounds with known and unknown activities can be helpful to improve the performance of the combined Fisher and Laplacian based feature selection methods.

摘要

定量构效关系(QSAR)是一种用于药物设计的有效计算技术,它将化合物的化学结构与其生物活性联系起来。特征选择是基于 QSAR 的药物设计中的一个重要步骤,用于选择最相关的描述符。对于分类问题,最受欢迎的特征选择方法之一是 Fisher 得分,其目的是最小化类内距离并最大化类间距离。在这项研究中,Fisher 准则的性质被扩展到 QSAR 模型中,以基于具有已知活性的化合物的连续活性值定义新的距离度量。然后,提出了一种基于 Fisher 和拉普拉斯准则组合的半监督特征选择方法,该方法利用具有已知和未知活性的化合物来选择相关的描述符。为了证明所提出的半监督特征选择方法在选择相关描述符方面的效率,我们将该方法和其他特征选择方法应用于三个 QSAR 数据集,如丝氨酸/苏氨酸蛋白激酶 PLK3 抑制剂、ROCK 抑制剂和酚类化合物。结果表明,在所选择的描述符上构建的 QSAR 模型比其他模型具有更好的性能。这表明该方法在使用具有已知和未知活性的化合物选择相关描述符方面的效率。这项研究的结果表明,具有已知和未知活性的化合物可以有助于提高基于 Fisher 和拉普拉斯的组合特征选择方法的性能。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验