Suppr超能文献

基于聚类的加权相似极限学习机的虚拟筛选新方法。

Virtual screening by a new Clustering-based Weighted Similarity Extreme Learning Machine approach.

机构信息

Faculty of Information Technology, King Mongkut's Institute of Technology Ladkrabang, Bangkok 10520, Thailand.

出版信息

PLoS One. 2018 Apr 13;13(4):e0195478. doi: 10.1371/journal.pone.0195478. eCollection 2018.

Abstract

Machine learning techniques are becoming popular in virtual screening tasks. One of the powerful machine learning algorithms is Extreme Learning Machine (ELM) which has been applied to many applications and has recently been applied to virtual screening. We propose the Weighted Similarity ELM (WS-ELM) which is based on a single layer feed-forward neural network in a conjunction of 16 different similarity coefficients as activation function in the hidden layer. It is known that the performance of conventional ELM is not robust due to random weight selection in the hidden layer. Thus, we propose a Clustering-based WS-ELM (CWS-ELM) that deterministically assigns weights by utilising clustering algorithms i.e. k-means clustering and support vector clustering. The experiments were conducted on one of the most challenging datasets-Maximum Unbiased Validation Dataset-which contains 17 activity classes carefully selected from PubChem. The proposed algorithms were then compared with other machine learning techniques such as support vector machine, random forest, and similarity searching. The results show that CWS-ELM in conjunction with support vector clustering yields the best performance when utilised together with Sokal/Sneath(1) coefficient. Furthermore, ECFP_6 fingerprint presents the best results in our framework compared to the other types of fingerprints, namely ECFP_4, FCFP_4, and FCFP_6.

摘要

机器学习技术在虚拟筛选任务中越来越受欢迎。其中一种强大的机器学习算法是极限学习机(ELM),它已经被应用于许多应用程序,并且最近已经被应用于虚拟筛选。我们提出了基于权重相似性的极限学习机(WS-ELM),它基于一个单层前馈神经网络,在隐藏层中使用 16 种不同的相似度系数作为激活函数。众所周知,由于隐藏层中随机权重的选择,传统的 ELM 的性能并不稳健。因此,我们提出了一种基于聚类的 WS-ELM(CWS-ELM),它通过使用聚类算法(即 k-means 聚类和支持向量聚类)来确定权重。实验是在最具挑战性的数据集之一——最大无偏验证数据集上进行的,该数据集包含了从 PubChem 中精心挑选的 17 个活性类别。然后,将所提出的算法与其他机器学习技术(如支持向量机、随机森林和相似性搜索)进行了比较。结果表明,在与 Sokal/Sneath(1)系数一起使用时,与支持向量聚类相结合的 CWS-ELM 可以获得最佳性能。此外,与其他指纹类型(即 ECFP_4、FCFP_4 和 FCFP_6)相比,ECFP_6 指纹在我们的框架中表现出最佳的结果。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/84e4/5898726/81416e7a0a23/pone.0195478.g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验