Zhang Wei, Ji Lijuan, Chen Yanan, Tang Kailin, Wang Haiping, Zhu Ruixin, Jia Wei, Cao Zhiwei, Liu Qi
Department of Central Laboratory, Shanghai Tenth People's Hospital, School of Life Sciences and Technology, Tongji University, Shanghai, China.
Huai'an Second People's Hospital affiliated to Xuzhou Medical College, Huai'an, China.
J Cheminform. 2015 Feb 13;7:5. doi: 10.1186/s13321-015-0052-z. eCollection 2015.
The rapid increase in the emergence of novel chemical substances presents a substantial demands for more sophisticated computational methodologies for drug discovery. In this study, the idea of Learning to Rank in web search was presented in drug virtual screening, which has the following unique capabilities of 1). Applicable of identifying compounds on novel targets when there is not enough training data available for these targets, and 2). Integration of heterogeneous data when compound affinities are measured in different platforms.
A standard pipeline was designed to carry out Learning to Rank in virtual screening. Six Learning to Rank algorithms were investigated based on two public datasets collected from Binding Database and the newly-published Community Structure-Activity Resource benchmark dataset. The results have demonstrated that Learning to rank is an efficient computational strategy for drug virtual screening, particularly due to its novel use in cross-target virtual screening and heterogeneous data integration.
To the best of our knowledge, we have introduced here the first application of Learning to Rank in virtual screening. The experiment workflow and algorithm assessment designed in this study will provide a standard protocol for other similar studies. All the datasets as well as the implementations of Learning to Rank algorithms are available at http://www.tongji.edu.cn/~qiliu/lor_vs.html. Graphical AbstractThe analogy between web search and ligand-based drug discovery.
新型化学物质的迅速涌现对药物发现中更复杂的计算方法提出了巨大需求。在本研究中,网络搜索中的排序学习理念被引入药物虚拟筛选,其具有以下独特能力:1)当针对新靶点没有足够的训练数据时,能够识别这些靶点上的化合物;2)当在不同平台测量化合物亲和力时,能够整合异构数据。
设计了一个标准流程来在虚拟筛选中进行排序学习。基于从结合数据库收集的两个公共数据集以及新发布的社区结构-活性资源基准数据集,研究了六种排序学习算法。结果表明,排序学习是一种高效的药物虚拟筛选计算策略,特别是由于其在跨靶点虚拟筛选和异构数据整合中的新颖应用。
据我们所知,我们在此首次介绍了排序学习在虚拟筛选中的应用。本研究中设计的实验工作流程和算法评估将为其他类似研究提供标准方案。所有数据集以及排序学习算法的实现可在http://www.tongji.edu.cn/~qiliu/lor_vs.html获取。图形摘要网络搜索与基于配体的药物发现之间的类比。