School of Life Sciences and Technology, Tongji University, 200092, China.
J Chem Inf Model. 2012 Mar 26;52(3):834-43. doi: 10.1021/ci200481c. Epub 2012 Feb 29.
The current drug virtual screen (VS) methods mainly include two categories. i.e., ligand/target structure-based virtual screen and that, utilizing protein-ligand interaction fingerprint information based on the large number of complex structures. Since the former one focuses on the one-side information while the later one focuses on the whole complex structure, they are thus complementary and can be boosted by each other. However, a common problem faced here is how to present a comprehensive understanding and evaluation of the various virtual screen results derived from various VS methods. Furthermore, there is still an urgent need for developing an efficient approach to fully integrate various VS methods from a comprehensive multiview perspective. In this study, our virtual screen schema based on multiview similarity integration and ranking aggregation was tested comprehensively with statistical evaluations, providing several novel and useful clues on how to perform drug VS from multiple heterogeneous data sources. (1) 18 complex structures of HIV-1 protease with ligands from the PDB were curated as a test data set and the VS was performed with five different drug representations. Ritonavir ( 1HXW ) was selected as the query in VS and the weighted ranks of the query results were aggregated from multiple views through four similarity integration approaches. (2) Further, one of the ranking aggregation methods was used to integrate the similarity ranks calculated by gene ontology (GO) fingerprint and structural fingerprint on the data set from connectivity map, and two typical HDAC and HSP90 inhibitors were chosen as the queries. The results show that rank aggregation can enhance the result of similarity searching in VS when two or more descriptions are involved and provide a more reasonable similarity rank result. Our study shows that integrated VS based on multiple data fusion can achieve a remarkable better performance compared to that from individual ones and, thus, serves as a promising way for efficient drug screening, taking advantages of the rapidly accumulated molecule representations and heterogeneous data in the pharmacological area.
当前的药物虚拟筛选(VS)方法主要包括两类,即配体/靶标结构为基础的虚拟筛选和基于大量复合物结构的蛋白质-配体相互作用指纹信息的虚拟筛选。由于前者侧重于单方面的信息,而后者侧重于整个复合物结构,因此它们是互补的,可以相互促进。然而,这里面临的一个共同问题是如何全面理解和评估各种 VS 方法得出的各种虚拟筛选结果。此外,仍然迫切需要开发一种有效的方法,从综合多视角全面整合各种 VS 方法。在这项研究中,我们基于多视图相似性集成和排序聚合的虚拟筛选方案进行了全面的统计评估,为如何从多个异构数据源进行药物 VS 提供了一些新颖而有用的线索。(1)从 PDB 中选择了 18 个带有配体的 HIV-1 蛋白酶复合物结构作为测试数据集,并使用五种不同的药物表示进行 VS。利托那韦(1HXW)被选为 VS 中的查询,通过四种相似性集成方法从多个视图聚合查询结果的加权排名。(2)进一步,使用其中一种排序聚合方法将连通性图谱上基于基因本体(GO)指纹和结构指纹计算的相似性排名集成到数据集上,并选择两种典型的 HDAC 和 HSP90 抑制剂作为查询。结果表明,当涉及两个或更多描述时,排序聚合可以增强 VS 中的相似性搜索结果,并提供更合理的相似性排名结果。我们的研究表明,基于多数据融合的综合 VS 可以比单个 VS 获得更好的性能,从而为高效的药物筛选提供了一种有前途的方法,利用药理学领域中快速积累的分子表示和异构数据的优势。