Ropp Patrick J, Spiegel Jacob O, Walker Jennifer L, Green Harrison, Morales Guillermo A, Milliken Katherine A, Ringe John J, Durrant Jacob D
Department of Biological Sciences, University of Pittsburgh, Pittsburgh, PA, 15260, USA.
Department of Chemistry and Chemical Biology, Indiana University-Purdue University Indianapolis, Indianapolis, IN, 46202, USA.
J Cheminform. 2019 May 24;11(1):34. doi: 10.1186/s13321-019-0358-3.
Computational techniques such as structure-based virtual screening require carefully prepared 3D models of potential small-molecule ligands. Though powerful, existing commercial programs for virtual-library preparation have restrictive and/or expensive licenses. Freely available alternatives, though often effective, do not fully account for all possible ionization, tautomeric, and ring-conformational variants. We here present Gypsum-DL, a free, robust open-source program that addresses these challenges. As input, Gypsum-DL accepts virtual compound libraries in SMILES or flat SDF formats. For each molecule in the virtual library, it enumerates appropriate ionization, tautomeric, chiral, cis/trans isomeric, and ring-conformational forms. As output, Gypsum-DL produces an SDF file containing each molecular form, with 3D coordinates assigned. To demonstrate its utility, we processed 1558 molecules taken from the NCI Diversity Set VI and 56,608 molecules taken from a Distributed Drug Discovery (D3) combinatorial virtual library. We also used 4463 high-quality protein-ligand complexes from the PDBBind database to show that Gypsum-DL processing can improve virtual-screening pose prediction. Gypsum-DL is available free of charge under the terms of the Apache License, Version 2.0.
诸如基于结构的虚拟筛选等计算技术需要精心准备潜在小分子配体的3D模型。尽管功能强大,但现有的用于虚拟库制备的商业程序具有限制性和/或昂贵的许可证。免费的替代方案虽然通常有效,但并未充分考虑所有可能的离子化、互变异构和环构象变体。我们在此介绍Gypsum-DL,这是一个免费、强大的开源程序,可应对这些挑战。作为输入,Gypsum-DL接受SMILES或平面SDF格式的虚拟化合物库。对于虚拟库中的每个分子,它会枚举适当的离子化、互变异构、手性、顺式/反式异构和环构象形式。作为输出,Gypsum-DL生成一个包含每种分子形式的SDF文件,并分配了3D坐标。为了证明其效用,我们处理了取自NCI多样性集VI的1558个分子和取自分布式药物发现(D3)组合虚拟库的56608个分子。我们还使用了来自PDBBind数据库的4463个高质量蛋白质-配体复合物来表明Gypsum-DL处理可以改善虚拟筛选姿势预测。Gypsum-DL可根据Apache许可证2.0版免费获得。