Institut de Chimie Organique et Analytique (ICOA), Universite d'Orleans, UMR CNRS 7311, BP 6759, 45067 Orleans, France.
Greenpharma S.A.S. 3, allee du Titane, 45100 Orleans, France.
Curr Med Chem. 2020;27(38):6480-6494. doi: 10.2174/0929867326666190614160451.
Drug discovery is a challenging and expensive field. Hence, novel in silico tools have been developed in early discovery stage to identify and prioritize novel molecules with suitable physicochemical properties. In many in silico drug design projects, molecular databases are screened by virtual screening tools to search for potential bioactive molecules. The preparation of the molecules is therefore a key step in the success of well-established techniques such as docking, similarity or pharmacophore searching. We review here the lists of several toolkits used in different steps during the cleaning of molecular databases, integrated within a KNIME workflow. During the first step of the automatic workflow, salts are removed, and mixtures are split to get one compound per entry. Then compounds with unwanted features are filtered. Duplicated entries are then deleted while considering stereochemistry. As a compromise between exhaustiveness and computational time, most distributed tautomers at physiological pH are computed. Additionally, various flags are applied to molecules by using either classical molecular descriptors, similarity search to known libraries or substructure search rules. Moreover, stereoisomers are enumerated depending on the unassigned chiral centers. Then, three-dimensional coordinates, and optionally conformers, are generated. This workflow has been already applied to several drug design projects and can be used for molecular database preparation upon request.
药物发现是一个具有挑战性和昂贵的领域。因此,在早期发现阶段开发了新的计算工具,以识别和优先考虑具有合适物理化学性质的新型分子。在许多计算药物设计项目中,通过虚拟筛选工具筛选分子数据库,以搜索潜在的生物活性分子。因此,分子的制备是对接、相似性或药效团搜索等成熟技术成功的关键步骤。我们在这里回顾了在分子数据库清理过程的不同步骤中使用的几个工具包列表,这些工具包集成在一个 KNIME 工作流程中。在自动工作流程的第一步中,去除盐并将混合物分开,以便每个条目获得一个化合物。然后过滤掉具有不需要特征的化合物。然后考虑立体化学删除重复条目。为了在全面性和计算时间之间取得平衡,计算了大多数在生理 pH 值下分布的互变异构体。此外,通过使用经典分子描述符、与已知库的相似性搜索或子结构搜索规则,对分子应用各种标志。此外,根据未分配的手性中心枚举立体异构体。然后,生成三维坐标,以及可选的构象。该工作流程已经应用于几个药物设计项目,并可根据要求用于分子数据库准备。