PAP Rashidah Sa'adatul Bolkiah Institute of Health Sciences, Universiti Brunei Darussalam, Gadong, Brunei Darussalam.
Faculty of Integrated Technologies, Universiti Brunei Darussalam, Gadong, Brunei Darussalam.
Chem Biol Drug Des. 2022 Aug;100(2):185-217. doi: 10.1111/cbdd.14062. Epub 2022 May 8.
Cheminformatics utilizing machine learning (ML) techniques have opened up a new horizon in drug discovery. This is owing to vast chemical space expansion with rocketing numbers of expected hits and lead compounds that match druggable macromolecular targets, in particular from natural compounds. Due to the natural products' (NP) structural complexity, uniqueness, and diversity, they could occupy a bigger space in pharmaceuticals, allowing the industry to pursue more selective leads in the nanomolar range of binding affinity. ML is an essential part of each step of the drug design pipeline, such as target prediction, compound library preparation, and lead optimization. Notably, molecular mechanic and dynamic simulations, induced docking, and free energy perturbations are essential in predicting best binding poses, binding free energy values, and molecular mechanics force fields. Those applications have leveraged from artificial intelligence (AI), which decreases the computational costs required for such costly simulations. This review aimed to describe chemical space and compound libraries related to NPs. High-throughput screening utilized for fractionating NPs and high-throughput virtual screening and their strategies, and significance, are reviewed. Particular emphasis was given to AI approaches, ML tools, algorithms, and techniques, especially in drug discovery of macrocyclic compounds and approaches in computer-aided and ML-based drug discovery. Anthraquinone derivatives were discussed as a source of new lead compounds that can be developed using ML tools for diverse medicinal uses such as cancer, infectious diseases, and metabolic disorders. Furthermore, the power of principal component analysis in understanding relevant protein conformations, and molecular modeling of protein-ligand interaction were also presented. Apart from being a concise reference for cheminformatics, this review is a useful text to understand the application of ML-based algorithms to molecular dynamics simulation and in silico absorption, distribution, metabolism, excretion, and toxicity prediction.
利用机器学习(ML)技术的化学信息学在药物发现中开辟了一个新的领域。这是由于化学空间的极大扩展,预期命中数量和与可成药的大分子靶标匹配的先导化合物数量呈爆炸式增长,尤其是来自天然化合物的。由于天然产物(NP)的结构复杂性、独特性和多样性,它们可以在药物中占据更大的空间,使行业能够在纳摩尔结合亲和力范围内追求更具选择性的先导化合物。ML 是药物设计管道的每个步骤的重要组成部分,例如靶标预测、化合物库的制备和先导化合物的优化。值得注意的是,分子力学和动力学模拟、诱导对接和自由能扰动在预测最佳结合构象、结合自由能值和分子力学力场方面至关重要。这些应用利用了人工智能(AI),降低了此类昂贵模拟所需的计算成本。本综述旨在描述与 NP 相关的化学空间和化合物库。综述了用于 NP 分段的高通量筛选以及高通量虚拟筛选及其策略的重要性。特别强调了 AI 方法、ML 工具、算法和技术,特别是在大环化合物的药物发现和计算机辅助和基于 ML 的药物发现中的应用。蒽醌衍生物被讨论为新的先导化合物的来源,可以使用 ML 工具进行开发,用于治疗癌症、传染病和代谢紊乱等多种医学用途。此外,还介绍了主成分分析在理解相关蛋白质构象和蛋白质-配体相互作用的分子建模方面的作用。除了作为化学信息学的简明参考之外,本综述还是一个有用的文本,可以了解基于 ML 的算法在分子动力学模拟和计算机辅助吸收、分布、代谢、排泄和毒性预测中的应用。