Division of Mathematical Sciences, School of Physical and Mathematical Sciences, Nanyang Technological University, 637371, Singapore.
School of Mathematical Science and LPMC, Nankai University, 300071, Tianjin, China.
Brief Bioinform. 2021 Sep 2;22(5). doi: 10.1093/bib/bbaa411.
Artificial intelligence (AI) based drug design has demonstrated great potential to fundamentally change the pharmaceutical industries. Currently, a key issue in AI-based drug design is efficient transferable molecular descriptors or fingerprints. Here, we present hypergraph-based molecular topological representation, hypergraph-based (weighted) persistent cohomology (HPC/HWPC) and HPC/HWPC-based molecular fingerprints for machine learning models in drug design. Molecular structures and their atomic interactions are highly complicated and pose great challenges for efficient mathematical representations. We develop the first hypergraph-based topological framework to characterize detailed molecular structures and interactions at atomic level. Inspired by the elegant path complex model, hypergraph-based embedded homology and persistent homology have been proposed recently. Based on them, we construct HPC/HWPC, and use them to generate molecular descriptors for learning models in protein-ligand binding affinity prediction, one of the key step in drug design. Our models are tested on three most commonly-used databases, including PDBbind-v2007, PDBbind-v2013 and PDBbind-v2016, and outperform all existing machine learning models with traditional molecular descriptors. Our HPC/HWPC models have demonstrated great potential in AI-based drug design.
人工智能(AI)药物设计已被证明具有从根本上改变制药行业的巨大潜力。目前,基于 AI 的药物设计中的一个关键问题是高效的可转移分子描述符或指纹。在这里,我们提出了基于超图的分子拓扑表示、基于超图的(加权)持久同调(HPC/HWPC)和基于 HPC/HWPC 的分子指纹,用于药物设计中的机器学习模型。分子结构及其原子相互作用非常复杂,给有效的数学表示带来了巨大挑战。我们开发了第一个基于超图的拓扑框架,以在原子水平上描述详细的分子结构和相互作用。受优雅路径复形模型的启发,最近提出了基于超图的嵌入同调与持久同调。在此基础上,我们构建了 HPC/HWPC,并使用它们生成分子描述符,用于学习模型,以预测蛋白质-配体结合亲和力,这是药物设计的关键步骤之一。我们的模型在三个最常用的数据库上进行了测试,包括 PDBbind-v2007、PDBbind-v2013 和 PDBbind-v2016,并且优于所有使用传统分子描述符的现有机器学习模型。我们的 HPC/HWPC 模型在基于 AI 的药物设计中具有巨大的潜力。