Hu Xiuyuan, Liu Guoqing, Yao Quanming, Zhao Yang, Zhang Hao
Department of Electronic Engineering, Tsinghua University, Beijing, China.
Microsoft Research AI for Science, Beijing, China.
J Cheminform. 2024 Aug 7;16(1):94. doi: 10.1186/s13321-024-00883-4.
In recent years, significant advancements have been made in molecular generation algorithms aimed at facilitating drug development, and molecular diversity holds paramount importance within the realm of molecular generation. Nonetheless, the effective quantification of molecular diversity remains an elusive challenge, as extant metrics exemplified by Richness and Internal Diversity fall short in concurrently encapsulating the two main aspects of such diversity: quantity and dissimilarity. To address this quandary, we propose Hamiltonian diversity, a novel molecular diversity metric predicated upon the shortest Hamiltonian circuit. This metric embodies both aspects of molecular diversity in principle, and we implement its calculation with high efficiency and accuracy. Furthermore, through empirical experiments we demonstrate the high consistency of Hamiltonian diversity with real-world chemical diversity, and substantiate its effects in promoting diversity of molecular generation algorithms. Our implementation of Hamiltonian diversity in Python is available at: https://github.com/HXYfighter/HamDiv .Scientific contributionWe propose a more rational molecular diversity metric for the community of cheminformatics and drug development. This metric can be applied to evaluation of existing molecular generation methods and enhancing drug design algorithms.
近年来,旨在促进药物开发的分子生成算法取得了重大进展,分子多样性在分子生成领域至关重要。然而,分子多样性的有效量化仍然是一个难以解决的挑战,因为以丰富度和内部多样性为代表的现有指标在同时涵盖这种多样性的两个主要方面(数量和不相似性)时存在不足。为了解决这一困境,我们提出了哈密顿多样性,这是一种基于最短哈密顿回路的新型分子多样性度量。该度量原则上体现了分子多样性的两个方面,并且我们以高效和准确的方式实现了其计算。此外,通过实证实验,我们证明了哈密顿多样性与现实世界化学多样性的高度一致性,并证实了其在促进分子生成算法多样性方面的作用。我们用Python实现的哈密顿多样性可在以下网址获取:https://github.com/HXYfighter/HamDiv 。科学贡献我们为化学信息学和药物开发领域提出了一种更合理的分子多样性度量。该度量可用于评估现有的分子生成方法和改进药物设计算法。