Lopez Perez Kenneth, López-López Edgar, Soulage Flavie, Felix Eloy, Medina-Franco José L, Miranda-Quintana Ramon Alain
Department of Chemistry and Quantum Theory Project, University of Florida, Gainesville, Florida 32611, United States.
DIFACQUIM Research Group, School of Chemistry, Department of Pharmacy, National Autonomous University of Mexico, Mexico City 04510, Mexico.
J Chem Inf Model. 2025 Jul 14;65(13):6788-6796. doi: 10.1021/acs.jcim.5c00347. Epub 2025 Jun 13.
It is well-known that the number of compounds (both synthesized and theoretical ones) is rapidly increasing. Hence, it would be obvious to affirm that the chemical space is expanding. However, is the chemical diversity of compound libraries growing? In this study, we approach this question by quantitatively assessing the time evolution of chemical libraries in terms of chemical diversity as measured with molecular fingerprints. To tackle this task, we employed innovative cheminformatics methods to assess the progress over time of the chemical diversity of compound libraries available in the public domain. Using the iSIM and the BitBIRCH clustering algorithm, we conclude that, based on the fingerprints used to represent the chemical structures, only an increasing number of molecules cannot be directly translated to diversity for the analyzed libraries. With these tools, we have identified what releases contributed to the diversity of the library and the zones they did. More importantly, the proposed pipeline can be applied to study the evolution of any chemical library and to assess how they are covering the chemical space.
众所周知,化合物(包括合成化合物和理论化合物)的数量正在迅速增加。因此,可以肯定地说化学空间正在扩大。然而,化合物库的化学多样性是否在增长呢?在本研究中,我们通过用分子指纹测量的化学多样性来定量评估化学库随时间的演变,从而探讨这个问题。为了解决这个任务,我们采用了创新的化学信息学方法来评估公共领域中可用化合物库的化学多样性随时间的进展。使用iSIM和BitBIRCH聚类算法,我们得出结论,基于用于表示化学结构的指纹,对于所分析的库,仅仅分子数量的增加并不能直接转化为多样性。借助这些工具,我们确定了哪些发布内容对库的多样性有贡献以及它们作用的区域。更重要的是,所提出的流程可应用于研究任何化学库的演变,并评估它们如何覆盖化学空间。