Department of Chemistry and Applied Biosciences, ETH Zurich Vladimir-Prelog-Weg 2, Zurich 8093, Switzerland.
J Chem Inf Model. 2024 Oct 28;64(20):7917-7924. doi: 10.1021/acs.jcim.4c01513. Epub 2024 Oct 10.
Molecular flexibility is a commonly used, but not easily quantified term. It is at the core of understanding composition and size of a conformational ensemble and contributes to many molecular properties. For many computational workflows, it is necessary to reduce a conformational ensemble to meaningful representatives, however defining them and guaranteeing the ensemble's completeness is difficult. We introduce the concepts of torsion angular bin strings (TABS) as a discrete vector representation of a conformer's dihedral angles and the number of possible TABS (nTABS) as an estimation for the ensemble size of a molecule, respectively. Here, we show that nTABS corresponds to an upper limit for the size of the conformational space of small molecules and compare the classification of conformer ensembles by TABS with classifications by RMSD. Overcoming known drawbacks like the molecular size dependency and threshold picking of the RMSD measure, TABS is shown to meaningfully discretize the conformational space and hence allows e.g. for fast checks of the coverage of the conformational space. The current proof-of-concept implementation is based on the ETKDGv3 conformer generator as implemented in the RDKit and known torsion preferences extracted from small-molecule crystallographic data.
分子柔性是一个常用但不易量化的术语。它是理解构象整体的组成和大小的核心,并且对许多分子性质有贡献。对于许多计算工作流程,有必要将构象整体简化为有意义的代表,但定义它们并保证整体的完整性是困难的。我们分别引入扭转角箱字符串(TABS)的概念,作为构象的二面角的离散向量表示,以及可能的 TABS 数量(nTABS)作为分子整体大小的估计。在这里,我们表明 nTABS 对应于小分子构象空间大小的上限,并比较 TABS 与 RMSD 的构象集分类。克服 RMSD 度量的分子大小依赖性和阈值选择等已知缺陷,TABS 被证明可以对构象空间进行有意义的离散化,从而例如允许快速检查构象空间的覆盖范围。当前的概念验证实现基于 RDKit 中实现的 ETKDGv3 构象生成器以及从小分子晶体学数据中提取的已知扭转偏好。