Department of Life Science Informatics, B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universitat Bonn, Dahlmannstrasse 2, D-53113 Bonn, Germany.
J Chem Inf Model. 2010 Apr 26;50(4):487-99. doi: 10.1021/ci900512g.
Atom pairs have been among the first systematically derived fragment-type topological descriptors and have been one of the origins of two-dimensional fingerprint searching. These descriptors continue to be popular and widely used to this date. Herein we introduce a new type of atom pair descriptors, bonded atom pairs, that exclusively capture short-range atom environment information and, thus, depart in their design from other topological descriptors that enumerate bond paths of varying length. Bonded atom pairs combine different types of structural information including element type, hybridization state, aliphatic/aromatic character, and cyclic/acyclic arrangement. Systematic design led to a set of 117 bonded atom pairs, all of which exist in synthetic compounds. A further expanded bonded atom pair set accounting for specific halogen atoms and including a total of 159 descriptors is also provided. Atom pair distribution and frequency analysis in sets of compounds having different selectivity reveals that both conventional and bonded atom pairs capture complementary structural information. In similarity searching, bonded atom pairs meet or exceed the performance of standard atom pairs and structural fragment fingerprints. The complementary nature of structural information captured by atom pairs of different design is also reflected by individual search calculations. Taken together, our findings indicate that bonded atom pairs extend the current repertoire of topological molecular descriptors.
原子对是最早被系统推导出来的片段型拓扑描述符之一,也是二维指纹搜索的起源之一。这些描述符至今仍然很受欢迎,被广泛使用。在此,我们引入了一种新的原子对描述符,键合原子对,它专门捕捉短程原子环境信息,因此在设计上与其他枚举不同长度键路径的拓扑描述符不同。键合原子对结合了不同类型的结构信息,包括元素类型、杂化状态、脂肪族/芳香族特征以及环状/非环状排列。系统设计产生了一组 117 个键合原子对,它们都存在于合成化合物中。还提供了一个进一步扩展的键合原子对集,包括特定的卤素原子,总共包含 159 个描述符。在具有不同选择性的化合物集中的原子对分布和频率分析表明,常规原子对和键合原子对都捕捉到了互补的结构信息。在相似性搜索中,键合原子对的性能与标准原子对和结构片段指纹相当或超过。不同设计的原子对所捕捉的结构信息的互补性也反映在单独的搜索计算中。总的来说,我们的发现表明键合原子对扩展了当前拓扑分子描述符的范围。