Luan Yue, Li Xianlan, Kong Dingling, Li Wanli, Li Wei, Zhang Qingyou, Pang Aimin
Henan Engineering Research Center of Industrial Circulating Water Treatment, Henan Joint International Research Laboratory of Environmental Pollution Control Materials, Henan University, Kaifeng, 475004, China.
Science and Technology on Aerospace Chemical Power Laboratory, Hubei Institute of Aerospace Chemotechnology, Xiangyang, 441003, Hubei, China.
J Mol Graph Model. 2024 Jun;129:108752. doi: 10.1016/j.jmgm.2024.108752. Epub 2024 Mar 6.
On the basis of the atomic graph-theoretical index - aEAID (atomic Extended Adjacency matrix IDentification) and molecular adjacent topological index - ATID (Adjacent Topological IDentification) suggested by one of the authors (Zhang Q), a highly selective atomic topological index - aATID (atomic Adjacent Topological IDentification) index was suggested to identify the equivalent atoms in this study. The aATID index of an atom was derived from the number of the attached hydrogen atoms of the atom but omitting bond types. In this case, the suggested index can be used to identify equivalent atoms in chemistry but perhaps not equivalent in the molecular graph. To test the uniqueness of aATID indices, the virtual atomic data sets were derived from alkanes containing 15-20 carbon atoms and the isomers of Octogen, as well as a real data set was derived from the NCI database. Only four pairs of atoms from alkanes containing 20 carbons can't be discriminated by aATID, that is, four pairs of degenerates were found for this data set. To solve this problem, the aATID index was modified by introducing distance factors between atoms, and the 2-aATID index was suggested. Its uniqueness was examined by 5,939,902 atoms derived from alkanes containing 20 carbons and further 16,166,984 atoms from alkanes of 21 carbons, and no degenerates were found. In addition, another large real data set of 16,650,688 atoms derived from the PubChem database was also used to test the uniqueness of both aATID and 2-aATID. As a result, each atom was successfully discriminated by any of the two indices. Finally, the suggested aATID index was applied to the identification of duplicate atoms as data pretreatment for QSPR (Quantitative Structure-Property Relationships) studies.
基于作者之一(张Q)提出的原子图论指标——原子扩展邻接矩阵识别(aEAID)和分子邻接拓扑指标——邻接拓扑识别(ATID),本研究提出了一种高选择性的原子拓扑指标——原子邻接拓扑识别(aATID)指标,用于识别等效原子。原子的aATID指标源自该原子所连接氢原子的数量,但忽略键型。在这种情况下,所提出的指标可用于识别化学中的等效原子,但在分子图中可能并非等效。为测试aATID指标的唯一性,从含15 - 20个碳原子的烷烃和奥克托今异构体中导出虚拟原子数据集,以及从美国国立癌症研究所(NCI)数据库导出真实数据集。对于含20个碳原子的烷烃,只有四对原子无法通过aATID区分,即该数据集发现了四对简并情况。为解决此问题,通过引入原子间的距离因子对aATID指标进行修正,提出了2 - aATID指标。通过含20个碳原子烷烃的5,939,902个原子以及含21个碳原子烷烃的另外16,166,984个原子检验其唯一性,未发现简并情况。此外,还使用了从PubChem数据库导出的另一个包含16,650,688个原子的大型真实数据集来测试aATID和2 - aATID的唯一性。结果,这两个指标中的任何一个都成功区分了每个原子。最后,将所提出的aATID指标应用于重复原子的识别,作为定量结构 - 性质关系(QSPR)研究的数据预处理。