Shionyu-Mitsuyama Clara, Hijikata Atsushi, Tsuji Toshiyuki, Shirai Tsuyoshi
Department of Bioscience, Nagahama Institute of Bio-science and Technology, 1266 Tamura, Nagahama, 526-0829, Japan.
J Struct Funct Genomics. 2016 Dec;17(4):135-146. doi: 10.1007/s10969-016-9209-x. Epub 2016 Dec 23.
The fast heuristic graph match algorithm for small molecules, COMPLIG, was improved by adding a structural superposition process to verify the atom-atom matching. The modified method was used to classify the small molecule ligands in the Protein Data Bank (PDB) by their three-dimensional structures, and 16,660 types of ligands in the PDB were classified into 7561 clusters. In contrast, a classification by a previous method (without structure superposition) generated 3371 clusters from the same ligand set. The characteristic feature in the current classification system is the increased number of singleton clusters, which contained only one ligand molecule in a cluster. Inspections of the singletons in the current classification system but not in the previous one implied that the major factors for the isolation were differences in chirality, cyclic conformations, separation of substructures, and bond length. Comparisons between current and previous classification systems revealed that the superposition-based classification was effective in clustering functionally related ligands, such as drugs targeted to specific biological processes, owing to the strictness of the atom-atom matching.
用于小分子的快速启发式图形匹配算法COMPLIG通过添加结构叠加过程来验证原子-原子匹配得到了改进。改进后的方法用于根据三维结构对蛋白质数据库(PDB)中的小分子配体进行分类,PDB中的16660种配体被分为7561个簇。相比之下,先前的方法(无结构叠加)对同一配体集进行分类时产生了3371个簇。当前分类系统的特征是单例簇的数量增加,单例簇中一个簇仅包含一个配体分子。对当前分类系统中而非先前系统中的单例进行检查表明,隔离的主要因素是手性、环状构象、子结构分离和键长的差异。当前分类系统与先前分类系统的比较表明,基于叠加的分类由于原子-原子匹配的严格性,在对功能相关配体(如针对特定生物过程的药物)进行聚类方面是有效的。