Novartis Institute for Tropical Diseases, Chromos, Singapore, Singapore.
J Comput Aided Mol Des. 2012 Oct;26(10):1127-41. doi: 10.1007/s10822-012-9604-8. Epub 2012 Sep 16.
Compounds known to be potent against a specific protein target may potentially contain a signature profile of common substructures that is highly correlated to their potency. These substructure profiles may be useful in enriching compound libraries or for prioritizing compounds against a specific protein target. With this objective in mind, a set of compounds with known potency against six selected kinases (2 each from 3 kinase families) was used to generate binary molecular fingerprints. Each fingerprint key represents a substructure that is found within a compound and the frequency with which the fingerprint occurs was then tabulated. Thereafter, a frequent pattern mining technique was applied with the aim of uncovering substructures that are not only well represented among known potent inhibitors but are also unrepresented among known inactive compounds and vice versa. Substructure profiles that are representative of potent inhibitors against each of the 3 kinase families were thus extracted. Based on our validation results, these substructure profiles demonstrated significant enrichment for highly potent compounds against their respective kinase targets. The advantages of using our approach over conventional methods in analyzing such datasets and its application in the mining of substructures for enriching compound libraries are presented.
针对特定蛋白质靶标的化合物,如果已知其具有很强的抑制作用,那么这些化合物可能具有共同的特征结构,这些特征结构与它们的抑制活性高度相关。这些结构特征可以用于丰富化合物库,或者用于针对特定蛋白质靶标对化合物进行优先级排序。基于这一目标,使用了一组针对六种选定激酶(每个激酶家族各两种)具有已知抑制活性的化合物来生成二元分子指纹。每个指纹键代表一个在化合物中发现的结构,然后统计指纹出现的频率。然后,应用频繁模式挖掘技术,旨在发现不仅在已知有效抑制剂中很好地表示,而且在已知非活性化合物中也没有表示的结构。因此,提取了针对每个激酶家族的有效抑制剂的代表性结构特征。基于我们的验证结果,这些结构特征在针对各自激酶靶标的高活性化合物中表现出显著的富集。本文还介绍了与传统方法相比,使用这种方法分析此类数据集的优势,以及在丰富化合物库方面挖掘结构特征的应用。