Fontaine Fabien, Pastor Manuel, Gutiérrez-de-Terán Hugo, Lozano Juan J, Sanz Ferran
Research Group on Biomedical Informatics (GRIB), IMIM, Universitat Pompeu Fabra, C/ Dr. Aiguader 80, Barcelona, Spain.
Mol Divers. 2003;6(2):135-47. doi: 10.1023/b:modi.0000006840.89805.e1.
The selection of a sample of diverse compounds is a common strategy for exploring large molecular libraries. However, the success of such approach depends on the selection of relevant molecular descriptors and the use of appropriate sampling methods. In the context of pharmaceutical research, the molecular descriptors should be based on physicochemical properties related with the pharmacological behaviour of the compounds. In this sense, the alignment-free GRIND and VolSurf molecular descriptors are promising candidates since they have been successfully used in the modelling of both pharmacodynamic and pharmacokinetic properties of drugs. This work describes the use of such descriptors in the diversity sampling of a library of primary amines and compares the results with those obtained in a previous study that used quantum-mechanical descriptors. As in the previous work, principal component (PC) analysis was applied to reduce the dimensionality and remove redundant information of the original descriptors, and the compounds were sampled on the basis of k-means clustering on the space of the selected PCs. The results of the present study show that VolSurf and GRIND provide similar quality sampling regarding global features of the molecules such as hydrophilicity, however the topology of the compounds is considered differently. The similarity between particular compounds strongly depends on the original descriptors used. However all the sample selections done in the PC space after k-means clustering provide the same apparent diversity in comparison to the whole dataset. The results indicate that there is no best set of descriptors on a diversity basis. The selection of descriptors must be based on the drug features to be investigated.
选择一组多样的化合物是探索大型分子库的常用策略。然而,这种方法的成功取决于相关分子描述符的选择以及适当采样方法的使用。在药物研究背景下,分子描述符应基于与化合物药理行为相关的物理化学性质。从这个意义上说,无比对的GRIND和VolSurf分子描述符是很有前景的候选者,因为它们已成功用于药物药效学和药代动力学性质的建模。这项工作描述了这些描述符在伯胺库多样性采样中的应用,并将结果与先前使用量子力学描述符的研究结果进行了比较。与先前的工作一样,应用主成分(PC)分析来降低维度并去除原始描述符的冗余信息,并且基于所选PC空间上的k均值聚类对化合物进行采样。本研究结果表明,就分子的全局特征(如亲水性)而言,VolSurf和GRIND提供了相似质量的采样,然而化合物的拓扑结构被不同地考虑。特定化合物之间的相似性强烈取决于所使用的原始描述符。然而,与整个数据集相比,k均值聚类后在PC空间中进行的所有样本选择都提供了相同的表观多样性。结果表明,在多样性基础上没有最佳的描述符集。描述符的选择必须基于要研究的药物特征。