Katritzky Alan R, Dobchev Dimitar A, Slavov Svetoslav, Karelson Mati
Institute of Chemistry, Tallinn University of Technology, Ehitajate tee 5, Tallinn 19086, Estonia.
J Chem Inf Model. 2008 Nov;48(11):2207-13. doi: 10.1021/ci8002073.
The use of large descriptor pools in multilinear QSAR/QSPR approaches has recently been increasingly criticized for their sensitivity to "chance correlations". Statistical experiments substituting "real descriptor" pools by random numbers were stated to demonstrate such sensitivity. While contributing positively to the improvement of the QSAR/QSPR methodology, these approaches claim complete interchangeability between the molecular descriptors used in QSAR/QSPR models and random numbers. Here, we demonstrate that when used correctly the large molecular descriptor pools are (i) not comparable with random numbers and (ii) can give very helpful QSPR conclusions.
在多线性定量构效关系/定量构性关系(QSAR/QSPR)方法中使用大型描述符库最近越来越受到批评,因为它们对“偶然相关性”敏感。用随机数替代“真实描述符”库的统计实验表明了这种敏感性。虽然这些方法对QSAR/QSPR方法的改进有积极贡献,但它们声称QSAR/QSPR模型中使用的分子描述符与随机数完全可互换。在这里,我们证明,当正确使用时,大型分子描述符库(i)与随机数不可比,(ii)可以得出非常有用的QSPR结论。