Ognichenko Liudmyla N, Kuz'min Victor E, Gorb Leonid, Hill Frances C, Artemenko Anatoly G, Polischuk Pavel G, Leszczynski Jerzy
Laboratory of Theoretical Chemistry, Department of Molecular Structure, A.V. Bogatsky Physical-Chemical Institute, National Academy of Science of Ukraine, Ukraine, Odessa, 65080, Lustdorfskaya Doroga 86.
Badger Technical Services, LLC, Vicksburg, Mississippi, USA.
Mol Inform. 2012 Apr;31(3-4):273-80. doi: 10.1002/minf.201100102. Epub 2012 Mar 12.
The relationship between the octanol-water partition coefficient for more than twelve thousand organic compounds and their structures was investigated using a QSPR approach based on Simplex Representation of Molecular Structure (SiRMS). The dataset used in our study included 10973 compounds with experimental values of lipophilicity (LogKow ) for different chemical compounds. Random Forest (RF) method was used for statistical modeling at the 2D level of representation of molecular structure. Developed models are adequate and successfully validated with external test sets. Proposed models have clear interpretation due to the use of simplex representation of molecular structure and predict the LogKow values with the accuracy of the best modern models. Thus QSPR models proposed in this study represent powerful and easy-to use virtual screening tool that can be recommended for prediction of octanol-water partition coefficient.
使用基于分子结构单形表示法(SiRMS)的定量构效关系(QSPR)方法,研究了一万两千多种有机化合物的正辛醇-水分配系数与其结构之间的关系。我们研究中使用的数据集包含10973种化合物,这些化合物具有不同化学物质的亲脂性实验值(LogKow)。随机森林(RF)方法用于分子结构二维表示水平的统计建模。所开发的模型是合适的,并通过外部测试集成功验证。由于使用了分子结构的单形表示法,所提出的模型具有清晰的解释,并能以最佳现代模型的精度预测LogKow值。因此,本研究中提出的QSPR模型代表了一种强大且易于使用的虚拟筛选工具,可推荐用于预测正辛醇-水分配系数。