CSIRO Molecular & Health Technologies, Private Bag 10, Clayton South MDC, Clayton, Victoria 3168, Australia.
J Mol Graph Model. 2010 Apr;28(7):593-7. doi: 10.1016/j.jmgm.2009.12.004. Epub 2009 Dec 16.
Two sparse Bayesian methods were used to derive predictive models of solubility of organic dyes and polycyclic aromatic compounds in supercritical carbon dioxide (scCO(2)), over a wide range of temperatures (285.9-423.2K) and pressures (60-1400 bar): a multiple linear regression employing an expectation maximization algorithm and a sparse prior (MLREM) method and a non-linear Bayesian Regularized Artificial Neural Network with a Laplacian Prior (BRANNLP). A randomly selected test set was used to estimate the predictive ability of the models. The MLREM method resulted in a model of similar predictivity to the less sparse MLR method, while the non-linear BRANNLP method created models of substantially better predictivity than either the MLREM or MLR based models. The BRANNLP method simultaneously generated context-relevant subsets of descriptors and a robust, non-linear quantitative structure-property relationship (QSPR) model for the compound solubility in scCO(2). The differences between linear and non-linear descriptor selection methods are discussed.
两种稀疏贝叶斯方法被用于推导在超临界二氧化碳(scCO2)中有机染料和多环芳烃化合物溶解度的预测模型,涵盖了很宽的温度(285.9-423.2K)和压力(60-1400 bar)范围:一种使用期望最大化算法和稀疏先验(MLREM)方法的多元线性回归,以及一种具有拉普拉斯先验(BRANNLP)的非线性贝叶斯正则化人工神经网络。随机选择测试集用于估计模型的预测能力。MLREM 方法得到的模型与不太稀疏的 MLR 方法具有相似的预测能力,而非线性 BRANNLP 方法创建的模型的预测能力明显优于基于 MLREM 或 MLR 的模型。BRANNLP 方法同时生成了与上下文相关的描述符子集和用于 scCO2 中化合物溶解度的稳健、非线性定量结构-性质关系(QSPR)模型。讨论了线性和非线性描述符选择方法之间的差异。