Mousavi Setare Loh, Sajjadi S Maryam
Faculty of Chemistry, Semnan University Semnan Iran
RSC Adv. 2023 Aug 8;13(34):23754-23771. doi: 10.1039/d3ra03177b. eCollection 2023 Aug 4.
In this work, a quantitative structure-activity relationship (QSAR) study was performed on a set of emerging contaminants (ECs) to predict their rejections by reverse osmosis membrane (RO). A wide range of molecular descriptors was calculated by Dragon software for 72 ECs. The QSAR data was analyzed by an artificial neural network method (ANN), in which four out of 3000 theoretical molecular descriptors were chosen and their significance was computed based on the Garson method. The significance trends of descriptors were as follows in descending order: ESpm14u > R2e > SIC1 > EEig03d. The selected descriptors were ranked based on their importance and then an explorative study was conducted on the QSAR data to show the trends in molecular descriptors and structures toward the rejections values of ECs. The MLR algorithm was used to make a linear model and the results were compared with those of the nonlinear ANN algorithm. The comparison results revealed it is necessary to apply the ANN model to this data with non-linear properties. For the whole dataset, the correlation coefficient () and residual mean squared error (RMSE) of the ANN and MLR methods were 0.9528, 6.4224; and 0.8753, 11.3400, respectively. The comparison results showed the superiority of ANN modeling in the analysis of ECs' QSAR data.
在这项工作中,对一组新兴污染物(ECs)进行了定量构效关系(QSAR)研究,以预测它们在反渗透膜(RO)中的截留率。利用Dragon软件为72种新兴污染物计算了广泛的分子描述符。通过人工神经网络方法(ANN)对QSAR数据进行分析,从3000个理论分子描述符中选择了4个,并基于加森方法计算了它们的显著性。描述符的显著性趋势按降序排列如下:ESpm14u > R2e > SIC1 > EEig03d。根据重要性对所选描述符进行排序,然后对QSAR数据进行探索性研究,以显示分子描述符和结构对新兴污染物截留率值的趋势。使用多元线性回归(MLR)算法建立线性模型,并将结果与非线性人工神经网络算法的结果进行比较。比较结果表明,有必要将人工神经网络模型应用于具有非线性特性的数据。对于整个数据集,人工神经网络和多元线性回归方法的相关系数()和残差均方根误差(RMSE)分别为0.9528、6.4224;和0.8753、11.3400。比较结果表明,人工神经网络建模在新兴污染物QSAR数据分析中具有优越性。