Banerjee Arkaprava, Kar Supratik, Pore Souvik, Roy Kunal
Drug Theoretics and Cheminformatics Laboratory, Department of Pharmaceutical Technology, Jadavpur University, Kolkata, India.
Chemometrics & Molecular Modeling Laboratory, Department of Chemistry, Kean University, Union, NJ, USA.
Nanotoxicology. 2023 Feb;17(1):78-93. doi: 10.1080/17435390.2023.2186280. Epub 2023 Mar 8.
The availability of experimental nanotoxicity data is in general limited which warrants both the use of methods for data gap filling and exploring novel methods for effective modeling. Read-Across Structure-Activity Relationship (RASAR) is an emerging cheminformatic approach that combines the usefulness of a QSAR model and similarity-based Read-Across predictions. In this work, we have generated simple, interpretable, and transferable quantitative-RASAR (q-RASAR) models which can efficiently predict the cytotoxicity of TiO-based multi-component nanoparticles. A data set of 29 TiO-based nanoparticles with specific amounts of noble metal precursors was rationally divided into training and test sets, and the Read-Across-based predictions for the test set were generated. The optimized hyperparameters and the similarity approach, which yield the best predictions, were used to calculate the similarity and error-based RASAR descriptors. A data fusion of the RASAR descriptors with the chemical descriptors was done followed by the best subset feature selection. The final set of selected descriptors was used to develop the q-RASAR models, which were validated using the stringent OECD criteria. Finally, a random forest model was also developed with the selected descriptors, which could efficiently predict the cytotoxicity of TiO-based multi-component nanoparticles superseding previously reported models in the prediction quality thus showing the merits of the q-RASAR approach. To further evaluate the usefulness of the approach, we have applied the q-RASAR approach also to a second cytotoxicity data set of 34 heterogeneous TiO-based nanoparticles which further confirmed the enhancement of external prediction quality of QSAR models after incorporation of RASAR descriptors.
实验性纳米毒性数据的可用性总体上有限,这就需要使用数据填补方法并探索有效的建模新方法。跨读结构-活性关系(RASAR)是一种新兴的化学信息学方法,它结合了定量构效关系(QSAR)模型的实用性和基于相似性的跨读预测。在这项工作中,我们生成了简单、可解释且可转移的定量RASAR(q-RASAR)模型,该模型可以有效地预测基于TiO的多组分纳米颗粒的细胞毒性。将一组包含特定量贵金属前驱体的29种基于TiO的纳米颗粒数据集合理地分为训练集和测试集,并对测试集进行基于跨读的预测。使用产生最佳预测结果的优化超参数和相似性方法来计算基于相似性和误差的RASAR描述符。将RASAR描述符与化学描述符进行数据融合,然后进行最佳子集特征选择。使用最终选定的描述符集来开发q-RASAR模型,并根据经合组织(OECD)的严格标准对其进行验证。最后,还使用选定的描述符开发了一个随机森林模型,该模型可以有效地预测基于TiO的多组分纳米颗粒的细胞毒性,在预测质量上超过了先前报道的模型,从而显示了q-RASAR方法的优点。为了进一步评估该方法的实用性,我们还将q-RASAR方法应用于另一个包含34种异质TiO基纳米颗粒的细胞毒性数据集,这进一步证实了在纳入RASAR描述符后QSAR模型外部预测质量的提高。