Helma Christoph, Rautenberg Micha, Gebele Denis
In Silico Toxicology gmbhBasel, Switzerland.
Front Pharmacol. 2017 Jun 16;8:377. doi: 10.3389/fphar.2017.00377. eCollection 2017.
The lazar framework for read across predictions was expanded for the prediction of nanoparticle toxicities, and a new methodology for calculating nanoparticle descriptors from core and coating structures was implemented. Nano-lazar provides a flexible and reproducible framework for downloading data and ontologies from the open eNanoMapper infrastructure, developing and validating nanoparticle read across models, open-source code and a free graphical interface for nanoparticle read-across predictions. In this study we compare different nanoparticle descriptor sets and local regression algorithms. Sixty independent crossvalidation experiments were performed for the Net Cell Association endpoint of the Protein Corona dataset. The best RMSE and results originated from models with protein corona descriptors and the weighted random forest algorithm, but their 95% prediction interval is significantly less accurate than for models with simpler descriptor sets (measured and calculated nanoparticle properties). The most accurate prediction intervals were obtained with measured nanoparticle properties (no statistical significant difference ( < 0.05) of RMSE and r values compared to protein corona descriptors). Calculated descriptors are interesting for cheap and fast high-throughput screening purposes. RMSE and prediction intervals of random forest models are comparable to protein corona models, but values are significantly lower.
用于类推预测的拉扎尔框架被扩展用于预测纳米颗粒毒性,并实施了一种从核心和涂层结构计算纳米颗粒描述符的新方法。纳米拉扎尔提供了一个灵活且可重复的框架,用于从开放的电子纳米映射器基础设施下载数据和本体,开发和验证纳米颗粒类推模型、开源代码以及用于纳米颗粒类推预测的免费图形界面。在本研究中,我们比较了不同的纳米颗粒描述符集和局部回归算法。针对蛋白质冠数据集的净细胞关联终点进行了60次独立的交叉验证实验。最佳的均方根误差(RMSE)和结果源自具有蛋白质冠描述符和加权随机森林算法的模型,但其95%预测区间的准确性明显低于具有更简单描述符集(测量和计算的纳米颗粒特性)的模型。使用测量的纳米颗粒特性获得了最准确的预测区间(与蛋白质冠描述符相比,RMSE和r值无统计学显著差异(<0.05))。计算得到的描述符对于廉价且快速的高通量筛选目的很有意义。随机森林模型的RMSE和预测区间与蛋白质冠模型相当,但r值明显更低。