Department of Bioengineering, Izmir Institute of Technology, Izmir, Turkey.
Department, of Material Science and Engineering, Izmir Institute of Technology, Izmir, Turkey.
J Drug Target. 2024 Dec;32(1):66-73. doi: 10.1080/1061186X.2023.2288995. Epub 2024 Jan 12.
There is strong interest to improve the therapeutic potential of gold nanoparticles (GNPs) while ensuring their safe development. The utility of GNPs in medicine requires a molecular-level understanding of how GNPs interact with biological systems. Despite considerable research efforts devoted to monitoring the internalisation of GNPs, there is still insufficient understanding of the factors responsible for the variability in GNP uptake in different cell types. Data-driven models are useful for identifying the sources of this variability. Here, we trained multiple machine learning models on 2077 data points for 193 individual nanoparticles from 59 independent studies to predict cellular uptake level of GNPs and compared different algorithms for their efficacies of prediction. The five ensemble learners (Xgboost, random forest, bootstrap aggregation, gradient boosting, light gradient boosting machine) made the best predictions of GNP uptake, accounting for 80-90% of the variance in the test data. The models identified particle size, zeta potential, GNP concentration and exposure duration as the most important drivers of cellular uptake. We expect this proof-of-concept study will foster the more effective use of accumulated cellular uptake data for GNPs and minimise any methodological bias in individual studies that may lead to under- or over-estimation of cellular internalisation rates.
人们强烈希望提高金纳米颗粒(GNPs)的治疗潜力,同时确保其安全开发。GNPs 在医学中的应用需要从分子水平上了解 GNPs 与生物系统的相互作用。尽管已经投入了相当多的研究努力来监测 GNPs 的内化,但对于导致不同细胞类型中 GNP 摄取变异性的因素仍缺乏足够的认识。数据驱动的模型有助于确定这种变异性的来源。在这里,我们对来自 59 项独立研究的 193 个单个纳米颗粒的 2077 个数据点进行了多种机器学习模型的训练,以预测 GNPs 的细胞摄取水平,并比较了不同算法的预测效果。五个集成学习器(Xgboost、随机森林、自举聚合、梯度提升、轻梯度提升机)对 GNP 摄取的预测效果最好,能够解释测试数据中 80-90%的方差。这些模型确定了颗粒大小、Zeta 电位、GNP 浓度和暴露时间是细胞摄取的最重要驱动因素。我们希望这项概念验证研究将促进更有效地利用积累的细胞摄取数据来研究 GNPs,并最大限度地减少个别研究中可能导致细胞内化率低估或高估的任何方法学偏差。