U.S. Environmental Protection Agency, National Center for Computational Toxicology, Research Triangle Park, North Carolina, United States of America.
Oak Ridge Institute for Science Education Postdoctoral Fellow, Oak Ridge, Tennessee, United States of America.
PLoS One. 2018 Jul 25;13(7):e0196963. doi: 10.1371/journal.pone.0196963. eCollection 2018.
High throughput screening (HTS) projects like the U.S. Environmental Protection Agency's ToxCast program are required to address the large and rapidly increasing number of chemicals for which we have little to no toxicity measurements. Concentration-response parameters such as potency and efficacy are extracted from HTS data using nonlinear regression, and models and analyses built from these parameters are used to predict in vivo and in vitro toxicity of thousands of chemicals. How these predictions are impacted by uncertainties that stem from parameter estimation and propagated through the models and analyses has not been well explored. While data size and complexity makes uncertainty quantification computationally expensive for HTS datasets, continued advancements in computational resources have allowed these computational challenges to be met. This study uses nonparametric bootstrap resampling to calculate uncertainties in concentration-response parameters from a variety of HTS assays. Using the ToxCast estrogen receptor model for bioactivity as a case study, we highlight how these uncertainties can be propagated through models to quantify the uncertainty in model outputs. Uncertainty quantification in model outputs is used to identify potential false positives and false negatives and to determine the distribution of model values around semi-arbitrary activity cutoffs, increasing confidence in model predictions. At the individual chemical-assay level, curves with high variability are flagged for manual inspection or retesting, focusing subject-matter-expert time on results that need further input. This work improves the confidence of predictions made using HTS data, increasing the ability to use this data in risk assessment.
高通量筛选(HTS)项目,如美国环保署的 ToxCast 计划,需要解决大量且快速增加的、我们几乎没有毒性测量数据的化学物质。使用非线性回归从 HTS 数据中提取效力和功效等浓度-反应参数,然后使用这些参数构建的模型和分析用于预测数千种化学物质的体内和体外毒性。这些预测受到参数估计和通过模型和分析传播的不确定性的影响程度尚未得到充分探索。虽然数据大小和复杂性使得 HTS 数据集的不确定性量化在计算上非常昂贵,但计算资源的持续进步使得这些计算挑战得以克服。本研究使用非参数自举重采样来计算来自各种 HTS 测定的浓度-反应参数的不确定性。使用 ToxCast 雌激素受体模型作为生物活性的案例研究,我们强调了这些不确定性如何通过模型传播,从而量化模型输出的不确定性。模型输出的不确定性量化用于识别潜在的假阳性和假阴性,并确定模型值在半任意活性截止值周围的分布,从而提高对模型预测的信心。在单个化学物质-测定水平上,具有高变异性的曲线被标记为需要手动检查或重新测试,从而将主题专家的时间集中在需要进一步输入的结果上。这项工作提高了使用 HTS 数据进行预测的置信度,增强了在风险评估中使用这些数据的能力。