Lim Changwon, Sen Pranab K, Peddada Shyamal D
Department of Mathematics and Statistics, Loyola University Chicago, 1032 W Sheridan Rd, Chicago, IL 60660.
Technometrics. 2013 May 1;55(2):150-160. doi: 10.1080/00401706.2012.749166.
Quantitative high throughput screening (qHTS) assays use cells or tissues to screen thousands of compounds in a short period of time. Data generated from qHTS assays are then evaluated using nonlinear regression models, such as the Hill model, and decisions regarding toxicity are made using the estimates of the parameters of the model. For any given compound, the variability in the observed response may either be constant across dose groups (homoscedasticity) or vary with dose (heteroscedasticity). Since thousands of compounds are simultaneously evaluated in a qHTS assay, it is not practically feasible for an investigator to perform residual analysis to determine the variance structure before performing statistical inferences on each compound. Since it is well-known that the variance structure plays an important role in the analysis of linear and nonlinear regression models it is therefore important to have practically useful and easy to interpret methodology which is robust to the variance structure. Furthermore, given the number of chemicals that are investigated in the qHTS assay, outliers and influential observations are not uncommon. In this article we describe preliminary test estimation (PTE) based methodology which is robust to the variance structure as well as any potential outliers and influential observations. Performance of the proposed methodology is evaluated in terms of false discovery rate (FDR) and power using a simulation study mimicking a real qHTS data. Of the two methods currently in use, our simulations studies suggest that one is extremely conservative with very small power in comparison to the proposed PTE based method whereas the other method is very liberal. In contrast, the proposed PTE based methodology achieves a better control of FDR while maintaining good power. The proposed methodology is illustrated using a data set obtained from the National Toxicology Program (NTP). Additional information, simulation results, data and computer code are available online as supplementary materials.
定量高通量筛选(qHTS)分析使用细胞或组织在短时间内筛选数千种化合物。然后使用非线性回归模型(如希尔模型)对qHTS分析产生的数据进行评估,并根据模型参数的估计结果做出关于毒性的决策。对于任何给定的化合物,观察到的反应变异性可能在各剂量组中保持恒定(同方差性),也可能随剂量变化(异方差性)。由于在qHTS分析中同时评估数千种化合物,研究人员在对每种化合物进行统计推断之前进行残差分析以确定方差结构在实际操作中并不可行。众所周知,方差结构在线性和非线性回归模型分析中起着重要作用,因此拥有实用且易于解释、对方差结构具有稳健性的方法非常重要。此外,鉴于qHTS分析中研究的化学物质数量众多,异常值和有影响的观测值并不罕见。在本文中,我们描述了基于初步测试估计(PTE)的方法,该方法对方差结构以及任何潜在的异常值和有影响的观测值具有稳健性。使用模拟真实qHTS数据的模拟研究,从错误发现率(FDR)和检验功效方面评估了所提出方法的性能。在目前使用的两种方法中,我们的模拟研究表明,与所提出的基于PTE的方法相比,一种方法极其保守,功效非常小,而另一种方法则非常宽松。相比之下,所提出的基于PTE的方法在保持良好功效的同时,能更好地控制FDR。使用从国家毒理学计划(NTP)获得的数据集对所提出的方法进行了说明。更多信息、模拟结果、数据和计算机代码可作为补充材料在线获取。