Fomenko Igor, Durst Mark, Balaban David
Systems Informatics Department, Amgen, One Amgen Center Drive, MS 34-2-A, Thousand Oaks, CA 91320-1799, USA.
Comput Methods Programs Biomed. 2006 Apr;82(1):31-7. doi: 10.1016/j.cmpb.2006.01.008. Epub 2006 Mar 23.
Effective analysis of high throughput screening (HTS) data requires automation of dose-response curve fitting for large numbers of datasets. Datasets with outliers are not handled well by standard non-linear least squares methods, and manual outlier removal after visual inspection is tedious and potentially biased. We propose robust non-linear regression via M-estimation as a statistical technique for automated implementation. The approach of finding M-estimates by Iteratively Reweighted Least Squares (IRLS) and the resulting optimization problem are described. Initial parameter estimates for iterative methods are important, so self-starting methods for our model are presented. We outline the software implementation, done in Matlab and deployed as an Excel application via the Matlab Excel Builder Toolkit. Results of M-estimation are compared with least squares estimates before and after manual editing.
高通量筛选(HTS)数据的有效分析需要对大量数据集的剂量反应曲线拟合进行自动化。标准的非线性最小二乘法不能很好地处理含有异常值的数据集,目视检查后手动去除异常值既繁琐又可能存在偏差。我们提出通过M估计进行稳健的非线性回归,作为一种用于自动化实现的统计技术。描述了通过迭代加权最小二乘法(IRLS)寻找M估计的方法以及由此产生的优化问题。迭代方法的初始参数估计很重要,因此我们给出了模型的自启动方法。我们概述了在Matlab中完成并通过Matlab Excel Builder Toolkit部署为Excel应用程序的软件实现。将M估计的结果与手动编辑前后的最小二乘估计进行了比较。