Department of Chemistry, Vanderbilt University Nashville, TN 37235, USA.
Biochim Biophys Acta Gen Subj. 2018 Apr;1862(4):886-894. doi: 10.1016/j.bbagen.2017.12.016. Epub 2017 Dec 29.
Questions about the reliability of parametric standard errors (SEs) from nonlinear least squares (LS) algorithms have led to a general mistrust of these precision estimators that is often unwarranted.
The importance of non-Gaussian parameter distributions is illustrated by converting linear models to nonlinear by substituting e, ln A, and 1/A for a linear parameter a. Monte Carlo (MC) simulations characterize parameter distributions in more complex cases, including when data have varying uncertainty and should be weighted, but weights are neglected. This situation leads to loss of precision and erroneous parametric SEs, as is illustrated for the Lineweaver-Burk analysis of enzyme kinetics data and the analysis of isothermal titration calorimetry data.
Non-Gaussian parameter distributions are generally asymmetric and biased. However, when the parametric SE is <10% of the magnitude of the parameter, both the bias and the asymmetry can usually be ignored. Sometimes nonlinear estimators can be redefined to give more normal distributions and better convergence properties.
Variable data uncertainty, or heteroscedasticity, can sometimes be handled by data transforms but more generally requires weighted LS, which in turn require knowledge of the data variance.
Parametric SEs are rigorously correct in linear LS under the usual assumptions, and are a trustworthy approximation in nonlinear LS provided they are sufficiently small - a condition favored by the abundant, precise data routinely collected in many modern instrumental methods.
关于非线性最小二乘法(LS)算法得出的参数标准误差(SE)可靠性的问题,导致人们普遍对这些精度估计值产生不信任,而这种不信任往往是没有根据的。
通过将线性模型转换为非线性模型,用 e、lnA 和 1/A 替换线性参数 a,说明了非正态参数分布的重要性。蒙特卡罗(MC)模拟更复杂情况下的参数分布,包括数据具有不同不确定性且应该加权但忽略权重的情况。这种情况会导致精度损失和错误的参数 SE,这在酶动力学数据的 Lineweaver-Burk 分析和等温滴定量热法数据的分析中得到了说明。
非正态参数分布通常是不对称和有偏差的。但是,当参数 SE 小于参数幅度的 10%时,通常可以忽略偏差和不对称性。有时可以重新定义非线性估计器,以获得更正态的分布和更好的收敛特性。
变量数据不确定性或异方差性有时可以通过数据变换处理,但更通常需要加权 LS,这反过来又需要了解数据方差。
在通常的假设下,线性 LS 中的参数 SE 在严格意义上是正确的,并且在非线性 LS 中是一个可靠的近似值,只要它们足够小——这是许多现代仪器方法中经常收集的大量精确数据所支持的条件。