Department of Chemistry, University of Alabama at Birmingham, Birmingham, AL, USA.
Department of Biochemistry and Molecular Genetics, University of Alabama at Birmingham, Birmingham, AL, USA.
Biophys Chem. 2021 Dec;279:106682. doi: 10.1016/j.bpc.2021.106682. Epub 2021 Sep 29.
Parameter optimization or "data fitting" is a computational process that identifies a set of parameter values that best describe an experimental data set. Parameter optimization is commonly carried out using a computer program utilizing a non-linear least squares (NLLS) algorithm. These algorithms work by continuously refining a user supplied initial guess resulting in a systematic increase in the goodness of fit. A well-understood problem with this class of algorithms is that in the case of models with correlated parameters the optimized output parameters are initial guess dependent. This dependency can potentially introduce user bias into the resultant analysis. While many optimization programs exist, few address this dilemma. Here we present a data analysis tool, MENOTR, that is capable of overcoming the initial guess dependence in parameter optimization. Several case studies with published experimental data are presented to demonstrate the capabilities of this tool. The results presented here demonstrate how to effectively overcome the initial guess dependence of NLLS leading to greater confidence that the resultant optimized parameters are the best possible set of parameters to describe an experimental data set. While the optimization strategies implemented within MENOTR are not entirely novel, the application of these strategies to optimize parameters in kinetic and thermodynamic biochemical models is uncommon. MENOTR was designed to require minimal modification to accommodate a new model making it immediately accessible to researchers with a limited programming background. We anticipate that this toolbox can be used in a wide variety of data analysis applications. Prototype versions of this toolbox have been used in a number of published investigations already, as well as ongoing work with chemical-quenched flow, stopped-flow, and molecular tweezers data sets. STATEMENT OF SIGNIFICANCE: Non-linear least squares (NLLS) is a common form of parameter optimization in biochemistry kinetic and thermodynamic investigations These algorithms are used to fit experimental data sets and report corresponding parameter values. The algorithms are fast and able to provide good quality solutions for models involving few parameters. However, initial guess dependence is a well-known drawback of this optimization strategy that can introduce user bias. An alternative method of parameter optimization are genetic algorithms (GA). Genetic algorithms do not have an initial guess dependence but are slow at arriving at the best set of fit parameters. Here, we present MENOTR, a parameter optimization toolbox utilizing a hybrid GA/NLLS algorithm. The toolbox maximizes the strength of each strategy while minimizing the inherent drawbacks.
参数优化或“数据拟合”是一种计算过程,用于确定一组最佳描述实验数据集的参数值。参数优化通常使用计算机程序利用非线性最小二乘法(NLLS)算法进行。这些算法通过不断改进用户提供的初始猜测,从而实现拟合度的系统提高。这类算法的一个众所周知的问题是,对于具有相关参数的模型,优化后的输出参数依赖于初始猜测。这种依赖性可能会在分析中引入用户的偏见。虽然有许多优化程序,但很少有程序能解决这个问题。在这里,我们提出了一个数据分析工具 MENOTR,它能够克服参数优化中的初始猜测依赖性。我们提出了几个用已发表的实验数据进行的案例研究,以展示该工具的功能。这里呈现的结果展示了如何有效地克服 NLLS 的初始猜测依赖性,从而增加对描述实验数据集的最佳参数集的信心。虽然 MENOTR 中实施的优化策略并不是完全新颖的,但将这些策略应用于动力学和热力学生化模型中的参数优化并不常见。MENOTR 的设计目的是只需要最小的修改就可以适应新模型,从而使编程背景有限的研究人员能够立即使用它。我们预计这个工具箱可以用于各种数据分析应用。这个工具箱的原型版本已经在许多已发表的研究中使用,并且正在与化学猝灭流动、停流和分子镊子数据进行合作。重要性声明:非线性最小二乘法(NLLS)是生化动力学和热力学研究中常用的一种参数优化形式。这些算法用于拟合实验数据集并报告相应的参数值。该算法速度快,能够为涉及少数参数的模型提供高质量的解决方案。然而,初始猜测依赖性是这种优化策略的一个众所周知的缺点,它可能会引入用户的偏见。参数优化的另一种方法是遗传算法(GA)。遗传算法没有初始猜测依赖性,但在找到最佳拟合参数集方面速度较慢。在这里,我们提出了 MENOTR,这是一个利用混合 GA/NLLS 算法的参数优化工具箱。该工具箱最大限度地发挥了每种策略的优势,同时最小化了固有的缺点。