Brendel Karl, Dartois Céline, Comets Emmanuelle, Lemenuel-Diot Annabelle, Laveille Christian, Tranchand Brigitte, Girard Pascal, Laffont Céline M, Mentré France
INSERM, U738, Paris, France.
Clin Pharmacokinet. 2007;46(3):221-34. doi: 10.2165/00003088-200746030-00003.
Model evaluation is an important issue in population analyses. We aimed to perform a systematic review of all population pharmacokinetic and/or pharmacodynamic analyses published between 2002 and 2004 to survey the current methods used to evaluate models and to assess whether those models were adequately evaluated. We selected 324 articles in MEDLINE using defined key words and built a data abstraction form composed of a checklist of items to extract the relevant information from these articles with respect to model evaluation. In the data abstraction form, evaluation methods were divided into three subsections: basic internal methods (goodness-of-fit [GOF] plots, uncertainty in parameter estimates and model sensitivity), advanced internal methods (data splitting, resampling techniques and Monte Carlo simulations) and external model evaluation. Basic internal evaluation was the most frequently described method in the reports: 65% of the models involved GOF evaluation. Standard errors or confidence intervals were reported for 50% of fixed effects but only for 22% of random effects. Advanced internal methods were used in approximately 25% of models: data splitting was more often used than bootstrap and cross-validation; simulations were used in 6% of models to evaluate models by a visual predictive check or by a posterior predictive check. External evaluation was performed in only 7% of models. Using the subjective synthesis of model evaluation for each article, we judged the models to be adequately evaluated in 28% of pharmacokinetic models and 26% of pharmacodynamic models. Basic internal evaluation was preferred to more advanced methods, probably because the former is performed easily with most software. We also noticed that when the aim of modelling was predictive, advanced internal methods or more stringent methods were more often used.
模型评估是群体分析中的一个重要问题。我们旨在对2002年至2004年间发表的所有群体药代动力学和/或药效学分析进行系统综述,以调查当前用于评估模型的方法,并评估这些模型是否得到了充分评估。我们使用特定关键词在MEDLINE中筛选出324篇文章,并构建了一个数据提取表,该表由一系列项目清单组成,用于从这些文章中提取与模型评估相关的信息。在数据提取表中,评估方法分为三个子部分:基本内部方法(拟合优度[GOF]图、参数估计的不确定性和模型敏感性)、高级内部方法(数据拆分、重采样技术和蒙特卡罗模拟)以及外部模型评估。基本内部评估是报告中最常描述的方法:65%的模型涉及GOF评估。50%的固定效应报告了标准误差或置信区间,但随机效应仅为22%。约25%的模型使用了高级内部方法:数据拆分的使用频率高于自助法和交叉验证;6%的模型使用模拟通过视觉预测检查或后验预测检查来评估模型。仅7%的模型进行了外部评估。通过对每篇文章的模型评估进行主观综合分析,我们判断28%的药代动力学模型和26%的药效学模型得到了充分评估。基本内部评估比更高级的方法更受青睐,可能是因为前者在大多数软件中都易于执行。我们还注意到,当建模目的是预测时,更常使用高级内部方法或更严格的方法。