Day S, Fayers P, Harvey D
Leo Pharmaceuticals, Princes Risborough, Buckinghamshire, United Kingdom.
Control Clin Trials. 1998 Feb;19(1):15-24. doi: 10.1016/s0197-2456(97)00096-2.
We challenge the notion that double data entry is either sufficient or necessary to ensure good-quality data in clinical trials. Although we do not completely reject that notion, we quantify some of the effects that poor quality data have on final study results in terms of estimation, significance testing, and power. By introducing digit errors into simulated blood pressure measurements we demonstrate that simple range checks allow us to detect (and therefore correct) the main errors that impact the final study results and conclusions. The errors that cannot easily be detected by such range checks, although possibly numerous, are shown to be of little importance in drawing the correct conclusions from the statistical analysis of data. Exploratory data analysis cannot identify all errors that a second data entry would detect, but on the other hand, not all errors that are found by exploratory data analysis are detectable by double data entry. Double data entry is concerned solely with ensuring, to a high degree of certainty, that what is recorded on the case record form is transcribed into the database. Exploratory data analysis looks beyond the case record form to challenge the plausibility of the written data. In this sense, the second entering of data has some benefit, but the use of exploratory data analysis methods, either as data entry is ongoing or at the end of data entry and as the first stage in an analysis strategy, should always be mandatory.
我们对双数据录入是确保临床试验中数据质量良好的充分条件或必要条件这一观念提出质疑。虽然我们并未完全否定这一观念,但我们从估计、显著性检验和效能方面对质量差的数据对最终研究结果产生的一些影响进行了量化。通过在模拟血压测量中引入数字错误,我们证明简单的范围检查能让我们检测到(并因此纠正)影响最终研究结果和结论的主要错误。此类范围检查不易检测到的错误,尽管可能数量众多,但在从数据的统计分析中得出正确结论方面显示出重要性不大。探索性数据分析无法识别二次数据录入能检测到的所有错误,但另一方面,并非探索性数据分析发现的所有错误都能通过双数据录入检测到。双数据录入仅关注高度确定地确保病例记录表上记录的内容被转录到数据库中。探索性数据分析超越病例记录表,对书面数据的合理性提出质疑。从这个意义上说,二次数据录入有一定益处,但无论是在数据录入过程中还是在数据录入结束时且作为分析策略的第一阶段,使用探索性数据分析方法都应始终是强制性的。