Department of Biostatistics, Harvard University, Boston, USA.
Channing Division of Network Medicine, Brigham and Women's Hospital, Boston, USA.
BMC Med Res Methodol. 2023 Aug 1;23(1):177. doi: 10.1186/s12874-023-01988-4.
Epidemiologic and medical studies often rely on evaluators to obtain measurements of exposures or outcomes for study participants, and valid estimates of associations depends on the quality of data. Even though statistical methods have been proposed to adjust for measurement errors, they often rely on unverifiable assumptions and could lead to biased estimates if those assumptions are violated. Therefore, methods for detecting potential 'outlier' evaluators are needed to improve data quality during data collection stage.
In this paper, we propose a two-stage algorithm to detect 'outlier' evaluators whose evaluation results tend to be higher or lower than their counterparts. In the first stage, evaluators' effects are obtained by fitting a regression model. In the second stage, hypothesis tests are performed to detect 'outlier' evaluators, where we consider both the power of each hypothesis test and the false discovery rate (FDR) among all tests. We conduct an extensive simulation study to evaluate the proposed method, and illustrate the method by detecting potential 'outlier' audiologists in the data collection stage for the Audiology Assessment Arm of the Conservation of Hearing Study, an epidemiologic study for examining risk factors of hearing loss in the Nurses' Health Study II.
Our simulation study shows that our method not only can detect true 'outlier' evaluators, but also is less likely to falsely reject true 'normal' evaluators.
Our two-stage 'outlier' detection algorithm is a flexible approach that can effectively detect 'outlier' evaluators, and thus data quality can be improved during data collection stage.
流行病学和医学研究通常依赖评估者为研究参与者获取暴露或结局的测量值,而关联的有效估计取决于数据的质量。尽管已经提出了一些统计方法来调整测量误差,但这些方法通常依赖于未经证实的假设,如果这些假设被违反,可能会导致有偏差的估计。因此,需要有检测潜在“异常”评估者的方法,以便在数据收集阶段提高数据质量。
在本文中,我们提出了一种两阶段算法来检测评估结果偏高或偏低的“异常”评估者。在第一阶段,通过拟合回归模型获得评估者的效应。在第二阶段,进行假设检验以检测“异常”评估者,同时考虑每个假设检验的功效和所有检验的假发现率(FDR)。我们进行了广泛的模拟研究来评估所提出的方法,并通过在听力保护研究的听力评估臂的数据分析阶段检测潜在的“异常”听力学家来说明该方法,该研究是一项流行病学研究,旨在检查护士健康研究 II 中听力损失的危险因素。
我们的模拟研究表明,我们的方法不仅可以检测到真正的“异常”评估者,而且不太可能错误地拒绝真正的“正常”评估者。
我们的两阶段“异常”检测算法是一种灵活的方法,可以有效地检测“异常”评估者,从而在数据收集阶段提高数据质量。