Yong Loo Lin School of Medicine, National University of Singapore, 21 Lower Kent Ridge Rd, Singapore, 119077, Singapore.
Biostatistics Unit, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore.
BMC Med Res Methodol. 2022 Apr 3;22(1):93. doi: 10.1186/s12874-022-01567-z.
Data from certain subgroups of clinical interest may not be presented in primary manuscripts or conference abstract presentations. In an effort to enable secondary data analyses, we propose a workflow to retrieve unreported subgroup survival data from published Kaplan-Meier (KM) plots.
We developed KMSubtraction, an R-package that retrieves patients from unreported subgroups by matching participants on KM plots of the overall cohort to participants on KM plots of a known subgroup with follow-up time. By excluding matched patients, the opposing unreported subgroup may be retrieved. Reproducibility and limits of error of the KMSubtraction workflow were assessed by comparing unmatched patients against the original survival data of subgroups from published datasets and simulations. Monte Carlo simulations were utilized to evaluate the limits of error of KMSubtraction.
The validation exercise found no material systematic error and demonstrates the robustness of KMSubtraction in deriving unreported subgroup survival data. Limits of error were small and negligible on marginal Cox proportional hazard models comparing reconstructed and original survival data of unreported subgroups. Extensive Monte Carlo simulations demonstrate that datasets with high reported subgroup proportion (r = 0.467, p < 0.001), small dataset size (r = - 0.374, p < 0.001) and high proportion of missing data in the unreported subgroup (r = 0.553, p < 0.001) were associated with uncertainty are likely to yield high limits of error with KMSubtraction.
KMSubtraction demonstrates robustness in deriving survival data from unreported subgroups. The limits of error of KMSubtraction derived from converged Monte Carlo simulations may guide the interpretation of reconstructed survival data of unreported subgroups.
某些具有临床意义的亚组数据可能未在主要手稿或会议摘要报告中呈现。为了能够进行二次数据分析,我们提出了一种从已发表的 Kaplan-Meier(KM)图中检索未报告亚组生存数据的工作流程。
我们开发了 KMSubtraction,这是一个 R 包,通过在总体队列的 KM 图上匹配参与者与已知亚组的 KM 图上的参与者(具有随访时间),从未报告的亚组中检索患者。通过排除匹配的患者,可以检索到相反的未报告亚组。通过将未匹配的患者与已发表数据集和模拟中的亚组原始生存数据进行比较,评估 KMSubtraction 工作流程的重现性和误差限制。利用蒙特卡罗模拟评估 KMSubtraction 的误差限制。
验证实验未发现明显的系统误差,并证明了 KMSubtraction 在推导出未报告亚组生存数据方面的稳健性。在比较重建和未报告亚组的原始生存数据的边缘 Cox 比例风险模型中,误差限制很小且可以忽略不计。广泛的蒙特卡罗模拟表明,具有高报告亚组比例(r = 0.467,p < 0.001)、小数据集大小(r = -0.374,p < 0.001)和未报告亚组中高缺失数据比例(r = 0.553,p < 0.001)的数据集可能会产生不确定性,使用 KMSubtraction 会产生较高的误差限制。
KMSubtraction 在从未报告亚组中推导出生存数据方面表现出稳健性。从收敛的蒙特卡罗模拟中得出的 KMSubtraction 的误差限制可以指导对未报告亚组的重建生存数据的解释。