Koola Jejo D, Ramesh Karthik, Mao Jialin, Ahn Minyoung, Davis Sharon E, Govindarajulu Usha, Perkins Amy M, Westerman Dax, Ssemaganda Henry, Speroff Theodore, Ohno-Machado Lucila, Ramsay Craig R, Sedrakyan Art, Resnic Frederic S, Matheny Michael E
Department of Medicine, University of California San Diego, San Diego, CA 92093, United States.
School of Medicine, University of California San Diego, San Diego, CA 92093, United States.
J Am Med Inform Assoc. 2025 Jan 1;32(1):206-217. doi: 10.1093/jamia/ocae273.
Traditional methods for medical device post-market surveillance often fail to accurately account for operator learning effects, leading to biased assessments of device safety. These methods struggle with non-linearity, complex learning curves, and time-varying covariates, such as physician experience. To address these limitations, we sought to develop a machine learning (ML) framework to detect and adjust for operator learning effects.
A gradient-boosted decision tree ML method was used to analyze synthetic datasets that replicate the complexity of clinical scenarios involving high-risk medical devices. We designed this process to detect learning effects using a risk-adjusted cumulative sum method, quantify the excess adverse event rate attributable to operator inexperience, and adjust for these alongside patient factors in evaluating device safety signals. To maintain integrity, we employed blinding between data generation and analysis teams. Synthetic data used underlying distributions and patient feature correlations based on clinical data from the Department of Veterans Affairs between 2005 and 2012. We generated 2494 synthetic datasets with widely varying characteristics including number of patient features, operators and institutions, and the operator learning form. Each dataset contained a hypothetical study device, Device B, and a reference device, Device A. We evaluated accuracy in identifying learning effects and identifying and estimating the strength of the device safety signal. Our approach also evaluated different clinically relevant thresholds for safety signal detection.
Our framework accurately identified the presence or absence of learning effects in 93.6% of datasets and correctly determined device safety signals in 93.4% of cases. The estimated device odds ratios' 95% confidence intervals were accurately aligned with the specified ratios in 94.7% of datasets. In contrast, a comparative model excluding operator learning effects significantly underperformed in detecting device signals and in accuracy. Notably, our framework achieved 100% specificity for clinically relevant safety signal thresholds, although sensitivity varied with the threshold applied.
A machine learning framework, tailored for the complexities of post-market device evaluation, may provide superior performance compared to standard parametric techniques when operator learning is present.
Demonstrating the capacity of ML to overcome complex evaluative challenges, our framework addresses the limitations of traditional statistical methods in current post-market surveillance processes. By offering a reliable means to detect and adjust for learning effects, it may significantly improve medical device safety evaluation.
医疗设备上市后监测的传统方法往往无法准确考虑操作人员的学习效应,导致对设备安全性的评估存在偏差。这些方法在处理非线性、复杂的学习曲线以及随时间变化的协变量(如医生经验)方面存在困难。为解决这些局限性,我们试图开发一种机器学习(ML)框架来检测并调整操作人员的学习效应。
使用梯度提升决策树ML方法分析合成数据集,这些数据集复制了涉及高风险医疗设备的临床场景的复杂性。我们设计这个过程,使用风险调整累积和方法检测学习效应,量化因操作人员缺乏经验导致的额外不良事件发生率,并在评估设备安全信号时将这些因素与患者因素一起进行调整。为保持完整性,我们在数据生成和分析团队之间采用了盲法。合成数据基于2005年至2012年退伍军人事务部的临床数据使用基础分布和患者特征相关性。我们生成了2494个具有广泛不同特征的合成数据集,包括患者特征数量、操作人员和机构数量以及操作人员学习形式。每个数据集包含一个假设的研究设备(设备B)和一个参考设备(设备A)。我们评估了识别学习效应以及识别和估计设备安全信号强度的准确性。我们的方法还评估了安全信号检测的不同临床相关阈值。
我们的框架在93.6%的数据集中准确识别了学习效应的存在与否,在93.4%的案例中正确确定了设备安全信号。在94.7%的数据集中,估计的设备优势比的95%置信区间与指定比例准确对齐。相比之下,一个排除操作人员学习效应的比较模型在检测设备信号和准确性方面表现明显较差。值得注意的是,我们的框架对于临床相关安全信号阈值实现了100%的特异性,尽管敏感性随所应用的阈值而变化。
针对上市后设备评估的复杂性量身定制的机器学习框架,在存在操作人员学习效应时,可能比标准参数技术提供更好的性能。
我们的框架展示了ML克服复杂评估挑战的能力,解决了当前上市后监测过程中传统统计方法的局限性。通过提供一种检测和调整学习效应的可靠方法,它可能显著改善医疗设备安全评估。