Tuan Wen-Jan, Yan Yifang, Abou Al Ardat Bilal, Felix Todd, Chen Qiushi
Department of Family and Community Medicine, Penn State College of Medicine, Hershey, Pennsylvania
Harold and Inge Marcus Department of Industrial and Manufacturing Engineering, The Pennsylvania State University, University Park, Pennsylvania.
Ann Fam Med. 2025 Jul 28;23(4):294-301. doi: 10.1370/afm.240316.
Factors influencing missed appointments are complex and difficult to anticipate and intervene against. To optimize appointment adherence, we aimed to use personalized machine learning and big data analytics to predict the risk of and contributing factors for no-shows and late cancellations in primary care practices.
We conducted a retrospective longitudinal study leveraging geolinked clinical, care utilization, socioeconomic, and climate data from 15 family medicine clinics at a regional academic health center in Pennsylvania from January 2019 to June 2023. We developed multiclass machine learning models using gradient boost, random forest, neural network, and logistic regression to predict appointment outcomes, followed by feature importance analysis to identify contributing factors for no-shows or late cancellations at the population and patient levels. We performed stratified analysis to evaluate the prediction performance by sex and race/ethnicity to ensure the fairness of the final model among sensitive features.
The analysis consisted of 109,328 patients and 1,118,236 appointments, including 77,322 (6.9%) no-shows and 75,545 (6.8%) late cancellations. The gradient boost model achieved the best performance with an area under the receiver operating characteristic curve of 0.852 for predicting no-shows and 0.921 for late cancellations. No bias against patient characteristics was detected. Schedule lead time was identified as the most important predictor of missed appointments.
Missed appointments remain a challenge for primary care. This study provided a practical and robust framework to predict missed appointments, laying the foundation for developing personalized strategies to improve patients' adherence to primary care appointments.
影响失约的因素复杂,难以预测和干预。为了优化预约依从性,我们旨在使用个性化机器学习和大数据分析来预测基层医疗实践中爽约和延迟取消的风险及促成因素。
我们进行了一项回顾性纵向研究,利用2019年1月至2023年6月宾夕法尼亚州一个地区学术健康中心15家家庭医学诊所的地理关联临床、医疗利用、社会经济和气候数据。我们使用梯度提升、随机森林、神经网络和逻辑回归开发多类机器学习模型来预测预约结果,随后进行特征重要性分析以确定总体和患者层面爽约或延迟取消的促成因素。我们进行分层分析以按性别和种族/族裔评估预测性能,以确保最终模型在敏感特征间的公平性。
分析包括109,328名患者和1,118,236次预约,其中有77,322次(6.9%)爽约和75,545次(6.8%)延迟取消。梯度提升模型表现最佳,预测爽约的受试者工作特征曲线下面积为0.852,预测延迟取消的为0.921。未检测到对患者特征的偏见。预约提前期被确定为失约的最重要预测因素。
失约仍是基层医疗面临的一项挑战。本研究提供了一个实用且强大的框架来预测失约,为制定个性化策略以提高患者对基层医疗预约的依从性奠定了基础。