Harrington Lia X, Wei Jason W, Suriawinata Arief A, Mackenzie Todd A, Hassanpour Saeed
Dartmouth College, Hanover, NH, USA.
Dartmouth-Hitchcock Medical Center, Lebanon, NH, USA.
AMIA Jt Summits Transl Sci Proc. 2020 May 30;2020:211-220. eCollection 2020.
Identifying patient characteristics that influence the rate of colorectal polyp recurrence can provide important insights into which patients are at higher risk for recurrence. We used natural language processing to extract polyp morphological characteristics from 953 polyp-presenting patients' electronic medical records. We used subsequent colonoscopy reports to examine how the time to polyp recurrence (731 patients experienced recurrence) is influenced by these characteristics as well as anthropometric features using Kaplan-Meier curves, Cox proportional hazards modeling, and random survival forest models. We found that the rate of recurrence differed significantly by polyp size, number, and location and patient smoking status. Additionally, right-sided colon polyps increased recurrence risk by 30% compared to left-sided polyps. History of tobacco use increased polyp recurrence risk by 20% compared to never-users. A random survival forest model showed an AUC of 0.65 and identified several other predictive variables, which can inform development of personalized polyp surveillance plans.
识别影响结直肠息肉复发率的患者特征,可为哪些患者复发风险较高提供重要见解。我们使用自然语言处理技术从953例有息肉的患者电子病历中提取息肉形态特征。我们利用后续的结肠镜检查报告,通过Kaplan-Meier曲线、Cox比例风险模型和随机生存森林模型,研究息肉复发时间(731例患者经历复发)如何受到这些特征以及人体测量特征的影响。我们发现,息肉复发率因息肉大小、数量、位置和患者吸烟状况而有显著差异。此外,与左侧息肉相比,右侧结肠息肉使复发风险增加30%。与从不吸烟者相比,有吸烟史使息肉复发风险增加20%。随机生存森林模型的AUC为0.65,并识别出其他几个预测变量,这可为制定个性化息肉监测计划提供参考。