Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, MA 02139, United States.
Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA 02139, United States.
J Am Med Inform Assoc. 2024 Jun 20;31(7):1578-1582. doi: 10.1093/jamia/ocae092.
Leverage electronic health record (EHR) audit logs to develop a machine learning (ML) model that predicts which notes a clinician wants to review when seeing oncology patients.
We trained logistic regression models using note metadata and a Term Frequency Inverse Document Frequency (TF-IDF) text representation. We evaluated performance with precision, recall, F1, AUC, and a clinical qualitative assessment.
The metadata only model achieved an AUC 0.930 and the metadata and TF-IDF model an AUC 0.937. Qualitative assessment revealed a need for better text representation and to further customize predictions for the user.
Our model effectively surfaces the top 10 notes a clinician wants to review when seeing an oncology patient. Further studies can characterize different types of clinician users and better tailor the task for different care settings.
EHR audit logs can provide important relevance data for training ML models that assist with note-writing in the oncology setting.
利用电子健康记录 (EHR) 审核日志开发机器学习 (ML) 模型,预测临床医生在查看肿瘤患者时想要查看的记录。
我们使用记录元数据和术语频率逆文档频率 (TF-IDF) 文本表示训练逻辑回归模型。我们使用精度、召回率、F1、AUC 和临床定性评估来评估性能。
仅使用元数据的模型的 AUC 为 0.930,而同时使用元数据和 TF-IDF 的模型的 AUC 为 0.937。定性评估表明需要更好的文本表示,并进一步针对用户定制预测。
我们的模型有效地显示了临床医生在查看肿瘤患者时想要查看的前 10 条记录。进一步的研究可以描述不同类型的临床医生用户,并为不同的护理环境更好地定制任务。
EHR 审核日志可以为培训 ML 模型提供重要的相关性数据,这些模型可以辅助肿瘤学环境中的记录书写。