Ando Kenichiro, Okumura Takashi, Komachi Mamoru, Horiguchi Hiromasa, Matsumoto Yuji
Graduate School of Systems Design, Tokyo Metropolitan University, Tokyo, Japan.
Center for Advanced Intelligence Project, RIKEN, Tokyo, Japan.
PLOS Digit Health. 2022 Dec 12;1(12):e0000158. doi: 10.1371/journal.pdig.0000158. eCollection 2022 Dec.
Medical professionals have been burdened by clerical work, and artificial intelligence may efficiently support physicians by generating clinical summaries. However, whether hospital discharge summaries can be generated automatically from inpatient records stored in electronic health records remains unclear. Therefore, this study investigated the sources of information in discharge summaries. First, the discharge summaries were automatically split into fine-grained segments, such as those representing medical expressions, using a machine learning model from a previous study. Second, these segments in the discharge summaries that did not originate from inpatient records were filtered out. This was performed by calculating the n-gram overlap between inpatient records and discharge summaries. The final source origin decision was made manually. Finally, to reveal the specific sources (e.g., referral documents, prescriptions, and physician's memory) from which the segments originated, they were manually classified by consulting medical professionals. For further and deeper analysis, this study designed and annotated clinical role labels that represent the subjectivity of the expressions and builds a machine learning model to assign them automatically. The analysis results revealed the following: First, 39% of the information in the discharge summary originated from external sources other than inpatient records. Second, patient's past clinical records constituted 43%, and patient referral documents constituted 18% of the expressions derived from external sources. Third, 11% of the missing information was not derived from any documents. These are possibly derived from physicians' memories or reasoning. According to these results, end-to-end summarization using machine learning is considered infeasible. Machine summarization with an assisted post-editing process is the best fit for this problem domain.
医疗专业人员一直被文书工作所累,而人工智能可以通过生成临床总结有效地支持医生。然而,能否从电子健康记录中存储的住院记录自动生成出院总结仍不清楚。因此,本研究调查了出院总结中的信息来源。首先,使用先前研究中的机器学习模型将出院总结自动分割为细粒度的片段,例如代表医学表述的片段。其次,过滤掉出院总结中那些并非源自住院记录的片段。这是通过计算住院记录与出院总结之间的n元语法重叠来完成的。最终的来源判定是人工进行的。最后,为了揭示这些片段的具体来源(例如,转诊文件、处方和医生的记忆),通过咨询医疗专业人员对它们进行了人工分类。为了进行进一步深入的分析,本研究设计并标注了代表表述主观性的临床角色标签,并构建了一个机器学习模型来自动分配这些标签。分析结果如下:第一,出院总结中39%的信息源自住院记录以外的外部来源。第二,患者过去的临床记录占43%,患者转诊文件占源自外部来源表述的18%。第三,11%的缺失信息并非源自任何文件。这些可能源自医生的记忆或推理。根据这些结果,使用机器学习进行端到端总结被认为是不可行的。带有辅助后编辑过程的机器总结最适合这个问题领域。