Gao Hui, Wang Kaipeng, Yuan Yuan, Wang Yueguo, Liu Qingyuan, Wang Yulan, Sun Jian, Wang Wenwen, Wang Huanli, Zhou Shusheng, Jin Kui, Zhang Mengping, Lai Yinglei
School of Mathematical Sciences, University of Science and Technology of China, Hefei, 230026, Anhui, China.
School of Mathematics and Statistics, Nanjing University of Science and Technology, Nanjing, 210094, Jiangsu, China.
Sci Rep. 2025 Jul 14;15(1):25345. doi: 10.1038/s41598-025-07649-4.
Identifying patients with critical illness in emergency departments (EDs) is an ongoing challenge, partly due to the limited information available at the time of admission. The clinical notes in patient records have already received attention for the value of improving prediction. Recent large language models (LLMs) have demonstrated their promising performance. However, the utilization of LLMs for analyzing clinical notes has not been extensively investigated. To improve the severity assessment of illness and the prediction of triage level, we developed a pipeline for utilizing LLMs (e.g. ChatGLM-2, GLM-4 and Alpaca-2) to extract information from patient complaint and anamnesis in clinical notes. In this pipeline, a LLM is supplied with the text input including complaint and anamnesis of a patient, where the input is further constructed by a prompt template, in-context learning (ICL), and retrieval-augmented generation (RAG). Then a severity score is extracted from the LLM, which is further integrated into a predictive model for improving its performance. We demonstrated the effectiveness of our pipeline based on the patient records derived from Chinese Emergency Triage, Assessment, and Treatment (CETAT) database. The extracted score were be incorporated into logistic regression as a predictor. At early stage, as vital signs were typically not yet measured, the predictive value of patient complaint and anamnesis was illustrated (evidenced by an improvement in AUC-ROC from 0.746 to 0.802). At later stage, vital signs became available, the enhancements in prediction attributable to the score were weaker, but still was observed with statistical significance in most cases. The recent LLMs are capable of extracting valuable information from clinical notes for identifying critical illness. The effectiveness has been illustrated in our study. It is still necessary to develop more efficient methods based on LLMs in order to achieve better performance.
在急诊科识别重症患者一直是一项挑战,部分原因是入院时可用信息有限。患者记录中的临床笔记因其在改善预测方面的价值而受到关注。最近的大语言模型(LLMs)已展现出其令人期待的性能。然而,利用大语言模型分析临床笔记的研究尚未广泛开展。为了改进疾病严重程度评估和分诊级别预测,我们开发了一个利用大语言模型(如ChatGLM - 2、GLM - 4和Alpaca - 2)从临床笔记中的患者主诉和病史中提取信息的流程。在这个流程中,大语言模型被提供包含患者主诉和病史的文本输入,其中输入通过提示模板、上下文学习(ICL)和检索增强生成(RAG)进一步构建。然后从大语言模型中提取严重程度评分,该评分进一步整合到预测模型中以提高其性能。我们基于源自中国急诊分诊、评估和治疗(CETAT)数据库的患者记录证明了我们流程的有效性。提取的评分被纳入逻辑回归作为预测因子。在早期,由于通常尚未测量生命体征,展示了患者主诉和病史的预测价值(AUC - ROC从0.746提高到0.802证明)。在后期,生命体征可用时,评分对预测的增强作用较弱,但在大多数情况下仍具有统计学意义。最近的大语言模型能够从临床笔记中提取有价值的信息以识别重症。我们的研究已证明了其有效性。为了实现更好的性能,仍有必要开发基于大语言模型的更高效方法。
Cochrane Database Syst Rev. 2022-5-20
Cochrane Database Syst Rev. 2021-4-19
J Am Med Inform Assoc. 2025-3-1
Cochrane Database Syst Rev. 2020-1-9
J Am Med Inform Assoc. 2025-5-1
J Korean Med Sci. 2024-12-2
Nat Commun. 2024-11-18
Lancet Reg Health Eur. 2024-10-19