Suppr超能文献

微调后的变压器语言模型在新临床环境中的泛化。

Generalization of finetuned transformer language models to new clinical contexts.

作者信息

Xie Kevin, Terman Samuel W, Gallagher Ryan S, Hill Chloe E, Davis Kathryn A, Litt Brian, Roth Dan, Ellis Colin A

机构信息

Department of Bioengineering, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA.

Center for Neuroengineering and Therapeutics, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA.

出版信息

JAMIA Open. 2023 Aug 16;6(3):ooad070. doi: 10.1093/jamiaopen/ooad070. eCollection 2023 Oct.

Abstract

OBJECTIVE

We have previously developed a natural language processing pipeline using clinical notes written by epilepsy specialists to extract seizure freedom, seizure frequency text, and date of last seizure text for patients with epilepsy. It is important to understand how our methods generalize to new care contexts.

MATERIALS AND METHODS

We evaluated our pipeline on unseen notes from nonepilepsy-specialist neurologists and non-neurologists without any additional algorithm training. We tested the pipeline out-of-institution using epilepsy specialist notes from an outside medical center with only minor preprocessing adaptations. We examined reasons for discrepancies in performance in new contexts by measuring physical and semantic similarities between documents.

RESULTS

Our ability to classify patient seizure freedom decreased by at least 0.12 agreement when moving from epilepsy specialists to nonspecialists or other institutions. On notes from our institution, textual overlap between the extracted outcomes and the gold standard annotations attained from manual chart review decreased by at least 0.11 F when an answer existed but did not change when no answer existed; here our models generalized on notes from the outside institution, losing at most 0.02 agreement. We analyzed textual differences and found that syntactic and semantic differences in both clinically relevant sentences and surrounding contexts significantly influenced model performance.

DISCUSSION AND CONCLUSION

Model generalization performance decreased on notes from nonspecialists; out-of-institution generalization on epilepsy specialist notes required small changes to preprocessing but was especially good for seizure frequency text and date of last seizure text, opening opportunities for multicenter collaborations using these outcomes.

摘要

目的

我们之前开发了一种自然语言处理流程,利用癫痫专家撰写的临床记录来提取癫痫患者的无癫痫发作情况、癫痫发作频率文本以及最后一次癫痫发作日期文本。了解我们的方法如何推广到新的护理环境中很重要。

材料与方法

我们在非癫痫专科神经科医生和非神经科医生的未见过的记录上评估了我们的流程,且没有进行任何额外的算法训练。我们使用来自外部医疗中心的癫痫专家记录在机构外测试了该流程,仅进行了少量的预处理调整。我们通过测量文档之间的物理和语义相似性来检查新环境中性能差异的原因。

结果

从癫痫专家转向非专家或其他机构时,我们对患者无癫痫发作情况进行分类的能力下降了至少0.12的一致性。在我们机构的记录上,当存在答案时,提取结果与通过人工图表审查获得的金标准注释之间的文本重叠至少下降了0.11 F,而当不存在答案时则没有变化;在这里,我们的模型在来自外部机构的记录上具有泛化能力,最多损失0.02的一致性。我们分析了文本差异,发现临床相关句子及其周围语境中的句法和语义差异显著影响模型性能。

讨论与结论

模型在非专家的记录上泛化性能下降;对癫痫专家记录进行机构外泛化需要对预处理进行小的更改,但对癫痫发作频率文本和最后一次癫痫发作日期文本特别有效,为利用这些结果进行多中心合作提供了机会。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0292/10432353/bc416172e42e/ooad070f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验