Poole Shane, Sisodia Nikki, Koshal Kanishka, Henderson Kyra, Wijangco Jaeleene, Paredes Danelvis, Chen Chelsea, Rowles William, Akula Amit, Wuerfel Jens, Sharma Vishakha, Rauschecker Andreas M, Henry Roland G, Bove Riley
UCSF Weill Institute for Neurosciences, University of California San Francisco, San Francisco, CA, USA.
F. Hoffmann-La Roche, Basel, Switzerland.
Ann Neurol. 2025 Aug;98(2):308-316. doi: 10.1002/ana.27251. Epub 2025 Apr 25.
Neuroimaging is routinely utilized to identify new inflammatory activity in multiple sclerosis (MS). A large language model to classify narrative magnetic resonance imaging reports in the electronic health record (EHR) as discrete data could provide significant benefits for MS research. The objectives of the current study were to develop such a prompt and to illustrate its research applications through a common clinical scenario: monitoring response to B-cell depleting therapy (BCDT).
An institutional ecosystem that securely connects healthcare data with ChatGPT4 was applied to clinical MS magnetic resonance imaging reports in a single institutional EHR (2000-2022). A prompt (msLesionprompt) was developed and iteratively refined to classify the presence or absence of new T2-weighted lesions (newT2w) and contrast-enhancing lesions (CEL). The multistep validation included evaluating efficiency (time and cost), comparison with manually annotated reports using standard confusion matrix, and application to identifying predictors of newT2w/CEL after BCDT start.
Accuracy of msLesionprompt was high for detection of newT2w (97%) and CEL (96.8%). All 14,888 available reports were categorized in 4.13 hours ($28); 79% showed no newT2w or CEL. Data extracted showed expected suppression of new activity by BCDT (>97% monitoring magnetic resonance images after an initial "rebaseline" scan). Neighborhood poverty (Area Deprivation Index) was identified as a predictor of inflammatory activity (newT2w: OR 1.69, 95% CI 1.10-2.59, p = 0.017; CEL: OR 1.54, 95% CI 1.01-2.34, p = 0.046).
Extracting discrete information from narrative imaging reports using an large language model is feasible and efficient. This approach could augment many real-world analyses of MS disease evolution and treatment response. ANN NEUROL 2025;98:308-316.
神经影像学常用于识别多发性硬化症(MS)中的新炎症活动。一个将电子健康记录(EHR)中的叙述性磁共振成像报告分类为离散数据的大语言模型可为MS研究带来显著益处。本研究的目的是开发这样一个提示,并通过一个常见的临床场景说明其研究应用:监测对B细胞耗竭疗法(BCDT)的反应。
一个将医疗数据与ChatGPT4安全连接的机构生态系统应用于单一机构EHR(2000 - 2022年)中的临床MS磁共振成像报告。开发并迭代完善了一个提示(msLesionprompt),以对新的T2加权病变(newT2w)和强化病变(CEL)的有无进行分类。多步骤验证包括评估效率(时间和成本)、使用标准混淆矩阵与人工标注报告进行比较,以及应用于识别BCDT开始后newT2w/CEL的预测因素。
msLesionprompt检测newT2w(97%)和CEL(96.8%)的准确率很高。所有14888份可用报告在4.13小时内(花费28美元)完成分类;79%的报告未显示新的newT2w或CEL。提取的数据显示BCDT对新活动有预期的抑制作用(初始“重新基线”扫描后>97%的监测磁共振图像)。社区贫困(地区剥夺指数)被确定为炎症活动的一个预测因素(newT2w:比值比1.69,95%置信区间1.10 - 2.59,p = 0.017;CEL:比值比1.54,95%置信区间1.01 - 2.34,p = 0.046)。
使用大语言模型从叙述性影像报告中提取离散信息是可行且高效的。这种方法可以增强对MS疾病演变和治疗反应的许多真实世界分析。《神经病学纪事》2025年;98:308 - 316。