文献检索，用中文搜 PubMed

PURPOSE

Large language models (LLM) show potential for decision support in breast cancer care. Their use in clinical care is currently prohibited by lack of control over sources used for decision-making, explainability of the decision-making process and health data security issues. Recent development of Small Language Models (SLM) is discussed to address these challenges. This preclinical proof-of-concept study tailors an open-source SLM to the German breast cancer guideline (BC-SLM) to evaluate initial clinical accuracy and technical functionality in a preclinical simulation.

METHODS

A multidisciplinary tumor board (MTB) is used as the gold-standard to assess the initial clinical accuracy in terms of concordance of the BC-SLM with MTB and comparing it to two publicly available LLM, ChatGPT3.5 and 4. The study includes 20 fictional patient profiles and recommendations for 5 treatment modalities, resulting in 100 binary treatment recommendations (recommended or not recommended). Statistical evaluation includes concordance with MTB in % including Cohen's Kappa statistic (κ). Technical functionality is assessed qualitatively in terms of local hosting, adherence to the guideline and information retrieval.

RESULTS

The overall concordance amounts to 86% for BC-SLM (κ = 0.721, p < 0.001), 90% for ChatGPT4 (κ = 0.820, p < 0.001) and 83% for ChatGPT3.5 (κ = 0.661, p < 0.001). Specific concordance for each treatment modality ranges from 65 to 100% for BC-SLM, 85-100% for ChatGPT4, and 55-95% for ChatGPT3.5. The BC-SLM is locally functional, adheres to the standards of the German breast cancer guideline and provides referenced sections for its decision-making.

CONCLUSION

The tailored BC-SLM shows initial clinical accuracy and technical functionality, with concordance to the MTB that is comparable to publicly-available LLMs like ChatGPT4 and 3.5. This serves as a proof-of-concept for adapting a SLM to an oncological disease and its guideline to address prevailing issues with LLM by ensuring decision transparency, explainability, source control, and data security, which represents a necessary step towards clinical validation and safe use of language models in clinical oncology.

PURPOSE

METHODS

RESULTS

CONCLUSION

目的

大型语言模型（LLM）在乳腺癌护理决策支持方面具有潜力。由于缺乏对决策使用的来源、决策过程的可解释性和健康数据安全性问题的控制，它们目前在临床护理中的使用受到限制。最近讨论了小型语言模型（SLM）的发展，以解决这些挑战。本临床前概念验证研究根据德国乳腺癌指南（BC-SLM）对开源 SLM 进行了调整，以在临床前模拟中评估其初始临床准确性和技术功能。

方法

多学科肿瘤委员会（MTB）作为金标准，根据 BC-SLM 与 MTB 的一致性评估初始临床准确性，并将其与两个公开可用的 LLM（ChatGPT3.5 和 4）进行比较。该研究包括 20 个虚构的患者概况和 5 种治疗方式的建议，产生了 100 个二进制治疗建议（推荐或不推荐）。统计评估包括与 MTB 的一致性，以百分比表示，并包括 Cohen's Kappa 统计量（κ）。技术功能从本地托管、遵守指南和信息检索的角度进行定性评估。

结果

BC-SLM 的总体一致性为 86%（κ=0.721，p<0.001），ChatGPT4 的一致性为 90%（κ=0.820，p<0.001），ChatGPT3.5 的一致性为 83%（κ=0.661，p<0.001）。BC-SLM 对每种治疗方式的具体一致性范围为 65%至 100%，ChatGPT4 为 85%至 100%，ChatGPT3.5 为 55%至 95%。BC-SLM 具有本地功能，符合德国乳腺癌指南的标准，并为其决策提供参考部分。

结论

经过调整的 BC-SLM 显示出初始临床准确性和技术功能，与 MTB 的一致性与公开可用的 LLM（如 ChatGPT4 和 3.5）相当。这为适应 SLM 以满足 LLM 存在的问题提供了概念验证，包括确保决策透明度、可解释性、来源控制和数据安全性，这是临床验证和安全使用语言模型在临床肿瘤学中的必要步骤。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

用于乳腺癌决策支持的小型语言模型聊天机器人的概念验证研究——一种透明、源码可控、可解释且数据安全的方法。

Proof-of-concept study of a small language model chatbot for breast cancer decision support - a transparent, source-controlled, explainable and data-secure approach.

机构信息

出版信息

PURPOSE

METHODS

RESULTS

CONCLUSION

相似文献

引用本文的文献

本文引用的文献

用于乳腺癌决策支持的小型语言模型聊天机器人的概念验证研究——一种透明、源码可控、可解释且数据安全的方法。

Proof-of-concept study of a small language model chatbot for breast cancer decision support - a transparent, source-controlled, explainable and data-secure approach.

机构信息

出版信息

PURPOSE

METHODS

RESULTS

CONCLUSION

目的

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献