McWilliams School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, United States.
Department of Biomedical Informatics and Data Science, School of Medicine, Yale University, New Haven, CT 06510, United States.
J Am Med Inform Assoc. 2024 Nov 1;31(11):2622-2631. doi: 10.1093/jamia/ocae233.
OBJECTIVE: In acupuncture therapy, the accurate location of acupoints is essential for its effectiveness. The advanced language understanding capabilities of large language models (LLMs) like Generative Pre-trained Transformers (GPTs) and Llama present a significant opportunity for extracting relations related to acupoint locations from textual knowledge sources. This study aims to explore the performance of LLMs in extracting acupoint-related location relations and assess the impact of fine-tuning on GPT's performance. MATERIALS AND METHODS: We utilized the World Health Organization Standard Acupuncture Point Locations in the Western Pacific Region (WHO Standard) as our corpus, which consists of descriptions of 361 acupoints. Five types of relations ("direction_of", "distance_of", "part_of", "near_acupoint", and "located_near") (n = 3174) between acupoints were annotated. Four models were compared: pre-trained GPT-3.5, fine-tuned GPT-3.5, pre-trained GPT-4, as well as pretrained Llama 3. Performance metrics included micro-average exact match precision, recall, and F1 scores. RESULTS: Our results demonstrate that fine-tuned GPT-3.5 consistently outperformed other models in F1 scores across all relation types. Overall, it achieved the highest micro-average F1 score of 0.92. DISCUSSION: The superior performance of the fine-tuned GPT-3.5 model, as shown by its F1 scores, underscores the importance of domain-specific fine-tuning in enhancing relation extraction capabilities for acupuncture-related tasks. In light of the findings from this study, it offers valuable insights into leveraging LLMs for developing clinical decision support and creating educational modules in acupuncture. CONCLUSION: This study underscores the effectiveness of LLMs like GPT and Llama in extracting relations related to acupoint locations, with implications for accurately modeling acupuncture knowledge and promoting standard implementation in acupuncture training and practice. The findings also contribute to advancing informatics applications in traditional and complementary medicine, showcasing the potential of LLMs in natural language processing.
目的:在针灸治疗中,穴位的准确定位对于其疗效至关重要。大型语言模型(如生成式预训练转换器(GPT)和 llama)具有先进的语言理解能力,为从文本知识源中提取与穴位位置相关的关系提供了重要机会。本研究旨在探讨大型语言模型在提取穴位相关位置关系方面的性能,并评估微调对 GPT 性能的影响。
材料和方法:我们使用世界卫生组织西太平洋地区标准针灸穴位位置(WHO 标准)作为语料库,其中包含 361 个穴位的描述。五种穴位之间的关系(“方向”、“距离”、“部分”、“近穴位”和“位于附近”)(n=3174)进行了标注。比较了四个模型:预训练的 GPT-3.5、微调的 GPT-3.5、预训练的 GPT-4 和预训练的 llama 3。性能指标包括微平均精确匹配精度、召回率和 F1 分数。
结果:我们的结果表明,微调的 GPT-3.5 在所有关系类型的 F1 分数上始终优于其他模型。总体而言,它达到了 0.92 的最高微平均 F1 分数。
讨论:微调的 GPT-3.5 模型的出色表现,体现在其 F1 分数上,突出了在针灸相关任务中增强关系提取能力的领域特定微调的重要性。鉴于本研究的结果,它为利用大型语言模型开发临床决策支持和创建针灸教育模块提供了有价值的见解。
结论:本研究强调了 GPT 和 llama 等大型语言模型在提取穴位位置相关关系方面的有效性,对准确建模针灸知识和促进针灸培训和实践中的标准实施具有重要意义。研究结果还为传统和补充医学中的信息学应用提供了新的思路,展示了大型语言模型在自然语言处理中的潜力。
J Am Med Inform Assoc. 2024-11-1
J Am Med Inform Assoc. 2025-3-1
J Am Med Inform Assoc. 2024-9-1
Bioinformatics. 2024-3-29
Front Med (Lausanne). 2025-7-31
AMIA Jt Summits Transl Sci Proc. 2025-6-10
J Tradit Complement Med. 2025-2-21
JAMIA Open. 2025-2-19
J Am Med Inform Assoc. 2024-9-1
J Healthc Inform Res. 2024-2-29
Front Artif Intell. 2023-5-24
Front Immunol. 2023
Curr Opin Neurobiol. 2022-10