Li Yunxiang, Li Zihan, Zhang Kai, Dan Ruilong, Jiang Steve, Zhang You
Department of Radiation Oncology, University of Texas Southwestern Medical Center, Dallas, USA.
Department of Computer Science, University of Illinois at Urbana-Champaign, Illinois, USA.
Cureus. 2023 Jun 24;15(6):e40895. doi: 10.7759/cureus.40895. eCollection 2023 Jun.
Objective The primary aim of this research was to address the limitations observed in the medical knowledge of prevalent large language models (LLMs) such as ChatGPT, by creating a specialized language model with enhanced accuracy in medical advice. Methods We achieved this by adapting and refining the large language model meta-AI (LLaMA) using a large dataset of 100,000 patient-doctor dialogues sourced from a widely used online medical consultation platform. These conversations were cleaned and anonymized to respect privacy concerns. In addition to the model refinement, we incorporated a self-directed information retrieval mechanism, allowing the model to access and utilize real-time information from online sources like Wikipedia and data from curated offline medical databases. Results The fine-tuning of the model with real-world patient-doctor interactions significantly improved the model's ability to understand patient needs and provide informed advice. By equipping the model with self-directed information retrieval from reliable online and offline sources, we observed substantial improvements in the accuracy of its responses. Conclusion Our proposed ChatDoctor, represents a significant advancement in medical LLMs, demonstrating a significant improvement in understanding patient inquiries and providing accurate advice. Given the high stakes and low error tolerance in the medical field, such enhancements in providing accurate and reliable information are not only beneficial but essential.
目的 本研究的主要目的是通过创建一个在医疗建议方面具有更高准确性的专业语言模型,来解决诸如ChatGPT等流行大语言模型(LLMs)在医学知识方面所观察到的局限性。方法 我们通过使用从一个广泛使用的在线医疗咨询平台获取的100,000个医患对话的大型数据集,对大语言模型meta-AI(LLaMA)进行调整和优化来实现这一目标。这些对话经过清理和匿名化处理,以尊重隐私问题。除了模型优化外,我们还纳入了一种自主信息检索机制,使模型能够访问和利用来自维基百科等在线来源的实时信息以及来自精心整理的离线医学数据库的数据。结果 通过真实世界的医患互动对模型进行微调,显著提高了模型理解患者需求并提供明智建议的能力。通过为模型配备从可靠的在线和离线来源进行自主信息检索的功能,我们观察到其回答的准确性有了大幅提高。结论 我们提出的ChatDoctor代表了医学大语言模型的重大进步,在理解患者询问和提供准确建议方面有显著改进。鉴于医学领域的高风险和低容错率,在提供准确可靠信息方面的这种改进不仅有益而且至关重要。