Nong Tiffany, Britton Sean, Bhanderi Viralkumar, Taylor Justin
Florida State University College of Medicine, Tallahassee, FL, USA.
University of Vermont Larner College of Medicine, Burlington, VT, USA.
Future Sci OA. 2025 Dec;11(1):2546259. doi: 10.1080/20565623.2025.2546259. Epub 2025 Sep 3.
Many patients seek accurate, understandable information about their disease and treatment, turning to the internet or messaging providers. This study aims to validate chatbots' ability to deliver accurate information, contributing to the literature on AI's role in cancer care and helping to improve these tools for patients and caregivers. A set of questions about hematologic malignancies was created with input from oncologists and reputable websites and then submitted to ChatGPT 3.5. Each response was rated by hematology-oncology physicians from strongly disagree (1) to strongly agree (5) regarding its accuracy and usefulness for patients, with multiple reviewers ensuring consistency. The general queries category received a higher average score of 3.38 compared to 3.06 in the novel therapies category, indicating relatively better satisfaction. However, no question achieved scores greater than 4 (agree) or 5 (strongly agree), with most scores ranging from 3.0 to 3.8, reflecting a neutral stance, suggesting room for improvement. ChatGPT struggled with providing current and specific information for patient-specific queries and novel therapies, especially in rapidly advancing fields like acute myeloid leukemia. These deficiencies are likely due to AI's reliance on large data sets, leading to less influence from novel therapies.
许多患者会通过互联网或信息服务提供商来寻求有关自身疾病及治疗的准确、易懂的信息。本研究旨在验证聊天机器人提供准确信息的能力,为人工智能在癌症护理中的作用的文献做出贡献,并帮助为患者和护理人员改进这些工具。在肿瘤学家和知名网站的协助下,创建了一组关于血液系统恶性肿瘤的问题,然后提交给ChatGPT 3.5。血液肿瘤内科医生对每条回复的准确性及其对患者的有用性进行评分,从强烈不同意(1分)到强烈同意(5分),由多名评审人员确保评分的一致性。一般问题类别获得的平均分数较高,为3.38分,而新疗法类别为3.06分,表明前者的满意度相对较高。然而,没有任何问题的得分高于4分(同意)或5分(强烈同意),大多数分数在3.0至3.8之间,反映出一种中立的立场,这表明仍有改进空间。ChatGPT在为针对患者的特定问题和新疗法提供最新且具体的信息方面存在困难,尤其是在急性髓系白血病等快速发展的领域。这些不足可能是由于人工智能对大数据集的依赖,导致新疗法的影响较小。