School of Population Medicine and Public Health, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China.
Department of Primary Healthcare and Family Medicine, Faculty of Medicine, Universidad de Chile, Santiago, Chile.
BMC Womens Health. 2024 Sep 2;24(1):482. doi: 10.1186/s12905-024-03320-8.
Cervical cancer (CC) and breast cancer (BC) threaten women's well-being, influenced by health-related stigma and a lack of reliable information, which can cause late diagnosis and early death. ChatGPT is likely to become a key source of health information, although quality concerns could also influence health-seeking behaviours.
This cross-sectional online survey compared ChatGPT's responses to five physicians specializing in mammography and five specializing in gynaecology. Twenty frequently asked questions about CC and BC were asked on 26th and 29th of April, 2023. A panel of seven experts assessed the accuracy, consistency, and relevance of ChatGPT's responses using a 7-point Likert scale. Responses were analyzed for readability, reliability, and efficiency. ChatGPT's responses were synthesized, and findings are presented as a radar chart.
ChatGPT had an accuracy score of 7.0 (range: 6.6-7.0) for CC and BC questions, surpassing the highest-scoring physicians (P < 0.05). ChatGPT took an average of 13.6 s (range: 7.6-24.0) to answer each of the 20 questions presented. Readability was comparable to that of experts and physicians involved, but ChatGPT generated more extended responses compared to physicians. The consistency of repeated answers was 5.2 (range: 3.4-6.7). With different contexts combined, the overall ChatGPT relevance score was 6.5 (range: 4.8-7.0). Radar plot analysis indicated comparably good accuracy, efficiency, and to a certain extent, relevance. However, there were apparent inconsistencies, and the reliability and readability be considered inadequate.
ChatGPT shows promise as an initial source of information for CC and BC. ChatGPT is also highly functional and appears to be superior to physicians, and aligns with expert consensus, although there is room for improvement in readability, reliability, and consistency. Future efforts should focus on developing advanced ChatGPT models explicitly designed to improve medical practice and for those with concerns about symptoms.
宫颈癌(CC)和乳腺癌(BC)威胁着女性的健康,这受到健康相关耻辱感和缺乏可靠信息的影响,可能导致诊断延迟和早逝。ChatGPT 可能成为关键的健康信息来源,尽管质量问题也可能影响寻求医疗行为。
这项横断面在线调查比较了 ChatGPT 对五位专门从事乳房 X 线照相术和五位专门从事妇科的医生的回答。2023 年 4 月 26 日和 29 日,询问了 20 个关于 CC 和 BC 的常见问题。一个由七名专家组成的小组使用 7 点李克特量表评估了 ChatGPT 回答的准确性、一致性和相关性。分析了回复的可读性、可靠性和效率。综合了 ChatGPT 的回复,并以雷达图的形式呈现结果。
ChatGPT 在 CC 和 BC 问题上的准确率得分为 7.0(范围:6.6-7.0),超过了得分最高的医生(P<0.05)。ChatGPT 平均需要 13.6 秒(范围:7.6-24.0)来回答提出的 20 个问题中的每一个。可读性与参与的专家和医生相当,但 ChatGPT 生成的回复比医生更长。重复回答的一致性为 5.2(范围:3.4-6.7)。将不同的上下文结合起来,ChatGPT 的整体相关性评分为 6.5(范围:4.8-7.0)。雷达图分析表明,准确性、效率和在一定程度上的相关性都相当不错。然而,存在明显的不一致性,可靠性和可读性需要考虑改进。
ChatGPT 有望成为 CC 和 BC 的初始信息来源。ChatGPT 功能强大,似乎优于医生,并且与专家共识一致,尽管在可读性、可靠性和一致性方面还有改进的空间。未来的努力应集中于开发专门用于改善医疗实践的高级 ChatGPT 模型,以及那些对症状有顾虑的人。