Michael G. DeGroote School of Medicine, McMaster University, Hamilton, Ontario, Canada.
Temerty Faculty of Medicine, University of Toronto, Toronto, Ontario, Canada.
Retina. 2024 Jun 1;44(6):950-953. doi: 10.1097/IAE.0000000000004044.
To determine whether the two popular artificial intelligence chatbots, ChatGPT and Bard, can provide high-quality information concerning procedure description, risks, benefits, and alternatives of various ophthalmic surgeries.
ChatGPT and Bard were prompted with questions pertaining to the description, potential risks, benefits, alternatives, and implications of not proceeding with various surgeries in different subspecialties of ophthalmology. Six common ophthalmic procedures were included in the authors' analysis. Two comprehensive ophthalmologists and one subspecialist graded each response independently using a 5-point Likert scale.
Likert grading for accuracy was significantly higher for ChatGPT in comparison with Bard (4.5 ± 0.6 vs. 3.8 ± 0.8, P < 0.0001). Generally, ChatGPT performed better than Bard even when questions were stratified by the type of ophthalmic surgery. There was no significant difference between ChatGPT and Bard for response length (2,104.7 ± 271.4 characters vs. 2,441.0 ± 633.9 characters, P = 0.12). ChatGPT responded significantly slower than Bard (46.0 ± 3.0 vs. 6.6 ± 1.2 seconds, P < 0.0001).
Both ChatGPT and Bard may offer accessible and high-quality information relevant to the informed consent process for various ophthalmic procedures. Nonetheless, both artificial intelligence chatbots overlooked the probability of adverse events, hence limiting their potential and introducing patients to information that may be difficult to interpret.
为了确定两种流行的人工智能聊天机器人 ChatGPT 和 Bard 是否能够提供有关各种眼科手术描述、风险、益处和替代方案的高质量信息。
向 ChatGPT 和 Bard 提出了与描述、潜在风险、益处、替代方案以及不同眼科亚专业各种手术不进行的影响相关的问题。作者的分析包括了六种常见的眼科手术。两名全面的眼科医生和一名专科医生使用 5 分李克特量表对每个回复进行独立评分。
与 Bard 相比,ChatGPT 的准确性评分明显更高(4.5 ± 0.6 对 3.8 ± 0.8,P < 0.0001)。一般来说,即使对眼科手术类型进行分层,ChatGPT 的表现也优于 Bard。ChatGPT 和 Bard 的回复长度(2,104.7 ± 271.4 个字符对 2,441.0 ± 633.9 个字符,P = 0.12)没有显著差异。ChatGPT 的回复速度明显慢于 Bard(46.0 ± 3.0 对 6.6 ± 1.2 秒,P < 0.0001)。
ChatGPT 和 Bard 都可能提供与各种眼科手术知情同意过程相关的可访问和高质量信息。然而,这两种人工智能聊天机器人都忽略了不良事件的可能性,从而限制了它们的潜力,并向患者提供了可能难以解释的信息。