Sanchez-Cordero Sergi, Lopez-Gonzalez Ruth, Fernandez Helena, Pujol-Gebellí Jordi
Bariatric Surgery, Moises Broggi University Hospital, Barcelona, Spain.
Bariatric Surgery, Moises Broggi University Hospital, Barcelona, Spain.
Obes Res Clin Pract. 2025 Jul-Aug;19(4):352-355. doi: 10.1016/j.orcp.2025.08.002. Epub 2025 Aug 15.
Selecting the most appropriate bariatric surgery (BS) technique is a complex, individualized process. Artificial intelligence (AI) tools like ChatGPT may assist, but their clinical utility is unclear. This study evaluates whether ChatGPT's recommendations for BS improve after exposure to scientific literature and how they align with real-world clinical decisions.
A retrospective single-center study included 283 patients who underwent primary BS between 2023 and 2025. No exclusion criteria were applied. Clinical variables (age, sex, BMI, comorbidities, and preoperative data) were collected. ChatGPT was asked to recommend the most suitable BS technique for each patient profile, first without context and then after being exposed to 412 open-access scientific articles. Recommendations were compared with actual clinical decisions using percentage agreement and Cohen's Kappa.
Initially, ChatGPT favored sleeve gastrectomy (SG, 56.8 %), followed by Roux-en-Y gastric bypass (RYGB, 26.8 %) and one-anastomosis gastric bypass (OAGB, 16.4 %); SADI-S was never suggested. Concordance with clinical practice was 20.0 % (Kappa = 0.003; p = 0.96). After training, SG recommendations decreased (35.7 %), RYGB increased (30.3 %), SADI-S emerged (17.1 %), and dual options appeared in 4 %. Concordance improved modestly to 25.8 % (Kappa = 0.068; p = 0.29), with a significant shift in global distribution (p < 0.00001).
ChatGPT adapts its recommendations after contextual training, but concordance with clinical judgment remains low. While potentially useful as an educational tool, ChatGPT is not yet reliable for autonomous surgical decision-making.
选择最合适的减肥手术(BS)技术是一个复杂的个体化过程。像ChatGPT这样的人工智能(AI)工具可能会有所帮助,但其临床实用性尚不清楚。本研究评估ChatGPT针对减肥手术的建议在接触科学文献后是否有所改善,以及它们与实际临床决策的契合程度。
一项回顾性单中心研究纳入了2023年至2025年间接受初次减肥手术的283例患者。未应用排除标准。收集临床变量(年龄、性别、体重指数、合并症和术前数据)。要求ChatGPT为每个患者档案推荐最合适的减肥手术技术,首先在无背景信息的情况下,然后在接触412篇开放获取的科学文章后进行推荐。使用百分比一致性和科恩kappa系数将推荐与实际临床决策进行比较。
最初,ChatGPT倾向于袖状胃切除术(SG,56.8%),其次是Roux-en-Y胃旁路术(RYGB,26.8%)和单吻合口胃旁路术(OAGB,16.4%);从未建议过SADI-S。与临床实践的一致性为20.0%(kappa系数=0.003;p=0.96)。训练后,SG的推荐率下降(35.7%),RYGB的推荐率上升(30.3%),SADI-S出现(17.1%),4%出现了双重选择。一致性适度提高到25.8%(kappa系数=0.068;p=0.29),全球分布有显著变化(p<0.00001)。
ChatGPT在进行背景训练后会调整其推荐,但与临床判断的一致性仍然较低。虽然作为一种教育工具可能有用,但ChatGPT在自主手术决策方面尚不可靠。