ChatGPT在多囊卵巢综合征相关问题及不孕女性治疗策略查询中的功效。

ChatGPT's Efficacy in Queries Regarding Polycystic Ovary Syndrome and Treatment Strategies for Women Experiencing Infertility.

作者信息

Devranoglu Belgin, Gurbuz Tugba, Gokmen Oya

机构信息

Department of Obstetrics and Gynecology, Zeynep Kamil Maternity/Children, Education and Training Hospital, Istanbul 34480, Turkey.

Department of Gynecology and Obstetrics Clinic, Medistate Hospital, Istanbul 34820, Turkey.

出版信息

Diagnostics (Basel). 2024 May 22;14(11):1082. doi: 10.3390/diagnostics14111082.

DOI:10.3390/diagnostics14111082

PMID:38893609

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11172366/

Abstract

This study assesses the efficacy of ChatGPT-4, an advanced artificial intelligence (AI) language model, in delivering precise and comprehensive answers to inquiries regarding managing polycystic ovary syndrome (PCOS)-related infertility. The research team, comprising experienced gynecologists, formulated 460 structured queries encompassing a wide range of common and intricate PCOS scenarios. The queries were: true/false (170), open-ended (165), and multiple-choice (125) and further classified as 'easy', 'moderate', and 'hard'. For true/false questions, ChatGPT-4 achieved a flawless accuracy rate of 100% initially and upon reassessment after 30 days. In the open-ended category, there was a noteworthy enhancement in accuracy, with scores increasing from 5.53 ± 0.89 initially to 5.88 ± 0.43 at the 30-day mark ( < 0.001). Completeness scores for open-ended queries also experienced a significant improvement, rising from 2.35 ± 0.58 to 2.92 ± 0.29 ( < 0.001). In the multiple-choice category, although the accuracy score exhibited a minor decline from 5.96 ± 0.44 to 5.92 ± 0.63 after 30 days ( > 0.05). Completeness scores for multiple-choice questions remained consistent, with initial and 30-day means of 2.98 ± 0.18 and 2.97 ± 0.25, respectively ( > 0.05). ChatGPT-4 demonstrated exceptional performance in true/false queries and significantly improved handling of open-ended questions during the 30 days. These findings emphasize the potential of AI, particularly ChatGPT-4, in enhancing decision-making support for healthcare professionals managing PCOS-related infertility.

摘要

本研究评估了先进的人工智能（AI）语言模型ChatGPT-4在提供有关多囊卵巢综合征（PCOS）相关不孕症管理问题的精确和全面答案方面的效果。由经验丰富的妇科医生组成的研究团队制定了460个结构化问题，涵盖了广泛的常见和复杂的PCOS情况。这些问题包括：是非题（170个）、开放式问题（165个）和多项选择题（125个），并进一步分为“简单”、“中等”和“困难”。对于是非题，ChatGPT-4最初以及在30天后重新评估时的准确率均达到了完美的100%。在开放式问题类别中，准确率有显著提高，分数从最初的5.53±0.89提高到30天时的5.88±0.43（<0.001）。开放式问题的完整性分数也有显著提高，从2.35±0.58提高到2.92±0.29（<0.001）。在多项选择题类别中，虽然30天后准确率分数略有下降，从5.96±0.44降至5.92±0.63（>0.05）。多项选择题的完整性分数保持一致，最初和30天的平均值分别为2.98±0.18和2.97±0.25（>0.05）。ChatGPT-4在是非题查询中表现出色，并且在30天内显著改善了对开放式问题的处理。这些发现强调了人工智能，特别是ChatGPT-4在增强对管理PCOS相关不孕症的医疗保健专业人员的决策支持方面的潜力。