Lian Chengxiang, Yuan Xin, Chokkakula Santosh, Wang Guanqing, Song Biao, Wang Zhe, Fan Ge, Yin Chengliang
Department of Dermatology and Venereology, The First Affiliated Hospital of Guang-xi Medical University, Nanning, 530021, China.
Department of Dermatology, GuiZhou Provincial People's Hospital, Guiyang, 550000, China.
Heliyon. 2024 Aug 30;10(17):e37220. doi: 10.1016/j.heliyon.2024.e37220. eCollection 2024 Sep 15.
The efficacy and adeptness of ChatGPT 3.5 and ChatGPT 4.0 in the precise diagnosis and management of conditions like atopic dermatitis and Autoimmune blistering skin diseases (AIBD) remain to be elucidated. So this study examined the accuracy and effectiveness of the ChatGPT responses related to understanding, therapies, and specific cases of these two conditions.
Firstly, the responses provided by ChatGPTs to a set of 50 questionnaires underwent evaluation by five distinct dermatologists, with complete adjudication of the third-party reviewer. The comparative analysis included the evaluative efficacy of both ChatGPT3.5 and ChatGPT4.0 against the diagnostic abilities exhibited by three distinct cohorts of qualified clinical professionals. And then, an examination was conducted to assess the diagnostic proficiency of ChatGPT3.5 and ChatGPT4.0 in the context of diagnosing specific instances of skin blistering autoimmune diseases.
In assessing the proficiency of ChatGPTs in generating responses related to fundamental knowledge about AD it is noteworthy that both versions of ChatGPTs, despite their lack of specialized training on medical databases, exhibited a commendable capacity to yield solutions that exhibited a substantial degree of concurrence with evidence-based medical information. Accordingly we observed that the performance of ChatGPT-4.0 beyond that of the ChatGPT-3.5. However, it it crucial to emphasize that ChatGPT-4.0 did not show the ability to offer answers surpassing those provided by associate senior, and senior medical professionals. In the assessment designed to determine the proficiency of ChatGPTs in recognizing particular type of AIBD, it is evident that both ChatGPT-4 and ChatGPT-3.5 demonstrated inadequacy in providing responses that are both precise and accurate for each individual occurrence of this skin condition.
Both ChatGPT-3.5 and ChatGPT-4.0 satisfactory for addressing fundamental inquiries related to atopic dermatitis, however they prove insufficient for diagnosing AIBD. The progress of ChatGPT in achieving utility within the professional medical domain remains a considerable journey ahead.
ChatGPT 3.5和ChatGPT 4.0在特应性皮炎和自身免疫性水疱性皮肤病(AIBD)等病症的精确诊断和管理中的功效和适用性仍有待阐明。因此,本研究考察了ChatGPT对这两种病症的理解、治疗及具体病例相关回答的准确性和有效性。
首先,由五位不同的皮肤科医生对ChatGPT针对一组50份问卷给出的回答进行评估,并由第三方审核员进行全面评判。比较分析包括ChatGPT3.5和ChatGPT4.0相对于三组不同合格临床专业人员所展现出的诊断能力的评估功效。然后,进行一项检查以评估ChatGPT3.5和ChatGPT4.0在诊断皮肤水疱性自身免疫性疾病具体病例方面的诊断能力。
在评估ChatGPT生成与特应性皮炎基础知识相关回答的能力时,值得注意的是,尽管ChatGPT的两个版本都未在医学数据库上接受过专门训练,但它们都展现出了值得称赞的能力,能够给出与循证医学信息高度一致的解决方案。因此,我们观察到ChatGPT-4.0的表现优于ChatGPT-3.5。然而,必须强调的是,ChatGPT-4.0并未表现出能够提供超越副主任医师和主任医师所提供答案的能力。在旨在确定ChatGPT识别特定类型AIBD能力的评估中,很明显ChatGPT-4和ChatGPT-3.5在针对这种皮肤病症的每一个具体病例提供精确准确的回答方面都表现不足。
ChatGPT-3.5和ChatGPT-4.0在回答与特应性皮炎相关的基本问题方面表现令人满意,但在诊断AIBD方面被证明是不足的。ChatGPT在专业医学领域实现实用性的进展仍有很长的路要走。