Sinha Ranwir K, Deb Roy Asitava, Kumar Nikhil, Mondal Himel
Pathology, All India Institute of Medical Sciences, Deoghar, Jharkhand, IND.
Physiology, All India Institute of Medical Sciences, Deoghar, Jharkhand, IND.
Cureus. 2023 Feb 20;15(2):e35237. doi: 10.7759/cureus.35237. eCollection 2023 Feb.
Background Artificial intelligence (AI) is evolving for healthcare services. Higher cognitive thinking in AI refers to the ability of the system to perform advanced cognitive processes, such as problem-solving, decision-making, reasoning, and perception. This type of thinking goes beyond simple data processing and involves the ability to understand and manipulate abstract concepts, interpret, and use information in a contextually relevant way, and generate new insights based on past experiences and accumulated knowledge. Natural language processing models like ChatGPT is a conversational program that can interact with humans to provide answers to queries. Objective We aimed to ascertain the capability of ChatGPT in solving higher-order reasoning in the subject of pathology. Methods This cross-sectional study was conducted on the internet using an AI-based chat program that provides free service for research purposes. The current version of ChatGPT (January 30 version) was used to converse with a total of 100 higher-order reasoning queries. These questions were randomly selected from the question bank of the institution and categorized according to different systems. The responses to each question were collected and stored for further analysis. The responses were evaluated by three expert pathologists on a zero to five scale and categorized into the structure of the observed learning outcome (SOLO) taxonomy categories. The score was compared by a one-sample median test with hypothetical values to find its accuracy. Result A total of 100 higher-order reasoning questions were solved by the program in an average of 45.31±7.14 seconds for an answer. The overall median score was 4.08 (Q1-Q3: 4-4.33) which was below the hypothetical maximum value of five (one-test median test p <0.0001) and similar to four (one-test median test p = 0.14). The majority (86%) of the responses were in the "relational" category in the SOLO taxonomy. There was no difference in the scores of the responses for questions asked from various organ systems in the subject of Pathology (Kruskal Wallis p = 0.55). The scores rated by three pathologists had an excellent level of inter-rater reliability (ICC = 0.975 [95% CI: 0.965-0.983]; F = 40.26; p < 0.0001). Conclusion The capability of ChatGPT to solve higher-order reasoning questions in pathology had a relational level of accuracy. Hence, the text output had connections among its parts to provide a meaningful response. The answers from the program can score approximately 80%. Hence, academicians or students can get help from the program for solving reasoning-type questions also. As the program is evolving, further studies are needed to find its accuracy level in any further versions.
背景 人工智能(AI)正在为医疗服务而不断发展。人工智能中的高级认知思维是指系统执行高级认知过程的能力,如解决问题、决策、推理和感知。这种思维超越了简单的数据处理,涉及理解和操纵抽象概念、以与上下文相关的方式解释和使用信息,以及基于过去的经验和积累的知识产生新见解的能力。像ChatGPT这样的自然语言处理模型是一种对话程序,可以与人类互动以提供问题的答案。
目的 我们旨在确定ChatGPT在解决病理学主题中的高阶推理问题的能力。
方法 本横断面研究在互联网上使用一个基于人工智能的聊天程序进行,该程序为研究目的提供免费服务。使用ChatGPT的当前版本(1月30日版本)与总共100个高阶推理问题进行对话。这些问题从该机构的题库中随机选择,并根据不同系统进行分类。收集并存储每个问题的答案以供进一步分析。由三名病理学专家对答案进行0至5分的评分,并根据观察到的学习成果结构(SOLO)分类法进行分类。通过单样本中位数检验将分数与假设值进行比较以确定其准确性。
结果 该程序共解决了100个高阶推理问题,平均每个答案用时45.31±7.14秒。总体中位数分数为4.08(第一四分位数 - 第三四分位数:4 - 4.33),低于假设的最大值5(单样本中位数检验p <0.0001),与4相似(单样本中位数检验p = 0.14)。大多数(86%)答案在SOLO分类法的“关联”类别中。病理学主题中不同器官系统提出的问题的答案分数没有差异(Kruskal Wallis检验p = 0.55)。三名病理学家给出的评分具有极好的评分者间信度(组内相关系数ICC = 0.975 [95%置信区间:0.965 - 0.983];F = 40.26;p <0.0001)。
结论 ChatGPT在解决病理学高阶推理问题方面具有关联水平的准确性。因此,文本输出的各个部分之间存在联系以提供有意义的回答。该程序给出的答案大约可以得80分。因此,学者或学生在解决推理类型的问题时也可以从该程序中获得帮助。随着该程序的不断发展,需要进一步研究以确定其在任何后续版本中的准确程度。