Eravsar Necati Bahadir, Aydin Mahmud, Eryilmaz Atahan, Turemis Cihangir, Surucu Serkan, Jimenez Andrew E
Johns Hopkins University, Department of Orthopaedic Surgery, Baltimore, MD, USA; S.B.U. Haydarpasa Numune Training and Research Hospital, Istanbul 34668, Turkey.
Sisli Memorial Hospital, Istanbul, 34384, Turkey.
J ISAKOS. 2025 Jun;12:100892. doi: 10.1016/j.jisako.2025.100892. Epub 2025 May 3.
The purpose of this study was to compare the reliability and accuracy of responses provided to patients about hip arthroscopy (HA) by Chat Generative Pre-Trained Transformer (ChatGPT), an artificial intelligence (AI) and large language model (LLM) online program, with those obtained through a contemporary Google Search for frequently asked questions (FAQs) regarding HA.
"HA" was entered into Google Search and ChatGPT, and the 15 most common FAQs and the answers were determined. In Google Search, the FAQs were obtained from the "People also ask" section. ChatGPT was queried to provide the 15 most common FAQs and subsequent answers. The Rothwell system groups the questions under 10 subheadings. Responses of ChatGPT and Google Search engines were compared.
Timeline of recovery (23.3%) and technical details (20%) were the most common categories of questions. ChatGPT produced significantly more data in the technical details category (33.3% vs. 6.6%; p-value = 0.0455) than in the other categories. The most FAQs were academic in nature for both Google web search (46.6%) and ChatGPT (93.3%). ChatGPT provided significantly more academic references than Google web searches (93.3% vs. 46.6%). Conversely, Google web search cited more medical practice references (20% vs. 0%), single surgeon websites (26% vs. 0%), and government websites (6% vs. 0%) more frequently than ChatGPT.
ChatGPT performed similarly to Google searches for information about HA. Compared to Google, ChatGPT provided significantly more academic sources for its answers to patient questions.
Level IV.
本研究旨在比较由人工智能和大型语言模型在线程序Chat Generative Pre-Trained Transformer(ChatGPT)提供给患者的关于髋关节镜检查(HA)的回答的可靠性和准确性,与通过当代谷歌搜索获取的关于HA常见问题(FAQ)的回答进行比较。
在谷歌搜索和ChatGPT中输入“HA”,确定15个最常见的FAQ及其答案。在谷歌搜索中,FAQ从“人们也问”部分获取。向ChatGPT询问以提供15个最常见的FAQ及后续答案。Rothwell系统将问题归为10个副标题下。比较ChatGPT和谷歌搜索引擎的回答。
恢复时间线(23.3%)和技术细节(20%)是最常见的问题类别。ChatGPT在技术细节类别中产生的数据(33.3%对6.6%;p值 = 0.0455)比其他类别显著更多。对于谷歌网络搜索(46.6%)和ChatGPT(93.3%),大多数FAQ本质上是学术性的。ChatGPT提供的学术参考文献比谷歌网络搜索显著更多(93.3%对46.6%)。相反,谷歌网络搜索比ChatGPT更频繁地引用医疗实践参考文献(20%对0%)、单个外科医生网站(26%对0%)和政府网站(6%对0%)。
ChatGPT在搜索关于HA的信息方面表现与谷歌类似。与谷歌相比,ChatGPT为其回答患者问题提供了显著更多的学术来源。
四级。