Herzog Isabel, Mendiratta Dhruv, Para Ashok, Berg Ari, Kaushal Neil, Vives Michael
Rutgers New Jersey Medical School Newark New Jersey USA.
J Exp Orthop. 2024 Jun 13;11(3):e12057. doi: 10.1002/jeo2.12057. eCollection 2024 Jul.
Since its release in November 2022, Chat Generative Pre-Trained Transformer 3.5 (ChatGPT), a complex machine learning model, has garnered more than 100 million users worldwide. The aim of this study is to determine how well ChatGPT can generate novel systematic review ideas on topics within spine surgery.
ChatGPT was instructed to give ten novel systematic review ideas for five popular topics in spine surgery literature: microdiscectomy, laminectomy, spinal fusion, kyphoplasty and disc replacement. A comprehensive literature search was conducted in PubMed, CINAHL, EMBASE and Cochrane. The number of nonsystematic review articles and number of systematic review papers that had been published on each ChatGPT-generated idea were recorded.
Overall, ChatGPT had a 68% accuracy rate in creating novel systematic review ideas. More specifically, the accuracy rates were 80%, 80%, 40%, 70% and 70% for microdiscectomy, laminectomy, spinal fusion, kyphoplasty and disc replacement, respectively. However, there was a 32% rate of ChatGPT generating ideas for which there were 0 nonsystematic review articles published. There was a 71.4%, 50%, 22.2%, 50%, 62.5% and 51.2% success rate of generating novel systematic review ideas, for which there were also nonsystematic reviews published, for microdiscectomy, laminectomy, spinal fusion, kyphoplasty, disc replacement and overall, respectively.
ChatGPT generated novel systematic review ideas at an overall rate of 68%. ChatGPT can help identify knowledge gaps in spine research that warrant further investigation, when used under supervision of an experienced spine specialist. This technology can be erroneous and lacks intrinsic logic; so, it should never be used in isolation.
Not applicable.
自2022年11月发布以来,复杂的机器学习模型Chat生成式预训练变换器3.5(ChatGPT)在全球已获得超过1亿用户。本研究的目的是确定ChatGPT在生成脊柱外科领域新的系统评价思路方面的表现如何。
要求ChatGPT针对脊柱外科文献中五个热门主题给出十条新的系统评价思路:显微椎间盘切除术、椎板切除术、脊柱融合术、椎体后凸成形术和椎间盘置换术。在PubMed、CINAHL、EMBASE和Cochrane中进行了全面的文献检索。记录了关于每个ChatGPT生成的思路已发表的非系统评价文章数量和系统评价论文数量。
总体而言,ChatGPT在创建新的系统评价思路方面的准确率为68%。更具体地说,显微椎间盘切除术、椎板切除术、脊柱融合术、椎体后凸成形术和椎间盘置换术的准确率分别为80%、80%、40%、70%和70%。然而,ChatGPT生成的思路中有32%的情况是没有发表过非系统评价文章的。对于显微椎间盘切除术、椎板切除术、脊柱融合术、椎体后凸成形术、椎间盘置换术以及总体而言,在也有非系统评价发表的情况下,生成新的系统评价思路的成功率分别为71.4%、50%、22.2%、50%、62.5%和51.2%。
ChatGPT生成新的系统评价思路的总体成功率为68%。在经验丰富的脊柱专家监督下使用时,ChatGPT有助于识别脊柱研究中值得进一步研究的知识空白。这项技术可能存在错误且缺乏内在逻辑,因此绝不应单独使用。
不适用。