Bernard Nathan, Sagawa Yoshimasa, Bier Nathalie, Lihoreau Thomas, Pazart Lionel, Tannou Thomas
Inserm CIC 1431, CHU Besançon, Besançon, F-25000, France.
Laboratoires de Neurosciences intégratives et clinique, unité de recherche EA 481, Université Marie et Louis Pasteur, INSERM, UMR 1322 LINC, Besançon, F-25000, France.
BMC Med Res Methodol. 2025 Mar 18;25(1):75. doi: 10.1186/s12874-025-02528-y.
Artificial intelligence (AI) tools are increasingly being used to assist researchers with various research tasks, particularly in the systematic review process. Elicit is one such tool that can generate a summary of the question asked, setting it apart from other AI tools. The aim of this study is to determine whether AI-assisted research using Elicit adds value to the systematic review process compared to traditional screening methods.
We compare the results from an umbrella review conducted independently of AI with the results of the AI-based searching using the same criteria. Elicit contribution was assessed based on three criteria: repeatability, reliability and accuracy. For repeatability the search process was repeated three times on Elicit (trial 1, trial 2, trial 3). For accuracy, articles obtained with Elicit were reviewed using the same inclusion criteria as the umbrella review. Reliability was assessed by comparing the number of publications with those without AI-based searches.
The repeatability test found 246,169 results and 172 results for the trials 1, 2, and 3 respectively. Concerning accuracy, 6 articles were included at the conclusion of the selection process. Regarding, revealed 3 common articles, 3 exclusively identified by Elicit and 17 exclusively identified by the AI-independent umbrella review search.
Our findings suggest that AI research assistants, like Elicit, can serve as valuable complementary tools for researchers when designing or writing systematic reviews. However, AI tools have several limitations and should be used with caution. When using AI tools, certain principles must be followed to maintain methodological rigour and integrity. Improving the performance of AI tools such as Elicit and contributing to the development of guidelines for their use during the systematic review process will enhance their effectiveness.
人工智能(AI)工具越来越多地用于协助研究人员完成各种研究任务,尤其是在系统评价过程中。Elicit就是这样一种工具,它可以生成所提问题的摘要,这使其有别于其他人工智能工具。本研究的目的是确定与传统筛选方法相比,使用Elicit进行人工智能辅助研究是否能为系统评价过程增加价值。
我们将独立于人工智能进行的一项综合性评价结果与使用相同标准的基于人工智能的检索结果进行比较。基于三个标准评估Elicit的贡献:可重复性、可靠性和准确性。对于可重复性,在Elicit上重复检索过程三次(试验1、试验2、试验3)。对于准确性,使用与综合性评价相同的纳入标准对通过Elicit获得的文章进行评审。通过比较有和没有基于人工智能检索的出版物数量来评估可靠性。
可重复性测试在试验1、试验2和试验3中分别发现了246,169条结果和172条结果。关于准确性,在筛选过程结束时纳入了6篇文章。关于可靠性,发现有3篇共同文章,3篇是Elicit专门识别出来的,17篇是独立于人工智能的综合性评价检索专门识别出来的。
我们的研究结果表明,像Elicit这样的人工智能研究助手在设计或撰写系统评价时可以作为研究人员有价值的补充工具。然而,人工智能工具存在一些局限性,应谨慎使用。使用人工智能工具时,必须遵循某些原则以保持方法的严谨性和完整性。提高Elicit等人工智能工具的性能并为其在系统评价过程中的使用制定指南将提高其有效性。