Suppr超能文献

人工智能聊天机器人的基准测试:评估它们识别被劫持医学期刊的准确性。

Benchmarking AI chatbots: assessing their accuracy in identifying hijacked medical journals.

作者信息

Hegedűs Mihály, Dadkhah Mehdi, Dávid Lóránt Dénes

机构信息

Department of Finance and Accounting, Tomori Pál College, Budapest, Hungary.

Chamber of Hungarian Auditors, Budapest, Hungary.

出版信息

Diagnosis (Berl). 2025 May 22. doi: 10.1515/dx-2025-0043.

Abstract

OBJECTIVES

The challenges posed by questionable journals to academia are very real, and being able to detect hijacked journals would be valuable to the research community. Using an artificial intelligence (AI) chatbot may be a promising approach to early detection. The purpose of this research is to analyze and benchmark the performance of different AI chatbots in identifying hijacked medical journals.

METHODS

This study utilized a dataset comprising 21 previously identified hijacked journals and 10 newly detected hijacked journals, alongside their respective legitimate versions. ChatGPT, Gemini, Copilot, DeepSeek, Qwen, Perplexity, and Claude were selected for benchmarking. Three question types were developed to assess AI chatbots' performance in providing information about hijacked journals, identifying hijacked websites, and verifying legitimate ones.

RESULTS

The results show that current AI chatbots can provide general information about hijacked journals, but cannot reliably identify either real or hijacked journal titles. While Copilot performed better than others, it was not error-free.

CONCLUSIONS

Current AI chatbots are not yet reliable for detecting hijacked journals and may inadvertently promote them.

摘要

目标

问题期刊给学术界带来的挑战非常现实,能够检测出被劫持的期刊对研究界将很有价值。使用人工智能(AI)聊天机器人可能是早期检测的一种有前景的方法。本研究的目的是分析和评估不同AI聊天机器人在识别被劫持的医学期刊方面的性能。

方法

本研究使用了一个数据集,该数据集包括21种先前确定的被劫持期刊和10种新检测到的被劫持期刊及其各自的合法版本。选择了ChatGPT、Gemini、Copilot、DeepSeek、Qwen、Perplexity和Claude进行基准测试。开发了三种问题类型,以评估AI聊天机器人在提供有关被劫持期刊的信息、识别被劫持网站以及验证合法网站方面的性能。

结果

结果表明,当前的AI聊天机器人可以提供有关被劫持期刊的一般信息,但无法可靠地识别真实或被劫持的期刊标题。虽然Copilot的表现优于其他机器人,但也并非没有错误。

结论

当前的AI聊天机器人在检测被劫持期刊方面尚不可靠,可能会无意中推广这些期刊。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验