Suppr超能文献

ChatGPT 作为研究科学家:探究 GPT 在研究馆员、研究伦理学家、数据生成器和数据预测者方面的能力。

ChatGPT as Research Scientist: Probing GPT's capabilities as a Research Librarian, Research Ethicist, Data Generator, and Data Predictor.

机构信息

Cangrade, Inc., Watertown, MA 02472.

Information School, University of Washington, Seattle, WA 98195.

出版信息

Proc Natl Acad Sci U S A. 2024 Aug 27;121(35):e2404328121. doi: 10.1073/pnas.2404328121. Epub 2024 Aug 20.

Abstract

How good a research scientist is ChatGPT? We systematically probed the capabilities of GPT-3.5 and GPT-4 across four central components of the scientific process: as a Research Librarian, Research Ethicist, Data Generator, and Novel Data Predictor, using psychological science as a testing field. In Study 1 (Research Librarian), unlike human researchers, GPT-3.5 and GPT-4 hallucinated, authoritatively generating fictional references 36.0% and 5.4% of the time, respectively, although GPT-4 exhibited an evolving capacity to acknowledge its fictions. In Study 2 (Research Ethicist), GPT-4 (though not GPT-3.5) proved capable of detecting violations like p-hacking in fictional research protocols, correcting 88.6% of blatantly presented issues, and 72.6% of subtly presented issues. In Study 3 (Data Generator), both models consistently replicated patterns of cultural bias previously discovered in large language corpora, indicating that ChatGPT can simulate known results, an antecedent to usefulness for both data generation and skills like hypothesis generation. Contrastingly, in Study 4 (Novel Data Predictor), neither model was successful at predicting new results absent in their training data, and neither appeared to leverage substantially new information when predicting more vs. less novel outcomes. Together, these results suggest that GPT is a flawed but rapidly improving librarian, a decent research ethicist already, capable of data generation in simple domains with known characteristics but poor at predicting novel patterns of empirical data to aid future experimentation.

摘要

ChatGPT 作为研究型科学家表现如何?我们系统地探究了 GPT-3.5 和 GPT-4 在科学研究过程的四个核心环节中的表现:作为研究型图书管理员、研究伦理学家、数据生成器和新颖数据预测器,以心理学领域为测试领域。在研究 1(研究型图书管理员)中,GPT-3.5 和 GPT-4 与人类研究人员不同,分别有 36.0%和 5.4%的时间会产生幻觉,权威地生成虚构的参考文献,尽管 GPT-4 表现出了不断承认其虚构内容的能力。在研究 2(研究伦理学家)中,GPT-4(而非 GPT-3.5)能够检测到像 p-值操纵这样的虚构研究方案中的违规行为,纠正 88.6%的明显问题和 72.6%的微妙问题。在研究 3(数据生成器)中,两个模型都一致地复制了先前在大型语言语料库中发现的文化偏见模式,表明 ChatGPT 可以模拟已知结果,这是生成数据和假设生成等技能的有用性的前提。相比之下,在研究 4(新颖数据预测器)中,两个模型都无法成功预测其训练数据中不存在的新结果,并且在预测更不新颖的结果时,它们似乎都没有利用大量新信息。综合来看,这些结果表明 GPT 是一个有缺陷但迅速改进的图书管理员,已经是一位不错的研究伦理学家,能够在具有已知特征的简单领域生成数据,但在预测新颖的实证数据模式以帮助未来实验方面表现不佳。

相似文献

4
GPT is an effective tool for multilingual psychological text analysis.GPT 是一种用于多语言心理文本分析的有效工具。
Proc Natl Acad Sci U S A. 2024 Aug 20;121(34):e2308950121. doi: 10.1073/pnas.2308950121. Epub 2024 Aug 12.

本文引用的文献

1
Autonomous chemical research with large language models.大语言模型驱动的自主化学研究。
Nature. 2023 Dec;624(7992):570-578. doi: 10.1038/s41586-023-06792-0. Epub 2023 Dec 20.
4
AI and the transformation of social science research.人工智能与社会科学研究的变革。
Science. 2023 Jun 16;380(6650):1108-1109. doi: 10.1126/science.adi1778. Epub 2023 Jun 15.
5
Can AI language models replace human participants?人工智能语言模型能否替代人类参与者?
Trends Cogn Sci. 2023 Jul;27(7):597-600. doi: 10.1016/j.tics.2023.04.008. Epub 2023 May 10.
6
Using cognitive psychology to understand GPT-3.利用认知心理学理解 GPT-3。
Proc Natl Acad Sci U S A. 2023 Feb 7;120(6):e2218523120. doi: 10.1073/pnas.2218523120. Epub 2023 Feb 2.
9
Replicability, Robustness, and Reproducibility in Psychological Science.心理科学中的可重复性、稳健性和再现性。
Annu Rev Psychol. 2022 Jan 4;73:719-748. doi: 10.1146/annurev-psych-020821-114157. Epub 2021 Oct 19.
10
Highly accurate protein structure prediction with AlphaFold.利用 AlphaFold 进行高精度蛋白质结构预测。
Nature. 2021 Aug;596(7873):583-589. doi: 10.1038/s41586-021-03819-2. Epub 2021 Jul 15.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验