• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

大语言模型的政治倾向。

The political preferences of LLMs.

机构信息

ECL, Otago Polytechnic, Dunedin, New Zealand.

出版信息

PLoS One. 2024 Jul 31;19(7):e0306621. doi: 10.1371/journal.pone.0306621. eCollection 2024.

DOI:10.1371/journal.pone.0306621
PMID:39083484
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11290627/
Abstract

I report here a comprehensive analysis about the political preferences embedded in Large Language Models (LLMs). Namely, I administer 11 political orientation tests, designed to identify the political preferences of the test taker, to 24 state-of-the-art conversational LLMs, both closed and open source. When probed with questions/statements with political connotations, most conversational LLMs tend to generate responses that are diagnosed by most political test instruments as manifesting preferences for left-of-center viewpoints. This does not appear to be the case for five additional base (i.e. foundation) models upon which LLMs optimized for conversation with humans are built. However, the weak performance of the base models at coherently answering the tests' questions makes this subset of results inconclusive. Finally, I demonstrate that LLMs can be steered towards specific locations in the political spectrum through Supervised Fine-Tuning (SFT) with only modest amounts of politically aligned data, suggesting SFT's potential to embed political orientation in LLMs. With LLMs beginning to partially displace traditional information sources like search engines and Wikipedia, the societal implications of political biases embedded in LLMs are substantial.

摘要

我在此报告了一项关于大型语言模型(LLMs)中隐含政治倾向的综合分析。具体来说,我对 24 个先进的对话式 LLM(包括闭源和开源)进行了 11 项政治倾向测试,这些测试旨在识别测试者的政治倾向。当对具有政治含义的问题/陈述进行测试时,大多数对话式 LLM 往往会生成被大多数政治测试工具诊断为表现出左倾观点偏好的响应。但在构建针对人类对话进行优化的 LLM 所基于的另外五个基础模型(即基础模型)上,情况并非如此。然而,基础模型在连贯回答测试问题方面的表现不佳,使得这部分结果无法得出结论。最后,我通过仅使用少量具有政治倾向的数据进行监督微调(SFT),证明了 LLM 可以通过 SFT 被引导到政治光谱的特定位置,这表明 SFT 有可能在 LLM 中嵌入政治倾向。随着 LLM 开始部分取代传统信息源,如搜索引擎和维基百科,嵌入 LLM 中的政治偏见对社会的影响是巨大的。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3a56/11290627/33355f705e70/pone.0306621.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3a56/11290627/82e0211c8bdf/pone.0306621.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3a56/11290627/089e24e45058/pone.0306621.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3a56/11290627/9ded089544ef/pone.0306621.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3a56/11290627/dd8e5d20169a/pone.0306621.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3a56/11290627/7d26547a07d3/pone.0306621.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3a56/11290627/33355f705e70/pone.0306621.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3a56/11290627/82e0211c8bdf/pone.0306621.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3a56/11290627/089e24e45058/pone.0306621.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3a56/11290627/9ded089544ef/pone.0306621.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3a56/11290627/dd8e5d20169a/pone.0306621.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3a56/11290627/7d26547a07d3/pone.0306621.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3a56/11290627/33355f705e70/pone.0306621.g006.jpg

相似文献

1
The political preferences of LLMs.大语言模型的政治倾向。
PLoS One. 2024 Jul 31;19(7):e0306621. doi: 10.1371/journal.pone.0306621. eCollection 2024.
2
Assessing the Alignment of Large Language Models With Human Values for Mental Health Integration: Cross-Sectional Study Using Schwartz's Theory of Basic Values.评估大型语言模型与人类心理健康整合价值观的一致性:使用施瓦茨基本价值观理论的横断面研究。
JMIR Ment Health. 2024 Apr 9;11:e55988. doi: 10.2196/55988.
3
A Systematic Review of ChatGPT and Other Conversational Large Language Models in Healthcare.ChatGPT及其他对话式大语言模型在医疗保健领域的系统评价
medRxiv. 2024 Apr 27:2024.04.26.24306390. doi: 10.1101/2024.04.26.24306390.
4
OpenMedLM: prompt engineering can out-perform fine-tuning in medical question-answering with open-source large language models.OpenMedLM:在使用开源大语言模型进行医学问答时,基于提示的工程学可以胜过微调。
Sci Rep. 2024 Jun 19;14(1):14156. doi: 10.1038/s41598-024-64827-6.
5
Bidirectional Encoder Representations from Transformers-like large language models in patient safety and pharmacovigilance: A comprehensive assessment of causal inference implications.基于像 Transformer 这样的大型语言模型的双向编码器表示在患者安全和药物警戒中的应用:因果推断影响的综合评估。
Exp Biol Med (Maywood). 2023 Nov;248(21):1908-1917. doi: 10.1177/15353702231215895. Epub 2023 Dec 12.
6
Tell it like it is: When politically incorrect language promotes authenticity.直言不讳:不恰当的语言如何提升真实性。
J Pers Soc Psychol. 2020 Jul;119(1):75-103. doi: 10.1037/pspi0000206. Epub 2019 Aug 15.
7
A comprehensive evaluation of large Language models on benchmark biomedical text processing tasks.对基准生物医学文本处理任务中大型语言模型的全面评估。
Comput Biol Med. 2024 Mar;171:108189. doi: 10.1016/j.compbiomed.2024.108189. Epub 2024 Feb 20.
8
Model tuning or prompt Tuning? a study of large language models for clinical concept and relation extraction.模型调优还是提示调优?大型语言模型在临床概念和关系抽取中的应用研究。
J Biomed Inform. 2024 May;153:104630. doi: 10.1016/j.jbi.2024.104630. Epub 2024 Mar 26.
9
Evaluation of the Performance of Generative AI Large Language Models ChatGPT, Google Bard, and Microsoft Bing Chat in Supporting Evidence-Based Dentistry: Comparative Mixed Methods Study.评估生成式 AI 大语言模型 ChatGPT、Google Bard 和 Microsoft Bing Chat 在支持循证牙科方面的性能:比较混合方法研究。
J Med Internet Res. 2023 Dec 28;25:e51580. doi: 10.2196/51580.
10
Evaluating the persuasive influence of political microtargeting with large language models.评估大型语言模型进行政治微目标推送的说服力影响。
Proc Natl Acad Sci U S A. 2024 Jun 11;121(24):e2403116121. doi: 10.1073/pnas.2403116121. Epub 2024 Jun 7.

引用本文的文献

1
AI-assisted grading and personalized feedback in large political science classes: Results from randomized controlled trials.大型政治学课程中的人工智能辅助评分与个性化反馈:随机对照试验的结果
PLoS One. 2025 Aug 19;20(8):e0328041. doi: 10.1371/journal.pone.0328041. eCollection 2025.
2
A publicly available benchmark for assessing large language models' ability to predict how humans balance self-interest and the interest of others.一个公开可用的基准,用于评估大语言模型预测人类如何平衡自身利益和他人利益的能力。
Sci Rep. 2025 Jul 1;15(1):21428. doi: 10.1038/s41598-025-01715-7.
3
Measuring gender and racial biases in large language models: Intersectional evidence from automated resume evaluation.

本文引用的文献

1
Diminished diversity-of-thought in a standard large language model.标准大语言模型中思想多样性的降低。
Behav Res Methods. 2024 Sep;56(6):5754-5770. doi: 10.3758/s13428-023-02307-x. Epub 2024 Jan 9.
2
ChatGPT outperforms crowd workers for text-annotation tasks.在文本注释任务中,ChatGPT的表现优于众包工作者。
Proc Natl Acad Sci U S A. 2023 Jul 25;120(30):e2305016120. doi: 10.1073/pnas.2305016120. Epub 2023 Jul 18.
3
Accuracy comparison across face recognition algorithms: Where are we on measuring race bias?不同人脸识别算法的准确性比较:在衡量种族偏见方面我们处于什么位置?
衡量大语言模型中的性别和种族偏见:来自自动化简历评估的交叉性证据。
PNAS Nexus. 2025 Mar 12;4(3):pgaf089. doi: 10.1093/pnasnexus/pgaf089. eCollection 2025 Mar.
4
Asking an AI for salary negotiation advice is a matter of concern: Controlled experimental perturbation of ChatGPT for protected and non-protected group discrimination on a contextual task with no clear ground truth answers.向人工智能寻求薪资谈判建议是一个值得关注的问题:在没有明确真实答案的情境任务中,对ChatGPT进行受控实验扰动,以检测其对受保护群体和非受保护群体的歧视。
PLoS One. 2025 Feb 7;20(2):e0318500. doi: 10.1371/journal.pone.0318500. eCollection 2025.
IEEE Trans Biom Behav Identity Sci. 2021 Jan;3(1):101-111. doi: 10.1109/TBIOM.2020.3027269. Epub 2020 Sep 29.
4
Wide range screening of algorithmic bias in word embedding models using large sentiment lexicons reveals underreported bias types.使用大型情感词典对词嵌入模型中的算法偏差进行广泛筛选,揭示了报告不足的偏差类型。
PLoS One. 2020 Apr 21;15(4):e0231189. doi: 10.1371/journal.pone.0231189. eCollection 2020.
5
Semantics derived automatically from language corpora contain human-like biases.从语言语料库中自动推导出来的语义包含类人偏见。
Science. 2017 Apr 14;356(6334):183-186. doi: 10.1126/science.aal4230.