• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

探索中国大语言模型的职业偏见与刻板印象。

Exploring the occupational biases and stereotypes of Chinese large language models.

作者信息

Jiang Leilei, Zhu Guixiang, Sun Jianshan, Cao Jie, Wu Jia

机构信息

College of Management, Hefei University of Technology, Hefei, 230009, China.

College of Information Engineering, Nanjing University of Finance and Economic, Nanjing, 210023, China.

出版信息

Sci Rep. 2025 May 29;15(1):18777. doi: 10.1038/s41598-025-03893-w.

DOI:10.1038/s41598-025-03893-w
PMID:40436980
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12119973/
Abstract

Large Language Models (LLMs) are transforming various aspects of our daily lives and work through their generated content, known as Artificial Intelligence Generated Content (AIGC). To effectively harness this change, it is essential to understand the limitations within these models. While extensive prior research has addressed biases in OpenAI's ChatGPT, limited attention has been given to biases present in Chinese Large Language Models (C-LLMs). This study systematically examines biases in five representative C-LLMs. We collected 90 Chinese surnames derived from authoritative demographic statistics and 12 occupations covering various professional sectors as input prompts. Each prompt was generated three times by the C-LLMs, resulting in a dataset comprising 16,200 generated personal profiles. We then evaluated these profiles for biases regarding gender, region, age, and educational background. Our findings reveal that the content produced by each examined C-LLMs exhibits significant gender and regional biases, as well as age and educational stereotypes. Notably, while most models can generate some unbiased content, ChatGLM stands out as the exception. In contrast, Tongyiqianwen is the only model that may refuse to generate certain content, due to its strong privacy protection mechanisms. We also further analyze the underlying mechanisms of bias formation by examining different stages of the model lifecycle and considering the unique characteristics of the Chinese linguistic and sociocultural context. This paper will contribute substantially to the literature on biases in C-LLMs and provide important insights for users aiming to utilize these models more effectively and ethically.

摘要

大语言模型(LLMs)正在通过其生成的内容,即人工智能生成内容(AIGC),改变我们日常生活和工作的各个方面。为了有效利用这一变化,了解这些模型的局限性至关重要。虽然之前有大量研究探讨了OpenAI的ChatGPT中的偏差,但对中国大语言模型(C-LLMs)中存在的偏差关注较少。本研究系统地考察了五个有代表性的C-LLMs中的偏差。我们从权威人口统计数据中收集了90个中国姓氏,并选取了涵盖各个专业领域的12种职业作为输入提示。每个提示由C-LLMs生成三次,从而得到一个包含16200个生成的个人资料的数据集。然后,我们评估了这些资料在性别、地区、年龄和教育背景方面的偏差。我们的研究结果表明,每个被考察的C-LLMs生成的内容都存在显著的性别和地区偏差,以及年龄和教育刻板印象。值得注意的是,虽然大多数模型都能生成一些无偏差的内容,但ChatGLM是个例外。相比之下,由于其强大的隐私保护机制,通义千问是唯一可能拒绝生成某些内容的模型。我们还通过考察模型生命周期的不同阶段,并考虑中国语言和社会文化背景的独特特征,进一步分析了偏差形成的潜在机制。本文将为有关C-LLMs偏差的文献做出重大贡献,并为旨在更有效、更符合道德地使用这些模型的用户提供重要见解。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ea36/12119973/25a4b8c8baf4/41598_2025_3893_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ea36/12119973/8ec8b5601023/41598_2025_3893_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ea36/12119973/199a8319d7eb/41598_2025_3893_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ea36/12119973/51f42f46f2cc/41598_2025_3893_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ea36/12119973/dda039c92bec/41598_2025_3893_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ea36/12119973/444d71b39991/41598_2025_3893_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ea36/12119973/e1bff415b762/41598_2025_3893_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ea36/12119973/47b88fb250b0/41598_2025_3893_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ea36/12119973/869bd3aad53c/41598_2025_3893_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ea36/12119973/25a4b8c8baf4/41598_2025_3893_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ea36/12119973/8ec8b5601023/41598_2025_3893_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ea36/12119973/199a8319d7eb/41598_2025_3893_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ea36/12119973/51f42f46f2cc/41598_2025_3893_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ea36/12119973/dda039c92bec/41598_2025_3893_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ea36/12119973/444d71b39991/41598_2025_3893_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ea36/12119973/e1bff415b762/41598_2025_3893_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ea36/12119973/47b88fb250b0/41598_2025_3893_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ea36/12119973/869bd3aad53c/41598_2025_3893_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ea36/12119973/25a4b8c8baf4/41598_2025_3893_Fig9_HTML.jpg

相似文献

1
Exploring the occupational biases and stereotypes of Chinese large language models.探索中国大语言模型的职业偏见与刻板印象。
Sci Rep. 2025 May 29;15(1):18777. doi: 10.1038/s41598-025-03893-w.
2
Bias of AI-generated content: an examination of news produced by large language models.人工智能生成内容的偏差:对大语言模型生成的新闻的审视
Sci Rep. 2024 Mar 4;14(1):5224. doi: 10.1038/s41598-024-55686-2.
3
Fighting reviewer fatigue or amplifying bias? Considerations and recommendations for use of ChatGPT and other large language models in scholarly peer review.对抗审稿人疲劳还是加剧偏见?关于在学术同行评审中使用ChatGPT和其他大语言模型的思考与建议。
Res Integr Peer Rev. 2023 May 18;8(1):4. doi: 10.1186/s41073-023-00133-5.
4
What's in a Name? Experimental Evidence of Gender Bias in Recommendation Letters Generated by ChatGPT.名字里的乾坤:ChatGPT 生成的推荐信中的性别偏见的实验证据。
J Med Internet Res. 2024 Mar 5;26:e51837. doi: 10.2196/51837.
5
Evaluating the Influence of Role-Playing Prompts on ChatGPT's Misinformation Detection Accuracy: Quantitative Study.评估角色扮演提示对 ChatGPT 错误信息检测准确率的影响:定量研究。
JMIR Infodemiology. 2024 Sep 26;4:e60678. doi: 10.2196/60678.
6
Examining the Role of Large Language Models in Orthopedics: Systematic Review.检查大型语言模型在骨科中的作用:系统评价。
J Med Internet Res. 2024 Nov 15;26:e59607. doi: 10.2196/59607.
7
Academic Surgery in the Era of Large Language Models: A Review.大语言模型时代的外科学术:综述。
JAMA Surg. 2024 Apr 1;159(4):445-450. doi: 10.1001/jamasurg.2023.6496.
8
Artificial Intelligence in Dental Education: Opportunities and Challenges of Large Language Models and Multimodal Foundation Models.人工智能在牙科教育中的应用:大型语言模型和多模态基础模型的机遇与挑战。
JMIR Med Educ. 2024 Sep 27;10:e52346. doi: 10.2196/52346.
9
Fighting reviewer fatigue or amplifying bias? Considerations and recommendations for use of ChatGPT and other Large Language Models in scholarly peer review.对抗审稿人疲劳还是放大偏见?关于在学术同行评审中使用ChatGPT和其他大语言模型的考量与建议。
Res Sq. 2023 Feb 20:rs.3.rs-2587766. doi: 10.21203/rs.3.rs-2587766/v1.
10
Assessing the performance of large language models (LLMs) in answering medical questions regarding breast cancer in the Chinese context.评估大语言模型(LLMs)在中国背景下回答有关乳腺癌医学问题的表现。
Digit Health. 2024 Oct 7;10:20552076241284771. doi: 10.1177/20552076241284771. eCollection 2024 Jan-Dec.

本文引用的文献

1
MedChatZH: A tuning LLM for traditional Chinese medicine consultations.医聊 ChatZH:一个用于中医咨询的调优大语言模型。
Comput Biol Med. 2024 Apr;172:108290. doi: 10.1016/j.compbiomed.2024.108290. Epub 2024 Mar 13.
2
Improving Fairness in AI Models on Electronic Health Records: The Case for Federated Learning Methods.提高电子健康记录人工智能模型的公平性:联邦学习方法的案例
FAccT 23 (2023). 2023 Jun;2023:1599-1608. doi: 10.1145/3593013.3594102. Epub 2023 Jun 12.
3
How Does ChatGPT Perform on the United States Medical Licensing Examination (USMLE)? The Implications of Large Language Models for Medical Education and Knowledge Assessment.
ChatGPT在美国医师执照考试(USMLE)中的表现如何?大语言模型对医学教育和知识评估的影响。
JMIR Med Educ. 2023 Feb 8;9:e45312. doi: 10.2196/45312.
4
Online Active Learning for Drifting Data Streams.针对漂移数据流的在线主动学习
IEEE Trans Neural Netw Learn Syst. 2023 Jan;34(1):186-200. doi: 10.1109/TNNLS.2021.3091681. Epub 2023 Jan 5.
5
Thinking is for doing: portraits of social cognition from daguerreotype to laserphoto.思考是为了行动:从银版照相到数码照片的社会认知画像
J Pers Soc Psychol. 1992 Dec;63(6):877-89. doi: 10.1037//0022-3514.63.6.877.