• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

从名字推断性别:比较Genderize、性别API和性别R包对不同国籍作者的识别准确率。

Inferring gender from first names: Comparing the accuracy of Genderize, Gender API, and the gender R package on authors of diverse nationality.

作者信息

VanHelene Alexander D, Khatri Ishaani, Hilton C Beau, Mishra Sanjay, Gamsiz Uzun Ece D, Warner Jeremy L

机构信息

Lifespan Cancer Institute, Rhode Island Hospital, Providence, Rhode Island, United States of America.

Center for Clinical Cancer Informatics and Data Science, Legorreta Cancer Center, Brown University, Providence, Rhode Island.

出版信息

PLOS Digit Health. 2024 Oct 29;3(10):e0000456. doi: 10.1371/journal.pdig.0000456. eCollection 2024 Oct.

DOI:10.1371/journal.pdig.0000456
PMID:39471154
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11521266/
Abstract

Meta-researchers commonly leverage tools that infer gender from first names, especially when studying gender disparities. However, tools vary in their accuracy, ease of use, and cost. The objective of this study was to compare the accuracy and cost of the commercial software Genderize and Gender API, and the open-source gender R package. Differences in binary gender prediction accuracy between the three services were evaluated. Gender prediction accuracy was tested on a multi-national dataset of 32,968 gender-labeled clinical trial authors. Additionally, two datasets from previous studies with 5779 and 6131 names, respectively, were re-evaluated with modern implementations of Genderize and Gender API. The gender inference accuracy of Genderize and Gender API were compared, both with and without supplying trialists' country of origin in the API call. The accuracy of the gender R package was only evaluated without supplying countries of origin. The accuracy of Genderize, Gender API, and the gender R package were defined as the percentage of correct gender predictions. Accuracy differences between methods were evaluated using McNemar's test. Genderize and Gender API demonstrated 96.6% and 96.1% accuracy, respectively, when countries of origin were not supplied in the API calls. Genderize and Gender API achieved the highest accuracy when predicting the gender of German authors with accuracies greater than 98%. Genderize and Gender API were least accurate with South Korean, Chinese, Singaporean, and Taiwanese authors, demonstrating below 82% accuracy. Genderize can provide similar accuracy to Gender API while being 4.85x less expensive. The gender R package achieved below 86% accuracy on the full dataset. In the replication studies, Genderize and gender API demonstrated better performance than in the original publications. Our results indicate that Genderize and Gender API achieve similar accuracy on a multinational dataset. The gender R package is uniformly less accurate than Genderize and Gender API.

摘要

元研究人员通常会利用根据名字推断性别的工具,尤其是在研究性别差异时。然而,这些工具在准确性、易用性和成本方面存在差异。本研究的目的是比较商业软件Genderize和Gender API以及开源性别R包的准确性和成本。评估了这三种服务在二元性别预测准确性上的差异。在一个包含32968名有性别标注的临床试验作者的跨国数据集中测试了性别预测准确性。此外,分别用Genderize和Gender API的现代版本对之前两项研究中的两个数据集(分别有5779个和6131个名字)进行了重新评估。比较了在API调用中提供和不提供试验者原籍国两种情况下Genderize和Gender API的性别推断准确性。仅在不提供原籍国的情况下评估了性别R包的准确性。将Genderize、Gender API和性别R包的准确性定义为正确性别预测的百分比。使用McNemar检验评估方法之间的准确性差异。当在API调用中不提供原籍国时,Genderize和Gender API的准确率分别为96.6%和96.1%。在预测德国作者的性别时,Genderize和Gender API的准确率最高,超过98%。对于韩国、中国、新加坡和中国台湾地区的作者,Genderize和Gender API的准确率最低,低于82%。Genderize能提供与Gender API相似的准确性,但其成本低4.85倍。在完整数据集中,性别R包的准确率低于86%。在复制研究中,Genderize和Gender API的表现比原始出版物中的更好。我们的结果表明,在跨国数据集中,Genderize和Gender API的准确性相似。性别R包的准确性始终低于Genderize和Gender API。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8816/11521266/59fd17b5d5b6/pdig.0000456.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8816/11521266/258e608a9513/pdig.0000456.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8816/11521266/491761d47c77/pdig.0000456.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8816/11521266/9e45b800c3be/pdig.0000456.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8816/11521266/59fd17b5d5b6/pdig.0000456.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8816/11521266/258e608a9513/pdig.0000456.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8816/11521266/491761d47c77/pdig.0000456.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8816/11521266/9e45b800c3be/pdig.0000456.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8816/11521266/59fd17b5d5b6/pdig.0000456.g004.jpg

相似文献

1
Inferring gender from first names: Comparing the accuracy of Genderize, Gender API, and the gender R package on authors of diverse nationality.从名字推断性别:比较Genderize、性别API和性别R包对不同国籍作者的识别准确率。
PLOS Digit Health. 2024 Oct 29;3(10):e0000456. doi: 10.1371/journal.pdig.0000456. eCollection 2024 Oct.
2
Using genderize.io to infer the gender of first names: how to improve the accuracy of the inference.使用 genderize.io 推断名字的性别:如何提高推断的准确性。
J Med Libr Assoc. 2021 Oct 1;109(4):609-612. doi: 10.5195/jmla.2021.1252.
3
Performance of gender detection tools: a comparative study of name-to-gender inference services.性别检测工具的性能:姓名到性别推断服务的比较研究。
J Med Libr Assoc. 2021 Jul 1;109(3):414-421. doi: 10.5195/jmla.2021.1185.
4
Erratum to "Performance of gender detection tools: a comparative study of name-to-gender inference services," 2021;109(3):414-21 and "Using genderize.io to infer the gender of first names: how to improve the accuracy of the inference," 2021;109(4):609-12.《性别检测工具的性能:姓名到性别的推理服务的比较研究》(2021年;109(3):414 - 21)及《使用genderize.io推断名字的性别:如何提高推断的准确性》(2021年;109(4):609 - 12)的勘误
J Med Libr Assoc. 2022 Apr 1;110(2):E32. doi: 10.5195/jmla.2022.1528.
5
How accurate are gender detection tools in predicting the gender for Chinese names? A study with 20,000 given names in Pinyin format.性别检测工具在预测中文名字的性别方面有多准确?一项针对 20000 个拼音形式的名字的研究。
J Med Libr Assoc. 2022 Apr 1;110(2):205-211. doi: 10.5195/jmla.2022.1289.
6
Does diversity beget diversity? A scientometric analysis of over 150,000 studies and 49,000 authors published in high-impact medical journals between 2007 and 2022.多样性会催生多样性吗?对2007年至2022年间发表在高影响力医学期刊上的15万多项研究和4.9万多名作者的科学计量分析。
medRxiv. 2024 Mar 22:2024.03.21.24304695. doi: 10.1101/2024.03.21.24304695.
7
Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.在流行地区,服用抗叶酸抗疟药物的人群中,叶酸补充剂与疟疾易感性和严重程度的关系。
Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.
8
Gender and Nationality Trends in Manuscripts Published in Prominent Gastroenterology Journals Between 1997 and 2017.1997 年至 2017 年期间,知名胃肠病学期刊发表的手稿中的性别和国籍趋势。
Dig Dis Sci. 2022 Feb;67(2):367-376. doi: 10.1007/s10620-021-07021-2. Epub 2021 May 18.
9
Persistent Gender Disparity in Authorship of Arthroscopic Surgery Research.关节镜手术研究作者署名方面持续存在的性别差异。
Arthrosc Sports Med Rehabil. 2025 Jan 10;7(2):101076. doi: 10.1016/j.asmr.2025.101076. eCollection 2025 Apr.
10
Diversity in the medical research ecosystem: a descriptive scientometric analysis of over 49 000 studies and 150 000 authors published in high-impact medical journals between 2007 and 2022.医学研究生态系统中的多样性:对2007年至2022年期间发表在高影响力医学期刊上的49000多项研究和150000多名作者的描述性科学计量分析。
BMJ Open. 2025 Jan 22;15(1):e086982. doi: 10.1136/bmjopen-2024-086982.

引用本文的文献

1
Marked gender inequity in the invited speakers at the European College of Veterinary Surgeons annual scientific congress 2012-2022.2012年至2022年欧洲兽医外科学会年度科学大会受邀演讲者中存在明显的性别不平等现象。
PLoS One. 2025 Sep 2;20(9):e0329147. doi: 10.1371/journal.pone.0329147. eCollection 2025.
2
Towards inclusive authorship: Analyzing author representation in PLOS Global Public Health front matter content.迈向包容性作者身份:分析《公共科学图书馆·全球公共卫生》卷首内容中的作者代表性
PLOS Glob Public Health. 2025 Aug 18;5(8):e0005066. doi: 10.1371/journal.pgph.0005066. eCollection 2025.
3
Scientific publications that use promotional language in the abstract receive more citations and public attention.

本文引用的文献

1
A new era of the Asian clinical research network: a report from the ATLAS international symposium.亚洲临床研究网络新纪元:ATLAS 国际研讨会报告。
Jpn J Clin Oncol. 2023 Jun 29;53(7):619-628. doi: 10.1093/jjco/hyad033.
2
Scientific authorship by gender: trends before and during a global pandemic.按性别划分的科学论文署名情况:全球大流行之前及期间的趋势
Humanit Soc Sci Commun. 2022;9(1):348. doi: 10.1057/s41599-022-01365-4. Epub 2022 Oct 4.
3
Global trends in oncology research: A mixed-methods study of publications and clinical trials from 2010 to 2019.
在摘要中使用宣传性语言的科学出版物会获得更多引用和公众关注。
Commun Psychol. 2025 Aug 5;3(1):118. doi: 10.1038/s44271-025-00293-8.
4
Women climate scientists are connected, productive, and successful but have shorter careers.女性气候科学家人脉广泛、成果丰硕且成就斐然,但职业生涯较短。
Proc Natl Acad Sci U S A. 2025 Jul;122(26):e2506023122. doi: 10.1073/pnas.2506023122. Epub 2025 Jun 23.
5
Gender equality in leadership of HIV care cascade clinical trials: A methodological study.艾滋病毒治疗级联临床试验领导力中的性别平等:一项方法学研究。
HIV Med. 2025 Sep;26(9):1356-1366. doi: 10.1111/hiv.70062. Epub 2025 Jun 20.
6
Gender Composition of Invited Speakers and Session Chairs at American Society for Apheresis Annual Meetings Between 2019 and 2024.2019年至2024年美国血液单采学会年会特邀演讲者和会议主席的性别构成
J Clin Apher. 2025 Apr;40(2):e70015. doi: 10.1002/jca.70015.
7
Comparative analysis of automatic gender detection from names: evaluating the stability and performance of ChatGPT Namsor, and Gender-API.从名字进行自动性别检测的比较分析:评估ChatGPT、Namsor和Gender-API的稳定性和性能。
PeerJ Comput Sci. 2024 Oct 17;10:e2378. doi: 10.7717/peerj-cs.2378. eCollection 2024.
8
Gender Differences in Citation Rate: An Analysis of Randomized Controlled Trials in Nephrology High-Impact Journals Over Two Decades.引用率的性别差异:对二十多年来肾脏病学高影响力期刊中随机对照试验的分析
Clin J Am Soc Nephrol. 2024 Nov 1;19(11):1453-1460. doi: 10.2215/CJN.0000000000000511. Epub 2024 Aug 6.
全球肿瘤学研究趋势:2010 年至 2019 年出版物和临床试验的混合方法研究。
Cancer Rep (Hoboken). 2023 Jan;6(1):e1650. doi: 10.1002/cnr2.1650. Epub 2022 Jun 11.
4
How accurate are gender detection tools in predicting the gender for Chinese names? A study with 20,000 given names in Pinyin format.性别检测工具在预测中文名字的性别方面有多准确?一项针对 20000 个拼音形式的名字的研究。
J Med Libr Assoc. 2022 Apr 1;110(2):205-211. doi: 10.5195/jmla.2022.1289.
5
Using genderize.io to infer the gender of first names: how to improve the accuracy of the inference.使用 genderize.io 推断名字的性别:如何提高推断的准确性。
J Med Libr Assoc. 2021 Oct 1;109(4):609-612. doi: 10.5195/jmla.2021.1252.
6
State-Level Sexism and Gender Disparities in Health Care Access and Quality in the United States.美国医疗保健获取与质量方面的州级性别歧视及性别差异
J Health Soc Behav. 2022 Mar;63(1):2-18. doi: 10.1177/00221465211058153. Epub 2021 Nov 18.
7
Performance of gender detection tools: a comparative study of name-to-gender inference services.性别检测工具的性能:姓名到性别推断服务的比较研究。
J Med Libr Assoc. 2021 Jul 1;109(3):414-421. doi: 10.5195/jmla.2021.1185.
8
Women's Experiences of Promotion and Tenure in Academic Medicine and Potential Implications for Gender Disparities in Career Advancement: A Qualitative Analysis.女性在学术医学领域的晋升经历及其对职业发展中性别差异的潜在影响:一项定性分析。
JAMA Netw Open. 2021 Sep 1;4(9):e2125843. doi: 10.1001/jamanetworkopen.2021.25843.
9
Are female authors under-represented in primary healthcare and general internal medicine journals?在初级卫生保健和普通内科医学期刊中,女性作者的占比是否过低?
Br J Gen Pract. 2021 Jun 24;71(708):302. doi: 10.3399/bjgp21X716249. Print 2021 Jul.
10
Gender Disparity in Citations in High-Impact Journal Articles.高影响力期刊文章中的引文存在性别差异。
JAMA Netw Open. 2021 Jul 1;4(7):e2114509. doi: 10.1001/jamanetworkopen.2021.14509.