Suppr超能文献

多样性与包容性:开放数据隐藏的额外益处。

Diversity and inclusion: A hidden additional benefit of Open Data.

作者信息

Charpignon Marie-Laure, Celi Leo Anthony, Cobanaj Marisa, Eber Rene, Fiske Amelia, Gallifant Jack, Li Chenyu, Lingamallu Gurucharan, Petushkov Anton, Pierce Robin

机构信息

Institute for Data, Systems, and Society, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America.

Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America.

出版信息

PLOS Digit Health. 2024 Jul 23;3(7):e0000486. doi: 10.1371/journal.pdig.0000486. eCollection 2024 Jul.

Abstract

The recent imperative by the National Institutes of Health to share scientific data publicly underscores a significant shift in academic research. Effective as of January 2023, it emphasizes that transparency in data collection and dedicated efforts towards data sharing are prerequisites for translational research, from the lab to the bedside. Given the role of data access in mitigating potential bias in clinical models, we hypothesize that researchers who leverage open-access datasets rather than privately-owned ones are more diverse. In this brief report, we proposed to test this hypothesis in the transdisciplinary and expanding field of artificial intelligence (AI) for critical care. Specifically, we compared the diversity among authors of publications leveraging open datasets, such as the commonly used MIMIC and eICU databases, with that among authors of publications relying exclusively on private datasets, unavailable to other research investigators (e.g., electronic health records from ICU patients accessible only to Mayo Clinic analysts). To measure the extent of author diversity, we characterized gender balance as well as the presence of researchers from low- and middle-income countries (LMIC) and minority-serving institutions (MSI) located in the United States (US). Our comparative analysis revealed a greater contribution of authors from LMICs and MSIs among researchers leveraging open critical care datasets (treatment group) than among those relying exclusively on private data resources (control group). The participation of women was similar between the two groups, albeit slightly larger in the former. Notably, although over 70% of all articles included at least one author inferred to be a woman, less than 25% had a woman as a first or last author. Importantly, we found that the proportion of authors from LMICs was substantially higher in the treatment than in the control group (10.1% vs. 6.2%, p<0.001), including as first and last authors. Moreover, we found that the proportion of US-based authors affiliated with a MSI was 1.5 times higher among articles in the treatment than in the control group, suggesting that open data resources attract a larger pool of participants from minority groups (8.6% vs. 5.6%, p<0.001). Thus, our study highlights the valuable contribution of the Open Data strategy to underrepresented groups, while also quantifying persisting gender gaps in academic and clinical research at the intersection of computer science and healthcare. In doing so, we hope our work points to the importance of extending open data practices in deliberate and systematic ways.

摘要

美国国立卫生研究院最近要求公开分享科学数据,这突显了学术研究的重大转变。自2023年1月起生效,该要求强调数据收集的透明度以及为数据共享所做的专门努力是从实验室到临床的转化研究的先决条件。鉴于数据获取在减轻临床模型中潜在偏差方面的作用,我们假设利用开放获取数据集而非私有数据集的研究人员更加多样化。在本简要报告中,我们提议在跨学科且不断发展的重症监护人工智能(AI)领域验证这一假设。具体而言,我们比较了利用开放数据集(如常用的MIMIC和eICU数据库)的出版物作者与仅依赖其他研究人员无法获取的私有数据集(例如仅梅奥诊所分析师可访问的ICU患者电子健康记录)的出版物作者之间的多样性。为了衡量作者多样性的程度,我们刻画了性别平衡以及来自低收入和中等收入国家(LMIC)的研究人员以及美国少数族裔服务机构(MSI)的研究人员的情况。我们的比较分析显示,与仅依赖私有数据资源的研究人员(对照组)相比,利用开放重症监护数据集的研究人员(治疗组)中来自LMIC和MSI的作者贡献更大。两组中女性的参与度相似,尽管前者略高。值得注意的是,尽管所有文章中超过70%至少有一位被推断为女性的作者,但不到25%的文章以女性为第一或最后作者。重要的是,我们发现治疗组中来自LMIC的作者比例显著高于对照组(10.1%对6.2%,p<0.001),包括第一和最后作者。此外,我们发现治疗组文章中隶属于MSI的美国作者比例比对照组高1.5倍,这表明开放数据资源吸引了更多来自少数群体的参与者(8.6%对5.

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fb51/11265679/1a2f91ef51d8/pdig.0000486.g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验