• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

医学影像与数据资源中心开放数据共享库中人口统计学代表性的纵向评估。

Longitudinal assessment of demographic representativeness in the Medical Imaging and Data Resource Center open data commons.

作者信息

Whitney Heather M, Baughan Natalie, Myers Kyle J, Drukker Karen, Gichoya Judy, Bower Brad, Chen Weijie, Gruszauskas Nicholas, Kalpathy-Cramer Jayashree, Koyejo Sanmi, Sá Rui C, Sahiner Berkman, Zhang Zi, Giger Maryellen L

机构信息

University of Chicago, Chicago, Illinois, United States.

The Medical Imaging and Data Resource Center (midrc.org).

出版信息

J Med Imaging (Bellingham). 2023 Nov;10(6):61105. doi: 10.1117/1.JMI.10.6.061105. Epub 2023 Jul 18.

DOI:10.1117/1.JMI.10.6.061105
PMID:37469387
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10353566/
Abstract

PURPOSE

The Medical Imaging and Data Resource Center (MIDRC) open data commons was launched to accelerate the development of artificial intelligence (AI) algorithms to help address the COVID-19 pandemic. The purpose of this study was to quantify longitudinal representativeness of the demographic characteristics of the primary MIDRC dataset compared to the United States general population (US Census) and COVID-19 positive case counts from the Centers for Disease Control and Prevention (CDC).

APPROACH

The Jensen-Shannon distance (JSD), a measure of similarity of two distributions, was used to longitudinally measure the representativeness of the distribution of (1) all unique patients in the MIDRC data to the 2020 US Census and (2) all unique COVID-19 positive patients in the MIDRC data to the case counts reported by the CDC. The distributions were evaluated in the demographic categories of age at index, sex, race, ethnicity, and the combination of race and ethnicity.

RESULTS

Representativeness of the MIDRC data by ethnicity and the combination of race and ethnicity was impacted by the percentage of CDC case counts for which this was not reported. The distributions by sex and race have retained their level of representativeness over time.

CONCLUSION

The representativeness of the open medical imaging datasets in the curated public data commons at MIDRC has evolved over time as the number of contributing institutions and overall number of subjects have grown. The use of metrics, such as the JSD support measurement of representativeness, is one step needed for fair and generalizable AI algorithm development.

摘要

目的

医学影像与数据资源中心(MIDRC)开放数据共享库的推出是为了加速人工智能(AI)算法的开发,以帮助应对新冠疫情。本研究的目的是量化MIDRC主要数据集的人口统计学特征与美国普通人群(美国人口普查数据)以及疾病控制与预防中心(CDC)的新冠确诊病例数相比的纵向代表性。

方法

詹森-香农距离(JSD)是一种衡量两个分布相似性的指标,用于纵向衡量(1)MIDRC数据中所有独特患者的分布与2020年美国人口普查数据的代表性,以及(2)MIDRC数据中所有独特的新冠阳性患者的分布与CDC报告的病例数的代表性。在索引年龄、性别、种族、族裔以及种族和族裔组合的人口统计学类别中对分布进行评估。

结果

MIDRC数据在族裔以及种族和族裔组合方面的代表性受到未报告此类信息的CDC病例数百分比的影响。按性别和种族划分的分布随时间保持了其代表性水平。

结论

随着贡献机构数量和受试者总数的增加,MIDRC精心策划的公共数据共享库中开放医学影像数据集的代表性随时间而演变。使用诸如JSD等指标来支持代表性测量,是公平且可推广的AI算法开发所需的一步。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/910e/10353566/eb8e6049e463/JMI-010-061105-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/910e/10353566/5d5ecccb3f15/JMI-010-061105-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/910e/10353566/678927a0bcb7/JMI-010-061105-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/910e/10353566/b85480891d17/JMI-010-061105-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/910e/10353566/4f4d502bf083/JMI-010-061105-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/910e/10353566/52ad74dccc5b/JMI-010-061105-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/910e/10353566/eb6eb1063c62/JMI-010-061105-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/910e/10353566/ce1603e51a9d/JMI-010-061105-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/910e/10353566/eb8e6049e463/JMI-010-061105-g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/910e/10353566/5d5ecccb3f15/JMI-010-061105-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/910e/10353566/678927a0bcb7/JMI-010-061105-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/910e/10353566/b85480891d17/JMI-010-061105-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/910e/10353566/4f4d502bf083/JMI-010-061105-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/910e/10353566/52ad74dccc5b/JMI-010-061105-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/910e/10353566/eb6eb1063c62/JMI-010-061105-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/910e/10353566/ce1603e51a9d/JMI-010-061105-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/910e/10353566/eb8e6049e463/JMI-010-061105-g008.jpg

相似文献

1
Longitudinal assessment of demographic representativeness in the Medical Imaging and Data Resource Center open data commons.医学影像与数据资源中心开放数据共享库中人口统计学代表性的纵向评估。
J Med Imaging (Bellingham). 2023 Nov;10(6):61105. doi: 10.1117/1.JMI.10.6.061105. Epub 2023 Jul 18.
2
Sequestration of imaging studies in MIDRC: stratified sampling to balance demographic characteristics of patients in a multi-institutional data commons.医学影像数据资源中心(MIDRC)中影像研究的隔离:在多机构数据共享库中采用分层抽样以平衡患者的人口统计学特征
J Med Imaging (Bellingham). 2023 Nov;10(6):064501. doi: 10.1117/1.JMI.10.6.064501. Epub 2023 Nov 16.
3
Use of Bland-Altman Analysis to Examine the Racial and Ethnic Representativeness of Study Populations in Community-Based Pediatric Health Research.使用 Bland-Altman 分析检查社区儿科健康研究中研究人群的种族和民族代表性。
JAMA Netw Open. 2023 May 1;6(5):e2312920. doi: 10.1001/jamanetworkopen.2023.12920.
4
MIDRC-MetricTree: a decision tree-based tool for recommending performance metrics in artificial intelligence-assisted medical image analysis.MIDRC-MetricTree:一种基于决策树的工具,用于在人工智能辅助医学图像分析中推荐性能指标。
J Med Imaging (Bellingham). 2024 Mar;11(2):024504. doi: 10.1117/1.JMI.11.2.024504. Epub 2024 Apr 3.
5
AI recognition of patient race in medical imaging: a modelling study.人工智能识别医学影像中的患者种族:一项建模研究。
Lancet Digit Health. 2022 Jun;4(6):e406-e414. doi: 10.1016/S2589-7500(22)00063-2. Epub 2022 May 11.
6
MIDRC CRP10 AI interface-an integrated tool for exploring, testing and visualization of AI models.MIDRC CRP10 AI 接口——一个用于探索、测试和可视化 AI 模型的集成工具。
Phys Med Biol. 2023 Mar 23;68(7). doi: 10.1088/1361-6560/acb754.
7
Collection of Data on Race, Ethnicity, Language, and Nativity by US Public Health Surveillance and Monitoring Systems: Gaps and Opportunities.美国公共卫生监测和监测系统收集种族、民族、语言和出生地数据:差距与机遇。
Public Health Rep. 2018 Jan/Feb;133(1):45-54. doi: 10.1177/0033354917745503. Epub 2017 Dec 20.
8
Integrating Electronic Medical Records and Claims Data for Influenza Vaccine Research.整合电子病历和理赔数据用于流感疫苗研究。
Vaccines (Basel). 2022 May 6;10(5):727. doi: 10.3390/vaccines10050727.
9
Completeness, agreement, and representativeness of ethnicity recording in the United Kingdom's Clinical Practice Research Datalink (CPRD) and linked Hospital Episode Statistics (HES).英国临床实践研究数据链(CPRD)和相关的医院入院统计(HES)中种族记录的完整性、一致性和代表性。
Popul Health Metr. 2023 Mar 14;21(1):3. doi: 10.1186/s12963-023-00302-0.
10
Reporting of demographic data and representativeness in machine learning models using electronic health records.利用电子健康记录报告机器学习模型中的人口统计学数据和代表性。
J Am Med Inform Assoc. 2020 Dec 9;27(12):1878-1884. doi: 10.1093/jamia/ocaa164.

引用本文的文献

1
Multimodal data curation via interoperability: use cases with the Medical Imaging and Data Resource Center.通过互操作性进行多模态数据管理:医学影像与数据资源中心的用例
Sci Data. 2025 Aug 1;12(1):1340. doi: 10.1038/s41597-025-05678-2.
2
Computational strategic recruitment for representation and coverage studied in the All of Us Research Program.在“我们所有人研究计划”中对用于代表性和覆盖范围的计算策略招募进行了研究。
NPJ Digit Med. 2025 Jul 3;8(1):402. doi: 10.1038/s41746-025-01804-x.
3
Assessing the representativeness of large medical data using population stability index.

本文引用的文献

1
Sequestration of imaging studies in MIDRC: stratified sampling to balance demographic characteristics of patients in a multi-institutional data commons.医学影像数据资源中心(MIDRC)中影像研究的隔离:在多机构数据共享库中采用分层抽样以平衡患者的人口统计学特征
J Med Imaging (Bellingham). 2023 Nov;10(6):064501. doi: 10.1117/1.JMI.10.6.064501. Epub 2023 Nov 16.
2
COVID-19 Surveillance After Expiration of the Public Health Emergency Declaration - United States, May 11, 2023.COVID-19 监测在公共卫生紧急事件宣言期满后-美国,2023 年 5 月 11 日。
MMWR Morb Mortal Wkly Rep. 2023 May 12;72(19):523-528. doi: 10.15585/mmwr.mm7219e1.
3
使用人口稳定性指数评估大型医学数据的代表性。
BMC Med Res Methodol. 2025 Feb 21;25(1):44. doi: 10.1186/s12874-025-02474-9.
4
Data Liberation and Crowdsourcing in Medical Research: The Intersection of Collective and Artificial Intelligence.数据解放与医学研究中的众包:集体智慧与人工智能的交集。
Radiol Artif Intell. 2024 Jan;6(1):e230006. doi: 10.1148/ryai.230006.
5
Sequestration of imaging studies in MIDRC: stratified sampling to balance demographic characteristics of patients in a multi-institutional data commons.医学影像数据资源中心(MIDRC)中影像研究的隔离:在多机构数据共享库中采用分层抽样以平衡患者的人口统计学特征
J Med Imaging (Bellingham). 2023 Nov;10(6):064501. doi: 10.1117/1.JMI.10.6.064501. Epub 2023 Nov 16.
Toward fairness in artificial intelligence for medical image analysis: identification and mitigation of potential biases in the roadmap from data collection to model deployment.
迈向医学图像分析人工智能的公平性:识别并减轻从数据收集到模型部署路线图中的潜在偏差
J Med Imaging (Bellingham). 2023 Nov;10(6):061104. doi: 10.1117/1.JMI.10.6.061104. Epub 2023 Apr 26.
4
Optimizing Equity: Working towards Fair Machine Learning Algorithms in Laboratory Medicine.优化公平性:致力于实现检验医学中公平的机器学习算法
J Appl Lab Med. 2023 Jan 4;8(1):113-128. doi: 10.1093/jalm/jfac085.
5
Mitigating Bias in Radiology Machine Learning: 1. Data Handling.减轻放射学机器学习中的偏差:1. 数据处理。
Radiol Artif Intell. 2022 Aug 24;4(5):e210290. doi: 10.1148/ryai.210290. eCollection 2022 Sep.
6
Mitigating Bias in Radiology Machine Learning: 3. Performance Metrics.减轻放射学机器学习中的偏差:3. 性能指标。
Radiol Artif Intell. 2022 Aug 24;4(5):e220061. doi: 10.1148/ryai.220061. eCollection 2022 Sep.
7
Mitigating Bias in Radiology Machine Learning: 2. Model Development.减轻放射学机器学习中的偏差:2. 模型开发。
Radiol Artif Intell. 2022 Aug 24;4(5):e220010. doi: 10.1148/ryai.220010. eCollection 2022 Sep.
8
Addressing fairness in artificial intelligence for medical imaging.解决医学影像人工智能中的公平性问题。
Nat Commun. 2022 Aug 6;13(1):4581. doi: 10.1038/s41467-022-32186-3.
9
Health disparities and COVID-19: A retrospective study examining individual and community factors causing disproportionate COVID-19 outcomes in Cook County, Illinois.健康差异与 COVID-19:一项回顾性研究,调查伊利诺伊州库克县导致 COVID-19 结果不成比例的个体和社区因素。
PLoS One. 2022 May 16;17(5):e0268317. doi: 10.1371/journal.pone.0268317. eCollection 2022.
10
An objective framework for evaluating unrecognized bias in medical AI models predicting COVID-19 outcomes.评估预测 COVID-19 结果的医疗 AI 模型中未被识别偏见的客观框架。
J Am Med Inform Assoc. 2022 Jul 12;29(8):1334-1341. doi: 10.1093/jamia/ocac070.