基因组研究中群体描述符的数据模型。

A data model for population descriptors in genomic research.

作者信息

Khan Alyna T, Adebamowo Clement, Fullerton Stephanie M, Hirbo Jibril, Konigsberg Iain R, Kraft Peter, Martin Iman, Nelson Sarah C, Ramsay Michèle, Wojcik Genevieve L, Adebamowo Sally N, Conomos Matthew P, Darst Burcu F, Hysong Micah R, Li Yun, Martin Alicia R, Mathias Rasika A, Rich Stephen S, Sakoda Lori C, Schrider Daniel R, Sharma Jayati, Smith Johanna L, Sun Quan, Zhang Yuji, Gogarten Stephanie M

机构信息

School of Engineering, Design, and Innovation, Pennsylvania State University, University Park, PA, USA.

Department of Epidemiology and Public Health, University of Maryland School of Medicine, Baltimore, MD, USA.

出版信息

Am J Hum Genet. 2025 Jul 3;112(7):1504-1514. doi: 10.1016/j.ajhg.2025.05.011. Epub 2025 Jun 12.

DOI:10.1016/j.ajhg.2025.05.011

PMID:40513563

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12256887/

Abstract

Population descriptors used in genetic studies have broad social and translational implications. There are no globally agreed-upon definitions or usages of common population descriptors (e.g., race, ethnicity, nationality, and tribe), many of which are applied ad hoc and/or derived from political or bureaucratic conventions. Recent recommendations have encouraged the retention of as much granularity in population descriptors as possible during data preparation, analysis, and interpretation of research results. However, genomic research infrastructures (i.e., current practices, resources, and workflows in genomic research) often lack systematic and flexible organization, structure, and harmonization of multifaceted and detailed population descriptor data. This can lead to loss of information, barriers to international collaboration, and potential issues in clinical translation. Here, we describe a data model, developed by the NIH-funded Polygenic Risk Methods in Diverse Populations (PRIMED) Consortium, that organizes and retains detailed population descriptor data for future research use. The model supports a versatile, traceable, and reproducible harmonization system that offers multiple benefits over existing data structures. This data model affords researchers the flexibility to thoughtfully choose and scientifically justify their choice of population descriptors. It avoids the conflation of social identities with biological categories and guards against harmful typological inferences. Genomic research tools of this kind will be crucial for producing scientifically robust findings that minimize potential harms of descriptor misuse while maximizing benefits for diverse communities.

摘要

基因研究中使用的人群描述符具有广泛的社会和转化意义。对于常见的人群描述符（如种族、族裔、国籍和部落），目前尚无全球统一认可的定义或用法，其中许多描述符是临时应用的，和/或源自政治或官僚惯例。最近的建议鼓励在数据准备、分析和研究结果解释过程中尽可能保留人群描述符的详细程度。然而，基因组研究基础设施（即基因组研究中的当前实践、资源和工作流程）往往缺乏对多方面详细人群描述符数据进行系统、灵活的组织、架构和协调。这可能导致信息丢失、国际合作受阻以及临床转化中出现潜在问题。在此，我们描述了一种由美国国立卫生研究院资助的不同人群多基因风险方法（PRIMED）联盟开发的数据模型，该模型组织并保留详细的人群描述符数据以供未来研究使用。该模型支持一个通用、可追溯且可重复的协调系统，与现有数据结构相比具有多种优势。这种数据模型使研究人员能够灵活地审慎选择人群描述符，并为其选择提供科学依据。它避免了社会身份与生物学类别之间的混淆，并防止有害的类型学推断。这类基因组研究工具对于得出科学可靠的结果至关重要，既能将描述符误用的潜在危害降至最低，又能为不同群体带来最大利益。

相似文献

A data model for population descriptors in genomic research.基因组研究中群体描述符的数据模型。

Am J Hum Genet. 2025 Jul 3;112(7):1504-1514. doi: 10.1016/j.ajhg.2025.05.011. Epub 2025 Jun 12.

Prescription of Controlled Substances: Benefits and Risks管制药品的处方：益处与风险

Cost-effectiveness of using prognostic information to select women with breast cancer for adjuvant systemic therapy.利用预后信息为乳腺癌患者选择辅助性全身治疗的成本效益

Health Technol Assess. 2006 Sep;10(34):iii-iv, ix-xi, 1-204. doi: 10.3310/hta10340.

Falls prevention interventions for community-dwelling older adults: systematic review and meta-analysis of benefits, harms, and patient values and preferences.社区居住的老年人跌倒预防干预措施：系统评价和荟萃分析的益处、危害以及患者的价值观和偏好。

Syst Rev. 2024 Nov 26;13(1):289. doi: 10.1186/s13643-024-02681-3.

Home treatment for mental health problems: a systematic review.心理健康问题的居家治疗：一项系统综述

Health Technol Assess. 2001;5(15):1-139. doi: 10.3310/hta5150.

Measures implemented in the school setting to contain the COVID-19 pandemic.学校为控制 COVID-19 疫情而采取的措施。

Cochrane Database Syst Rev. 2022 Jan 17;1(1):CD015029. doi: 10.1002/14651858.CD015029.

Community views on mass drug administration for soil-transmitted helminths: a qualitative evidence synthesis.社区对土壤传播蠕虫群体药物给药的看法：定性证据综合分析

Cochrane Database Syst Rev. 2025 Jun 20;6:CD015794. doi: 10.1002/14651858.CD015794.pub2.

Interventions for preventing falls in older people in care facilities.护理机构中预防老年人跌倒的干预措施。

Cochrane Database Syst Rev. 2025 Aug 20;8:CD016064. doi: 10.1002/14651858.CD016064.

Fabricating mice and dementia: opening up relations in multi-species research制造小鼠与痴呆症：开启多物种研究中的关联

Patient navigator programmes for children and adolescents with chronic diseases.慢性病患儿和青少年的患者导航员计划。

Cochrane Database Syst Rev. 2024 Oct 9;10(10):CD014688. doi: 10.1002/14651858.CD014688.pub2.

本文引用的文献

Data sharing in the PRIMED Consortium: Design, implementation, and recommendations for future policymaking.PRIMED联盟中的数据共享：设计、实施及对未来政策制定的建议

Am J Hum Genet. 2025 Jul 3. doi: 10.1016/j.ajhg.2025.06.004.

The PRIMED Consortium: Reducing disparities in polygenic risk assessment.PRIMED联盟：减少多基因风险评估中的差异。

Am J Hum Genet. 2024 Dec 5;111(12):2594-2606. doi: 10.1016/j.ajhg.2024.10.010. Epub 2024 Nov 18.

Misunderstanding of race as biology has deep negative biological and social consequences.将种族误解为生物学现象会产生严重的负面生物学和社会后果。

Exp Physiol. 2024 Aug;109(8):1240-1243. doi: 10.1113/EP091491. Epub 2024 May 3.

Aspiring toward equitable benefits from genomic advances to individuals of ancestrally diverse backgrounds.力求让来自不同祖先背景的个人从基因组学进展中获得公平的益处。

Am J Hum Genet. 2024 May 2;111(5):809-824. doi: 10.1016/j.ajhg.2024.04.002. Epub 2024 Apr 19.

A General Primer for Data Harmonization.数据协调通用指南

Sci Data. 2024 Jan 31;11(1):152. doi: 10.1038/s41597-024-02956-3.

Beyond borders: A commentary on the benefit of promoting immigrant populations in genome-wide association studies.跨越国界：关于在全基因组关联研究中促进移民人群的益处的评论。

HGG Adv. 2023 May 11;4(3):100205. doi: 10.1016/j.xhgg.2023.100205. eCollection 2023 Jul 13.

Including multiracial individuals is crucial for race, ethnicity and ancestry frameworks in genetics and genomics.在遗传学和基因组学中，纳入多种族个体对于种族、民族和祖先框架至关重要。

Nat Genet. 2023 Jun;55(6):895-900. doi: 10.1038/s41588-023-01394-y.

Patient and provider perspectives on polygenic risk scores: implications for clinical reporting and utilization.患者和提供者对多基因风险评分的看法：对临床报告和应用的影响。

Genome Med. 2022 Oct 7;14(1):114. doi: 10.1186/s13073-022-01117-8.

Disaggregation of Race and Ethnicity Group Data: Research-to-Practice Issues in Clinical Environments.种族和族裔群体数据的分解：临床环境中的研究到实践问题

JAMA. 2022 Oct 11;328(14):1395-1396. doi: 10.1001/jama.2022.17194.

Recommendations on the use and reporting of race, ethnicity, and ancestry in genetic research: Experiences from the NHLBI TOPMed program.关于种族、族裔和血统在基因研究中的使用及报告的建议：美国国立心、肺、血液研究所（NHLBI）精准医学跨组学研究项目（TOPMed）的经验

Cell Genom. 2022 Aug 10;2(8). doi: 10.1016/j.xgen.2022.100155. Epub 2022 Jul 26.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

基因组研究中群体描述符的数据模型。

A data model for population descriptors in genomic research.

作者信息

机构信息

出版信息

相似文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

本文引用的文献