Suppr超能文献

评估和共享生物医学数据集的全球遗传血统。

Evaluating and sharing global genetic ancestry in biomedical datasets.

机构信息

Health Department of Biomedical Informatics, University of California, San Diego, La Jolla, California, USA.

Moores Cancer Center, University of California, San Diego, La Jolla, California, USA.

出版信息

J Am Med Inform Assoc. 2019 May 1;26(5):457-461. doi: 10.1093/jamia/ocy194.

Abstract

Genetic ancestry is a critical co-factor to study phenotype-genotype associations using cohorts of human subjects. Most publicly available molecular datasets are, however, missing this information or only share self-reported race and ethnicity, representing a limitation to identify and repurpose datasets to investigate the contribution of ancestry to diseases and traits. We propose an analytical framework to enrich the metadata from publicly available cohorts with genetic ancestry information and a resulting diversity score at continental resolution, calculated directly from the data. We illustrate this framework using The Cancer Genome Atlas datasets searched through the DataMed Data Discovery Index. Data repositories and contributors can use this framework to provide genetic diversity measurements for controlled access datasets, minimizing the work involved in requesting a dataset that may ultimately prove inadequate for a researcher's purpose. With the increasing global scale of human genetics research, studies on disease risk and susceptibility would benefit greatly from the adequate estimation and sharing of genetic diversity in publicly available datasets following a framework such as the one presented.

摘要

遗传背景是研究人类研究对象表型-基因型关联的关键协同因素。然而,大多数公开可用的分子数据集都缺少这方面的信息,或者只共享自我报告的种族和民族信息,这限制了对数据集的识别和重新利用,以研究遗传背景对疾病和特征的贡献。我们提出了一个分析框架,用遗传背景信息和大陆分辨率的多样性评分来丰富公开队列的元数据,这些评分是直接从数据中计算出来的。我们使用通过 DataMed 数据发现索引搜索的癌症基因组图谱数据集来说明这个框架。数据存储库和贡献者可以使用这个框架为受控访问数据集提供遗传多样性测量,从而最大限度地减少请求数据集的工作,这些数据集最终可能不适合研究人员的目的。随着人类遗传学研究的全球范围不断扩大,疾病风险和易感性的研究将从在公开数据集的充分估计和共享遗传多样性中获益,而这正是所提出的框架的作用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8a9e/6433181/5ee4304d5780/ocy194f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验