Suppr超能文献

有用的研究数据通常会被分享吗?对全基因组关联研究汇总统计数据的调查。

Is useful research data usually shared? An investigation of genome-wide association study summary statistics.

机构信息

Statistical Cybermetrics Research Group, University of Wolverhampton, Wolverhampton, United Kingdom.

MRC Integrative Epidemiology Unit at the University of Bristol, Bristol, United Kingdom.

出版信息

PLoS One. 2020 Feb 21;15(2):e0229578. doi: 10.1371/journal.pone.0229578. eCollection 2020.

Abstract

Primary data collected during a research study is often shared and may be reused for new studies. To assess the extent of data sharing in favourable circumstances and whether data sharing checks can be automated, this article investigates summary statistics from primary human genome-wide association studies (GWAS). This type of data is highly suitable for sharing because it is a standard research output, is straightforward to use in future studies (e.g., for secondary analysis), and may be already stored in a standard format for internal sharing within multi-site research projects. Manual checks of 1799 articles from 2010 and 2017 matching a simple PubMed query for molecular epidemiology GWAS were used to identify 314 primary human GWAS papers. Of these, only 13% reported the location of a complete set of GWAS summary data, increasing from 3% in 2010 to 23% in 2017. Whilst information about whether data was shared was typically located clearly within a data availability statement, the exact nature of the shared data was usually unspecified. Thus, data sharing is the exception even in suitable research fields with relatively strong data sharing norms. Moreover, the lack of clear data descriptions within data sharing statements greatly complicates the task of automatically characterising shared data sets.

摘要

在研究过程中收集的原始数据通常会被共享,并可能被重新用于新的研究。为了评估在有利情况下数据共享的程度,以及是否可以自动进行数据共享检查,本文调查了原发性人类全基因组关联研究(GWAS)的汇总统计数据。由于这种数据是标准的研究成果,易于在未来的研究中使用(例如,用于二次分析),并且可能已经以标准格式存储在多站点研究项目内部共享,因此非常适合共享。通过手动检查 2010 年和 2017 年与分子流行病学 GWAS 的简单 PubMed 查询匹配的 1799 篇文章,确定了 314 篇原发性人类 GWAS 论文。其中,只有 13%的论文报告了完整的 GWAS 汇总数据的位置,而 2010 年的比例为 3%,2017 年的比例为 23%。尽管关于数据是否共享的信息通常位于数据可用性声明中,但共享数据的确切性质通常未指定。因此,即使在具有相对较强数据共享规范的合适研究领域,数据共享也只是例外。而且,数据共享声明中缺少明确的数据描述极大地增加了自动描述共享数据集的任务的复杂性。

相似文献

2
Has GWAS lost its status as a paragon of open science?GWAS 是否已经失去了其作为开放科学典范的地位?
PLoS Biol. 2021 May 3;19(5):e3001242. doi: 10.1371/journal.pbio.3001242. eCollection 2021 May.
3
Ten quick tips for sharing open genomic data.分享开放基因组数据的 10 个快速技巧
PLoS Comput Biol. 2018 Dec 27;14(12):e1006472. doi: 10.1371/journal.pcbi.1006472. eCollection 2018 Dec.

引用本文的文献

7
Clinical utility of polygenic risk scores: a critical 2023 appraisal.多基因风险评分的临床效用:2023年的批判性评估
J Community Genet. 2023 Oct;14(5):471-487. doi: 10.1007/s12687-023-00645-z. Epub 2023 May 3.

本文引用的文献

2
The FAIR guiding principles for data stewardship: fair enough?FAIR 数据管理原则:足够公平吗?
Eur J Hum Genet. 2018 Jul;26(7):931-936. doi: 10.1038/s41431-018-0160-0. Epub 2018 May 17.
3
Sharing Data and Materials in Psychological Science.心理学领域的数据与材料共享
Psychol Sci. 2017 Jun;28(6):699-702. doi: 10.1177/0956797617704015. Epub 2017 Apr 17.
4
Advantages of a Truly Open-Access Data-Sharing Model.真正开放获取的数据共享模式的优势。
N Engl J Med. 2017 Mar 23;376(12):1178-1181. doi: 10.1056/NEJMsb1702054.
9
Controlled Access under Review: Improving the Governance of Genomic Data Access.审查中的受控访问:改善基因组数据访问治理
PLoS Biol. 2015 Dec 31;13(12):e1002339. doi: 10.1371/journal.pbio.1002339. eCollection 2015 Dec.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验