Suppr超能文献

一种在全基因组关联研究中有效利用数量性状基因型数据的合并策略。

A pooling strategy to effectively use genotype data in quantitative traits genome-wide association studies.

机构信息

Division of Intramural Population Health Research, Eunice Kennedy Shriver National Institute of Child Health and Human Development, National Institutes of Health, Bethesda, Maryland.

Biostatistics Branch, Division of Cancer Epidemiology and Genetics, National Cancer Institute, Rockville, Maryland.

出版信息

Stat Med. 2018 Nov 30;37(27):4083-4095. doi: 10.1002/sim.7898. Epub 2018 Jul 12.

Abstract

The goal of quantitative traits genome-wide association studies is to identify associations between a phenotypic variable, such as a vitamin level and genetic variants, often single-nucleotide polymorphisms. When funding limits the number of assays that can be performed to measure the level of the phenotypic variable, a subgroup of subjects is often randomly selected from the genotype database and the level of the phenotypic variable is then measured for each subject. Because only a proportion of the genotype data can be used, such a simple random sampling method may suffer from substantial loss of efficiency, especially when the number of assays is relative small and the frequency of the less common variant (minor allele frequency) is low. We propose a pooling strategy in which subjects in a randomly selected reference subgroup are aligned with randomly selected subjects from the remaining study subjects to form independent pools; blood samples from subjects in each pool are mixed; and the level of the phenotypic variable is measured for each pool. We demonstrate that the proposed pooling approach produces considerable gains in efficiency over the simple random sampling method for inference concerning the phenotype-genotype association, resulting in higher precision and power. The methods are illustrated using genotypic and phenotypic data from the Trinity Students Study, a quantitative genome-wide association study.

摘要

全基因组关联研究的目的是鉴定表型变量(如维生素水平)与遗传变异(通常是单核苷酸多态性)之间的关联。当资金限制了可以进行的测定数量以测量表型变量的水平时,通常会从基因型数据库中随机选择一组受试者,然后为每个受试者测量表型变量的水平。由于只能使用一部分基因型数据,因此这种简单的随机抽样方法可能会导致效率大幅降低,尤其是当测定数量相对较少且较少见变体(次要等位基因频率)的频率较低时。我们提出了一种汇集策略,其中随机选择的参考亚组中的受试者与来自其余研究受试者的随机选择的受试者相匹配,以形成独立的池;从每个池中采集受试者的血样进行混合;并为每个池测量表型变量的水平。我们证明,与简单随机抽样方法相比,所提出的汇集方法在推断表型-基因型关联方面具有显著的效率增益,从而提高了精度和功效。该方法使用 Trinity Students 研究的基因型和表型数据进行了说明,这是一项全基因组关联研究。

相似文献

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验