一种用于全基因组关联研究中泛化测试的强大统计框架，并应用于西班牙裔社区健康研究/拉丁裔研究（HCHS/SOL）。

A powerful statistical framework for generalization testing in GWAS, with application to the HCHS/SOL.

作者信息

Sofer Tamar, Heller Ruth, Bogomolov Marina, Avery Christy L, Graff Mariaelisa, North Kari E, Reiner Alex P, Thornton Timothy A, Rice Kenneth, Benjamini Yoav, Laurie Cathy C, Kerr Kathleen F

机构信息

Department of Biostatistics, University of Washington, Seattle, WA, USA.

Department of Statistics and Operations Research, Tel-Aviv University, Tel-Aviv, Israel.

出版信息

Genet Epidemiol. 2017 Apr;41(3):251-258. doi: 10.1002/gepi.22029. Epub 2017 Jan 15.

DOI:10.1002/gepi.22029

PMID:28090672

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5340573/

Abstract

In genome-wide association studies (GWAS), "generalization" is the replication of genotype-phenotype association in a population with different ancestry than the population in which it was first identified. Current practices for declaring generalizations rely on testing associations while controlling the family-wise error rate (FWER) in the discovery study, then separately controlling error measures in the follow-up study. This approach does not guarantee control over the FWER or false discovery rate (FDR) of the generalization null hypotheses. It also fails to leverage the two-stage design to increase power for detecting generalized associations. We provide a formal statistical framework for quantifying the evidence of generalization that accounts for the (in)consistency between the directions of associations in the discovery and follow-up studies. We develop the directional generalization FWER (FWER ) and FDR (FDR ) controlling r-values, which are used to declare associations as generalized. This framework extends to generalization testing when applied to a published list of Single Nucleotide Polymorphism-(SNP)-trait associations. Our methods control FWER or FDR under various SNP selection rules based on P-values in the discovery study. We find that it is often beneficial to use a more lenient P-value threshold than the genome-wide significance threshold. In a GWAS of total cholesterol in the Hispanic Community Health Study/Study of Latinos (HCHS/SOL), when testing all SNPs with P-values <5×10-8 (15 genomic regions) for generalization in a large GWAS of whites, we generalized SNPs from 15 regions. But when testing all SNPs with P-values <6.6×10-5 (89 regions), we generalized SNPs from 27 regions.

摘要

在全基因组关联研究（GWAS）中，“泛化”是指在与首次发现基因型-表型关联的人群具有不同祖先的人群中对该关联进行复制。目前宣布泛化的做法依赖于在发现研究中控制家族性错误率（FWER）的同时测试关联，然后在后续研究中分别控制错误度量。这种方法不能保证对泛化无效假设的FWER或错误发现率（FDR）进行控制。它也未能利用两阶段设计来提高检测泛化关联的功效。我们提供了一个正式的统计框架，用于量化泛化的证据，该框架考虑了发现研究和后续研究中关联方向之间的（不）一致性。我们开发了用于控制r值的方向泛化FWER（FWER ）和FDR（FDR ），这些r值用于将关联声明为泛化。当应用于已发表的单核苷酸多态性-（SNP）-性状关联列表时，该框架扩展到泛化测试。我们的方法在基于发现研究中的P值的各种SNP选择规则下控制FWER或FDR。我们发现，使用比全基因组显著性阈值更宽松的P值阈值通常是有益的。在西班牙裔社区健康研究/拉丁裔研究（HCHS/SOL）中对总胆固醇进行的GWAS中，当在一项针对白人进行的大型GWAS中测试所有P值<5×10-8（15个基因组区域）的SNP的泛化情况时，我们从15个区域泛化了SNP。但是，当测试所有P值<6.6×10-5（89个区域）的SNP时，我们从27个区域泛化了SNP。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

一种用于全基因组关联研究中泛化测试的强大统计框架，并应用于西班牙裔社区健康研究/拉丁裔研究（HCHS/SOL）。

A powerful statistical framework for generalization testing in GWAS, with application to the HCHS/SOL.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

一种用于全基因组关联研究中泛化测试的强大统计框架，并应用于西班牙裔社区健康研究/拉丁裔研究（HCHS/SOL）。

A powerful statistical framework for generalization testing in GWAS, with application to the HCHS/SOL.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献