Suppr超能文献

复杂抽样调查下的遗传关联分析:西班牙裔社区健康研究/拉丁裔研究。

Genetic association analysis under complex survey sampling: the Hispanic Community Health Study/Study of Latinos.

机构信息

Department of Biostatistics, University of North Carolina, Chapel Hill, NC 27599-7420, USA.

Department of Biostatistics, University of North Carolina, Chapel Hill, NC 27599-7420, USA.

出版信息

Am J Hum Genet. 2014 Dec 4;95(6):675-88. doi: 10.1016/j.ajhg.2014.11.005.

Abstract

The cohort design allows investigators to explore the genetic basis of a variety of diseases and traits in a single study while avoiding major weaknesses of the case-control design. Most cohort studies employ multistage cluster sampling with unequal probabilities to conveniently select participants with desired characteristics, and participants from different clusters might be genetically related. Analysis that ignores the complex sampling design can yield biased estimation of the genetic association and inflation of the type I error. Herein, we develop weighted estimators that reflect unequal selection probabilities and differential nonresponse rates, and we derive variance estimators that properly account for the sampling design and the potential relatedness of participants in different sampling units. We compare, both analytically and numerically, the performance of the proposed weighted estimators with unweighted estimators that disregard the sampling design. We demonstrate the usefulness of the proposed methods through analysis of MetaboChip data in the Hispanic Community Health Study/Study of Latinos, which is the largest health study of the Hispanic/Latino population in the United States aimed at identifying risk factors for various diseases and determining the role of genes and environment in the occurrence of diseases. We provide guidelines on the use of weighted and unweighted estimators, as well as the relevant software.

摘要

队列设计允许研究人员在一项研究中探索各种疾病和特征的遗传基础,同时避免病例对照设计的主要弱点。大多数队列研究采用多阶段聚类抽样,采用不等概率的方法方便地选择具有所需特征的参与者,并且来自不同聚类的参与者可能具有遗传相关性。忽略复杂抽样设计的分析可能会导致遗传关联的偏倚估计和 I 型错误的膨胀。本文中,我们开发了加权估计量,反映了不等的选择概率和不同的无反应率,并推导出适当考虑抽样设计和不同抽样单位中参与者潜在相关性的方差估计量。我们通过分析西班牙裔社区健康研究/拉丁裔研究中的 MetaboChip 数据,对所提出的加权估计量和不考虑抽样设计的未加权估计量进行了分析和数值比较。我们展示了所提出的方法的有用性,该方法通过分析美国最大的西班牙裔/拉丁裔人口健康研究——西班牙裔社区健康研究/拉丁裔研究中的 MetaboChip 数据,旨在确定各种疾病的风险因素,并确定基因和环境在疾病发生中的作用。我们提供了关于使用加权和未加权估计量以及相关软件的指南。

相似文献

4
Sample design and cohort selection in the Hispanic Community Health Study/Study of Latinos.
Ann Epidemiol. 2010 Aug;20(8):642-9. doi: 10.1016/j.annepidem.2010.05.006.
6
Estimation of ROC curve with complex survey data.
Stat Med. 2015 Apr 15;34(8):1293-303. doi: 10.1002/sim.6405. Epub 2014 Dec 29.
7
Modelling the sampling design in the analysis of health surveys.
Stat Methods Med Res. 1996 Sep;5(3):263-81. doi: 10.1177/096228029600500304.
8
A Gene-Acculturation Study of Obesity Among US Hispanic/Latinos: The Hispanic Community Health Study/Study of Latinos.
Psychosom Med. 2023 May 1;85(4):358-365. doi: 10.1097/PSY.0000000000001193. Epub 2023 Mar 15.

引用本文的文献

1
Type 2 diabetes and cause-specific mortality in Mexico City: a Mendelian randomisation analysis.
Lancet Reg Health Am. 2025 Apr 6;45:101082. doi: 10.1016/j.lana.2025.101082. eCollection 2025 May.
3
Trans-ancestry genome-wide association study of childhood body mass index identifies novel loci and age-specific effects.
HGG Adv. 2025 Apr 10;6(2):100411. doi: 10.1016/j.xhgg.2025.100411. Epub 2025 Jan 30.
6
Re-analysis and meta-analysis of summary statistics from gene-environment interaction studies.
Bioinformatics. 2023 Dec 1;39(12). doi: 10.1093/bioinformatics/btad730.
8
Ancestral diversity in lipoprotein(a) studies helps address evidence gaps.
Open Heart. 2023 Aug;10(2). doi: 10.1136/openhrt-2023-002382.
9
A Guide to Genome-Wide Association Study Design for Diabetic Retinopathy.
Methods Mol Biol. 2023;2678:49-89. doi: 10.1007/978-1-0716-3255-0_5.

本文引用的文献

2
Whole-exome sequencing identifies rare and low-frequency coding variants associated with LDL cholesterol.
Am J Hum Genet. 2014 Feb 6;94(2):233-45. doi: 10.1016/j.ajhg.2014.01.010.
3
The National Children's Study--a proposed plan.
N Engl J Med. 2013 Nov 14;369(20):1873-5. doi: 10.1056/NEJMp1311150.
4
GEE-based SNP set association test for continuous and discrete traits in family-based association studies.
Genet Epidemiol. 2013 Dec;37(8):778-86. doi: 10.1002/gepi.21763. Epub 2013 Oct 25.
5
Quantitative trait analysis in sequencing studies under trait-dependent sampling.
Proc Natl Acad Sci U S A. 2013 Jul 23;110(30):12247-52. doi: 10.1073/pnas.1221713110. Epub 2013 Jul 11.
7
The National Longitudinal Study of Adolescent Health (Add Health) sibling pairs data.
Twin Res Hum Genet. 2013 Feb;16(1):391-8. doi: 10.1017/thg.2012.137. Epub 2012 Dec 12.
8
Detecting rare variant effects using extreme phenotype sampling in sequencing association studies.
Genet Epidemiol. 2013 Feb;37(2):142-51. doi: 10.1002/gepi.21699. Epub 2012 Nov 26.
9
The metabochip, a custom genotyping array for genetic studies of metabolic, cardiovascular, and anthropometric traits.
PLoS Genet. 2012;8(8):e1002793. doi: 10.1371/journal.pgen.1002793. Epub 2012 Aug 2.
10
Improving power and robustness for detecting genetic association with extreme-value sampling design.
Genet Epidemiol. 2011 Dec;35(8):823-30. doi: 10.1002/gepi.20631. Epub 2011 Oct 17.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验