Suppr超能文献

基于两阶段残差-结局回归分析的遗传关联研究中的偏倚。

Bias due to two-stage residual-outcome regression analysis in genetic association studies.

机构信息

Department of Biostatistics, Boston University School of Public Health, Boston, Massachusetts 02118, USA.

出版信息

Genet Epidemiol. 2011 Nov;35(7):592-6. doi: 10.1002/gepi.20607. Epub 2011 Jul 18.

Abstract

Association studies of risk factors and complex diseases require careful assessment of potential confounding factors. Two-stage regression analysis, sometimes referred to as residual- or adjusted-outcome analysis, has been increasingly used in association studies of single nucleotide polymorphisms (SNPs) and quantitative traits. In this analysis, first, a residual-outcome is calculated from a regression of the outcome variable on covariates and then the relationship between the adjusted-outcome and the SNP is evaluated by a simple linear regression of the adjusted-outcome on the SNP. In this article, we examine the performance of this two-stage analysis as compared with multiple linear regression (MLR) analysis. Our findings show that when a SNP and a covariate are correlated, the two-stage approach results in biased genotypic effect and loss of power. Bias is always toward the null and increases with the squared-correlation between the SNP and the covariate (). For example, for , 0.1, and 0.5, two-stage analysis results in, respectively, 0, 10, and 50% attenuation in the SNP effect. As expected, MLR was always unbiased. Since individual SNPs often show little or no correlation with covariates, a two-stage analysis is expected to perform as well as MLR in many genetic studies; however, it produces considerably different results from MLR and may lead to incorrect conclusions when independent variables are highly correlated. While a useful alternative to MLR under , the two -stage approach has serious limitations. Its use as a simple substitute for MLR should be avoided.

摘要

关联研究风险因素和复杂疾病需要仔细评估潜在的混杂因素。两阶段回归分析,有时也称为残差或调整后结果分析,已越来越多地用于单核苷酸多态性(SNP)和定量性状的关联研究。在这种分析中,首先,从协变量对结果变量的回归中计算出残差,然后通过对调整后结果与 SNP 的简单线性回归来评估调整后结果与 SNP 之间的关系。在本文中,我们将比较两阶段分析与多元线性回归(MLR)分析的性能。我们的研究结果表明,当 SNP 和协变量相关时,两阶段方法会导致偏置基因型效应和丧失功效。偏差总是朝着零值,并且随着 SNP 和协变量之间的平方相关系数()增加而增加。例如,对于,0.1 和 0.5,两阶段分析分别导致 SNP 效应衰减 0、10 和 50%。如预期的那样,MLR 始终是无偏的。由于单个 SNP 通常与协变量相关性较小或没有相关性,因此两阶段分析在许多遗传研究中应与 MLR 一样有效;然而,当自变量高度相关时,它会产生与 MLR 非常不同的结果,并可能导致错误的结论。虽然在 下是 MLR 的一种有用替代方法,但两阶段方法存在严重的局限性。应避免将其作为 MLR 的简单替代品使用。

相似文献

1
Bias due to two-stage residual-outcome regression analysis in genetic association studies.
Genet Epidemiol. 2011 Nov;35(7):592-6. doi: 10.1002/gepi.20607. Epub 2011 Jul 18.
2
Heritability and GWAS Studies for Monocyte-Lymphocyte Ratio.
Twin Res Hum Genet. 2017 Apr;20(2):97-107. doi: 10.1017/thg.2017.3. Epub 2017 Feb 14.
3
MLR-tagging: informative SNP selection for unphased genotypes based on multiple linear regression.
Bioinformatics. 2006 Oct 15;22(20):2558-61. doi: 10.1093/bioinformatics/btl420. Epub 2006 Aug 7.
5
Power loss due to testing association between covariate-adjusted traits and genetic variants.
Genet Epidemiol. 2020 Sep;44(6):579-588. doi: 10.1002/gepi.22325. Epub 2020 Jun 8.
6
Shared genetic factors for age at natural menopause in Iranian and European women.
Hum Reprod. 2013 Jul;28(7):1987-94. doi: 10.1093/humrep/det106. Epub 2013 Apr 16.
7
Smooth-Threshold Multivariate Genetic Prediction with Unbiased Model Selection.
Genet Epidemiol. 2016 Apr;40(3):233-43. doi: 10.1002/gepi.21958. Epub 2016 Mar 6.
8
A comprehensive analysis comparing linear and generalized linear models in detecting adaptive SNPs.
Mol Ecol Resour. 2021 Apr;21(3):733-744. doi: 10.1111/1755-0998.13298. Epub 2021 Feb 9.
9
Multitrait genome association analysis identifies new susceptibility genes for human anthropometric variation in the GCAT cohort.
J Med Genet. 2018 Nov;55(11):765-778. doi: 10.1136/jmedgenet-2018-105437. Epub 2018 Aug 30.

引用本文的文献

1
Beyond the single-outcome approach: A comparison of outcome-wide analysis methods for exposome research.
Environ Int. 2023 Dec;182:108344. doi: 10.1016/j.envint.2023.108344. Epub 2023 Nov 22.
3
Analysis of Epigenetic Age Predictors in Pain-Related Conditions.
Front Public Health. 2020 Jun 9;8:172. doi: 10.3389/fpubh.2020.00172. eCollection 2020.
4
A rank-based normalization method with the fully adjusted full-stage procedure in genetic association studies.
PLoS One. 2020 Jun 19;15(6):e0233847. doi: 10.1371/journal.pone.0233847. eCollection 2020.
5
Power loss due to testing association between covariate-adjusted traits and genetic variants.
Genet Epidemiol. 2020 Sep;44(6):579-588. doi: 10.1002/gepi.22325. Epub 2020 Jun 8.
6
Genome-wide identification of loci associated with growth in rainbow trout.
BMC Genomics. 2020 Mar 5;21(1):209. doi: 10.1186/s12864-020-6617-x.
7
The role of physical activity in metabolic homeostasis before and after the onset of type 2 diabetes: an IMI DIRECT study.
Diabetologia. 2020 Apr;63(4):744-756. doi: 10.1007/s00125-019-05083-6. Epub 2020 Jan 30.
8
Scalable Nonparametric Prescreening Method for Searching Higher-Order Genetic Interactions Underlying Quantitative Traits.
Genetics. 2019 Dec;213(4):1209-1224. doi: 10.1534/genetics.119.302658. Epub 2019 Oct 4.
9
A fully adjusted two-stage procedure for rank-normalization in genetic association studies.
Genet Epidemiol. 2019 Apr;43(3):263-275. doi: 10.1002/gepi.22188. Epub 2019 Jan 17.
10
The effect of phenotypic outliers and non-normality on rare-variant association testing.
Eur J Hum Genet. 2016 Aug;24(8):1188-94. doi: 10.1038/ejhg.2015.270. Epub 2016 Jan 6.

本文引用的文献

2
Refined QTLs of osteoporosis-related traits by linkage analysis with genome-wide SNPs: Framingham SHARe.
Bone. 2010 Apr;46(4):1114-21. doi: 10.1016/j.bone.2010.01.001. Epub 2010 Jan 11.
4
Principal components analysis corrects for stratification in genome-wide association studies.
Nat Genet. 2006 Aug;38(8):904-9. doi: 10.1038/ng1847. Epub 2006 Jul 23.
6
Risk factors, confounding, and the illusion of statistical control.
Psychosom Med. 2004 Nov-Dec;66(6):868-75. doi: 10.1097/01.psy.0000140008.70959.41.
7
The effect of correlated measurement error in multivariate models of diet.
Am J Epidemiol. 2004 Jul 1;160(1):59-67. doi: 10.1093/aje/kwh169.
8
Genome-wide linkage analysis of systolic blood pressure: a comparison of two approaches to phenotype definition.
BMC Genet. 2003 Dec 31;4 Suppl 1(Suppl 1):S13. doi: 10.1186/1471-2156-4-S1-S13.
9
Adjusting for covariates in variance components QTL linkage analysis.
Behav Genet. 2004 Mar;34(2):127-33. doi: 10.1023/B:BEGE.0000013726.65708.c2.
10
Commentary: Dietary diaries versus food frequency questionnaires-a case of undigestible data.
Int J Epidemiol. 2001 Apr;30(2):317-9. doi: 10.1093/ije/30.2.317.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验