一种新颖的数据一致性反演方法在全基因组关联研究中克服虚假推断的应用。

A novel application of data-consistent inversion to overcome spurious inference in genome-wide association studies.

机构信息

Department of Mathematical and Statistical Sciences, University of Colorado Denver, Denver, Colorado, USA.

Department of Epidemiology, Colorado School of Public Health, Aurora, Colorado, USA.

出版信息

Genet Epidemiol. 2024 Sep;48(6):270-288. doi: 10.1002/gepi.22563. Epub 2024 Apr 21.

DOI:10.1002/gepi.22563

PMID:38644517

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11938999/

Abstract

The genome-wide association studies (GWAS) typically use linear or logistic regression models to identify associations between phenotypes (traits) and genotypes (genetic variants) of interest. However, the use of regression with the additive assumption has potential limitations. First, the normality assumption of residuals is the one that is rarely seen in practice, and deviation from normality increases the Type-I error rate. Second, building a model based on such an assumption ignores genetic structures, like, dominant, recessive, and protective-risk cases. Ignoring genetic variants may result in spurious conclusions about the associations between a variant and a trait. We propose an assumption-free model built upon data-consistent inversion (DCI), which is a recently developed measure-theoretic framework utilized for uncertainty quantification. This proposed DCI-derived model builds a nonparametric distribution on model inputs that propagates to the distribution of observed data without the required normality assumption of residuals in the regression model. This characteristic enables the proposed DCI-derived model to cover all genetic variants without emphasizing on additivity of the classic-GWAS model. Simulations and a replication GWAS with data from the COPDGene demonstrate the ability of this model to control the Type-I error rate at least as well as the classic-GWAS (additive linear model) approach while having similar or greater power to discover variants in different genetic modes of transmission.

摘要

全基因组关联研究（GWAS）通常使用线性或逻辑回归模型来识别表型（特征）与感兴趣的基因型（遗传变异）之间的关联。然而，使用加性假设的回归有潜在的局限性。首先，残差的正态性假设在实践中很少见，偏离正态性会增加 I 型错误率。其次，基于这种假设构建模型忽略了遗传结构，如显性、隐性和保护风险病例。忽略遗传变异可能导致关于变异与特征之间关联的虚假结论。我们提出了一种基于数据一致反转（DCI）的无假设模型，这是一种最近开发的用于不确定性量化的测度论框架。所提出的基于 DCI 的模型在模型输入上构建了一个非参数分布，该分布传播到观测数据的分布，而无需回归模型中残差的正态性假设。这一特性使所提出的基于 DCI 的模型能够涵盖所有遗传变异，而无需强调经典-GWAS 模型的加性。模拟和 COPDGene 数据的复制 GWAS 表明，该模型至少能够像经典-GWAS（加性线性模型）方法一样控制 I 型错误率，同时在发现不同遗传传递模式的变异方面具有相似或更大的功效。

相似文献

A novel application of data-consistent inversion to overcome spurious inference in genome-wide association studies.

Genet Epidemiol. 2024 Sep;48(6):270-288. doi: 10.1002/gepi.22563. Epub 2024 Apr 21.

A rank-based normalization method with the fully adjusted full-stage procedure in genetic association studies.

PLoS One. 2020 Jun 19;15(6):e0233847. doi: 10.1371/journal.pone.0233847. eCollection 2020.

Effect of non-normality and low count variants on cross-phenotype association tests in GWAS.

Eur J Hum Genet. 2020 Mar;28(3):300-312. doi: 10.1038/s41431-019-0514-2. Epub 2019 Oct 3.

Testing Genetic Pleiotropy with GWAS Summary Statistics for Marginal and Conditional Analyses.

Genetics. 2017 Dec;207(4):1285-1299. doi: 10.1534/genetics.117.300347. Epub 2017 Oct 2.

Childhood asthma is associated with COPD and known asthma variants in COPDGene: a genome-wide association study.

Respir Res. 2018 Oct 29;19(1):209. doi: 10.1186/s12931-018-0890-0.

How powerful are summary-based methods for identifying expression-trait associations under different genetic architectures?

Pac Symp Biocomput. 2018;23:228-239.

Ordered multinomial regression for genetic association analysis of ordinal phenotypes at Biobank scale.

Genet Epidemiol. 2020 Apr;44(3):248-260. doi: 10.1002/gepi.22276. Epub 2019 Dec 26.

Pleiotropy informed adaptive association test of multiple traits using genome-wide association study summary data.

Biometrics. 2019 Dec;75(4):1076-1085. doi: 10.1111/biom.13076. Epub 2019 Aug 2.

Hierarchical structural component model for pathway analysis of common variants.

BMC Med Genomics. 2020 Feb 24;13(Suppl 3):26. doi: 10.1186/s12920-019-0650-0.

A novel association test for multiple secondary phenotypes from a case-control GWAS.

Genet Epidemiol. 2017 Jul;41(5):413-426. doi: 10.1002/gepi.22045. Epub 2017 Apr 10.

本文引用的文献

Burden of chronic obstructive pulmonary disease and its attributable risk factors in 204 countries and territories, 1990-2019: results from the Global Burden of Disease Study 2019.

BMJ. 2022 Jul 27;378:e069679. doi: 10.1136/bmj-2021-069679.

Chromatin Landscapes of Human Lung Cells Predict Potentially Functional Chronic Obstructive Pulmonary Disease Genome-Wide Association Study Variants.

Am J Respir Cell Mol Biol. 2021 Jul;65(1):92-102. doi: 10.1165/rcmb.2020-0475OC.

Best Practices for Binary and Ordinal Data Analyses.

Behav Genet. 2021 May;51(3):204-214. doi: 10.1007/s10519-020-10031-x. Epub 2021 Jan 5.

Subtypes of COPD Have Unique Distributions and Differential Risk of Mortality.

Chronic Obstr Pulm Dis. 2019 Nov;6(5):400-413. doi: 10.15326/jcopdf.6.5.2019.0150.

Effect of non-normality and low count variants on cross-phenotype association tests in GWAS.

Eur J Hum Genet. 2020 Mar;28(3):300-312. doi: 10.1038/s41431-019-0514-2. Epub 2019 Oct 3.

CHRNA3 rs1051730 and CHRNA5 rs16969968 polymorphisms are associated with heavy smoking, lung cancer, and chronic obstructive pulmonary disease in a mexican population.

Ann Hum Genet. 2018 Nov;82(6):415-424. doi: 10.1111/ahg.12264. Epub 2018 Jul 11.

Identification of Chronic Obstructive Pulmonary Disease Axes That Predict All-Cause Mortality: The COPDGene Study.

Am J Epidemiol. 2018 Oct 1;187(10):2109-2116. doi: 10.1093/aje/kwy087.

Understanding the role of the chromosome 15q25.1 in COPD through epigenetics and transcriptomics.

Eur J Hum Genet. 2018 May;26(5):709-722. doi: 10.1038/s41431-017-0089-8. Epub 2018 Feb 8.

Genetic loci associated with chronic obstructive pulmonary disease overlap with loci for lung function and pulmonary fibrosis.

Nat Genet. 2017 Mar;49(3):426-432. doi: 10.1038/ng.3752. Epub 2017 Feb 6.

Comparing GWAS Results of Complex Traits Using Full Genetic Model and Additive Models for Revealing Genetic Architecture.

Sci Rep. 2017 Jan 12;7:38600. doi: 10.1038/srep38600.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

Suppr
超能文献

一种新颖的数据一致性反演方法在全基因组关联研究中克服虚假推断的应用。

A novel application of data-consistent inversion to overcome spurious inference in genome-wide association studies.

机构信息

出版信息

相似文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

Suppr超能文献

一种新颖的数据一致性反演方法在全基因组关联研究中克服虚假推断的应用。

A novel application of data-consistent inversion to overcome spurious inference in genome-wide association studies.

机构信息

出版信息

相似文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

Suppr
超能文献