一种用于数量性状网络关联分析的多元回归方法。

A multivariate regression approach to association analysis of a quantitative trait network.

作者信息

Kim Seyoung, Sohn Kyung-Ah, Xing Eric P

机构信息

School of Computer Science, Carnegie Mellon University, Pittsburgh, USA.

出版信息

Bioinformatics. 2009 Jun 15;25(12):i204-12. doi: 10.1093/bioinformatics/btp218.

DOI:10.1093/bioinformatics/btp218

PMID:19477989

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2687972/

Abstract

MOTIVATION

Many complex disease syndromes such as asthma consist of a large number of highly related, rather than independent, clinical phenotypes, raising a new technical challenge in identifying genetic variations associated simultaneously with correlated traits. Although a causal genetic variation may influence a group of highly correlated traits jointly, most of the previous association analyses considered each phenotype separately, or combined results from a set of single-phenotype analyses.

RESULTS

We propose a new statistical framework called graph-guided fused lasso to address this issue in a principled way. Our approach represents the dependency structure among the quantitative traits explicitly as a network, and leverages this trait network to encode structured regularizations in a multivariate regression model over the genotypes and traits, so that the genetic markers that jointly influence subgroups of highly correlated traits can be detected with high sensitivity and specificity. While most of the traditional methods examined each phenotype independently, our approach analyzes all of the traits jointly in a single statistical method to discover the genetic markers that perturb a subset of correlated traits jointly rather than a single trait. Using simulated datasets based on the HapMap consortium data and an asthma dataset, we compare the performance of our method with the single-marker analysis, and other sparse regression methods that do not use any structural information in the traits. Our results show that there is a significant advantage in detecting the true causal single nucleotide polymorphisms when we incorporate the correlation pattern in traits using our proposed methods.

AVAILABILITY

Software for GFlasso is available at http://www.sailing.cs.cmu.edu/gflasso.html.

摘要

动机

许多复杂疾病综合征，如哮喘，由大量高度相关而非独立的临床表型组成，这在识别与相关性状同时相关的基因变异方面带来了新的技术挑战。尽管一个因果基因变异可能会共同影响一组高度相关的性状，但大多数先前的关联分析都是分别考虑每个表型，或者将一组单表型分析的结果合并起来。

结果

我们提出了一种新的统计框架，称为图引导融合套索，以一种有原则的方式解决这个问题。我们的方法将数量性状之间的依赖结构明确表示为一个网络，并利用这个性状网络在基因型和性状的多元回归模型中编码结构化正则化，以便能够以高灵敏度和特异性检测共同影响高度相关性状亚组的基因标记。虽然大多数传统方法独立检查每个表型，但我们的方法在单一统计方法中联合分析所有性状，以发现共同干扰相关性状子集而非单个性状的基因标记。使用基于HapMap联盟数据的模拟数据集和一个哮喘数据集，我们将我们方法的性能与单标记分析以及其他不使用性状中任何结构信息的稀疏回归方法进行了比较。我们的结果表明，当我们使用我们提出的方法纳入性状中的相关模式时，在检测真正的因果单核苷酸多态性方面具有显著优势。

可用性

GFlasso软件可在http://www.sailing.cs.cmu.edu/gflasso.html获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4067/2687972/72f028ebb85c/btp218f1.jpg

相似文献

A multivariate regression approach to association analysis of a quantitative trait network.

Bioinformatics. 2009 Jun 15;25(12):i204-12. doi: 10.1093/bioinformatics/btp218.

Statistical estimation of correlated genome associations to a quantitative trait network.

PLoS Genet. 2009 Aug;5(8):e1000587. doi: 10.1371/journal.pgen.1000587. Epub 2009 Aug 14.

Leveraging input and output structures for joint mapping of epistatic and marginal eQTLs.

Bioinformatics. 2012 Jun 15;28(12):i137-46. doi: 10.1093/bioinformatics/bts227.

Structured association analysis leads to insight into Saccharomyces cerevisiae gene regulation by finding multiple contributing eQTL hotspots associated with functional gene modules.

BMC Genomics. 2013 Mar 21;14:196. doi: 10.1186/1471-2164-14-196.

Finding genome-transcriptome-phenome association with structured association mapping and visualization in GenAMap.

Pac Symp Biocomput. 2012:327-38.

Genome-wide association study of disease resilience traits from a natural polymicrobial disease challenge model in pigs identifies the importance of the major histocompatibility complex region.

G3 (Bethesda). 2022 Mar 4;12(3). doi: 10.1093/g3journal/jkab441.

A time-varying group sparse additive model for genome-wide association studies of dynamic complex traits.

Bioinformatics. 2016 Oct 1;32(19):2903-10. doi: 10.1093/bioinformatics/btw347. Epub 2016 Jun 13.

How powerful are summary-based methods for identifying expression-trait associations under different genetic architectures?

Pac Symp Biocomput. 2018;23:228-239.

Learning gene networks under SNP perturbations using eQTL datasets.

PLoS Comput Biol. 2014 Feb 27;10(2):e1003420. doi: 10.1371/journal.pcbi.1003420. eCollection 2014 Feb.

Efficient set tests for the genetic analysis of correlated traits.

Nat Methods. 2015 Aug;12(8):755-8. doi: 10.1038/nmeth.3439. Epub 2015 Jun 15.

引用本文的文献

GRAMMAR-Lambda Delivers Efficient Understanding of the Genetic Basis for Head Size in Catfish.

Biology (Basel). 2025 Jan 13;14(1):63. doi: 10.3390/biology14010063.

Beyond the single-outcome approach: A comparison of outcome-wide analysis methods for exposome research.

Environ Int. 2023 Dec;182:108344. doi: 10.1016/j.envint.2023.108344. Epub 2023 Nov 22.

Dissecting Complex Traits Using Omics Data: A Review on the Linear Mixed Models and Their Application in GWAS.

Plants (Basel). 2022 Nov 28;11(23):3277. doi: 10.3390/plants11233277.

A Novel Method for Identifying a Parsimonious and Accurate Predictive Model for Multiple Clinical Outcomes.

Comput Methods Programs Biomed. 2021 Jun;204:106073. doi: 10.1016/j.cmpb.2021.106073. Epub 2021 Mar 27.

An enhanced machine learning tool for cis-eQTL mapping with regularization and confounder adjustments.

Genet Epidemiol. 2020 Nov;44(8):798-810. doi: 10.1002/gepi.22341. Epub 2020 Jul 22.

Microbiome Multi-Omics Network Analysis: Statistical Considerations, Limitations, and Opportunities.

Front Genet. 2019 Nov 8;10:995. doi: 10.3389/fgene.2019.00995. eCollection 2019.

Robust network-based analysis of the associations between (epi)genetic measurements.

J Multivar Anal. 2018 Nov;168:119-130. doi: 10.1016/j.jmva.2018.06.009. Epub 2018 Jul 10.

Longitudinal Genotype-Phenotype Association Study through Temporal Structure Auto-Learning Predictive Model.

J Comput Biol. 2018 Jul;25(7):809-824. doi: 10.1089/cmb.2018.0008.

An integrative network-based approach to identify novel disease genes and pathways: a case study in the context of inflammatory bowel disease.

BMC Bioinformatics. 2018 Jul 13;19(1):264. doi: 10.1186/s12859-018-2251-x.

Longitudinal Genotype-Phenotype Association Study via Temporal Structure Auto-Learning Predictive Model.

Res Comput Mol Biol. 2017 May;10229:287-302. doi: 10.1007/978-3-319-56970-3_18. Epub 2017 Apr 12.

本文引用的文献

Application of a canonical transformation to detection of quantitative trait loci with the aid of genetic markers in a multi-trait experiment.

Theor Appl Genet. 1996 Jun;92(8):998-1002. doi: 10.1007/BF00224040.

An integrative network approach to map the transcriptome to the phenome.

J Comput Biol. 2009 Aug;16(8):1023-34. doi: 10.1089/cmb.2009.0037.

Integrating large-scale functional genomic data to dissect the complexity of yeast regulatory networks.

Nat Genet. 2008 Jul;40(7):854-61. doi: 10.1038/ng.167. Epub 2008 Jun 15.

Variations in DNA elucidate molecular networks that cause disease.

Nature. 2008 Mar 27;452(7186):429-35. doi: 10.1038/nature06757. Epub 2008 Mar 16.

Genetics of gene expression and its effect on disease.

Nature. 2008 Mar 27;452(7186):423-8. doi: 10.1038/nature06758. Epub 2008 Mar 16.

Accommodating linkage disequilibrium in genetic-association analyses via ridge regression.

Am J Hum Genet. 2008 Feb;82(2):375-85. doi: 10.1016/j.ajhg.2007.10.012.

Sparse inverse covariance estimation with the graphical lasso.

Biostatistics. 2008 Jul;9(3):432-41. doi: 10.1093/biostatistics/kxm045. Epub 2007 Dec 12.

Bayesian mapping of quantitative trait loci for multiple complex traits with the use of variance components.

Am J Hum Genet. 2007 Aug;81(2):304-20. doi: 10.1086/519495. Epub 2007 Jul 3.

Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls.

Nature. 2007 Jun 7;447(7145):661-78. doi: 10.1038/nature05911.

Association mapping via regularized regression analysis of single-nucleotide-polymorphism haplotypes in variable-sized sliding windows.

Am J Hum Genet. 2007 Apr;80(4):705-15. doi: 10.1086/513205. Epub 2007 Feb 19.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

一种用于数量性状网络关联分析的多元回归方法。

A multivariate regression approach to association analysis of a quantitative trait network.

作者信息

Kim Seyoung, Sohn Kyung-Ah, Xing Eric P

机构信息

School of Computer Science, Carnegie Mellon University, Pittsburgh, USA.