Suppr超能文献

基于基因或区域的核主成分分析关联研究。

Gene- or region-based association study via kernel principal component analysis.

机构信息

Department of Epidemiology and Health Statistics, School of Public Health, Shandong University, Jinan 250012, China.

出版信息

BMC Genet. 2011 Aug 26;12:75. doi: 10.1186/1471-2156-12-75.

Abstract

BACKGROUND

In genetic association study, especially in GWAS, gene- or region-based methods have been more popular to detect the association between multiple SNPs and diseases (or traits). Kernel principal component analysis combined with logistic regression test (KPCA-LRT) has been successfully used in classifying gene expression data. Nevertheless, the purpose of association study is to detect the correlation between genetic variations and disease rather than to classify the sample, and the genomic data is categorical rather than numerical. Recently, although the kernel-based logistic regression model in association study has been proposed by projecting the nonlinear original SNPs data into a linear feature space, it is still impacted by multicolinearity between the projections, which may lead to loss of power. We, therefore, proposed a KPCA-LRT model to avoid the multicolinearity.

RESULTS

Simulation results showed that KPCA-LRT was always more powerful than principal component analysis combined with logistic regression test (PCA-LRT) at different sample sizes, different significant levels and different relative risks, especially at the genewide level (1E-5) and lower relative risks (RR = 1.2, 1.3). Application to the four gene regions of rheumatoid arthritis (RA) data from Genetic Analysis Workshop16 (GAW16) indicated that KPCA-LRT had better performance than single-locus test and PCA-LRT.

CONCLUSIONS

KPCA-LRT is a valid and powerful gene- or region-based method for the analysis of GWAS data set, especially under lower relative risks and lower significant levels.

摘要

背景

在遗传关联研究中,特别是在全基因组关联研究(GWAS)中,基于基因或区域的方法已经越来越受欢迎,用于检测多个 SNP 与疾病(或特征)之间的关联。核主成分分析结合逻辑回归检验(KPCA-LRT)已成功应用于基因表达数据分类。然而,关联研究的目的是检测遗传变异与疾病之间的相关性,而不是对样本进行分类,并且基因组数据是分类的而不是数值的。最近,尽管已经提出了基于核的关联研究中的逻辑回归模型,即将非线性原始 SNP 数据投影到线性特征空间中,但它仍然受到投影之间的多重共线性的影响,这可能导致功效损失。因此,我们提出了一种 KPCA-LRT 模型来避免多重共线性。

结果

模拟结果表明,在不同的样本量、不同的显著水平和不同的相对风险下,KPCA-LRT 始终比主成分分析结合逻辑回归检验(PCA-LRT)更有效,特别是在全基因水平(1E-5)和较低的相对风险(RR=1.2,1.3)下。对来自遗传分析研讨会 16(GAW16)的四个类风湿关节炎(RA)基因区域的数据的应用表明,KPCA-LRT 比单基因检验和 PCA-LRT 具有更好的性能。

结论

KPCA-LRT 是一种有效的、强大的基于基因或区域的 GWAS 数据分析方法,特别是在较低的相对风险和较低的显著水平下。

相似文献

1
Gene- or region-based association study via kernel principal component analysis.
BMC Genet. 2011 Aug 26;12:75. doi: 10.1186/1471-2156-12-75.
2
Weighted SNP set analysis in genome-wide association study.
PLoS One. 2013 Sep 30;8(9):e75897. doi: 10.1371/journal.pone.0075897. eCollection 2013.
3
A powerful score-based test statistic for detecting gene-gene co-association.
BMC Genet. 2016 Jan 29;17:31. doi: 10.1186/s12863-016-0331-3.
4
Improved Statistical Fault Detection Technique and Application to Biological Phenomena Modeled by S-Systems.
IEEE Trans Nanobioscience. 2017 Sep;16(6):504-512. doi: 10.1109/TNB.2017.2726144. Epub 2017 Jul 12.
5
Online prediction model based on the SVD-KPCA method.
ISA Trans. 2013 Jan;52(1):96-104. doi: 10.1016/j.isatra.2012.09.007. Epub 2012 Oct 24.
6
Association test based on SNP set: logistic kernel machine based test vs. principal component analysis.
PLoS One. 2012;7(9):e44978. doi: 10.1371/journal.pone.0044978. Epub 2012 Sep 13.
9
Gene-Based Association Analysis for Censored Traits Via Fixed Effect Functional Regressions.
Genet Epidemiol. 2016 Feb;40(2):133-43. doi: 10.1002/gepi.21947. Epub 2016 Jan 18.
10
Application of kernel principal component analysis for single-lead-ECG-derived respiration.
IEEE Trans Biomed Eng. 2012 Apr;59(4):1169-76. doi: 10.1109/TBME.2012.2186448.

引用本文的文献

3
Gene-based mediation analysis in epigenetic studies.
Brief Bioinform. 2021 May 20;22(3). doi: 10.1093/bib/bbaa113.
5
A Comparative Study of Five Association Tests Based on CpG Set for Epigenome-Wide Association Studies.
PLoS One. 2016 Jun 3;11(6):e0156895. doi: 10.1371/journal.pone.0156895. eCollection 2016.
7
Weighted SNP set analysis in genome-wide association study.
PLoS One. 2013 Sep 30;8(9):e75897. doi: 10.1371/journal.pone.0075897. eCollection 2013.
8
Region-based association analysis of human quantitative traits in related individuals.
PLoS One. 2013 Jun 17;8(6):e65395. doi: 10.1371/journal.pone.0065395. Print 2013.
9
SNP set association analysis for genome-wide association studies.
PLoS One. 2013 May 3;8(5):e62495. doi: 10.1371/journal.pone.0062495. Print 2013.

本文引用的文献

2
A versatile gene-based test for genome-wide association studies.
Am J Hum Genet. 2010 Jul 9;87(1):139-45. doi: 10.1016/j.ajhg.2010.06.009.
3
Powerful SNP-set analysis for case-control genome-wide association studies.
Am J Hum Genet. 2010 Jun 11;86(6):929-42. doi: 10.1016/j.ajhg.2010.05.002.
4
Genome-wide association study meta-analysis identifies seven new rheumatoid arthritis risk loci.
Nat Genet. 2010 Jun;42(6):508-14. doi: 10.1038/ng.582. Epub 2010 May 9.
6
Incorporating multiple-marker information to detect risk loci for rheumatoid arthritis.
BMC Proc. 2009 Dec 15;3 Suppl 7(Suppl 7):S28. doi: 10.1186/1753-6561-3-s7-s28.
7
Detecting susceptibility genes for rheumatoid arthritis based on a novel sliding-window approach.
BMC Proc. 2009 Dec 15;3 Suppl 7(Suppl 7):S14. doi: 10.1186/1753-6561-3-s7-s14.
8
Genome-wide gene-based association study.
BMC Proc. 2009 Dec 15;3 Suppl 7(Suppl 7):S135. doi: 10.1186/1753-6561-3-s7-s135.
9
Genome-wide gene-based analysis of rheumatoid arthritis-associated interaction with PTPN22 and HLA-DRB1.
BMC Proc. 2009 Dec 15;3 Suppl 7(Suppl 7):S132. doi: 10.1186/1753-6561-3-s7-s132.
10
A new gene-based association test for genome-wide association studies.
BMC Proc. 2009 Dec 15;3 Suppl 7(Suppl 7):S130. doi: 10.1186/1753-6561-3-s7-s130.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验