Suppr超能文献

关联研究中的生物标志物检测:通过逻辑方差分析同时对单核苷酸多态性进行建模。

Biomarker Detection in Association Studies: Modeling SNPs Simultaneously via Logistic ANOVA.

作者信息

Jung Yoonsuh, Huang Jianhua Z, Hu Jianhua

机构信息

Department of Statistics, Univerisity of Waikato, Private Bag 3105, Hamilton 3240, New Zealand.

Department of Statistics, Texas A&M University, College Station, TX, USA, and Special Term Professor at ISEM, Captial University of Economics and Business, Beijing, China.

出版信息

J Am Stat Assoc. 2014 Dec 1;109(508):1355-1367. doi: 10.1080/01621459.2014.928217.

Abstract

In genome-wide association studies, the primary task is to detect biomarkers in the form of Single Nucleotide Polymorphisms (SNPs) that have nontrivial associations with a disease phenotype and some other important clinical/environmental factors. However, the extremely large number of SNPs comparing to the sample size inhibits application of classical methods such as the multiple logistic regression. Currently the most commonly used approach is still to analyze one SNP at a time. In this paper, we propose to consider the genotypes of the SNPs simultaneously via a logistic analysis of variance (ANOVA) model, which expresses the logit transformed mean of SNP genotypes as the summation of the SNP effects, effects of the disease phenotype and/or other clinical variables, and the interaction effects. We use a reduced-rank representation of the interaction-effect matrix for dimensionality reduction, and employ the -penalty in a penalized likelihood framework to filter out the SNPs that have no associations. We develop a Majorization-Minimization algorithm for computational implementation. In addition, we propose a modified BIC criterion to select the penalty parameters and determine the rank number. The proposed method is applied to a Multiple Sclerosis data set and simulated data sets and shows promise in biomarker detection.

摘要

在全基因组关联研究中,主要任务是检测单核苷酸多态性(SNP)形式的生物标志物,这些生物标志物与疾病表型以及其他一些重要的临床/环境因素存在显著关联。然而,与样本量相比,SNP的数量极其庞大,这限制了诸如多元逻辑回归等经典方法的应用。目前最常用的方法仍然是一次分析一个SNP。在本文中,我们建议通过逻辑方差分析(ANOVA)模型同时考虑SNP的基因型,该模型将SNP基因型的对数转换均值表示为SNP效应、疾病表型和/或其他临床变量的效应以及交互效应的总和。我们使用交互效应矩阵的降秩表示进行降维,并在惩罚似然框架中采用惩罚来筛选出无关联的SNP。我们开发了一种主元最小化算法用于计算实现。此外,我们提出了一种修正的BIC准则来选择惩罚参数并确定秩数。所提出的方法应用于一个多发性硬化症数据集和模拟数据集,并在生物标志物检测方面显示出前景。

相似文献

8
Genome-wide association analysis by lasso penalized logistic regression.基于套索惩罚逻辑回归的全基因组关联分析。
Bioinformatics. 2009 Mar 15;25(6):714-21. doi: 10.1093/bioinformatics/btp041. Epub 2009 Jan 28.

本文引用的文献

7
Singular Value Decomposition-based Alternative Splicing Detection.基于奇异值分解的可变剪接检测
J Am Stat Assoc. 2009 Sep 1;104(487):944-953. doi: 10.1198/jasa.2009.ap08283.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验