• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

多元回归方法在罕见变异关联测试中显示出巨大的潜力。

Multiple regression methods show great potential for rare variant association tests.

机构信息

Lady Davis Institute for Medical Research, Jewish General Hospital, Montreal, Quebec, Canada.

出版信息

PLoS One. 2012;7(8):e41694. doi: 10.1371/journal.pone.0041694. Epub 2012 Aug 8.

DOI:10.1371/journal.pone.0041694
PMID:22916111
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3420665/
Abstract

The investigation of associations between rare genetic variants and diseases or phenotypes has two goals. Firstly, the identification of which genes or genomic regions are associated, and secondly, discrimination of associated variants from background noise within each region. Over the last few years, many new methods have been developed which associate genomic regions with phenotypes. However, classical methods for high-dimensional data have received little attention. Here we investigate whether several classical statistical methods for high-dimensional data: ridge regression (RR), principal components regression (PCR), partial least squares regression (PLS), a sparse version of PLS (SPLS), and the LASSO are able to detect associations with rare genetic variants. These approaches have been extensively used in statistics to identify the true associations in data sets containing many predictor variables. Using genetic variants identified in three genes that were Sanger sequenced in 1998 individuals, we simulated continuous phenotypes under several different models, and we show that these feature selection and feature extraction methods can substantially outperform several popular methods for rare variant analysis. Furthermore, these approaches can identify which variants are contributing most to the model fit, and therefore both goals of rare variant analysis can be achieved simultaneously with the use of regression regularization methods. These methods are briefly illustrated with an analysis of adiponectin levels and variants in the ADIPOQ gene.

摘要

对罕见遗传变异与疾病或表型之间关联的研究有两个目标。首先,确定哪些基因或基因组区域与疾病或表型相关联;其次,在每个区域内将相关变异与背景噪声区分开来。在过去的几年中,已经开发出许多新的方法来将基因组区域与表型相关联。然而,经典的高维数据方法却很少受到关注。在这里,我们研究了几种经典的高维数据统计方法:岭回归(RR)、主成分回归(PCR)、偏最小二乘回归(PLS)、PLS 的稀疏版本(SPLS)和 LASSO 是否能够检测到与罕见遗传变异的关联。这些方法在统计学中被广泛用于识别包含许多预测变量的数据集中的真实关联。我们使用在 1998 个人中进行桑格测序的三个基因中鉴定的遗传变异,模拟了几种不同模型下的连续表型,并表明这些特征选择和特征提取方法可以大大优于几种用于罕见变异分析的流行方法。此外,这些方法可以确定哪些变异对模型拟合的贡献最大,因此可以同时使用回归正则化方法来实现罕见变异分析的两个目标。我们使用脂联素水平和 ADIPOQ 基因中的变异的分析简要说明了这些方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/be38/3420665/1b314f02baed/pone.0041694.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/be38/3420665/d3ddf3c7b9d3/pone.0041694.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/be38/3420665/f3c60753c5fa/pone.0041694.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/be38/3420665/e61f3e36a6aa/pone.0041694.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/be38/3420665/1b314f02baed/pone.0041694.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/be38/3420665/d3ddf3c7b9d3/pone.0041694.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/be38/3420665/f3c60753c5fa/pone.0041694.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/be38/3420665/e61f3e36a6aa/pone.0041694.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/be38/3420665/1b314f02baed/pone.0041694.g004.jpg

相似文献

1
Multiple regression methods show great potential for rare variant association tests.多元回归方法在罕见变异关联测试中显示出巨大的潜力。
PLoS One. 2012;7(8):e41694. doi: 10.1371/journal.pone.0041694. Epub 2012 Aug 8.
2
Regularized estimation of large-scale gene association networks using graphical Gaussian models.基于图式高斯模型的大规模基因关联网络正则化估计
BMC Bioinformatics. 2009 Nov 24;10:384. doi: 10.1186/1471-2105-10-384.
3
Dimension reduction and variable selection for genomic selection: application to predicting milk yield in Holsteins.降维与变量选择在基因组选择中的应用:以荷斯坦奶牛产奶量预测为例
J Anim Breed Genet. 2011 Aug;128(4):247-57. doi: 10.1111/j.1439-0388.2011.00917.x. Epub 2011 Mar 28.
4
Bayesian regression models outperform partial least squares methods for predicting milk components and technological properties using infrared spectral data.在使用红外光谱数据预测牛奶成分和工艺特性方面,贝叶斯回归模型优于偏最小二乘法。
J Dairy Sci. 2015 Nov;98(11):8133-51. doi: 10.3168/jds.2014-9143. Epub 2015 Sep 18.
5
Deviance residuals-based sparse PLS and sparse kernel PLS regression for censored data.基于偏差残差的稀疏偏最小二乘和稀疏核偏最小二乘回归用于删失数据。
Bioinformatics. 2015 Feb 1;31(3):397-404. doi: 10.1093/bioinformatics/btu660. Epub 2014 Oct 6.
6
Smoothed functional principal component analysis for testing association of the entire allelic spectrum of genetic variation.平滑功能主成分分析检验全等位基因谱遗传变异的关联。
Eur J Hum Genet. 2013 Feb;21(2):217-24. doi: 10.1038/ejhg.2012.141. Epub 2012 Jul 11.
7
Applications of machine learning and data mining methods to detect associations of rare and common variants with complex traits.应用机器学习和数据挖掘方法检测罕见和常见变异与复杂性状之间的关联。
Genet Epidemiol. 2014 Sep;38 Suppl 1:S81-5. doi: 10.1002/gepi.21830.
8
GENOME-WIDE ASSOCIATION MAPPING AND RARE ALLELES: FROM POPULATION GENOMICS TO PERSONALIZED MEDICINE - Session Introduction.全基因组关联图谱与罕见等位基因:从群体基因组学到个性化医学——会议介绍
Pac Symp Biocomput. 2011:74-5. doi: 10.1142/9789814335058_0008.
9
A Novel Statistic for Global Association Testing Based on Penalized Regression.一种基于惩罚回归的用于全局关联检验的新型统计量。
Genet Epidemiol. 2015 Sep;39(6):415-26. doi: 10.1002/gepi.21915.
10
A rare variant association test in family-based designs and non-normal quantitative traits.基于家系设计和非正态定量性状的罕见变异关联检验。
Stat Med. 2016 Mar 15;35(6):905-21. doi: 10.1002/sim.6750. Epub 2015 Sep 29.

引用本文的文献

1
Excalibur: A new ensemble method based on an optimal combination of aggregation tests for rare-variant association testing for sequencing data.Excalibur:一种新的基于聚合检验最优组合的测序数据罕见变异关联检验的集成方法。
PLoS Comput Biol. 2023 Sep 14;19(9):e1011488. doi: 10.1371/journal.pcbi.1011488. eCollection 2023 Sep.
2
Comprehensive evaluation of resistance effects of pyramiding lines with different broad-spectrum resistance genes against Magnaporthe oryzae in rice (Oryza sativa L.).水稻(Oryza sativa L.)中携带不同广谱抗性基因的聚合系对稻瘟病菌(Magnaporthe oryzae)抗性效应的综合评价
Rice (N Y). 2019 Mar 1;12(1):11. doi: 10.1186/s12284-019-0264-3.
3

本文引用的文献

1
Novel loci for adiponectin levels and their influence on type 2 diabetes and metabolic traits: a multi-ethnic meta-analysis of 45,891 individuals.脂肪因子基因新位点及其对 2 型糖尿病和代谢特征的影响:45891 人的多民族荟萃分析。
PLoS Genet. 2012;8(3):e1002607. doi: 10.1371/journal.pgen.1002607. Epub 2012 Mar 29.
2
Comparison of scoring methods for the detection of causal genes with or without rare variants.用于检测有无罕见变异的因果基因的评分方法比较。
BMC Proc. 2011 Nov 29;5 Suppl 9(Suppl 9):S49. doi: 10.1186/1753-6561-5-S9-S49.
3
Use of principal components to aggregate rare variants in case-control and family-based association studies in the presence of multiple covariates.
Comparison of single-marker and multi-marker tests in rare variant association studies of quantitative traits.
数量性状罕见变异关联研究中单标记与多标记检验的比较。
PLoS One. 2017 May 31;12(5):e0178504. doi: 10.1371/journal.pone.0178504. eCollection 2017.
4
Identifying rare and common variants with Bayesian variable selection.使用贝叶斯变量选择识别罕见和常见变异。
BMC Proc. 2016 Oct 18;10(Suppl 7):379-384. doi: 10.1186/s12919-016-0059-0. eCollection 2016.
5
Combination Patterns of Major R Genes Determine the Level of Resistance to the M. oryzae in Rice (Oryza sativa L.).主要抗性基因的组合模式决定了水稻(Oryza sativa L.)对稻瘟病菌的抗性水平。
PLoS One. 2015 Jun 1;10(6):e0126130. doi: 10.1371/journal.pone.0126130. eCollection 2015.
6
Gene-based analysis of rare and common variants to determine association with blood pressure.基于基因的罕见和常见变异分析以确定与血压的关联。
BMC Proc. 2014 Jun 17;8(Suppl 1 Genetic Analysis Workshop 18Vanessa Olmo):S46. doi: 10.1186/1753-6561-8-S1-S46. eCollection 2014.
7
Identification of rare causal variants in sequence-based studies: methods and applications to VPS13B, a gene involved in Cohen syndrome and autism.基于序列研究中罕见因果变异的鉴定:方法及其在VPS13B基因中的应用,VPS13B基因与科恩综合征和自闭症相关
PLoS Genet. 2014 Dec 11;10(12):e1004729. doi: 10.1371/journal.pgen.1004729. eCollection 2014 Dec.
8
GWAS to Sequencing: Divergence in Study Design and Analysis.GWAS 到测序:研究设计和分析的分歧。
Genes (Basel). 2014 May 28;5(2):460-76. doi: 10.3390/genes5020460.
9
Diabetic retinopathy risk prediction for fundus examination using sparse learning: a cross-sectional study.基于稀疏学习的眼底检查糖尿病视网膜病变风险预测:一项横断面研究。
BMC Med Inform Decis Mak. 2013 Sep 13;13:106. doi: 10.1186/1472-6947-13-106.
10
The value of statistical or bioinformatics annotation for rare variant association with quantitative trait.统计或生物信息学注释对罕见变异与数量性状关联的价值。
Genet Epidemiol. 2013 Nov;37(7):666-74. doi: 10.1002/gepi.21747. Epub 2013 Jul 8.
在存在多个协变量的情况下,在病例对照研究和基于家系的关联研究中使用主成分来汇总罕见变异。
BMC Proc. 2011 Nov 29;5 Suppl 9(Suppl 9):S29. doi: 10.1186/1753-6561-5-S9-S29.
4
A LASSO-based approach to analyzing rare variants in genetic association studies.一种基于套索算法的方法用于分析基因关联研究中的罕见变异。
BMC Proc. 2011 Nov 29;5 Suppl 9(Suppl 9):S100. doi: 10.1186/1753-6561-5-S9-S100.
5
Comparison of statistical approaches to rare variant analysis for quantitative traits.用于数量性状的罕见变异分析的统计方法比较。
BMC Proc. 2011 Nov 29;5 Suppl 9(Suppl 9):S113. doi: 10.1186/1753-6561-5-S9-S113.
6
Penalized-regression-based multimarker genotype analysis of Genetic Analysis Workshop 17 data.基于惩罚回归的遗传分析研讨会17数据多标记基因型分析
BMC Proc. 2011 Nov 29;5 Suppl 9(Suppl 9):S92. doi: 10.1186/1753-6561-5-S9-S92.
7
Gene-based partial least-squares approaches for detecting rare variant associations with complex traits.基于基因的偏最小二乘法用于检测与复杂性状相关的罕见变异
BMC Proc. 2011 Nov 29;5 Suppl 9(Suppl 9):S19. doi: 10.1186/1753-6561-5-S9-S19.
8
The empirical power of rare variant association methods: results from sanger sequencing in 1,998 individuals.稀有变异关联方法的经验效力:对 1998 个人进行桑格测序的结果。
PLoS Genet. 2012 Feb;8(2):e1002496. doi: 10.1371/journal.pgen.1002496. Epub 2012 Feb 2.
9
Logistic Bayesian LASSO for identifying association with rare haplotypes and application to age-related macular degeneration.用于识别与罕见单倍型关联的逻辑贝叶斯套索法及其在年龄相关性黄斑变性中的应用。
Biometrics. 2012 Jun;68(2):587-97. doi: 10.1111/j.1541-0420.2011.01680.x. Epub 2011 Sep 28.
10
Computational and statistical approaches to analyzing variants identified by exome sequencing.基于外显子组测序鉴定变异的计算与统计分析方法。
Genome Biol. 2011 Sep 14;12(9):227. doi: 10.1186/gb-2011-12-9-227.