• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

带有群体结构校正的关联作图的套索多标记混合模型。

A Lasso multi-marker mixed model for association mapping with population structure correction.

机构信息

Machine Learning and Computational Biology Research Group, Max Planck Institute for Intelligent Systems and Max Planck Institute for Developmental Biology, 72076 Tübingen, Germany.

出版信息

Bioinformatics. 2013 Jan 15;29(2):206-14. doi: 10.1093/bioinformatics/bts669. Epub 2012 Nov 22.

DOI:10.1093/bioinformatics/bts669
PMID:23175758
Abstract

MOTIVATION

Exploring the genetic basis of heritable traits remains one of the central challenges in biomedical research. In traits with simple Mendelian architectures, single polymorphic loci explain a significant fraction of the phenotypic variability. However, many traits of interest seem to be subject to multifactorial control by groups of genetic loci. Accurate detection of such multivariate associations is non-trivial and often compromised by limited statistical power. At the same time, confounding influences, such as population structure, cause spurious association signals that result in false-positive findings.

RESULTS

We propose linear mixed models LMM-Lasso, a mixed model that allows for both multi-locus mapping and correction for confounding effects. Our approach is simple and free of tuning parameters; it effectively controls for population structure and scales to genome-wide datasets. LMM-Lasso simultaneously discovers likely causal variants and allows for multi-marker-based phenotype prediction from genotype. We demonstrate the practical use of LMM-Lasso in genome-wide association studies in Arabidopsis thaliana and linkage mapping in mouse, where our method achieves significantly more accurate phenotype prediction for 91% of the considered phenotypes. At the same time, our model dissects the phenotypic variability into components that result from individual single nucleotide polymorphism effects and population structure. Enrichment of known candidate genes suggests that the individual associations retrieved by LMM-Lasso are likely to be genuine.

AVAILABILITY

Code available under http://webdav.tuebingen. mpg.de/u/karsten/Forschung/research.html.

CONTACT

rakitsch@tuebingen.mpg.de, ippert@microsoft.com or stegle@ebi.ac.uk

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

探索可遗传性状的遗传基础仍然是生物医学研究的核心挑战之一。在具有简单孟德尔结构的性状中,单一多态性位点解释了表型可变性的很大一部分。然而,许多感兴趣的性状似乎受到遗传位点群的多因素控制。准确检测这种多变量关联并非微不足道,并且经常受到统计能力有限的影响。同时,混杂的影响,如群体结构,导致虚假的关联信号,从而导致假阳性发现。

结果

我们提出了线性混合模型 LMM-Lasso,这是一种允许多基因座映射和校正混杂影响的混合模型。我们的方法简单,无需调整参数;它有效地控制了群体结构,并扩展到全基因组数据集。LMM-Lasso 同时发现可能的因果变异,并允许基于多标记的基因型表型预测。我们在拟南芥全基因组关联研究和小鼠连锁映射中展示了 LMM-Lasso 的实际应用,在这两种方法中,我们的方法对 91%的考虑表型实现了显著更准确的表型预测。同时,我们的模型将表型可变性分解为由单个单核苷酸多态性效应和群体结构引起的成分。已知候选基因的富集表明,LMM-Lasso 检索到的个体关联很可能是真实的。

可用性

代码可在 http://webdav.tuebingen.mpg.de/u/karsten/Forschung/research.html 下获得。

联系方式

rakitsch@tuebingen.mpg.de,ippert@microsoft.com 或 stegle@ebi.ac.uk

补充信息

补充数据可在 Bioinformatics 在线获得。

相似文献

1
A Lasso multi-marker mixed model for association mapping with population structure correction.带有群体结构校正的关联作图的套索多标记混合模型。
Bioinformatics. 2013 Jan 15;29(2):206-14. doi: 10.1093/bioinformatics/bts669. Epub 2012 Nov 22.
2
Efficient network-guided multi-locus association mapping with graph cuts.基于图割的高效网络引导多基因座关联作图。
Bioinformatics. 2013 Jul 1;29(13):i171-9. doi: 10.1093/bioinformatics/btt238.
3
Multiplex confounding factor correction for genomic association mapping with squared sparse linear mixed model.基于二次稀疏线性混合模型的基因组关联作图的多元混杂因素校正。
Methods. 2018 Aug 1;145:33-40. doi: 10.1016/j.ymeth.2018.04.020. Epub 2018 Apr 27.
4
An efficient multi-locus mixed-model approach for genome-wide association studies in structured populations.一种在结构群体中进行全基因组关联研究的高效多基因混合模型方法。
Nat Genet. 2012 Jun 17;44(7):825-30. doi: 10.1038/ng.2314.
5
Genome-wide association analysis and genetic architecture of egg weight and egg uniformity in layer chickens.鸡卵重和卵均匀度的全基因组关联分析和遗传结构。
Anim Genet. 2012 Jul;43 Suppl 1:87-96. doi: 10.1111/j.1365-2052.2012.02381.x.
6
MISS: a non-linear methodology based on mutual information for genetic association studies in both population and sib-pairs analysis.MISS:一种基于互信息的非线性方法,用于群体和同胞对分析中的遗传关联研究。
Bioinformatics. 2010 Aug 1;26(15):1811-8. doi: 10.1093/bioinformatics/btq273. Epub 2010 Jun 18.
7
Mapping of fertility traits in Finnish Ayrshire by genome-wide association analysis.通过全基因组关联分析对芬兰爱尔夏牛的繁殖性状进行定位。
Anim Genet. 2011 Jun;42(3):263-9. doi: 10.1111/j.1365-2052.2010.02149.x. Epub 2011 Jan 17.
8
A genome scan for quantitative trait loci influencing carcass, post-natal growth and reproductive traits in commercial Angus cattle.对影响商业安格斯牛胴体、产后生长和繁殖性状的数量性状基因座进行全基因组扫描。
Anim Genet. 2010 Dec;41(6):597-607. doi: 10.1111/j.1365-2052.2010.02063.x.
9
Finding genome-transcriptome-phenome association with structured association mapping and visualization in GenAMap.通过GenAMap中的结构化关联映射和可视化来寻找基因组-转录组-表型组关联。
Pac Symp Biocomput. 2012:327-38.
10
A multi-marker association method for genome-wide association studies without the need for population structure correction.一种无需进行群体结构校正即可用于全基因组关联研究的多标记关联方法。
Nat Commun. 2016 Nov 10;7:13299. doi: 10.1038/ncomms13299.

引用本文的文献

1
GWAS for identification of genomic regions and candidate genes in vegetable crops.GWAS 用于鉴定蔬菜作物中的基因组区域和候选基因。
Funct Integr Genomics. 2024 Oct 29;24(6):203. doi: 10.1007/s10142-024-01477-x.
2
Genome-wide association study of salt tolerance at the seed germination stage in lettuce.生菜种子萌发阶段耐盐性的全基因组关联研究。
PLoS One. 2024 Oct 18;19(10):e0308818. doi: 10.1371/journal.pone.0308818. eCollection 2024.
3
Multi-locus genome-wide association study for grain yield and drought tolerance indices in sorghum accessions.
高粱种质资源产量和耐旱性指标的多位点全基因组关联研究
Plant Genome. 2024 Dec;17(4):e20505. doi: 10.1002/tpg2.20505. Epub 2024 Sep 10.
4
High-dimensional supervised classification in a context of non-independence of observations to identify the determining SNPs in a phenotype.在观测值非独立的情况下进行高维监督分类,以识别表型中的决定性单核苷酸多态性。
Infect Dis Model. 2023 Sep 9;8(4):1079-1087. doi: 10.1016/j.idm.2023.09.002. eCollection 2023 Dec.
5
Limitations of principal components in quantitative genetic association models for human studies.主成分在人类研究定量遗传关联模型中的局限性。
Elife. 2023 May 4;12:e79238. doi: 10.7554/eLife.79238.
6
Efficient penalized generalized linear mixed models for variable selection and genetic risk prediction in high-dimensional data.高效惩罚广义线性混合模型在高维数据中的变量选择和遗传风险预测。
Bioinformatics. 2023 Feb 3;39(2). doi: 10.1093/bioinformatics/btad063.
7
Dissecting Complex Traits Using Omics Data: A Review on the Linear Mixed Models and Their Application in GWAS.利用组学数据剖析复杂性状:线性混合模型及其在全基因组关联研究中的应用综述
Plants (Basel). 2022 Nov 28;11(23):3277. doi: 10.3390/plants11233277.
8
Potential application of elastic nets for shared polygenicity detection with adapted threshold selection.弹性网络在具有自适应阈值选择的共享多基因性检测中的潜在应用。
Int J Biostat. 2022 Nov 3;19(2):417-438. doi: 10.1515/ijb-2020-0108. eCollection 2023 Nov 1.
9
Multi-locus genome-wide association studies (ML-GWAS) reveal novel genomic regions associated with seedling and adult plant stage leaf rust resistance in bread wheat (Triticum aestivum L.).多基因座全基因组关联研究(ML-GWAS)揭示了与普通小麦(Triticum aestivum L.)幼苗期和成株期叶片抗条锈病相关的新基因组区域。
Heredity (Edinb). 2022 Jun;128(6):434-449. doi: 10.1038/s41437-022-00525-1. Epub 2022 Apr 13.
10
Power analysis of transcriptome-wide association study: Implications for practical protocol choice.全转录组关联研究的功效分析:对实际方案选择的启示。
PLoS Genet. 2021 Feb 26;17(2):e1009405. doi: 10.1371/journal.pgen.1009405. eCollection 2021 Feb.