• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于随机 Lanczos 估计的线性混合效应模型的基因组方差分量估计。

Stochastic Lanczos estimation of genomic variance components for linear mixed-effects models.

机构信息

Institute for Behavioral Genetics, University of Colorado Boulder, Boulder, 80309, CO, USA.

Department of Psychology and Neuroscience, University of Colorado Boulder, Boulder, 80309, CO, USA.

出版信息

BMC Bioinformatics. 2019 Jul 30;20(1):411. doi: 10.1186/s12859-019-2978-z.

DOI:10.1186/s12859-019-2978-z
PMID:31362713
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6668092/
Abstract

BACKGROUND

Linear mixed-effects models (LMM) are a leading method in conducting genome-wide association studies (GWAS) but require residual maximum likelihood (REML) estimation of variance components, which is computationally demanding. Previous work has reduced the computational burden of variance component estimation by replacing direct matrix operations with iterative and stochastic methods and by employing loose tolerances to limit the number of iterations in the REML optimization procedure. Here, we introduce two novel algorithms, stochastic Lanczos derivative-free REML (SLDF_REML) and Lanczos first-order Monte Carlo REML (L_FOMC_REML), that exploit problem structure via the principle of Krylov subspace shift-invariance to speed computation beyond existing methods. Both novel algorithms only require a single round of computation involving iterative matrix operations, after which their respective objectives can be repeatedly evaluated using vector operations. Further, in contrast to existing stochastic methods, SLDF_REML can exploit precomputed genomic relatedness matrices (GRMs), when available, to further speed computation.

RESULTS

Results of numerical experiments are congruent with theory and demonstrate that interpreted-language implementations of both algorithms match or exceed existing compiled-language software packages in speed, accuracy, and flexibility.

CONCLUSIONS

Both the SLDF_REML and L_FOMC_REML algorithms outperform existing methods for REML estimation of variance components for LMM and are suitable for incorporation into existing GWAS LMM software implementations.

摘要

背景

线性混合效应模型(LMM)是进行全基因组关联研究(GWAS)的主要方法,但需要对方差分量进行残差最大似然(REML)估计,这在计算上要求很高。先前的工作通过用迭代和随机方法替代直接矩阵运算,并通过放宽容限来限制 REML 优化过程中的迭代次数,从而降低了方差分量估计的计算负担。在这里,我们引入了两种新的算法,即随机 Lanczos 无导数 REML(SLDF_REML)和 Lanczos 一阶蒙特卡罗 REML(L_FOMC_REML),它们通过 Krylov 子空间平移不变性原理利用问题结构来加速计算,超越了现有方法。这两种新算法都只需要一轮涉及迭代矩阵运算的计算,之后可以使用向量运算重复评估各自的目标。此外,与现有随机方法相比,当可用时,SLDF_REML 可以利用预先计算的基因组亲缘关系矩阵(GRM)进一步加速计算。

结果

数值实验的结果与理论相符,表明这两种算法的解释型语言实现与速度、准确性和灵活性方面都超越了现有的编译型语言软件包。

结论

SLDF_REML 和 L_FOMC_REML 算法都优于现有用于 LMM 的 REML 方差分量估计方法,适合纳入现有的 GWAS LMM 软件实现中。

相似文献

1
Stochastic Lanczos estimation of genomic variance components for linear mixed-effects models.基于随机 Lanczos 估计的线性混合效应模型的基因组方差分量估计。
BMC Bioinformatics. 2019 Jul 30;20(1):411. doi: 10.1186/s12859-019-2978-z.
2
Employing a Monte Carlo algorithm in Newton-type methods for restricted maximum likelihood estimation of genetic parameters.在用于遗传参数限制最大似然估计的牛顿型方法中采用蒙特卡罗算法。
PLoS One. 2013 Dec 10;8(12):e80821. doi: 10.1371/journal.pone.0080821. eCollection 2013.
3
Employing a Monte Carlo algorithm in expectation maximization restricted maximum likelihood estimation of the linear mixed model.运用蒙特卡罗算法在期望最大化限制最大似然估计线性混合模型中的应用。
J Anim Breed Genet. 2012 Dec;129(6):457-68. doi: 10.1111/j.1439-0388.2012.01000.x. Epub 2012 Apr 28.
4
Efficient Monte Carlo algorithm for restricted maximum likelihood estimation of genetic parameters.用于遗传参数限制极大似然估计的高效蒙特卡罗算法。
J Anim Breed Genet. 2019 Jul;136(4):252-261. doi: 10.1111/jbg.12375.
5
REML estimation of variance parameters in nonlinear mixed effects models using the SAEM algorithm.使用SAEM算法对非线性混合效应模型中方差参数进行限制最大似然估计。
Biom J. 2007 Dec;49(6):876-88. doi: 10.1002/bimj.200610348.
6
A fast genomic selection approach for large genomic data.一种针对大型基因组数据的快速基因组选择方法。
Theor Appl Genet. 2017 Jun;130(6):1277-1284. doi: 10.1007/s00122-017-2887-3. Epub 2017 Apr 7.
7
Reliable computing in estimation of variance components.方差分量估计中的可靠计算。
J Anim Breed Genet. 2008 Dec;125(6):363-70. doi: 10.1111/j.1439-0388.2008.00774.x.
8
Closed-form approximations to the REML estimator of a variance ratio (or heritability) in a mixed linear model.混合线性模型中方差比(或遗传力)的REML估计量的闭式近似值。
Biometrics. 2001 Dec;57(4):1148-56. doi: 10.1111/j.0006-341x.2001.01148.x.
9
Hybrid of Restricted and Penalized Maximum Likelihood Method for Efficient Genome-Wide Association Study.基于受限极大似然和惩罚极大似然法的高效全基因组关联研究混合方法
Genes (Basel). 2020 Oct 29;11(11):1286. doi: 10.3390/genes11111286.
10
A robust DF-REML framework for variance components estimation in genetic studies.一种稳健的 DF-REML 框架,用于遗传研究中的方差分量估计。
Bioinformatics. 2017 Nov 15;33(22):3584-3594. doi: 10.1093/bioinformatics/btx457.

引用本文的文献

1
Weighted GBLUP in Simulated Beef Cattle Populations: Impact of Reference Population, Marker Density, and Heritability.模拟肉牛群体中的加权基因组最佳线性无偏预测:参考群体、标记密度和遗传力的影响
Animals (Basel). 2025 Apr 12;15(8):1118. doi: 10.3390/ani15081118.
2
SLEMM: million-scale genomic predictions with window-based SNP weighting.SLEMM:基于窗口的 SNP 加权的大规模基因组预测。
Bioinformatics. 2023 Mar 1;39(3). doi: 10.1093/bioinformatics/btad127.
3
Hybrid of Restricted and Penalized Maximum Likelihood Method for Efficient Genome-Wide Association Study.

本文引用的文献

1
Mixed-model association for biobank-scale datasets.基于生物库规模数据集的混合模型关联分析。
Nat Genet. 2018 Jul;50(7):906-908. doi: 10.1038/s41588-018-0144-6.
2
Comparison of methods that use whole genome data to estimate the heritability and genetic architecture of complex traits.使用全基因组数据估计复杂性状遗传力和遗传结构的方法比较。
Nat Genet. 2018 May;50(5):737-745. doi: 10.1038/s41588-018-0108-x. Epub 2018 Apr 26.
3
Contrasting genetic architectures of schizophrenia and other complex diseases using fast variance-components analysis.
基于受限极大似然和惩罚极大似然法的高效全基因组关联研究混合方法
Genes (Basel). 2020 Oct 29;11(11):1286. doi: 10.3390/genes11111286.
使用快速方差成分分析对比精神分裂症和其他复杂疾病的遗传结构。
Nat Genet. 2015 Dec;47(12):1385-92. doi: 10.1038/ng.3431. Epub 2015 Nov 2.
4
UK biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age.英国生物银行:一个用于识别多种中老年复杂疾病病因的开放获取资源。
PLoS Med. 2015 Mar 31;12(3):e1001779. doi: 10.1371/journal.pmed.1001779. eCollection 2015 Mar.
5
Efficient Bayesian mixed-model analysis increases association power in large cohorts.高效的贝叶斯混合模型分析提高了大型队列研究中的关联效能。
Nat Genet. 2015 Mar;47(3):284-90. doi: 10.1038/ng.3190. Epub 2015 Feb 2.
6
Efficient multivariate linear mixed model algorithms for genome-wide association studies.高效的全基因组关联研究的多元线性混合模型算法。
Nat Methods. 2014 Apr;11(4):407-9. doi: 10.1038/nmeth.2848. Epub 2014 Feb 16.
7
Advantages and pitfalls in the application of mixed-model association methods.混合模型关联方法应用的优缺点。
Nat Genet. 2014 Feb;46(2):100-6. doi: 10.1038/ng.2876.
8
Prediction of complex human traits using the genomic best linear unbiased predictor.利用基因组最佳线性无偏预测器预测复杂人类特征。
PLoS Genet. 2013;9(7):e1003608. doi: 10.1371/journal.pgen.1003608. Epub 2013 Jul 11.
9
Genome-wide efficient mixed-model analysis for association studies.全基因组高效混合模型关联分析。
Nat Genet. 2012 Jun 17;44(7):821-4. doi: 10.1038/ng.2310.
10
FaST linear mixed models for genome-wide association studies.Fast 线性混合模型在全基因组关联研究中的应用。
Nat Methods. 2011 Sep 4;8(10):833-5. doi: 10.1038/nmeth.1681.