• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

SLEMM:基于窗口的 SNP 加权的大规模基因组预测。

SLEMM: million-scale genomic predictions with window-based SNP weighting.

机构信息

Department of Animal Science, North Carolina State University, Raleigh, NC 27695, United States.

Animal Genomics and Improvement Laboratory, USDA-ARS, Beltsville, MD 20705, United States.

出版信息

Bioinformatics. 2023 Mar 1;39(3). doi: 10.1093/bioinformatics/btad127.

DOI:10.1093/bioinformatics/btad127
PMID:36897019
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10039786/
Abstract

MOTIVATION

The amount of genomic data is increasing exponentially. Using many genotyped and phenotyped individuals for genomic prediction is appealing yet challenging.

RESULTS

We present SLEMM (short for Stochastic-Lanczos-Expedited Mixed Models), a new software tool, to address the computational challenge. SLEMM builds on an efficient implementation of the stochastic Lanczos algorithm for REML in a framework of mixed models. We further implement SNP weighting in SLEMM to improve its predictions. Extensive analyses on seven public datasets, covering 19 polygenic traits in three plant and three livestock species, showed that SLEMM with SNP weighting had overall the best predictive ability among a variety of genomic prediction methods including GCTA's empirical BLUP, BayesR, KAML, and LDAK's BOLT and BayesR models. We also compared the methods using nine dairy traits of ∼300k genotyped cows. All had overall similar prediction accuracies, except that KAML failed to process the data. Additional simulation analyses on up to 3 million individuals and 1 million SNPs showed that SLEMM was advantageous over counterparts as for computational performance. Overall, SLEMM can do million-scale genomic predictions with an accuracy comparable to BayesR.

AVAILABILITY AND IMPLEMENTATION

The software is available at https://github.com/jiang18/slemm.

摘要

动机

基因组数据的数量正在呈指数级增长。使用大量经过基因分型和表型分析的个体进行基因组预测具有吸引力,但也具有挑战性。

结果

我们提出了 SLEMM(Stochastic-Lanczos-Expedited Mixed Models 的简称),这是一种新的软件工具,用于解决计算挑战。SLEMM 建立在 REML 的随机 Lanczos 算法的高效实现基础上,并在混合模型框架中实现。我们进一步在 SLEMM 中实现 SNP 加权,以提高其预测能力。在七个公共数据集上进行的广泛分析涵盖了三个植物和三个牲畜物种的 19 个多基因性状,结果表明,在包括 GCTA 的经验 BLUP、BayesR、KAML 和 LDAK 的 BOLT 和 BayesR 模型在内的各种基因组预测方法中,带有 SNP 加权的 SLEMM 具有总体最佳的预测能力。我们还使用了约 30 万头经过基因分型的奶牛的九个奶牛性状比较了这些方法。所有方法的预测准确性总体上都相似,除了 KAML 无法处理数据。在多达 300 万个个体和 1000 万个 SNP 上进行的额外模拟分析表明,SLEMM 在计算性能方面优于其对应方法。总体而言,SLEMM 可以进行百万规模的基因组预测,并且与 BayesR 的准确性相当。

可用性和实现

该软件可在 https://github.com/jiang18/slemm 上获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/58fe/10039786/bbe2aeac2dd9/btad127f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/58fe/10039786/bbe2aeac2dd9/btad127f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/58fe/10039786/bbe2aeac2dd9/btad127f1.jpg

相似文献

1
SLEMM: million-scale genomic predictions with window-based SNP weighting.SLEMM:基于窗口的 SNP 加权的大规模基因组预测。
Bioinformatics. 2023 Mar 1;39(3). doi: 10.1093/bioinformatics/btad127.
2
A computationally efficient algorithm for genomic prediction using a Bayesian model.一种使用贝叶斯模型进行基因组预测的计算高效算法。
Genet Sel Evol. 2015 Apr 30;47(1):34. doi: 10.1186/s12711-014-0082-4.
3
Accuracies of genomic predictions for disease resistance of striped catfish to Edwardsiella ictaluri using artificial intelligence algorithms.利用人工智能算法预测条纹斑竹鳖对爱德华氏菌病抗性的基因组准确性。
G3 (Bethesda). 2022 Jan 4;12(1). doi: 10.1093/g3journal/jkab361.
4
Performances of Adaptive MultiBLUP, Bayesian regressions, and weighted-GBLUP approaches for genomic predictions in Belgian Blue beef cattle.对比利时蓝牛肉牛进行基因组预测时,自适应多 BLUP、贝叶斯回归和加权 GBLUP 方法的表现。
BMC Genomics. 2020 Aug 6;21(1):545. doi: 10.1186/s12864-020-06921-3.
5
Genome-wide association study and prediction of genomic breeding values for fatty-acid composition in Korean Hanwoo cattle using a high-density single-nucleotide polymorphism array.全基因组关联研究和利用高密度单核苷酸多态性芯片预测韩牛脂肪酸组成的基因组育种值。
J Anim Sci. 2018 Sep 29;96(10):4063-4075. doi: 10.1093/jas/sky280.
6
Comparison of genomic predictions using genomic relationship matrices built with different weighting factors to account for locus-specific variances.使用基于不同加权因子构建的基因组关系矩阵来考虑位点特异性方差的基因组预测比较。
J Dairy Sci. 2014 Oct;97(10):6547-59. doi: 10.3168/jds.2014-8210. Epub 2014 Aug 14.
7
A comparison of five methods to predict genomic breeding values of dairy bulls from genome-wide SNP markers.比较五种方法从全基因组 SNP 标记预测奶牛公牛的基因组育种值。
Genet Sel Evol. 2009 Dec 31;41(1):56. doi: 10.1186/1297-9686-41-56.
8
Exploiting biological priors and sequence variants enhances QTL discovery and genomic prediction of complex traits.利用生物学先验知识和序列变异可增强复杂性状的数量性状基因座发现及基因组预测。
BMC Genomics. 2016 Feb 27;17:144. doi: 10.1186/s12864-016-2443-6.
9
Application of a Bayesian non-linear model hybrid scheme to sequence data for genomic prediction and QTL mapping.贝叶斯非线性模型混合方案在基因组预测和QTL定位序列数据中的应用。
BMC Genomics. 2017 Aug 15;18(1):618. doi: 10.1186/s12864-017-4030-x.
10
Accuracy of prediction of simulated polygenic phenotypes and their underlying quantitative trait loci genotypes using real or imputed whole-genome markers in cattle.利用真实或推算的全基因组标记预测牛模拟多基因表型及其潜在数量性状位点基因型的准确性。
Genet Sel Evol. 2015 Dec 23;47:99. doi: 10.1186/s12711-015-0179-4.

引用本文的文献

1
Assessing the Impact of Different Mixing Strategies on Genomic Prediction Accuracy for Beef Cattle Breeding Values in Multi-Breed Genomic Prediction.评估不同混合策略对多品种基因组预测中肉牛育种值基因组预测准确性的影响。
Animals (Basel). 2025 Aug 21;15(16):2463. doi: 10.3390/ani15162463.
2
Benchmarking of feed-forward neural network models for genomic prediction of quantitative traits in pigs.猪数量性状基因组预测的前馈神经网络模型基准测试
Front Genet. 2025 Jun 18;16:1618891. doi: 10.3389/fgene.2025.1618891. eCollection 2025.
3
Enhancing Genomic Prediction Accuracy with a Single-Step Genomic Best Linear Unbiased Prediction Model Integrating Genome-Wide Association Study Results.

本文引用的文献

1
Improving GWAS discovery and genomic prediction accuracy in biobank data.提高生物库数据中 GWAS 发现和基因组预测准确性。
Proc Natl Acad Sci U S A. 2022 Aug 2;119(31):e2121279119. doi: 10.1073/pnas.2121279119. Epub 2022 Jul 29.
2
Incorporating functional priors improves polygenic prediction accuracy in UK Biobank and 23andMe data sets.纳入功能先验信息可提高 UK Biobank 和 23andMe 数据集的多基因预测准确性。
Nat Commun. 2021 Oct 18;12(1):6052. doi: 10.1038/s41467-021-25171-9.
3
Accelerated deciphering of the genetic architecture of agricultural economic traits in pigs using a low-coverage whole-genome sequencing strategy.
使用整合全基因组关联研究结果的单步基因组最佳线性无偏预测模型提高基因组预测准确性。
Animals (Basel). 2025 Apr 29;15(9):1268. doi: 10.3390/ani15091268.
4
Weighted GBLUP in Simulated Beef Cattle Populations: Impact of Reference Population, Marker Density, and Heritability.模拟肉牛群体中的加权基因组最佳线性无偏预测:参考群体、标记密度和遗传力的影响
Animals (Basel). 2025 Apr 12;15(8):1118. doi: 10.3390/ani15081118.
5
Capturing resilience from phenotypic deviations: a case study using feed consumption and whole genome data in pigs.从表型偏差中捕捉弹性:以猪的饲料消耗和全基因组数据为例的案例研究。
BMC Genomics. 2024 Nov 21;25(1):1128. doi: 10.1186/s12864-024-11052-0.
6
MPH: fast REML for large-scale genome partitioning of quantitative genetic variation.MPH:用于大规模数量遗传变异基因组划分的快速 REML。
Bioinformatics. 2024 May 2;40(5). doi: 10.1093/bioinformatics/btae298.
7
Genome-wide association analysis of heifer livability and early first calving in Holstein cattle.荷斯坦奶牛小母牛成活率和首次产犊过早的全基因组关联分析。
BMC Genomics. 2023 Oct 21;24(1):628. doi: 10.1186/s12864-023-09736-0.
利用低覆盖度全基因组测序策略加速破译猪的农业经济性状的遗传结构。
Gigascience. 2021 Jul 20;10(7). doi: 10.1093/gigascience/giab048.
4
Improved genetic prediction of complex traits from individual-level data or summary statistics.从个体水平数据或汇总统计信息中提高复杂性状的遗传预测能力。
Nat Commun. 2021 Jul 7;12(1):4192. doi: 10.1038/s41467-021-24485-y.
5
Genomic Prediction Using Alternative Strategies of Weighted Single-Step Genomic BLUP for Yearling Weight and Carcass Traits in Hanwoo Beef Cattle.利用加权单步基因组 BLUP 替代策略进行韩牛育肥牛周岁体重和胴体性状的基因组预测。
Genes (Basel). 2021 Feb 12;12(2):266. doi: 10.3390/genes12020266.
6
KAML: improving genomic prediction accuracy of complex traits using machine learning determined parameters.KAML:使用机器学习确定的参数来提高复杂性状的基因组预测准确性。
Genome Biol. 2020 Jun 17;21(1):146. doi: 10.1186/s13059-020-02052-w.
7
Stochastic Lanczos estimation of genomic variance components for linear mixed-effects models.基于随机 Lanczos 估计的线性混合效应模型的基因组方差分量估计。
BMC Bioinformatics. 2019 Jul 30;20(1):411. doi: 10.1186/s12859-019-2978-z.
8
A Large-Scale Genome-Wide Association Study in U.S. Holstein Cattle.一项针对美国荷斯坦奶牛的大规模全基因组关联研究。
Front Genet. 2019 May 14;10:412. doi: 10.3389/fgene.2019.00412. eCollection 2019.
9
Genome-Wide Association Study Reveals Candidate Genes for Growth Relevant Traits in Pigs.全基因组关联研究揭示猪生长相关性状的候选基因。
Front Genet. 2019 Apr 5;10:302. doi: 10.3389/fgene.2019.00302. eCollection 2019.
10
Quantification of frequency-dependent genetic architectures in 25 UK Biobank traits reveals action of negative selection.量化 25 项英国生物库特征中频率相关的遗传结构,揭示负选择的作用。
Nat Commun. 2019 Feb 15;10(1):790. doi: 10.1038/s41467-019-08424-6.