Suppr超能文献

BLUPmrMLM:全基因组关联研究中的一种快速 mrMLM 算法。

BLUPmrMLM: A Fast mrMLM Algorithm in Genome-wide Association Studies.

机构信息

College of Plant Science and Technology, Huazhong Agricultural University, Wuhan 430070, China.

出版信息

Genomics Proteomics Bioinformatics. 2024 Sep 13;22(3). doi: 10.1093/gpbjnl/qzae020.

Abstract

Multilocus genome-wide association study has become the state-of-the-art tool for dissecting the genetic architecture of complex and multiomic traits. However, most existing multilocus methods require relatively long computational time when analyzing large datasets. To address this issue, in this study, we proposed a fast mrMLM method, namely, best linear unbiased prediction multilocus random-SNP-effect mixed linear model (BLUPmrMLM). First, genome-wide single-marker scanning in mrMLM was replaced by vectorized Wald tests based on the best linear unbiased prediction (BLUP) values of marker effects and their variances in BLUPmrMLM. Then, adaptive best subset selection (ABESS) was used to identify potentially associated markers on each chromosome to reduce computational time when estimating marker effects via empirical Bayes. Finally, shared memory and parallel computing schemes were used to reduce the computational time. In simulation studies, BLUPmrMLM outperformed GEMMA, EMMAX, mrMLM, and FarmCPU as well as the control method (BLUPmrMLM with ABESS removed), in terms of computational time, power, accuracy for estimating quantitative trait nucleotide positions and effects, false positive rate, false discovery rate, false negative rate, and F1 score. In the reanalysis of two large rice datasets, BLUPmrMLM significantly reduced the computational time and identified more previously reported genes, compared with the aforementioned methods. This study provides an excellent multilocus model method for the analysis of large-scale and multiomic datasets. The software mrMLM v5.1 is available at BioCode (https://ngdc.cncb.ac.cn/biocode/tool/BT007388) or GitHub (https://github.com/YuanmingZhang65/mrMLM).

摘要

多基因全基因组关联研究已成为剖析复杂多组学性状遗传结构的最新工具。然而,大多数现有的多基因方法在分析大型数据集时需要相对较长的计算时间。针对这一问题,本研究提出了一种快速 mrMLM 方法,即最佳线性无偏预测多基因随机-SNP 效应混合线性模型 (BLUPmrMLM)。首先,BLUPmrMLM 中的全基因组单标记扫描被基于标记效应的最佳线性无偏预测 (BLUP) 值和 BLUPmrMLM 中标记效应方差的矢量化 Wald 检验所取代。然后,自适应最佳子集选择 (ABESS) 用于识别每条染色体上可能相关的标记,以减少通过经验 Bayes 估计标记效应的计算时间。最后,使用共享内存和并行计算方案来减少计算时间。在模拟研究中,BLUPmrMLM 在计算时间、功效、估计数量性状核苷酸位置和效应的准确性、假阳性率、假发现率、假阴性率和 F1 分数方面均优于 GEMMA、EMMAX、mrMLM 和 FarmCPU 以及对照方法(去除 ABESS 的 BLUPmrMLM)。在对两个大型水稻数据集的重新分析中,与上述方法相比,BLUPmrMLM 显著减少了计算时间并鉴定出了更多先前报道的基因。本研究为分析大规模多组学数据集提供了一种出色的多基因模型方法。mrMLM v5.1 软件可在 BioCode(https://ngdc.cncb.ac.cn/biocode/tool/BT007388)或 GitHub(https://github.com/YuanmingZhang65/mrMLM)上获取。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验