• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

HighDimMixedModels.jl:跨组学数据的稳健高维混合效应模型。

HighDimMixedModels.jl: Robust high-dimensional mixed-effects models across omics data.

作者信息

Gorstein Evan, Aghdam Rosa, Solís-Lemus Claudia

机构信息

Wisconsin Institute for Discovery, University of Wisconsin-Madison, Madison, Wisconsin, United States of America.

Department of Statistics, University of Wisconsin-Madison, Madison, Wisconsin, United States of America.

出版信息

PLoS Comput Biol. 2025 Jan 13;21(1):e1012143. doi: 10.1371/journal.pcbi.1012143. eCollection 2025 Jan.

DOI:10.1371/journal.pcbi.1012143
PMID:39804942
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11761659/
Abstract

High-dimensional mixed-effects models are an increasingly important form of regression in which the number of covariates rivals or exceeds the number of samples, which are collected in groups or clusters. The penalized likelihood approach to fitting these models relies on a coordinate descent algorithm that lacks guarantees of convergence to a global optimum. Here, we empirically study the behavior of this algorithm on simulated and real examples of three types of data that are common in modern biology: transcriptome, genome-wide association, and microbiome data. Our simulations provide new insights into the algorithm's behavior in these settings, and, comparing the performance of two popular penalties, we demonstrate that the smoothly clipped absolute deviation (SCAD) penalty consistently outperforms the least absolute shrinkage and selection operator (LASSO) penalty in terms of both variable selection and estimation accuracy across omics data. To empower researchers in biology and other fields to fit models with the SCAD penalty, we implement the algorithm in a Julia package, HighDimMixedModels.jl.

摘要

高维混合效应模型是一种越来越重要的回归形式,其中协变量的数量与样本数量相当或超过样本数量,且样本是按组或集群收集的。拟合这些模型的惩罚似然方法依赖于一种坐标下降算法,该算法缺乏收敛到全局最优解的保证。在此,我们通过实证研究该算法在现代生物学中常见的三种类型数据(转录组、全基因组关联和微生物组数据)的模拟和真实示例上的行为。我们的模拟为该算法在这些设置下的行为提供了新的见解,并且通过比较两种常用惩罚的性能,我们证明在组学数据的变量选择和估计准确性方面,平滑截断绝对偏差(SCAD)惩罚始终优于最小绝对收缩和选择算子(LASSO)惩罚。为了使生物学和其他领域的研究人员能够使用SCAD惩罚来拟合模型,我们在Julia包HighDimMixedModels.jl中实现了该算法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3b66/11761659/11585ca79927/pcbi.1012143.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3b66/11761659/99d7e5ccbb06/pcbi.1012143.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3b66/11761659/457aab4f7ae7/pcbi.1012143.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3b66/11761659/ba946a008e1d/pcbi.1012143.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3b66/11761659/3952588e5cbc/pcbi.1012143.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3b66/11761659/30740cd7e589/pcbi.1012143.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3b66/11761659/8d96c7e5f4a9/pcbi.1012143.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3b66/11761659/11585ca79927/pcbi.1012143.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3b66/11761659/99d7e5ccbb06/pcbi.1012143.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3b66/11761659/457aab4f7ae7/pcbi.1012143.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3b66/11761659/ba946a008e1d/pcbi.1012143.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3b66/11761659/3952588e5cbc/pcbi.1012143.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3b66/11761659/30740cd7e589/pcbi.1012143.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3b66/11761659/8d96c7e5f4a9/pcbi.1012143.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3b66/11761659/11585ca79927/pcbi.1012143.g007.jpg

相似文献

1
HighDimMixedModels.jl: Robust high-dimensional mixed-effects models across omics data.HighDimMixedModels.jl:跨组学数据的稳健高维混合效应模型。
PLoS Comput Biol. 2025 Jan 13;21(1):e1012143. doi: 10.1371/journal.pcbi.1012143. eCollection 2025 Jan.
2
Majorization Minimization by Coordinate Descent for Concave Penalized Generalized Linear Models.基于坐标下降法的凹惩罚广义线性模型的优化最小化
Stat Comput. 2014 Sep;24(5):871-883. doi: 10.1007/s11222-013-9407-3.
3
COORDINATE DESCENT ALGORITHMS FOR NONCONVEX PENALIZED REGRESSION, WITH APPLICATIONS TO BIOLOGICAL FEATURE SELECTION.用于非凸惩罚回归的坐标下降算法及其在生物特征选择中的应用
Ann Appl Stat. 2011 Jan 1;5(1):232-253. doi: 10.1214/10-AOAS388.
4
Variable selection in the cox regression model with covariates missing at random.协变量随机缺失情况下Cox回归模型中的变量选择
Biometrics. 2010 Mar;66(1):97-104. doi: 10.1111/j.1541-0420.2009.01274.x. Epub 2009 May 18.
5
Regularized Weighted Nonparametric Likelihood Approach for High-Dimension Sparse Subdistribution Hazards Model for Competing Risk Data.正则化加权非参数似然法在高维稀疏亚分布风险模型中的应用。
Comput Math Methods Med. 2021 Sep 19;2021:5169052. doi: 10.1155/2021/5169052. eCollection 2021.
6
Variable selection for zero-inflated and overdispersed data with application to health care demand in Germany.针对零膨胀和过度分散数据的变量选择及其在德国医疗保健需求中的应用
Biom J. 2015 Sep;57(5):867-84. doi: 10.1002/bimj.201400143. Epub 2015 Jun 8.
7
High-dimensional Cox models: the choice of penalty as part of the model building process.高维Cox模型:作为模型构建过程一部分的惩罚项选择
Biom J. 2010 Feb;52(1):50-69. doi: 10.1002/bimj.200900064.
8
Penalized variable selection for accelerated failure time models with random effects.具有随机效应的加速失效时间模型的惩罚变量选择。
Stat Med. 2019 Feb 28;38(5):878-892. doi: 10.1002/sim.8023. Epub 2018 Nov 8.
9
Multiple imputation with sequential penalized regression.多重插补与序贯惩罚回归。
Stat Methods Med Res. 2019 May;28(5):1311-1327. doi: 10.1177/0962280218755574. Epub 2018 Feb 16.
10
Newton-Raphson Meets Sparsity: Sparse Learning Via a Novel Penalty and a Fast Solver.牛顿-拉弗森方法与稀疏性:通过一种新型惩罚项和快速求解器实现稀疏学习
IEEE Trans Neural Netw Learn Syst. 2024 Sep;35(9):12057-12067. doi: 10.1109/TNNLS.2023.3251748. Epub 2024 Sep 3.

本文引用的文献

1
glmmPen: High Dimensional Penalized Generalized Linear Mixed Models.glmmPen:高维惩罚广义线性混合模型
R J. 2023 Dec;15(4):106-128. doi: 10.32614/rj-2023-086. Epub 2024 Apr 10.
2
RNA-seq data science: From raw data to effective interpretation.RNA测序数据科学:从原始数据到有效解读
Front Genet. 2023 Mar 13;14:997383. doi: 10.3389/fgene.2023.997383. eCollection 2023.
3
Inference for high-dimensional linear mixed-effects models: A quasi-likelihood approach.高维线性混合效应模型的推断:一种拟似然方法。
J Am Stat Assoc. 2022;117(540):1835-1846. doi: 10.1080/01621459.2021.1888740. Epub 2021 Apr 20.
4
Efficient penalized generalized linear mixed models for variable selection and genetic risk prediction in high-dimensional data.高效惩罚广义线性混合模型在高维数据中的变量选择和遗传风险预测。
Bioinformatics. 2023 Feb 3;39(2). doi: 10.1093/bioinformatics/btad063.
5
Statistical challenges in longitudinal microbiome data analysis.纵向微生物组数据分析中的统计挑战。
Brief Bioinform. 2022 Jul 18;23(4). doi: 10.1093/bib/bbac273.
6
On Genetic Correlation Estimation With Summary Statistics From Genome-Wide Association Studies.基于全基因组关联研究汇总统计量的遗传相关性估计
J Am Stat Assoc. 2022;117(537):1-11. doi: 10.1080/01621459.2021.1906684. Epub 2021 May 19.
7
Model-based clustering of high-dimensional longitudinal data via regularization.基于模型的高维纵向数据聚类方法:正则化。
Biometrics. 2023 Jun;79(2):761-774. doi: 10.1111/biom.13672. Epub 2022 Apr 28.
8
Penalized linear mixed models for structured genetic data.基于结构遗传数据的惩罚线性混合模型。
Genet Epidemiol. 2021 Jul;45(5):427-444. doi: 10.1002/gepi.22384. Epub 2021 May 16.
9
RNA sequencing: new technologies and applications in cancer research.RNA 测序:癌症研究中的新技术和应用。
J Hematol Oncol. 2020 Dec 4;13(1):166. doi: 10.1186/s13045-020-01005-x.
10
Analysis of microbial compositions: a review of normalization and differential abundance analysis.微生物组成分析:归一化和差异丰度分析综述。
NPJ Biofilms Microbiomes. 2020 Dec 2;6(1):60. doi: 10.1038/s41522-020-00160-w.