HighDimMixedModels.jl：跨组学数据的稳健高维混合效应模型。

HighDimMixedModels.jl: Robust high-dimensional mixed-effects models across omics data.

作者信息

Gorstein Evan, Aghdam Rosa, Solís-Lemus Claudia

机构信息

Wisconsin Institute for Discovery, University of Wisconsin-Madison, Madison, Wisconsin, United States of America.

Department of Statistics, University of Wisconsin-Madison, Madison, Wisconsin, United States of America.

出版信息

PLoS Comput Biol. 2025 Jan 13;21(1):e1012143. doi: 10.1371/journal.pcbi.1012143. eCollection 2025 Jan.

DOI:10.1371/journal.pcbi.1012143

PMID:39804942

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11761659/

Abstract

High-dimensional mixed-effects models are an increasingly important form of regression in which the number of covariates rivals or exceeds the number of samples, which are collected in groups or clusters. The penalized likelihood approach to fitting these models relies on a coordinate descent algorithm that lacks guarantees of convergence to a global optimum. Here, we empirically study the behavior of this algorithm on simulated and real examples of three types of data that are common in modern biology: transcriptome, genome-wide association, and microbiome data. Our simulations provide new insights into the algorithm's behavior in these settings, and, comparing the performance of two popular penalties, we demonstrate that the smoothly clipped absolute deviation (SCAD) penalty consistently outperforms the least absolute shrinkage and selection operator (LASSO) penalty in terms of both variable selection and estimation accuracy across omics data. To empower researchers in biology and other fields to fit models with the SCAD penalty, we implement the algorithm in a Julia package, HighDimMixedModels.jl.

摘要

高维混合效应模型是一种越来越重要的回归形式，其中协变量的数量与样本数量相当或超过样本数量，且样本是按组或集群收集的。拟合这些模型的惩罚似然方法依赖于一种坐标下降算法，该算法缺乏收敛到全局最优解的保证。在此，我们通过实证研究该算法在现代生物学中常见的三种类型数据（转录组、全基因组关联和微生物组数据）的模拟和真实示例上的行为。我们的模拟为该算法在这些设置下的行为提供了新的见解，并且通过比较两种常用惩罚的性能，我们证明在组学数据的变量选择和估计准确性方面，平滑截断绝对偏差（SCAD）惩罚始终优于最小绝对收缩和选择算子（LASSO）惩罚。为了使生物学和其他领域的研究人员能够使用SCAD惩罚来拟合模型，我们在Julia包HighDimMixedModels.jl中实现了该算法。