Ma Xin, Kundu Suprateek
Department of Biostatistics and Bioinfomatics, Emory University.
Department of Biostatistics, The University of Texas at MD Anderson Cancer Center.
J Am Stat Assoc. 2024;119(545):650-663. doi: 10.1080/01621459.2022.2140052. Epub 2022 Nov 17.
Recent medical imaging studies have given rise to distinct but inter-related datasets corresponding to multiple experimental tasks or longitudinal visits. Standard scalar-on-image regression models that fit each dataset separately are not equipped to leverage information across inter-related images, and existing multi-task learning approaches are compromised by the inability to account for the noise that is often observed in images. We propose a novel joint scalar-on-image regression framework involving wavelet-based image representations with grouped penalties that are designed to pool information across inter-related images for joint learning, and which explicitly accounts for noise in high-dimensional images via a projection-based approach. In the presence of non-convexity arising due to noisy images, we derive non-asymptotic error bounds under non-convex as well as convex grouped penalties, even when the number of voxels increases exponentially with sample size. A projected gradient descent algorithm is used for computation, which is shown to approximate the optimal solution via well-defined non-asymptotic optimization error bounds under noisy images. Extensive simulations and application to a motivating longitudinal Alzheimer's disease study illustrate significantly improved predictive ability and greater power to detect true signals, that are simply missed by existing methods without noise correction due to the phenomenon.
最近的医学成像研究产生了与多个实验任务或纵向访视相对应的不同但相互关联的数据集。分别拟合每个数据集的标准图像标量回归模型无法利用跨相关图像的信息,而现有的多任务学习方法因无法考虑图像中经常出现的噪声而受到影响。我们提出了一种新颖的联合图像标量回归框架,该框架涉及基于小波的图像表示和分组惩罚,旨在汇集跨相关图像的信息以进行联合学习,并通过基于投影的方法明确考虑高维图像中的噪声。在存在由噪声图像引起的非凸性的情况下,即使体素数量随样本量呈指数增长,我们也能在非凸和凸分组惩罚下推导非渐近误差界。使用投影梯度下降算法进行计算,结果表明该算法在噪声图像下通过定义良好的非渐近优化误差界来逼近最优解。广泛的模拟以及对一项具有启发性的纵向阿尔茨海默病研究的应用表明,该方法显著提高了预测能力,并且具有更强的检测真实信号的能力,而现有方法由于该现象在没有噪声校正的情况下会遗漏这些信号。