Hung Hung, Huang Su-Yun
Institute of Epidemiology and Preventive Medicine, National Taiwan University, Taiwan.
Institute of Statistical Science, Academia Sinica, Taiwan.
Biometrics. 2019 Mar;75(1):245-255. doi: 10.1111/biom.12926. Epub 2018 Jul 27.
Sufficient dimension reduction (SDR) continues to be an active field of research. When estimating the central subspace (CS), inverse regression based SDR methods involve solving a generalized eigenvalue problem, which can be problematic under the large-p-small-n situation. In recent years, new techniques have emerged in numerical linear algebra, called randomized algorithms or random sketching, for high-dimensional and large scale problems. To overcome the large-p-small-n SDR problem, we combine the idea of statistical inference with random sketching to propose a new SDR method, called integrated random-partition SDR (iRP-SDR). Our method consists of the following three steps: (i) Randomly partition the covariates into subsets to construct an envelope subspace with low dimension. (ii) Obtain a sketch of the CS by applying a conventional SDR method within the constructed envelope subspace. (iii) Repeat the above two steps many times and integrate these multiple sketches to form the final estimate of the CS. After describing the details of these steps, the asymptotic properties of iRP-SDR are established. Unlike existing methods, iRP-SDR does not involve the determination of the structural dimension until the last stage, which makes it more adaptive to a high-dimensional setting. The advantageous performance of iRP-SDR is demonstrated via simulation studies and a practical example analyzing EEG data.
充分降维(SDR)仍然是一个活跃的研究领域。在估计中心子空间(CS)时,基于逆回归的SDR方法涉及求解一个广义特征值问题,在高维小样本情况下这可能会出现问题。近年来,数值线性代数中出现了一些新技术,称为随机算法或随机抽样,用于解决高维和大规模问题。为了克服高维小样本SDR问题,我们将统计推断的思想与随机抽样相结合,提出了一种新的SDR方法,称为集成随机划分SDR(iRP-SDR)。我们的方法包括以下三个步骤:(i)将协变量随机划分为子集,以构建一个低维的包络子空间。(ii)在构建的包络子空间内应用传统的SDR方法获得CS的一个抽样。(iii)多次重复上述两个步骤,并整合这些多个抽样以形成CS的最终估计。在描述了这些步骤的细节之后,建立了iRP-SDR的渐近性质。与现有方法不同,iRP-SDR直到最后阶段才涉及结构维度的确定,这使得它更适应高维设置。通过模拟研究和一个分析脑电图数据的实际例子证明了iRP-SDR的优越性能。