Department of Radiology, University of California at San Diego La Jolla, CA, USA.
Front Genet. 2012 Sep 27;3:190. doi: 10.3389/fgene.2012.00190. eCollection 2012.
Multivariate distance matrix regression (MDMR) analysis is a statistical technique that allows researchers to relate P variables to an additional M factors collected on N individuals, where P ≫ N. The technique can be applied to a number of research settings involving high-dimensional data types such as DNA sequence data, gene expression microarray data, and imaging data. MDMR analysis involves computing the distance between all pairs of individuals with respect to P variables of interest and constructing an N × N matrix whose elements reflect these distances. Permutation tests can be used to test linear hypotheses that consider whether or not the M additional factors collected on the individuals can explain variation in the observed distances between and among the N individuals as reflected in the matrix. Despite its appeal and utility, properties of the statistics used in MDMR analysis have not been explored in detail. In this paper we consider the level accuracy and power of MDMR analysis assuming different distance measures and analysis settings. We also describe the utility of MDMR analysis in assessing hypotheses about the appropriate number of clusters arising from a cluster analysis.
多元距离矩阵回归 (MDMR) 分析是一种统计技术,允许研究人员将 P 个变量与 N 个人收集的额外 M 个因素相关联,其中 P≫N。该技术可应用于涉及高维数据类型的许多研究环境,例如 DNA 序列数据、基因表达微阵列数据和成像数据。MDMR 分析涉及计算与 P 个感兴趣变量有关的所有个体对之间的距离,并构建一个 N×N 矩阵,其元素反映这些距离。置换检验可用于检验线性假设,这些假设考虑个体上收集的 M 个附加因素是否可以解释在矩阵中反映的 N 个个体之间和个体内部观察到的距离的变化。尽管 MDMR 分析具有吸引力和实用性,但尚未详细探讨其使用的统计数据的特性。在本文中,我们考虑了在不同距离度量和分析设置下 MDMR 分析的水平准确性和功效。我们还描述了 MDMR 分析在评估聚类分析中出现的适当聚类数量的假设方面的效用。