Coste Joël, Bouée Stéphane, Ecosse Emmanuel, Leplège Alain, Pouchot Jacques
Département de Biostatistique et d'Informatique Médicale, Pavillon Saint Jacques, Hôpital Cochin, Paris, France.
Qual Life Res. 2005 Apr;14(3):641-54. doi: 10.1007/s11136-004-1260-6.
During the early steps of the construction of composite health measures, principal component analysis (PCA) is commonly used to identify 'latent' factors that underlie observed variables and to determine the dimensionality of the instruments. The determination of the number of components to retain is critical to PCA: it markedly influences the factorial model identified and further conditions the validity of the constructed instrument. However, many researchers developing composite health measures seem to be unaware of the importance of this determination. The purposes of the paper are to illustrate (1) the variability of the factorial models obtained by using different published rules (n = 10) for determining the number of components to retain in PCA applied to two quality-of-life datasets, and (2) the value of a careful and diversified approach to the problem of the number of components to retain in PCA that we suggest, instead of the unsatisfactory 'rule-of-thumb' that many researchers use. This involves: (1) using robust rules (including parallel analysis and minimum average partial procedure) to generate a set of possible values for the number of components to retain, (2) repeating the analysis across samples, (3) comprehensively assessing the models obtained, and (4) considering complementary methods to PCA and especially confirmatory factor analysis.
在构建综合健康指标的早期阶段,主成分分析(PCA)通常用于识别观测变量背后的“潜在”因素,并确定指标的维度。确定保留的成分数量对主成分分析至关重要:它会显著影响所识别的因子模型,并进一步影响所构建指标的有效性。然而,许多开发综合健康指标的研究人员似乎并未意识到这一确定的重要性。本文的目的是说明:(1)在应用于两个生活质量数据集的主成分分析中,使用不同的已发表规则(n = 10)来确定保留的成分数量时,所获得的因子模型的变异性;(2)我们建议的一种谨慎且多样化的方法在确定主成分分析中保留的成分数量问题上的价值,而不是许多研究人员所使用的不尽人意的“经验法则”。这包括:(1)使用稳健的规则(包括平行分析和最小平均偏相关程序)来生成一组可能的保留成分数量值,(2)在不同样本中重复分析,(3)全面评估所获得的模型,以及(4)考虑主成分分析的补充方法,尤其是验证性因子分析。