Key Laboratory of Mental Health, Institute of Psychology, Chinese Academy of Sciences, Beijing, 100101, China.
BMC Bioinformatics. 2013 May 2;14:151. doi: 10.1186/1471-2105-14-151.
Complex traits may be defined by a range of different criteria. It would result in a loss of information to perform analyses simply on the basis of a final clinical dichotomized affected / unaffected variable.
We assess the performance of four alternative approaches for the analysis of multiple phenotypes in genetic association studies. We describe the four methods in detail and discuss their relative theoretical merits and disadvantages. Using simulation we demonstrate that PCA provides the greatest power when applied to both correlated phenotypes and with large numbers of phenotypes. The multivariate approach had low type I error only with independent phenotypes or small numbers of phenotypes. In this study, our application of the four methods to schizophrenia data provides converging evidence of the relative performance of the methods.
Via power analysis of simulated data and testing of experimental data, we conclude that PCA, creating one variable based on a linear combination of all the traits, performs optimally. We propose that our comparison will provide insight into the properties of the methods and help researchers to choose appropriate strategy in future experimental studies.
复杂性状可以由一系列不同的标准来定义。如果仅仅基于最终的临床二分法(患病/未患病)变量来进行分析,将会导致信息丢失。
我们评估了在遗传关联研究中分析多个表型的四种替代方法的性能。我们详细描述了这四种方法,并讨论了它们相对的理论优点和缺点。通过模拟,我们证明了当应用于相关表型和大量表型时,主成分分析(PCA)提供了最大的功效。多变量方法仅在独立表型或少量表型时具有较低的Ⅰ类错误率。在这项研究中,我们将这四种方法应用于精神分裂症数据,为方法的相对性能提供了一致的证据。
通过对模拟数据的功效分析和对实验数据的检验,我们得出结论,基于所有特征的线性组合创建一个变量的主成分分析(PCA)表现最佳。我们建议,我们的比较将深入了解这些方法的特性,并帮助研究人员在未来的实验研究中选择适当的策略。