Yoo Jinkyung, Sun Zequn, Greenacre Michael, Ma Qin, Chung Dongjun, Kim Young Min
Department of Statistics, Kyungpook National University, South Korea.
Department of Preventive Medicine - Biostatistics, Northwestern University, USA.
Commun Stat Appl Methods. 2022 Jul;29(4):453-469. doi: 10.29220/csam.2022.29.4.453. Epub 2022 Jul 31.
The study of immune cellular composition has been of great scientific interest in immunology because of the generation of multiple large-scale data. From the statistical point of view, such immune cellular data should be treated as compositional. In compositional data, each element is positive, and all the elements sum to a constant, which can be set to one in general. Standard statistical methods are not directly applicable for the analysis of compositional data because they do not appropriately handle correlations between the compositional elements. In this paper, we review statistical methods for compositional data analysis and illustrate them in the context of immunology. Specifically, we focus on regression analyses using log-ratio transformations and the alternative approach using Dirichlet regression analysis, discuss their theoretical foundations, and illustrate their applications with immune cellular fraction data generated from colorectal cancer patients.
由于产生了多个大规模数据,免疫细胞组成的研究在免疫学中具有极大的科学兴趣。从统计学角度来看,此类免疫细胞数据应被视为成分数据。在成分数据中,每个元素都是正数,且所有元素之和为一个常数,通常可设为1。标准统计方法不适用于成分数据的分析,因为它们无法恰当地处理成分元素之间的相关性。在本文中,我们回顾了成分数据分析的统计方法,并在免疫学背景下对其进行说明。具体而言,我们重点关注使用对数比率变换的回归分析以及使用狄利克雷回归分析的替代方法,讨论它们的理论基础,并用从结直肠癌患者产生的免疫细胞分数数据说明它们的应用。