Department of Statistics, Quaid-i-Azam University Islamabad, Pakistan.
School of Mathematics and Statistics, University of New South Wales, Sydney, Australia.
Stat Methods Med Res. 2024 Sep;33(9):1624-1636. doi: 10.1177/09622802241267808. Epub 2024 Aug 8.
Regression to the mean occurs when an unusual observation is followed by a more typical outcome closer to the population mean. In pre- and post-intervention studies, treatment is administered to subjects with initial measurements located in the tail of a distribution, and a paired sample -test can be utilized to assess the effectiveness of the intervention. The observed change in the pre-post means is the sum of regression to the mean and treatment effects, and ignoring regression to the mean could lead to erroneous conclusions about the effectiveness of the treatment effect. In this study, formulae for regression to the mean are derived, and maximum likelihood estimation is employed to numerically estimate the regression to the mean effect when the test statistic follows the bivariate -distribution based on a baseline criterion or a cut-off point. The pre-post degrees of freedom could be equal but also unequal such as when there is missing data. Additionally, we illustrate how regression to the mean is influenced by cut-off points, mixing angles which are related to correlation, and degrees of freedom. A simulation study is conducted to assess the statistical properties of unbiasedness, consistency, and asymptotic normality of the regression to the mean estimator. Moreover, the proposed methods are compared with an existing one assuming bivariate normality. The -values are compared when regression to the mean is either ignored or accounted for to gauge the statistical significance of the paired -test. The proposed method is applied to real data concerning schizophrenia patients, and the observed conditional mean difference called the total effect is decomposed into the regression to the mean and treatment effects.
回归均值是指当一个不寻常的观察结果出现后,紧接着出现更接近总体均值的更典型的结果。在干预前后的研究中,将治疗方法应用于初始测量值位于分布尾部的受试者,并且可以使用配对样本 t 检验来评估干预措施的效果。在前后测量值中观察到的变化是回归均值和治疗效果的总和,忽略回归均值可能会导致对治疗效果有效性的错误结论。在这项研究中,推导出了回归均值的公式,并使用最大似然估计来数值估计当基于基线标准或截止点的双变量 - 分布的检验统计量遵循时的回归均值效应。前后自由度可以相等,也可以不相等,例如在存在缺失数据的情况下。此外,我们还说明了回归均值如何受到截止点、与相关性相关的混合角度以及自由度的影响。进行了一项模拟研究,以评估回归均值估计量的无偏性、一致性和渐近正态性的统计性质。此外,还将所提出的方法与假设双变量正态性的现有方法进行了比较。当忽略或考虑回归均值时,会比较 - 值,以衡量配对 t 检验的统计显著性。将所提出的方法应用于涉及精神分裂症患者的真实数据,并将观察到的条件均值差异称为总效应分解为回归均值和治疗效果。