Huang Jing, Huang Jiayan, Chen Yong, Ying Gui-Shuang
a Division of Biostatistics , Center for Clinical Epidemiology and Biostatistics.
b Center for Preventive Ophthalmology and Biostatistics, Department of Ophthalmology, Perelman School of Medicine , University of Pennsylvania , Philadelphia , Pennsylvania , USA.
Ophthalmic Epidemiol. 2018 Feb;25(1):45-54. doi: 10.1080/09286586.2017.1339809. Epub 2017 Sep 11.
To evaluate the performance of commonly used statistical methods for analyzing continuous correlated eye data when sample size is small.
We simulated correlated continuous data from two designs: (1) two eyes of a subject in two comparison groups; (2) two eyes of a subject in the same comparison group, under various sample size (5-50), inter-eye correlation (0-0.75) and effect size (0-0.8). Simulated data were analyzed using paired t-test, two sample t-test, Wald test and score test using the generalized estimating equations (GEE) and F-test using linear mixed effects model (LMM). We compared type I error rates and statistical powers, and demonstrated analysis approaches through analyzing two real datasets.
In design 1, paired t-test and LMM perform better than GEE, with nominal type 1 error rate and higher statistical power. In design 2, no test performs uniformly well: two sample t-test (average of two eyes or a random eye) achieves better control of type I error but yields lower statistical power. In both designs, the GEE Wald test inflates type I error rate and GEE score test has lower power.
When sample size is small, some commonly used statistical methods do not perform well. Paired t-test and LMM perform best when two eyes of a subject are in two different comparison groups, and t-test using the average of two eyes performs best when the two eyes are in the same comparison group. When selecting the appropriate analysis approach the study design should be considered.
评估样本量较小时分析连续相关眼部数据的常用统计方法的性能。
我们从两种设计中模拟相关连续数据:(1)两个比较组中受试者的双眼;(2)同一比较组中受试者的双眼,设置不同的样本量(5 - 50)、眼间相关性(0 - 0.75)和效应量(0 - 0.8)。使用配对t检验、两样本t检验、 Wald检验以及使用广义估计方程(GEE)的得分检验和使用线性混合效应模型(LMM)的F检验对模拟数据进行分析。我们比较了I型错误率和统计效能,并通过分析两个真实数据集展示了分析方法。
在设计1中,配对t检验和LMM的表现优于GEE,具有名义I型错误率和更高的统计效能。在设计2中,没有一种检验在所有情况下都表现良好:两样本t检验(双眼平均值或随机一只眼)能更好地控制I型错误,但统计效能较低。在两种设计中,GEE Wald检验会使I型错误率膨胀,GEE得分检验的效能较低。
当样本量较小时,一些常用的统计方法表现不佳。当受试者的双眼处于两个不同的比较组时,配对t检验和LMM表现最佳;当双眼处于同一比较组时,使用双眼平均值的t检验表现最佳。选择合适的分析方法时应考虑研究设计。