Walters S J, Campbell M J, Lall R
Sheffield Health Economics Group, School of Health and Related Research, University of Sheffield, United Kingdom.
J Biopharm Stat. 2001;11(3):155-76. doi: 10.1081/BIP-100107655.
Health Related Quality of Life (HRQoL) measures are becoming more frequently used in clinical trials, as both primary and secondary endpoints. Investigators are now asking statisticians for advice on how to plan (e.g., sample size) and analyze studies using HRQoL measures. HRQoL measures such as the SF-36 are usually measured on an ordered categorical (ordinal) scale. In the designing stages and when analyzing, the scales are often scored and the scores treated as if they were continuous and normally distributed. However the ordinal scaling of HRQoL measures leads to problems in determining sample size, and conventional parametric methods of estimation and hypothesis testing may not be appropriate for such outcomes. We present practical guidelines for the design and analysis of trials with HRQoL measures as outcomes. We used conventional statistical methods (i.e., t-tests and multiple regression), various ordinal regression models (proportional odds, continuation ratio, polytomous and stereotype) and bootstrap methods to analyze an HRQoL dataset. To illustrate the various methods we used HRQoL data on the SF-36 Role Limitations Emotional dimension for two groups of patients with leg ulcers. The bootstrap, t-test, and multiple regression methods gave similar results. The various ordinal regression models also gave similar results. If the HRQoL measure has a large number of ordered categories, most of which are occupied, and the underlying scale really is continuous but measured imperfectly by an instrument with a limited number of discrete values, then an informal rule of thumb is that this discrete scale should be treated as continuous if it has seven or more categories and as ordinal otherwise.
与健康相关的生活质量(HRQoL)测量在临床试验中作为主要和次要终点的使用频率越来越高。研究人员现在向统计学家咨询如何使用HRQoL测量来规划(例如样本量)和分析研究。诸如SF-36之类的HRQoL测量通常是在有序分类(序数)尺度上进行的。在设计阶段和分析时,这些尺度通常会被计分,并且分数被当作是连续且呈正态分布的来处理。然而,HRQoL测量的序数尺度在确定样本量方面会导致问题,并且传统的参数估计和假设检验方法可能不适用于此类结果。我们提出了以HRQoL测量作为结果的试验设计和分析的实用指南。我们使用传统统计方法(即t检验和多元回归)、各种序数回归模型(比例优势模型、连续比例模型、多类别模型和刻板模型)以及自助法来分析一个HRQoL数据集。为了说明各种方法,我们使用了两组腿部溃疡患者的SF-36角色限制情感维度的HRQoL数据。自助法、t检验和多元回归方法得出了相似的结果。各种序数回归模型也得出了相似的结果。如果HRQoL测量有大量有序类别,其中大多数都有数据,并且潜在尺度实际上是连续的,但由具有有限数量离散值的仪器测量得并不完美,那么一个非正式的经验法则是,如果这个离散尺度有七个或更多类别,就应将其当作连续尺度处理,否则当作序数尺度处理。