具有小数据集或稀疏数据集的偏倚降低和分离证明的条件逻辑回归。

Bias-reduced and separation-proof conditional logistic regression with small or sparse data sets.

机构信息

Section of Clinical Biometrics, Core Unit of Medical Statistics and Informatics, Medical University of Vienna, Spitalgasse 23, Vienna A-1090, Austria.

出版信息

Stat Med. 2010 Mar 30;29(7-8):770-7. doi: 10.1002/sim.3794.

DOI:10.1002/sim.3794

PMID:20213709

Abstract

Conditional logistic regression is used for the analysis of binary outcomes when subjects are stratified into several subsets, e.g. matched pairs or blocks. Log odds ratio estimates are usually found by maximizing the conditional likelihood. This approach eliminates all strata-specific parameters by conditioning on the number of events within each stratum. However, in the analyses of both an animal experiment and a lung cancer case-control study, conditional maximum likelihood (CML) resulted in infinite odds ratio estimates and monotone likelihood. Estimation can be improved by using Cytel Inc.'s well-known LogXact software, which provides a median unbiased estimate and exact or mid-p confidence intervals. Here, we suggest and outline point and interval estimation based on maximization of a penalized conditional likelihood in the spirit of Firth's (Biometrika 1993; 80:27-38) bias correction method (CFL). We present comparative analyses of both studies, demonstrating some advantages of CFL over competitors. We report on a small-sample simulation study where CFL log odds ratio estimates were almost unbiased, whereas LogXact estimates showed some bias and CML estimates exhibited serious bias. Confidence intervals and tests based on the penalized conditional likelihood had close-to-nominal coverage rates and yielded highest power among all methods compared, respectively. Therefore, we propose CFL as an attractive solution to the stratified analysis of binary data, irrespective of the occurrence of monotone likelihood. A SAS program implementing CFL is available at: http://www.muw.ac.at/msi/biometrie/programs.

摘要

条件逻辑回归用于分析当研究对象被分为几个子集（例如匹配对或块）时的二项结果。对数优势比估计通常通过最大化条件似然来获得。这种方法通过在每个层内的事件数上进行条件化，消除了所有特定于层的参数。然而，在一项动物实验和一项肺癌病例对照研究的分析中，条件最大似然（CML）导致了无限的优势比估计值和单调似然。通过使用 Cytel Inc. 的知名 LogXact 软件，可以改善估计，该软件提供了中位数无偏估计值和精确或中点置信区间。在这里，我们建议并概述了基于惩罚条件似然最大化的点估计和区间估计，这是 Firth 的（Biometrika 1993；80：27-38）偏差校正方法（CFL）的精神。我们对这两项研究进行了比较分析，展示了 CFL 相对于竞争对手的一些优势。我们报告了一项小样本模拟研究，其中 CFL 对数优势比估计值几乎无偏，而 LogXact 估计值存在一些偏差，CML 估计值则存在严重偏差。基于惩罚条件似然的置信区间和检验具有接近名义覆盖率，并在比较的所有方法中产生了最高的功效。因此，我们提出 CFL 作为一种有吸引力的解决方案，用于分析二项数据的分层情况，无论是否出现单调似然。一个实现 CFL 的 SAS 程序可在以下网址获得：http://www.muw.ac.at/msi/biometrie/programs。