Demler Olga V, Paynter Nina P, Cook Nancy R
Division of Preventive Medicine, Brigham and Women's Hospital Harvard Medical School, 900 Commonwealth Ave., East Boston, MA, 02215, U.S.A.
Stat Med. 2015 May 10;34(10):1659-80. doi: 10.1002/sim.6428. Epub 2015 Feb 11.
To access the calibration of a predictive model in a survival analysis setting, several authors have extended the Hosmer-Lemeshow goodness-of-fit test to survival data. Grønnesby and Borgan developed a test under the proportional hazards assumption, and Nam and D'Agostino developed a nonparametric test that is applicable in a more general survival setting for data with limited censoring. We analyze the performance of the two tests and show that the Grønnesby-Borgan test attains appropriate size in a variety of settings, whereas the Nam-D'Agostino method has a higher than nominal Type 1 error when there is more than trivial censoring. Both tests are sensitive to small cell sizes. We develop a modification of the Nam-D'Agostino test to allow for higher censoring rates. We show that this modified Nam-D'Agostino test has appropriate control of Type 1 error and comparable power to the Grønnesby-Borgan test and is applicable to settings other than proportional hazards. We also discuss the application to small cell sizes.
为了在生存分析环境中评估预测模型的校准情况,几位作者已将霍斯默 - 莱梅肖拟合优度检验扩展到生存数据。格伦内斯比和博尔根在比例风险假设下开发了一种检验方法,而南和达戈斯蒂诺开发了一种非参数检验,该检验适用于 censoring 有限的数据的更一般生存环境。我们分析了这两种检验的性能,结果表明格伦内斯比 - 博尔根检验在各种情况下都能达到合适的检验水平,而当存在较多 censoring 时,南 - 达戈斯蒂诺方法的一类错误率高于名义水平。两种检验对小样本量都很敏感。我们对南 - 达戈斯蒂诺检验进行了修改,以允许更高的 censoring 率。我们表明,这种修改后的南 - 达戈斯蒂诺检验对一类错误有适当的控制,并且与格伦内斯比 - 博尔根检验具有相当的功效,并且适用于比例风险以外的情况。我们还讨论了其在小样本量情况下的应用。 (注:“censoring”在医学统计等领域常译为“删失”,这里为保留原文表述未进行意译替换)