Suppr超能文献

参数假设等同于隐藏观测:比较非参数和参数模型在估计 HIV 阳性女性队列中艾滋病或死亡时间的效率。

Parametric assumptions equate to hidden observations: comparing the efficiency of nonparametric and parametric models for estimating time to AIDS or death in a cohort of HIV-positive women.

机构信息

Department of Epidemiology, University of North Carolina at Chapel Hill, 135 Dauer Drive, 2101 McGavran-Greenberg Hall, CB #7435, Chapel Hill, 27599, NC, USA.

出版信息

BMC Med Res Methodol. 2018 Nov 19;18(1):142. doi: 10.1186/s12874-018-0605-8.

Abstract

BACKGROUND

When conducting a survival analysis, researchers might consider two broad classes of models: nonparametric models and parametric models. While nonparametric models are more flexible because they make few assumptions regarding the shape of the data distribution, parametric models are more efficient. Here we sought to make concrete the difference in efficiency between these two model types using effective sample size.

METHODS

We compared cumulative risk of AIDS or death estimated using four survival models - nonparametric, generalized gamma, Weibull, and exponential - and data from 1164 HIV patients who were alive and AIDS-free in 1995. We added pseudo-observations to the sample until the spread of the 95% confidence limits for the nonparametric model became less than that for the parametric models.

RESULTS

We found the 3-parameter generalized gamma to be a good fit to the nonparametric risk curve, but the 1-parameter exponential both underestimated and overestimated the risk at different times. Using two year-risk as an example, we had to add 354, 593, and 3960 observations for the nonparametric model to be as efficient as the generalized gamma, Weibull, and exponential models, respectively.

CONCLUSIONS

These added observations represent the hidden observations underlying the efficiency gained through parametric model form assumptions. If the model is correctly specified, the efficiency gain may be justified, as appeared to be the case for the generalized gamma model. Otherwise, precision will be improved, but at the cost of specification bias, as was the case for the exponential model.

摘要

背景

在进行生存分析时,研究人员可能会考虑两类广泛的模型:非参数模型和参数模型。虽然非参数模型更灵活,因为它们对数据分布的形状没有做出太多假设,但参数模型更有效。在这里,我们试图使用有效样本量来具体说明这两种模型类型在效率上的差异。

方法

我们比较了使用四种生存模型——非参数、广义伽马、威布尔和指数——估计的艾滋病或死亡累积风险,以及来自 1995 年仍存活且无艾滋病的 1164 名 HIV 患者的数据。我们向样本中添加了伪观测值,直到非参数模型的 95%置信限的分布范围小于参数模型的分布范围。

结果

我们发现三参数广义伽马模型非常适合非参数风险曲线,但单参数指数模型在不同时间低估和高估了风险。以两年风险为例,我们必须分别添加 354、593 和 3960 个观测值,才能使非参数模型与广义伽马、威布尔和指数模型的效率相当。

结论

这些附加观测值代表了通过参数模型形式假设获得的效率所隐含的观测值。如果模型被正确指定,那么效率的提高可能是合理的,正如广义伽马模型的情况一样。否则,精度将得到提高,但代价是规范偏差,就像指数模型的情况一样。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/af70/6245810/5ccd95bfb0a0/12874_2018_605_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验