Suppr超能文献

一种用于分析具有稀疏性和过多零值比例的林德利二项式模型。

A Lindley-binomial model for analyzing the proportions with sparseness and excessive zeros.

作者信息

Deng Dianliang, Zhang Xiaoqing

机构信息

Department of Mathematics and Statistics, University of Regina, Sask, Canada.

出版信息

J Appl Stat. 2023 Jul 22;51(9):1792-1817. doi: 10.1080/02664763.2023.2237212. eCollection 2024.

Abstract

Proportional data arise frequently in a wide variety of fields of study. Such data often exhibit extra variation such as over/under dispersion, sparseness and zero inflation. For example, the hepatitis data present both sparseness and zero inflation with 19 contributing non-zero denominators of 5 or less and with 36 having zero seropositive out of 83 annual age groups. The whitefly data consists of 640 observations with 339 zeros (53%), which demonstrates extra zero inflation. The catheter management data involve excessive zeros with over 60% zeros averagely for outcomes of 193 urinary tract infections, 194 outcomes of catheter blockages and 193 outcomes of catheter displacements. However, the existing models cannot always address such features appropriately. In this paper, a new two-parameter probability distribution called Lindley-binomial (LB) distribution is proposed to analyze the proportional data with such features. The probabilistic properties of the distribution such as moment, moment generating function are derived. The Fisher scoring algorithm and EM algorithm are presented for the computation of estimates of parameters in the proposed LB regression model. The issues on goodness of fit for the LB model are discussed. A limited simulation study is also performed to evaluate the performance of derived EM algorithms for the estimation of parameters in the model with/without covariates. The proposed model is illustrated through three aforementioned proportional datasets.

摘要

比例数据在广泛的研究领域中经常出现。这类数据往往表现出额外的变异,如过度离散/欠离散、稀疏性和零膨胀。例如,肝炎数据既呈现出稀疏性又有零膨胀,在83个年龄组中,有19个组的分母为5或更小且有非零贡献,还有36个组的血清学阳性为零。粉虱数据由640个观测值组成,其中有339个零(53%),这显示出额外的零膨胀。导管管理数据存在过多的零,对于193例尿路感染、194例导管堵塞和193例导管移位的结果,平均有超过60%的零。然而,现有的模型并不总能恰当地处理这些特征。在本文中,提出了一种新的双参数概率分布,称为林德利-二项分布(LB分布),用于分析具有此类特征的比例数据。推导了该分布的概率性质,如矩、矩生成函数。提出了费希尔评分算法和期望最大化(EM)算法,用于计算所提出的LB回归模型中参数的估计值。讨论了LB模型的拟合优度问题。还进行了有限的模拟研究,以评估所推导的EM算法在有/无协变量模型中估计参数的性能。通过上述三个比例数据集对所提出的模型进行了说明。

相似文献

7
Testing inflated zeros in binomial regression models.检验二项回归模型中的膨胀零值。
Biom J. 2021 Jan;63(1):59-80. doi: 10.1002/bimj.202000028. Epub 2020 Sep 23.
9
Disease mapping of zero-excessive mesothelioma data in Flanders.比利时弗拉芒地区零超额间皮瘤数据的疾病地图绘制。
Ann Epidemiol. 2017 Jan;27(1):59-66.e3. doi: 10.1016/j.annepidem.2016.10.006. Epub 2016 Nov 1.

本文引用的文献

2
Testing inflated zeros in binomial regression models.检验二项回归模型中的膨胀零值。
Biom J. 2021 Jan;63(1):59-80. doi: 10.1002/bimj.202000028. Epub 2020 Sep 23.
6
A test of inflated zeros for Poisson regression models.泊松回归模型中零膨胀的检验。
Stat Methods Med Res. 2019 Apr;28(4):1157-1169. doi: 10.1177/0962280217749991. Epub 2017 Dec 28.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验