Suppr超能文献

调整德国健康保险记录中的选择偏差以估计地区流行率。

Adjusting selection bias in German health insurance records for regional prevalence estimation.

机构信息

Institution-Department of Economic and Social Statistics, Trier University, Universitätsring 15, 54286, Trier, Germany.

出版信息

Popul Health Metr. 2019 Aug 27;17(1):13. doi: 10.1186/s12963-019-0189-5.

Abstract

BACKGROUND

Regional prevalence estimation requires epidemiologic data with substantial local detail. National health surveys may lack in sufficient local observations due to limited resources. Therefore, corresponding prevalence estimates may not capture regional morbidity patterns with the necessary accuracy. Health insurance records represent alternative data sources for this purpose. Fund-specific member populations have more local observations than surveys, which benefits regional prevalence estimation. However, due to national insurance market regulations, insurance membership can be informative for morbidity. Regional fund-specific prevalence proportions are selective in the sense that the morbidity structure of a fund's member population cannot be extrapolated to the national population. This implies a selection bias that marks a major obstacle for statistical inference. We provide a methodology to adjust fund-specific selectivity and perform regional prevalence estimation from health insurance records. The methodology is applied to estimate regional cohort-referenced diabetes mellitus type 2 prevalence in Germany.

METHODS

Records of the German Public Health Insurance Company from 2014 and Diagnosis-Related Group Statistics data are combined within a benchmarked multi-level model. The fund-specific selectivity is adjusted in a two-step procedure. Firstly, the conditional expectation of the insurance company's regional prevalence given related inpatient diagnosis frequencies of its members is quantified. Secondly, the regional prevalence is estimated by extrapolating the conditional expectation using corresponding inpatient diagnosis frequencies of the Diagnosis-Related Group Statistics as benchmarks. Model assumptions are validated via Monte Carlo simulation. Variable selection is performed via multivariate methods. The optimal model fit is determined by analysis of variance. 95% confidence intervals for the estimates are constructed via semiparametric bootstrapping.

RESULTS

The national diabetes mellitus type 2 prevalence is estimated at 8.70% with a 95% confidence interval of [8.48%, 9.35%]. This indicates an adjustment of the original fund-specific prevalence from - 32.79 to - 25.93%. The estimated disease distribution shows significant morbidity differences between regions, especially between eastern and western Germany. However, the cohort-referenced estimates suggest that these differences can be partially explained by regional demography.

CONCLUSIONS

The proposed methodology allows regional prevalence estimation in remarkable detail despite fund-specific selectivity. This enhances and encourages the use of health insurance records for future epidemiologic studies.

摘要

背景

区域流行率估计需要具有大量本地详细信息的流行病学数据。由于资源有限,国家健康调查可能缺乏足够的本地观察结果。因此,相应的流行率估计可能无法以必要的准确性捕捉到区域发病模式。健康保险记录是为此目的的替代数据源。特定基金的会员人群具有比调查更多的本地观察结果,这有利于区域流行率估计。然而,由于国家保险市场法规,保险会员资格可以为发病率提供信息。区域特定基金的流行率比例是有选择性的,因为基金会员人群的发病率结构不能外推到全国人口。这意味着存在选择偏差,这是统计推断的主要障碍。我们提供了一种从健康保险记录中调整特定基金选择性并进行区域流行率估计的方法。该方法应用于估计德国的区域队列参考 2 型糖尿病流行率。

方法

将 2014 年德国公共卫生保险公司的记录与诊断相关组统计数据相结合,在基准多水平模型内进行分析。在两步程序中调整特定基金的选择性。首先,量化了给定成员相关住院诊断频率的保险公司区域流行率的条件期望。其次,通过使用诊断相关组统计的相应住院诊断频率作为基准来推断条件期望,从而估计区域流行率。通过蒙特卡罗模拟验证模型假设。通过多元方法进行变量选择。通过方差分析确定最佳模型拟合。使用半参数自举法构建估计的 95%置信区间。

结果

全国 2 型糖尿病流行率估计为 8.70%,95%置信区间为[8.48%,9.35%]。这表明原始特定基金的流行率从-32.79 调整为-25.93%。估计的疾病分布显示出区域之间显著的发病率差异,特别是在德国东部和西部之间。然而,队列参考估计表明,这些差异可以部分由区域人口统计学解释。

结论

尽管存在特定基金的选择性,该方法仍允许以非常详细的方式进行区域流行率估计。这增强并鼓励未来的流行病学研究使用健康保险记录。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4b71/6712777/927fdf47c7f2/12963_2019_189_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验