Suppr超能文献

大数据中的联合模型:基于模拟的纵向电子健康记录所需数据质量指南。

Joint models in big data: simulation-based guidelines for required data quality in longitudinal electronic health records.

作者信息

Hunsdieck Berit, Bender Christian, Ickstadt Katja, Mielke Johanna

机构信息

Computational Biology, Bayer AG, Wuppertal, Germany.

Department of Statistics, TU Dortmund University, Dortmund, Germany.

出版信息

BioData Min. 2025 May 13;18(1):35. doi: 10.1186/s13040-025-00450-z.

Abstract

BACKGROUND

Over the past decade an increase in usage of electronic health data (EHR) by office-based physicians and hospitals has been reported. However, these data types come with challenge regarding completeness and data quality and it is, especially for more complex models, unclear how these characteristics influence the performance.

METHODS

In this paper, we focus on joint models which combines longitudinal modelling with survival modelling to incorporate all available information. The aim of this paper is to establish simulation-based guidelines for the necessary quality of longitudinal EHR data so that joint models perform better than cox models. We conducted an extensive simulation study by systematically and transparently varying different characteristics of data quality, e.g., measurement frequency, noise, and heterogeneity between patients. We apply the joint models and evaluate their performance relative to traditional Cox survival modelling techniques.

RESULTS

Key findings suggest that biomarker changes before disease onset must be consistent within similar patient groups. With increasing noise and a higher measurement density, the joint model surpasses the traditional Cox regression model in terms of model performance. We illustrate the usefulness and limitations of the guidelines with two real-world examples, namely the influence of serum bilirubin on primary biliary liver cirrhosis and the influence of the estimated glomerular filtration rate on chronic kidney disease.

摘要

背景

在过去十年间,有报告称门诊医生和医院对电子健康数据(EHR)的使用有所增加。然而,这些数据类型在完整性和数据质量方面存在挑战,而且,尤其是对于更复杂的模型而言,尚不清楚这些特征如何影响其性能。

方法

在本文中,我们聚焦于联合模型,该模型将纵向建模与生存建模相结合以纳入所有可用信息。本文的目的是为纵向EHR数据的必要质量建立基于模拟的指南,以便联合模型的表现优于Cox模型。我们通过系统且透明地改变数据质量的不同特征,例如测量频率、噪声以及患者之间的异质性,开展了一项广泛的模拟研究。我们应用联合模型,并相对于传统的Cox生存建模技术评估其性能。

结果

主要发现表明,疾病发作前生物标志物的变化在相似患者组内必须是一致的。随着噪声增加和测量密度提高,联合模型在模型性能方面超过了传统的Cox回归模型。我们用两个实际例子说明了这些指南的有用性和局限性,即血清胆红素对原发性胆汁性肝硬化的影响以及估计肾小球滤过率对慢性肾病的影响。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/09a1/12070788/467fe9b0e0da/13040_2025_450_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验