在贝叶斯网络建模中利用缺失临床数据预测医疗问题。

Exploiting missing clinical data in Bayesian network modeling for predicting medical problems.

作者信息

Lin Jau-Huei, Haug Peter J

机构信息

Department of Biomedical Informatics, University of Utah, 26 South 2000 East Room 5775 HSEB, Salt Lake City, UT 84112-5750, USA.

出版信息

J Biomed Inform. 2008 Feb;41(1):1-14. doi: 10.1016/j.jbi.2007.06.001. Epub 2007 Jun 9.

DOI:10.1016/j.jbi.2007.06.001

PMID:17625974

Abstract

When machine learning algorithms are applied to data collected during the course of clinical care, it is generally accepted that the data has not been consistently collected. The absence of expected data elements is common and the mechanism through which a data element is missing often involves the clinical relevance of that data element in a specific patient. Therefore, the absence of data may have information value of its own. In the process of designing an application intended to support a medical problem list, we have studied whether the "missingness" of clinical data can provide useful information in building prediction models. In this study, we experimented with four methods of treating missing values in a clinical data set-two of them explicitly model the absence or "missingness" of data. Each of these data sets were used to build four different kinds of Bayesian classifiers-a naive Bayes structure, a human-composed network structure, and two networks based on structural learning algorithms. We compared the performance between groups with and without explicit models of missingness using the area under the ROC curve. The results showed that in most cases the classifiers trained using the explicit missing value treatments performed better. The result suggests that information may exist in "missingness" itself. Thus, when designing a decision support system, we suggest one consider explicitly representing the presence/absence of data in the underlying logic.

摘要

当机器学习算法应用于临床护理过程中收集的数据时，人们普遍认为这些数据并非始终如一地收集。缺少预期的数据元素很常见，数据元素缺失的机制通常涉及该数据元素在特定患者中的临床相关性。因此，数据的缺失可能本身就具有信息价值。在设计一个旨在支持医疗问题列表的应用程序的过程中，我们研究了临床数据的“缺失性”是否能在构建预测模型时提供有用信息。在本研究中，我们试验了临床数据集中处理缺失值的四种方法，其中两种方法明确对数据的缺失或“缺失性”进行建模。每个数据集都用于构建四种不同类型的贝叶斯分类器——朴素贝叶斯结构、人工构建的网络结构以及基于结构学习算法的两种网络。我们使用ROC曲线下面积比较了有无明确缺失性模型的组间性能。结果表明，在大多数情况下，使用明确缺失值处理方法训练的分类器表现更好。该结果表明“缺失性”本身可能存在信息。因此，在设计决策支持系统时，我们建议在底层逻辑中明确考虑数据的存在/缺失情况。

相似文献

Exploiting missing clinical data in Bayesian network modeling for predicting medical problems.在贝叶斯网络建模中利用缺失临床数据预测医疗问题。

J Biomed Inform. 2008 Feb;41(1):1-14. doi: 10.1016/j.jbi.2007.06.001. Epub 2007 Jun 9.

Prognostic Bayesian networks I: rationale, learning procedure, and clinical use.预后贝叶斯网络I：基本原理、学习过程及临床应用。

J Biomed Inform. 2007 Dec;40(6):609-18. doi: 10.1016/j.jbi.2007.07.003. Epub 2007 Jul 25.

Feature selection in Bayesian classifiers for the prognosis of survival of cirrhotic patients treated with TIPS.用于经颈静脉肝内门体分流术（TIPS）治疗的肝硬化患者生存预后的贝叶斯分类器中的特征选择

J Biomed Inform. 2005 Oct;38(5):376-88. doi: 10.1016/j.jbi.2005.05.004. Epub 2005 Jun 4.

Impact of censoring on learning Bayesian networks in survival modelling.生存模型中删失数据对贝叶斯网络学习的影响。

Artif Intell Med. 2009 Nov;47(3):199-217. doi: 10.1016/j.artmed.2009.08.001. Epub 2009 Oct 14.

A decision support system to facilitate management of patients with acute gastrointestinal bleeding.一个有助于急性胃肠道出血患者管理的决策支持系统。

Artif Intell Med. 2008 Mar;42(3):247-59. doi: 10.1016/j.artmed.2007.10.003. Epub 2007 Dec 11.

Automated diagnosis of coronary artery disease based on data mining and fuzzy modeling.基于数据挖掘和模糊建模的冠状动脉疾病自动诊断

IEEE Trans Inf Technol Biomed. 2008 Jul;12(4):447-58. doi: 10.1109/TITB.2007.907985.

Decision strategies that maximize the area under the LROC curve.使LROC曲线下面积最大化的决策策略。

IEEE Trans Med Imaging. 2005 Dec;24(12):1626-36. doi: 10.1109/TMI.2005.859210.

Application of irregular and unbalanced data to predict diabetic nephropathy using visualization and feature selection methods.运用可视化和特征选择方法，将不规则和不平衡数据应用于预测糖尿病肾病。

Artif Intell Med. 2008 Jan;42(1):37-53. doi: 10.1016/j.artmed.2007.09.005. Epub 2007 Nov 7.

Predicting dire outcomes of patients with community acquired pneumonia.预测社区获得性肺炎患者的严重后果。

J Biomed Inform. 2005 Oct;38(5):347-66. doi: 10.1016/j.jbi.2005.02.005. Epub 2005 Mar 17.

A hybrid neural network system for pattern classification tasks with missing features.一种用于处理具有缺失特征的模式分类任务的混合神经网络系统。

IEEE Trans Pattern Anal Mach Intell. 2005 Apr;27(4):648-53. doi: 10.1109/TPAMI.2005.64.

引用本文的文献

Data Synthesis Reinvented: Preserving Missing Patterns for Enhanced Analysis.数据合成的重塑：保留缺失模式以增强分析。

IEEE Trans Knowl Data Eng. 2025 Jul;37(7):3962-3975. doi: 10.1109/tkde.2025.3563319. Epub 2025 Apr 22.

Preserving Missing Data Distribution in Synthetic Data.在合成数据中保留缺失数据分布

Proc Int World Wide Web Conf. 2023 Apr-May;2023:2110-2121. doi: 10.1145/3543507.3583297. Epub 2023 Apr 30.

A Bayesian Network Approach to Lung Cancer Screening: Assessing the Impact of Data Quantity, Quality, and the Combination of Data from Danish Electronic Health Records.一种用于肺癌筛查的贝叶斯网络方法：评估数据量、质量以及丹麦电子健康记录数据组合的影响。

Cancers (Basel). 2024 Nov 28;16(23):3989. doi: 10.3390/cancers16233989.

Assessing Credibility in Bayesian Networks Structure Learning.评估贝叶斯网络结构学习中的可信度。

Entropy (Basel). 2024 Sep 30;26(10):829. doi: 10.3390/e26100829.

A Machine Learning Model for Predicting In-Hospital Mortality in Chinese Patients With ST-Segment Elevation Myocardial Infarction: Findings From the China Myocardial Infarction Registry.基于中国急性心肌梗死注册研究的机器学习模型预测中国 ST 段抬高型心肌梗死患者住院死亡率

J Med Internet Res. 2024 Jul 30;26:e50067. doi: 10.2196/50067.

Missing data matter: an empirical evaluation of the impacts of missing EHR data in comparative effectiveness research.缺失数据很重要：缺失电子健康记录数据对比较有效性研究影响的实证评估。

J Am Med Inform Assoc. 2023 Jun 20;30(7):1246-1256. doi: 10.1093/jamia/ocad066.

Robustness of Multiple Imputation Methods for Missing Risk Factor Data from Electronic Medical Records for Observational Studies.观察性研究中电子病历缺失风险因素数据的多重填补方法的稳健性

J Healthc Inform Res. 2022 Sep 10;6(4):385-400. doi: 10.1007/s41666-022-00119-w. eCollection 2022 Dec.

Treatment of missing data in Bayesian network structure learning: an application to linked biomedical and social survey data.贝叶斯网络结构学习中缺失数据的处理：在链接生物医学和社会调查数据中的应用。

BMC Med Res Methodol. 2022 Dec 19;22(1):326. doi: 10.1186/s12874-022-01781-9.

Machine learning modeling practices to support the principles of AI and ethics in nutrition research.支持营养研究中人工智能和伦理原则的机器学习建模实践。

Nutr Diabetes. 2022 Dec 2;12(1):48. doi: 10.1038/s41387-022-00226-y.

Impact of molecular sequence data completeness on HIV cluster detection and a network science approach to enhance detection.分子序列数据完整性对 HIV 聚类检测的影响，以及一种提高检测效率的网络科学方法。

Sci Rep. 2022 Nov 10;12(1):19230. doi: 10.1038/s41598-022-21924-8.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

在贝叶斯网络建模中利用缺失临床数据预测医疗问题。

Exploiting missing clinical data in Bayesian network modeling for predicting medical problems.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献