Suppr超能文献

生存模型中删失数据对贝叶斯网络学习的影响。

Impact of censoring on learning Bayesian networks in survival modelling.

机构信息

Department of Automation, Electronics and Computing, Faculty of Engineering, University of Rijeka, Vukovarska 58, 51000 Rijeka, Croatia.

出版信息

Artif Intell Med. 2009 Nov;47(3):199-217. doi: 10.1016/j.artmed.2009.08.001. Epub 2009 Oct 14.

Abstract

OBJECTIVE

Bayesian networks are commonly used for presenting uncertainty and covariate interactions in an easily interpretable way. Because of their efficient inference and ability to represent causal relationships, they are an excellent choice for medical decision support systems in diagnosis, treatment, and prognosis. Although good procedures for learning Bayesian networks from data have been defined, their performance in learning from censored survival data has not been widely studied. In this paper, we explore how to use these procedures to learn about possible interactions between prognostic factors and their influence on the variate of interest. We study how censoring affects the probability of learning correct Bayesian network structures. Additionally, we analyse the potential usefulness of the learnt models for predicting the time-independent probability of an event of interest.

METHODS AND MATERIALS

We analysed the influence of censoring with a simulation on synthetic data sampled from randomly generated Bayesian networks. We used two well-known methods for learning Bayesian networks from data: a constraint-based method and a score-based method. We compared the performance of each method under different levels of censoring to those of the naive Bayes classifier and the proportional hazards model. We did additional experiments on several datasets from real-world medical domains. The machine-learning methods treated censored cases in the data as event-free.

RESULTS

We report and compare results for several commonly used model evaluation metrics. On average, the proportional hazards method outperformed other methods in most censoring setups. As part of the simulation study, we also analysed structural similarities of the learnt networks. Heavy censoring, as opposed to no censoring, produces up to a 5% surplus and up to 10% missing total arcs. It also produces up to 50% missing arcs that should originally be connected to the variate of interest.

CONCLUSION

Presented methods for learning Bayesian networks from data can be used to learn from censored survival data in the presence of light censoring (up to 20%) by treating censored cases as event-free. Given intermediate or heavy censoring, the learnt models become tuned to the majority class and would thus require a different approach.

摘要

目的

贝叶斯网络常用于以易于理解的方式呈现不确定性和协变量交互。由于其高效的推断能力和表示因果关系的能力,它们是诊断、治疗和预后中医疗决策支持系统的绝佳选择。尽管已经定义了从数据中学习贝叶斯网络的良好程序,但它们在从有 censored 生存数据中学习的性能尚未得到广泛研究。在本文中,我们探讨了如何使用这些程序来了解预后因素之间的可能相互作用及其对感兴趣变量的影响。我们研究了 censoring 如何影响学习正确贝叶斯网络结构的概率。此外,我们分析了学习模型在预测独立于时间的感兴趣事件的概率方面的潜在有用性。

方法和材料

我们使用从随机生成的贝叶斯网络中采样的合成数据进行模拟,分析 censoring 的影响。我们使用两种从数据中学习贝叶斯网络的知名方法:基于约束的方法和基于评分的方法。我们比较了每种方法在不同 censoring 水平下的性能与朴素贝叶斯分类器和比例风险模型的性能。我们还在来自真实医疗领域的几个数据集上进行了额外的实验。机器学习方法将数据中的 censored 案例视为无事件。

结果

我们报告并比较了几种常用模型评估指标的结果。平均而言,比例风险方法在大多数 censoring 设置中都优于其他方法。作为模拟研究的一部分,我们还分析了学习网络的结构相似性。与无 censoring 相比,重度 censoring 最多会产生 5%的额外和 10%的总缺失弧。它还会产生多达 50%的原本应连接到感兴趣变量的缺失弧。

结论

所提出的从数据中学习贝叶斯网络的方法可以用于在存在轻度 censoring(最多 20%)的情况下从 censored 生存数据中学习,将 censored 案例视为无事件。对于中等或重度 censoring,学习模型会针对多数类进行调整,因此需要采用不同的方法。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验