Hainke Katrin, Rahnenführer Jörg, Fried Roland
Department of Statistics, TU Dortmund University, 44221 Dortmund, Germany.
Biom J. 2012 Sep;54(5):617-40. doi: 10.1002/bimj.201100186. Epub 2012 Aug 8.
A better understanding of disease progression is beneficial for early diagnosis and appropriate individual therapy. Many different approaches for statistical modelling of cumulative disease progression have been proposed in the literature, including simple path models up to complex restricted Bayesian networks. Important fields of application are diseases such as cancer and HIV. Tumour progression is measured by means of chromosome aberrations, whereas people infected with HIV develop drug resistances because of genetic changes of the HI-virus. These two very different diseases have typical courses of disease progression, which can be modelled partly by consecutive and partly by independent steps. This paper gives an overview of the different progression models and points out their advantages and drawbacks. Different models are compared via simulations to analyse how they work if some of their assumptions are violated. In a simulation study, we evaluate how models perform in terms of fitting induced multivariate probability distributions and topological relationships. We often find that the true model class used for generating data is outperformed by either a less or a more complex model class. The more flexible conjunctive Bayesian networks can be used to fit oncogenetic trees, whereas mixtures of oncogenetic trees with three tree components can be well fitted by mixture models with only two tree components.
更好地理解疾病进展情况有利于早期诊断和进行适当的个体化治疗。文献中已经提出了许多用于累积疾病进展统计建模的不同方法,包括从简单的路径模型到复杂的受限贝叶斯网络。重要的应用领域包括癌症和艾滋病等疾病。肿瘤进展通过染色体畸变来衡量,而感染艾滋病病毒的人则会由于该病毒的基因变化而产生耐药性。这两种截然不同的疾病具有典型的疾病进展过程,其可以部分地通过连续步骤和部分地通过独立步骤进行建模。本文概述了不同的进展模型,并指出了它们的优缺点。通过模拟比较不同的模型,以分析当它们的一些假设被违反时其如何运行。在一项模拟研究中,我们评估模型在拟合诱导多变量概率分布和拓扑关系方面的表现。我们经常发现,用于生成数据的真实模型类别会被一个较简单或较复杂的模型类别超越。更灵活的联合贝叶斯网络可用于拟合肿瘤发生树,而具有三个树组件的肿瘤发生树混合物可以被仅具有两个树组件的混合模型很好地拟合。