Dipartimento di Statistica G. Parenti, Università degli Studi di Firenze, viale Morgagni 59, Florence, Italy.
BMC Bioinformatics. 2009 Oct 15;10 Suppl 12(Suppl 12):S13. doi: 10.1186/1471-2105-10-S12-S13.
The associations existing among different biomarkers are important in clinical settings because they contribute to the characterisation of specific pathways related to the natural history of the disease, genetic and environmental determinants. Despite the availability of binary/linear (or at least monotonic) correlation indices, the full exploitation of molecular information depends on the knowledge of direct/indirect conditional independence (and eventually causal) relationships among biomarkers, and with target variables in the population of interest. In other words, that depends on inferences which are performed on the joint multivariate distribution of markers and target variables. Graphical models, such as Bayesian Networks, are well suited to this purpose. Therefore, we reconsidered a previously published case study on classical biomarkers in breast cancer, namely estrogen receptor (ER), progesterone receptor (PR), a proliferative index (Ki67/MIB-1) and to protein HER2/neu (NEU) and p53, to infer conditional independence relations existing in the joint distribution by inferring (learning) the structure of graphs entailing those relations of independence. We also examined the conditional distribution of a special molecular phenotype, called triple-negative, in which ER, PR and NEU were absent. We confirmed that ER is a key marker and we found that it was able to define subpopulations of patients characterized by different conditional independence relations among biomarkers. We also found a preliminary evidence that, given a triple-negative profile, the distribution of p53 protein is mostly supported in 'zero' and 'high' states providing useful information in selecting patients that could benefit from an adjuvant anthracyclines/alkylating agent-based chemotherapy.
不同生物标志物之间的关联在临床环境中很重要,因为它们有助于描述与疾病自然史、遗传和环境决定因素相关的特定途径。尽管存在二元/线性(或至少单调)相关指数,但要充分利用分子信息,取决于对生物标志物之间以及目标变量在感兴趣人群中的直接/间接条件独立性(最终是因果关系)的了解。换句话说,这取决于对标记和目标变量的联合多变量分布进行的推断。图形模型,如贝叶斯网络,非常适合这个目的。因此,我们重新考虑了先前发表的关于乳腺癌经典生物标志物的案例研究,即雌激素受体(ER)、孕激素受体(PR)、增殖指数(Ki67/MIB-1)以及蛋白 HER2/neu(NEU)和 p53,通过推断(学习)包含这些独立性关系的图形结构,推断(学习)联合分布中存在的条件独立性关系。我们还检查了一种特殊分子表型,称为三阴性,即 ER、PR 和 NEU 均不存在的条件分布。我们证实 ER 是一个关键标志物,并且发现它能够定义具有不同生物标志物之间条件独立性关系的患者亚群。我们还发现了初步证据,表明在给定三阴性特征的情况下,p53 蛋白的分布主要支持在“零”和“高”状态,这为选择可能受益于辅助蒽环类药物/烷化剂化疗的患者提供了有用信息。