Emmert-Streib Frank, Manjang Kalifa, Dehmer Matthias, Yli-Harja Olli, Auvinen Anssi
Predictive Society and Data Analytics Lab, Faculty of Information Technology and Communication Sciences, Tampere University, 33720 Tampere, Finland.
Department of Computer Science, Swiss Distance University of Applied Sciences, 3900 Brig, Switzerland.
Cancers (Basel). 2021 Oct 12;13(20):5087. doi: 10.3390/cancers13205087.
Prognostic biomarkers can have an important role in the clinical practice because they allow stratification of patients in terms of predicting the outcome of a disorder. Obstacles for developing such markers include lack of robustness when using different data sets and limited concordance among similar signatures. In this paper, we highlight a new problem that relates to the biological meaning of already established prognostic gene expression signatures. Specifically, it is commonly assumed that prognostic markers provide sensible biological information and molecular explanations about the underlying disorder. However, recent studies on prognostic biomarkers investigating 80 established signatures of breast and prostate cancer demonstrated that this is not the case. We will show that this surprising result is related to the distinction between causal models and predictive models and the obfuscating usage of these models in the biomedical literature. Furthermore, we suggest a falsification procedure for studies aiming to establish a prognostic signature to safeguard against false expectations with respect to biological utility.
预后生物标志物在临床实践中可发挥重要作用,因为它们能够根据预测疾病的转归对患者进行分层。开发此类标志物面临的障碍包括使用不同数据集时缺乏稳健性以及相似特征之间的一致性有限。在本文中,我们强调了一个与已确立的预后基因表达特征的生物学意义相关的新问题。具体而言,通常认为预后标志物能提供关于潜在疾病的合理生物学信息和分子解释。然而,最近对乳腺癌和前列腺癌80个已确立特征的预后生物标志物研究表明情况并非如此。我们将表明这一惊人结果与因果模型和预测模型之间的区别以及这些模型在生物医学文献中的混淆使用有关。此外,我们为旨在确立预后特征的研究提出了一种证伪程序,以防止对生物学效用产生不切实际的期望。