Bzdok Danilo, Engemann Denis, Thirion Bertrand
Mila - Quebec Artificial Intelligence Institute, Montreal, QC, Canada.
Department of Biomedical Engineering, McConnell Brain Imaging Centre (BIC), Montreal Neurological Institute (MNI), Faculty of Medicine, School of Computer Science, McGill University, Montreal, QC, Canada.
Patterns (N Y). 2020 Oct 8;1(8):100119. doi: 10.1016/j.patter.2020.100119. eCollection 2020 Nov 13.
In the 20 century, many advances in biological knowledge and evidence-based medicine were supported by p values and accompanying methods. In the early 21 century, ambitions toward precision medicine place a premium on detailed predictions for single individuals. The shift causes tension between traditional regression methods used to infer statistically significant group differences and burgeoning predictive analysis tools suited to forecast an individual's future. Our comparison applies linear models for identifying significant contributing variables and for finding the most predictive variable sets. In systematic data simulations and common medical datasets, we explored how variables identified as significantly relevant and variables identified as predictively relevant can agree or diverge. Across analysis scenarios, even small predictive performances typically coincided with finding underlying significant statistical relationships, but not vice versa. More complete understanding of different ways to define "important" associations is a prerequisite for reproducible research and advances toward personalizing medical care.
在20世纪,生物学知识和循证医学的许多进展都得到了P值及相关方法的支持。在21世纪初,精准医学的目标高度重视对个体的详细预测。这种转变导致用于推断具有统计学意义的群体差异的传统回归方法与适用于预测个体未来的新兴预测分析工具之间产生了矛盾。我们的比较采用线性模型来识别显著的影响变量,并找到最具预测性的变量集。在系统的数据模拟和常见的医学数据集中,我们探讨了被确定为显著相关的变量与被确定为预测相关的变量如何一致或不一致。在各种分析场景中,即使是很小的预测性能通常也与发现潜在的显著统计关系相吻合,但反之则不然。更全面地理解定义“重要”关联的不同方式是开展可重复研究以及推进医疗个性化的先决条件。