Hope Thomas M H, Bowman Howard, Bruce Rachel M, Leff Alex P, Price Cathy J
Department of Imaging Neuroscience, Institute of Neurology, University College London, London, UK.
Department of Psychology and Social Sciences, John Cabot University, Rome, Italy.
Ann Clin Transl Neurol. 2025 Aug;12(8):1619-1627. doi: 10.1002/acn3.70077. Epub 2025 Jun 12.
Current medicine cannot confidently predict who will recover from post-stroke impairments. Researchers have sought to bridge this gap by treating the post-stroke prognostic problem as a machine learning problem, reporting prediction error metrics across samples of patients whose outcomes are known. This approach effectively shares prediction error equally among the patients, which is contrary to the long-held clinical intuition that some patients' outcomes are more predictable than other patients' outcomes. Here, we test that intuition empirically, by asking whether those 'more predictable' patients can be identified before their outcomes are known.
Drawing on lesion location and demographic data, we use ensemble classifiers to predict the presence of a variety of different language impairments in a large sample of stroke patients. We tune these models to maximise their Positive Predictive Value (or precision): that is, the probability that patients assigned to a class are really members of that class. We test whether those tuned models have high precision on independent data.
Precision-tuned models might only classify a subset of patients, but for that reduced set, the classifications are very likely to be correct: typically > 90% and sometimes > 95%. Small reductions of target precision could rapidly raise the proportion of patients for whom 'high enough precision' predictions can be made.
High precision prognoses are possible when predicting language outcomes after stroke. Providing such predictions for subsets of patients might be a reasonable intermediate step on the way to providing them for all.
目前的医学无法确切预测哪些中风后功能障碍患者能够康复。研究人员试图通过将中风后预后问题视为一个机器学习问题来弥合这一差距,报告已知预后患者样本的预测误差指标。这种方法在患者之间平均分配预测误差,这与长期以来的临床直觉相反,即有些患者的预后比其他患者更容易预测。在这里,我们通过询问那些“更可预测”的患者在其预后已知之前是否能够被识别来实证检验这种直觉。
利用病变位置和人口统计学数据,我们使用集成分类器来预测大量中风患者样本中各种不同语言障碍的存在情况。我们对这些模型进行调整,以最大化其阳性预测值(或精确率):即被归类到某一类别的患者真正属于该类别的概率。我们测试那些调整后的模型在独立数据上是否具有高精度。
精确率调整后的模型可能只能对一部分患者进行分类,但对于这一减少后的集合,分类很可能是正确的:通常>90%,有时>95%。目标精确率的小幅降低可能会迅速提高能够做出“足够高精度”预测的患者比例。
预测中风后的语言预后时,高精度预后是可能的。为部分患者提供此类预测可能是朝着为所有患者提供预测迈出的合理中间步骤。