Indrayan Abhaya, Mishra Sakshi
Department of Clinical Research, Max Healthcare, New Delhi, India.
Indian J Community Med. 2025 Sep-Oct;50(5):739-744. doi: 10.4103/ijcm.ijcm_567_24. Epub 2025 Mar 31.
Many models claim to predict outcomes with good accuracy. However, not many seem to be adopted in practice. This could be because most of them do not have sufficient predictive accuracy. We analyzed 20 recently published papers on prediction models and found that most use inadequate measures to assess predictive performance. These measures primarily include the area under the ROC curve (C-index) that measures discrimination and not predictivity, that too accepting a relatively low value, and using aggregate concordance for assessing predictive accuracy instead of individual-based agreement between the observed and predicted values. Some use arbitrary scores in their models, consider only binary outcomes where multiple categories could be more useful, misinterpret values, ignore future dynamics, use inappropriate validation settings, and do not fully consider the process of the outcomes. We give details of all these inadequacies and suggest remedies so that models with adequate predictive performance can be developed.
许多模型声称能够以较高的准确率预测结果。然而,在实际应用中,似乎没有多少模型被采用。这可能是因为它们中的大多数没有足够的预测准确率。我们分析了最近发表的20篇关于预测模型的论文,发现大多数论文使用的评估预测性能的方法并不充分。这些方法主要包括衡量区分能力而非预测能力的ROC曲线下面积(C指数),而且接受的是相对较低的值,以及使用总体一致性来评估预测准确率,而不是基于观测值和预测值之间的个体一致性。有些模型使用任意分数,只考虑二元结果,而多类别结果可能更有用,错误解释数值,忽略未来动态,使用不恰当的验证设置,并且没有充分考虑结果的过程。我们详细说明了所有这些不足之处,并提出补救措施,以便能够开发出具有足够预测性能的模型。