Weaver Colin George Wyllie, Basmadjian Robert B, Williamson Tyler, McBrien Kerry, Sajobi Tolu, Boyne Devon, Yusuf Mohamed, Ronksley Paul Everett
Department of Community Health Sciences, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada.
Department of Family Medicine, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada.
JMIR Res Protoc. 2022 Mar 3;11(3):e30956. doi: 10.2196/30956.
With the growing excitement of the potential benefits of using machine learning and artificial intelligence in medicine, the number of published clinical prediction models that use these approaches has increased. However, there is evidence (albeit limited) that suggests that the reporting of machine learning-specific aspects in these studies is poor. Further, there are no reviews assessing the reporting quality or broadly accepted reporting guidelines for these aspects.
This paper presents the protocol for a systematic review that will assess the reporting quality of machine learning-specific aspects in studies that use machine learning to develop clinical prediction models.
We will include studies that use a supervised machine learning algorithm to develop a prediction model for use in clinical practice (ie, for diagnosis or prognosis of a condition or identification of candidates for health care interventions). We will search MEDLINE for studies published in 2019, pseudorandomly sort the records, and screen until we obtain 100 studies that meet our inclusion criteria. We will assess reporting quality with a novel checklist developed in parallel with this review, which includes content derived from existing reporting guidelines, textbooks, and consultations with experts. The checklist will cover 4 key areas where the reporting of machine learning studies is unique: modelling steps (order and data used for each step), model performance (eg, reporting the performance of each model compared), statistical methods (eg, describing the tuning approach), and presentation of models (eg, specifying the predictors that contributed to the final model).
We completed data analysis in August 2021 and are writing the manuscript. We expect to submit the results to a peer-reviewed journal in early 2022.
This review will contribute to more standardized and complete reporting in the field by identifying areas where reporting is poor and can be improved.
PROSPERO International Prospective Register of Systematic Reviews CRD42020206167; https://www.crd.york.ac.uk/PROSPERO/display_record.php?RecordID=206167.
INTERNATIONAL REGISTERED REPORT IDENTIFIER (IRRID): RR1-10.2196/30956.
随着人们对机器学习和人工智能在医学领域潜在益处的兴趣日益浓厚,使用这些方法发表的临床预测模型数量有所增加。然而,有证据(尽管有限)表明,这些研究中机器学习特定方面的报告情况不佳。此外,目前尚无针对这些方面报告质量的评估或广泛认可的报告指南。
本文介绍了一项系统评价的方案,该评价将评估在使用机器学习开发临床预测模型的研究中,机器学习特定方面的报告质量。
我们将纳入使用监督式机器学习算法开发用于临床实践的预测模型(即用于疾病诊断或预后评估,或确定医疗保健干预候选对象)的研究。我们将在MEDLINE中检索2019年发表的研究,对记录进行伪随机排序并筛选,直至获得100项符合纳入标准的研究。我们将使用与本评价并行开发的一份新颖清单来评估报告质量,该清单包含源自现有报告指南、教科书以及与专家协商得出的内容。该清单将涵盖机器学习研究报告具有独特性的4个关键领域:建模步骤(每个步骤的顺序和使用的数据)、模型性能(例如,报告每个比较模型的性能)、统计方法(例如,描述调整方法)以及模型呈现(例如,指定对最终模型有贡献的预测变量)。
我们于2021年8月完成了数据分析,正在撰写稿件。预计将于2022年初将结果提交给同行评审期刊。
本评价将通过识别报告不佳且可改进的领域促进该领域报告更加标准化和完整。
国际前瞻性系统评价注册库PROSPERO,注册号CRD42020206167;https://www.crd.york.ac.uk/PROSPERO/display_record.php?RecordID=206167。
国际注册报告识别号(IRRID):RR1-10.2196/30956。