Alapati Rahul, Renslo Bryan, Wagoner Sarah F, Karadaghy Omar, Serpedin Aisha, Kim Yeo Eun, Feucht Maria, Wang Naomi, Ramesh Uma, Bon Nieves Antonio, Lawrence Amelia, Virgen Celina, Sawaf Tuleen, Rameau Anaïs, Bur Andrés M
Department of Otolaryngology-Head & Neck Surgery, University of Kansas Medical Center, Kansas City, Kansas, U.S.A.
Department of Otolaryngology-Head & Neck Surgery, Thomas Jefferson University, Philadelphia, Pennsylvania, U.S.A.
Laryngoscope. 2025 Feb;135(2):687-694. doi: 10.1002/lary.31756. Epub 2024 Sep 11.
This study aimed to assess reporting quality of machine learning (ML) algorithms in the head and neck oncology literature using the TRIPOD-AI criteria.
A comprehensive search was conducted using PubMed, Scopus, Embase, and Cochrane Database of Systematic Reviews, incorporating search terms related to "artificial intelligence," "machine learning," "deep learning," "neural network," and various head and neck neoplasms.
Two independent reviewers analyzed each published study for adherence to the 65-point TRIPOD-AI criteria. Items were classified as "Yes," "No," or "NA" for each publication. The proportion of studies satisfying each TRIPOD-AI criterion was calculated. Additionally, the evidence level for each study was evaluated independently by two reviewers using the Oxford Centre for Evidence-Based Medicine (OCEBM) Levels of Evidence. Discrepancies were reconciled through discussion until consensus was reached.
The study highlights the need for improvements in ML algorithm reporting in head and neck oncology. This includes more comprehensive descriptions of datasets, standardization of model performance reporting, and increased sharing of ML models, data, and code with the research community. Adoption of TRIPOD-AI is necessary for achieving standardized ML research reporting in head and neck oncology.
Current reporting of ML algorithms hinders clinical application, reproducibility, and understanding of the data used for model training. To overcome these limitations and improve patient and clinician trust, ML developers should provide open access to models, code, and source data, fostering iterative progress through community critique, thus enhancing model accuracy and mitigating biases.
NA Laryngoscope, 135:687-694, 2025.
本研究旨在使用TRIPOD-AI标准评估头颈肿瘤学文献中机器学习(ML)算法的报告质量。
使用PubMed、Scopus、Embase和Cochrane系统评价数据库进行全面检索,纳入与“人工智能”“机器学习”“深度学习”“神经网络”以及各种头颈肿瘤相关的检索词。
两名独立的评审员分析每项已发表的研究是否符合65分的TRIPOD-AI标准。对每项出版物的条目分类为“是”“否”或“不适用”。计算满足每个TRIPOD-AI标准的研究比例。此外,两名评审员使用牛津循证医学中心(OCEBM)证据水平独立评估每项研究的证据水平。通过讨论解决分歧,直至达成共识。
该研究强调了在头颈肿瘤学中改进ML算法报告的必要性。这包括更全面地描述数据集、模型性能报告的标准化,以及增加与研究界共享ML模型、数据和代码。采用TRIPOD-AI对于在头颈肿瘤学中实现标准化的ML研究报告是必要的。
目前ML算法的报告阻碍了临床应用、可重复性以及对用于模型训练的数据的理解。为克服这些限制并提高患者和临床医生的信任度,ML开发者应提供对模型、代码和源数据的开放访问,通过社区批评促进迭代进步,从而提高模型准确性并减轻偏差。
不适用 喉镜,135:687 - 694,2025年。