Aung Thazin N, Liu Matthew, Su David, Shafi Saba, Boyaci Ceren, Steen Sanna, Tsiknakis Nikolaos, Vidal Joan Martinez, Maher Nigel, Micevic Goran, Tan Samuel X, Vesely Matthew D, Nourmohammadi Saeed, Bai Yalai, Djureinovic Dijana, Wong Pok Fai, Bates Katherine, Chan Nay N N, Gavirelatou Niki, He Mengni, Burela Sneha, Barna Robert, Bosic Martina, Bräutigam Konstantin, Illabochaca Irineu, Chenhao Zhou, Gama Joao, Kreis Bianca, Mohacsi Reka, Pillar Nir, Pinto Joao, Poulios Christos, Toli Maria Angeliki, Tzoras Evangelos, Bracero Yadriel, Bosisio Francesca, Cserni Gábor, Dema Alis, Fortarezza Francesco, Gonzalez Mercedes Solorzano, Gullo Irene, Queipo Gutiérrez Francisco Javier, Hacihasanoglu Ezgi, Jovic Viktor, Lazar Bianca, Olinca Maria, Neppl Christina, Oliveira Rui Caetano, Pezzuto Federica, Gomes Pinto Daniel, Plotar Vanda, Pop Ovidiu, Rau Tilman, Skok Kristijan, Sun Wenwen, Serbes Ezgi Dicle, Solass Wiebke, Stanowska Olga, Szasz Marcell, Szymonski Krzysztof, Thimm Franziska, Vignati Danielle, Vigdorovits Alon, Prieto Victor, Sinnberg Tobias, Wilmott James, Cowper Shawn, Warrell Jonathan, Saenger Yvonne, Hartman Johan, Plummer Jasmine, Osman Iman, Rimm David L, Acs Balazs
Department of Pathology, Yale University School of Medicine, New Haven, Connecticut.
Yale Cancer Center, New Haven, Connecticut.
JAMA Netw Open. 2025 Jul 1;8(7):e2518906. doi: 10.1001/jamanetworkopen.2025.18906.
IMPORTANCE: Tumor-infiltrating lymphocytes (TILs) are a provocative biomarker in melanoma, influencing diagnosis, prognosis, and immunotherapy outcomes; however, traditional pathologist-read TIL assessment on hematoxylin and eosin-stained slides is prone to interobserver variability, leading to inconsistent clinical decisions. Therefore, development of newer TIL scoring approaches that produce more reliable and consistent readouts is important. OBJECTIVE: To evaluate the analytical and clinical validity of a machine learning algorithm for TIL quantification in melanoma compared with traditional pathologist-read methods. DESIGN, SETTING, AND PARTICIPANTS: This multioperator, global, multi-institutional prognostic study compared TIL scoring reproducibility between traditional pathologist-read methods and an artificial intelligence (AI)-driven approach. The study was conducted using retrospective cohorts of patients with melanoma between January 2022 and June 2023 across 45 institutions, with tissue evaluated by participants from academic, clinical, and research institutions. Participants were selected to ensure diverse expertise and professional backgrounds. MAIN OUTCOMES AND MEASURES: Intraclass correlation coefficient (ICC) values were calculated for the manual and AI-assisted arms using log-transformed data. Kendall W values were calculated for Clark scores (brisk = 3, nonbrisk = 2, and sparse = 1). Reliabilities of ICC and W values were classified as moderate (0.40-0.60), good (0.61-0.80), or excellent (>0.80). AI TIL measurements were dichotomized using the 16.6 and median cutoffs. Univariable and multivariable Cox regression analyses assessed the prognostic value of TIL scores adjusted for clinicopathologic variables. RESULTS: There were 111 patients with melanoma in the independent testing cohort (median [range] age at diagnosis, 61.0 [25.0-87.0] years; 56 [50.5%] male) who contributed melanoma whole tissue sections. A total of 98 participants evaluated TILs on 60 hematoxylin and eosin-stained melanoma tissue sections. All 40 participants in the manual arm were pathologists, while the AI-assisted arm included 11 pathologists and 47 nonpathologists (scientists). The AI algorithm demonstrated superior reproducibility, with ICCs higher than 0.90 for all machine learning TIL variables, significantly outperforming manual assessments (ICC, 0.61 for AI-derived stromal TILs vs Kendall W, 0.44 for manual Clark TIL scoring). AI-based TIL scores showed prognostic associations with patient outcomes (n = 111) using the median cutoff approach with a hazard ratio (HR) of 0.45 (95% CI, 0.26-0.80; P = .005), and using the cutoff of 16.6, with an HR of 0.56 (95% CI, 0.32-0.98; P = .04). CONCLUSIONS AND RELEVANCE: In this prognostic study of TIL quantification in melanoma, the AI algorithm demonstrated superior reproducibility and prognostic associations compared with traditional methods. Although the retrospective nature of the cohorts limits demonstration of clinical utility, the publicly available dataset and open-source AI tool offer a foundation for future validation and integration into melanoma management.
重要性:肿瘤浸润淋巴细胞(TILs)是黑色素瘤中一种具有启发性的生物标志物,影响诊断、预后和免疫治疗结果;然而,传统病理学家在苏木精和伊红染色切片上对TILs的评估容易出现观察者间的差异,导致临床决策不一致。因此,开发能产生更可靠和一致读数的新型TIL评分方法很重要。 目的:与传统病理学家阅读方法相比,评估一种用于黑色素瘤TIL定量的机器学习算法的分析和临床有效性。 设计、设置和参与者:这项多操作者、全球、多机构的预后研究比较了传统病理学家阅读方法和人工智能(AI)驱动方法之间TIL评分的可重复性。该研究使用了2022年1月至2023年6月期间45个机构中黑色素瘤患者的回顾性队列,组织由学术、临床和研究机构的参与者进行评估。选择参与者以确保不同的专业知识和专业背景。 主要结果和测量指标:使用对数转换数据计算手动和AI辅助组的组内相关系数(ICC)值。计算Clark评分(活跃=3,不活跃=2,稀疏=1)的Kendall W值。ICC和W值的可靠性分为中等(0.40-0.60)、良好(0.61-0.80)或优秀(>0.80)。使用16.6和中位数临界值对AI TIL测量值进行二分法。单变量和多变量Cox回归分析评估了根据临床病理变量调整后的TIL评分的预后价值。 结果:独立测试队列中有111例黑色素瘤患者(诊断时的中位[范围]年龄为61.0[25.0-87.0]岁;56例[50.5%]为男性)提供了黑色素瘤全组织切片。共有98名参与者对60张苏木精和伊红染色的黑色素瘤组织切片上的TILs进行了评估。手动组的40名参与者均为病理学家,而AI辅助组包括11名病理学家和47名非病理学家(科学家)。AI算法显示出更高的可重复性,所有机器学习TIL变量的ICC均高于0.90,显著优于手动评估(AI衍生的基质TILs的ICC为0.61,而手动Clark TIL评分的Kendall W为0.44)。基于AI的TIL评分使用中位数临界值方法显示与患者预后相关(n=111),风险比(HR)为0.45(95%CI,0.26-0.80;P=.005),使用临界值16.6时,HR为0.56(95%CI,0.32-0.98;P=.04)。 结论和相关性:在这项黑色素瘤TIL定量的预后研究中,与传统方法相比,AI算法显示出更高的可重复性和预后相关性。尽管队列的回顾性性质限制了临床效用的证明,但公开可用的数据集和开源AI工具为未来的验证以及纳入黑色素瘤管理提供了基础。
JAMA Netw Open. 2025-7-1
Clin Orthop Relat Res. 2024-9-1
Cochrane Database Syst Rev. 2022-9-26
Cochrane Database Syst Rev. 2018-2-6
N Engl J Med. 2024-11-7
J Immunother Cancer. 2024-3-13
Lab Invest. 2023-11