Jairath Neil K, Pahalyants Vartan, Cheraghlou Shayan, Maas Derek, Lee Nayoung, Criscito Maressa C, Stevenson Mary L, Mehta Apoorva, Leibovit-Reiben Zachary, Stockard Alyssa, Doudican Nicole, Mangold Aaron, Carucci John A
The Ronald O. Perelman Department of Dermatology, New York University Grossman School of Medicine, New York.
Department of Dermatology, Mayo Clinic Comprehensive Cancer Center, Phoenix, Arizona.
JAMA Dermatol. 2025 Jun 11. doi: 10.1001/jamadermatol.2025.1614.
There exists substantial heterogeneity in outcomes within T stages for patients with cutaneous squamous cell carcinoma (cSCC).
To determine whether a customized generative pretrained transformer model, trained on a comprehensive dataset with more than 1 trillion parameters and equipped with relevant focused context and retrieval augmented generation (RAG), could excel in aggregating and interpreting vast quantities of data to develop a novel class-based risk stratification system that outperforms the current standards.
DESIGN, SETTING, AND PARTICIPANTS: To build the RAG knowledge base, a systematic review of the literature was conducted that addressed risk factors for poor outcomes in cSCC. Using the RAG-enabled generative pretrained transformer (GPT) model, we developed a novel class-based risk stratification system that assigned point values for risk factors, culminating in a GPT-based prognostication system called the artificial intelligence-derived risk score (AIRIS). The system's performance was validated on a combined prospective and retrospective cohort of 2379 primary cSCC tumors (1996-2023) with at least 36 months of follow-up, against Brigham and Women's Hospital (BWH) and American Joint Committee on Cancer Staging Manual, eighth edition (AJCC8) systems in stratifying risk for locoregional recurrence (LR), nodal metastasis (NM), distant metastasis (DM), and disease-specific death (DSD).
Performance metrics evaluated included distinctiveness, homogeneity, and monotonicity, as defined by the AJCC8, as well as sensitivity, specificity, positive predictive value, negative predictive value, accuracy, the area under the receiver operating characteristic curve, and concordance.
The median age at diagnosis was 73 (IQR, 64-81) years, with 38.5% female patients and 61.5% male patients. The AIRIS prognostication system demonstrated superior sensitivity across all outcomes (LR, 49.1%; NM, 73.7%; DM, 82.5%; and DSD, 72.2%) and the highest area under the receiver operating characteristic curve values (LR, 0.69; NM, 0.81; DM, 0.85; and DSD, 0.80), indicating significantly enhanced discriminative capability compared with the BWH and AJCC8 systems. While all systems were comparably distinctive, the AIRIS prognostication system consistently demonstrated the lowest proportion of tumors exhibiting poor outcomes in low-risk categories, suggesting its improved homogeneity and monotonicity.
The results of this diagnostic study suggest that the AIRIS system outperforms the existing BWH and AJCC8 prognostication systems, potentially providing a more effective tool for predicting poor outcomes in cSCC. This study illustrates the potential of large language models in refining prognostic tools, offering implications for treating patients with cancer.
皮肤鳞状细胞癌(cSCC)患者在T分期内的预后存在显著异质性。
确定一个定制的生成式预训练变换器模型,该模型在一个拥有超过1万亿参数的综合数据集上进行训练,并配备相关的聚焦上下文和检索增强生成(RAG)技术,是否能够在聚合和解释大量数据方面表现出色,从而开发出一种优于当前标准的基于类别的新型风险分层系统。
设计、设置和参与者:为构建RAG知识库,我们对文献进行了系统回顾,探讨了cSCC预后不良的风险因素。使用启用RAG的生成式预训练变换器(GPT)模型,我们开发了一种基于类别的新型风险分层系统,该系统为风险因素分配分值,最终形成了一个基于GPT的预后系统,称为人工智能衍生风险评分(AIRIS)。该系统的性能在一个包含2379例原发性cSCC肿瘤(1996 - 2023年)的前瞻性和回顾性组合队列中进行了验证,这些肿瘤至少有36个月的随访期,并与布莱根妇女医院(BWH)和美国癌症联合委员会第八版分期手册(AJCC8)系统在局部区域复发(LR)、淋巴结转移(NM)、远处转移(DM)和疾病特异性死亡(DSD)的风险分层方面进行了比较。
评估的性能指标包括AJCC8定义的独特性、同质性和单调性,以及敏感性、特异性、阳性预测值、阴性预测值、准确性、受试者操作特征曲线下面积和一致性。
诊断时的中位年龄为73岁(四分位间距,64 - 81岁),女性患者占38.5%,男性患者占61.5%。AIRIS预后系统在所有结局方面均表现出更高的敏感性(LR为49.1%;NM为73.7%;DM为82.5%;DSD为72.2%)以及最高的受试者操作特征曲线下面积值(LR为0.69;NM为0.81;DM为0.85;DSD为0.80),表明与BWH和AJCC8系统相比,其判别能力显著增强。虽然所有系统在独特性方面相当,但AIRIS预后系统在低风险类别中始终表现出预后不良的肿瘤比例最低,表明其同质性和单调性有所改善。
这项诊断研究的结果表明,AIRIS系统优于现有的BWH和AJCC8预后系统,并可能为预测cSCC的不良预后提供更有效的工具。本研究说明了大语言模型在完善预后工具方面的潜力,对癌症患者的治疗具有启示意义。