Faculty of Medicine and Health, Sydney Medical School, The University of Sydney, Sydney, NSW, Australia; Sydney Melanoma Diagnostic Centre, Royal Prince Alfred Hospital, Camperdown, NSW, Australia.
Melanoma Institute Australia, The University of Sydney, Sydney, NSW, Australia; Department of Dermatology, Medical University of Vienna, Vienna, Austria.
Lancet Digit Health. 2023 Oct;5(10):e679-e691. doi: 10.1016/S2589-7500(23)00130-9.
Diagnosis of skin cancer requires medical expertise, which is scarce. Mobile phone-powered artificial intelligence (AI) could aid diagnosis, but it is unclear how this technology performs in a clinical scenario. Our primary aim was to test in the clinic whether there was equivalence between AI algorithms and clinicians for the diagnosis and management of pigmented skin lesions.
In this multicentre, prospective, diagnostic, clinical trial, we included specialist and novice clinicians and patients from two tertiary referral centres in Australia and Austria. Specialists had a specialist medical qualification related to diagnosing and managing pigmented skin lesions, whereas novices were dermatology junior doctors or registrars in trainee positions who had experience in examining and managing these lesions. Eligible patients were aged 18-99 years and had a modified Fitzpatrick I-III skin type; those in the diagnostic trial were undergoing routine excision or biopsy of one or more suspicious pigmented skin lesions bigger than 3 mm in the longest diameter, and those in the management trial had baseline total-body photographs taken within 1-4 years. We used two mobile phone-powered AI instruments incorporating a simple optical attachment: a new 7-class AI algorithm and the International Skin Imaging Collaboration (ISIC) AI algorithm, which was previously tested in a large online reader study. The reference standard for excised lesions in the diagnostic trial was histopathological examination; in the management trial, the reference standard was a descending hierarchy based on histopathological examination, comparison of baseline total-body photographs, digital monitoring, and telediagnosis. The main outcome of this study was to compare the accuracy of expert and novice diagnostic and management decisions with the two AI instruments. Possible decisions in the management trial were dismissal, biopsy, or 3-month monitoring. Decisions to monitor were considered equivalent to dismissal (scenario A) or biopsy of malignant lesions (scenario B). The trial was registered at the Australian New Zealand Clinical Trials Registry ACTRN12620000695909 (Universal trial number U1111-1251-8995).
The diagnostic study included 172 suspicious pigmented lesions (84 malignant) from 124 patients and the management study included 5696 pigmented lesions (18 malignant) from the whole body of 66 high-risk patients. The diagnoses of the 7-class AI algorithm were equivalent to the specialists' diagnoses (absolute accuracy difference 1·2% [95% CI -6·9 to 9·2]) and significantly superior to the novices' ones (21·5% [13·1 to 30·0]). The diagnoses of the ISIC AI algorithm were significantly inferior to the specialists' diagnoses (-11·6% [-20·3 to -3·0]) but significantly superior to the novices' ones (8·7% [-0·5 to 18·0]). The best 7-class management AI was significantly inferior to specialists' management (absolute accuracy difference in correct management decision -0·5% [95% CI -0·7 to -0·2] in scenario A and -0·4% [-0·8 to -0·05] in scenario B). Compared with the novices' management, the 7-class management AI was significantly inferior (-0·4% [-0·6 to -0·2]) in scenario A but significantly superior (0·4% [0·0 to 0·9]) in scenario B.
The mobile phone-powered AI technology is simple, practical, and accurate for the diagnosis of suspicious pigmented skin cancer in patients presenting to a specialist setting, although its usage for management decisions requires more careful execution. An AI algorithm that was superior in experimental studies was significantly inferior to specialists in a real-world scenario, suggesting that caution is needed when extrapolating results of experimental studies to clinical practice.
MetaOptima Technology.
皮肤癌的诊断需要医学专业知识,而这方面的专业知识较为稀缺。手机人工智能(AI)可以辅助诊断,但目前尚不清楚这项技术在临床环境中的表现如何。我们的主要目的是在临床环境中测试,AI 算法在诊断和管理色素性皮肤病变方面是否与临床医生等效。
这是一项多中心、前瞻性、诊断性、临床试验,纳入了来自澳大利亚和奥地利两家三级转诊中心的专家和新手临床医生以及患者。专家具有与诊断和管理色素性皮肤病变相关的专业医学资质,而新手则是皮肤科初级医生或处于培训阶段的住院医师,他们有检查和管理这些病变的经验。合格患者的年龄为 18-99 岁,皮肤类型为改良的 Fitzpatrick I-III 型;诊断试验中的患者正在接受一个或多个直径大于 3 毫米的可疑色素性皮肤病变的常规切除或活检,而管理试验中的患者在 1-4 年内接受了全身总体表照片拍摄。我们使用了两种基于手机的 AI 仪器,均包含一个简单的光学附件:一个新的 7 分类 AI 算法和国际皮肤成像协作(ISIC)AI 算法,该算法之前在一项大型在线读者研究中进行了测试。诊断试验中切除病变的参考标准是组织病理学检查;在管理试验中,参考标准是基于组织病理学检查、基线全身总体表照片比较、数字监测和远程诊断的降序层次结构。本研究的主要结果是比较专家和新手的诊断和管理决策与两种 AI 仪器的准确性。管理试验中可能的决策是解雇、活检或 3 个月监测。监测决策被视为与解雇(方案 A)或恶性病变活检(方案 B)等效。该试验在澳大利亚和新西兰临床试验注册中心(ACTRN12620000695909)进行了注册(通用试验编号 U1111-1251-8995)。
诊断研究纳入了 124 名患者的 172 个可疑色素性病变(84 个恶性病变),管理研究纳入了 66 名高危患者的 5696 个色素性病变(18 个恶性病变)。7 分类 AI 算法的诊断与专家的诊断相当(绝对准确性差异为 1.2%[95%CI-6.9 至 9.2]),明显优于新手的诊断(21.5%[13.1 至 30.0])。ISIC AI 算法的诊断明显低于专家的诊断(-11.6%[-20.3 至-3.0]),但明显优于新手的诊断(8.7%[-0.5 至 18.0])。最佳的 7 分类管理 AI 明显劣于专家的管理(方案 A 中正确管理决策的绝对准确性差异为-0.5%[95%CI-0.7 至-0.2],方案 B 中为-0.4%[-0.8 至-0.05])。与新手的管理相比,7 分类管理 AI 在方案 A 中明显劣于(-0.4%[-0.6 至-0.2]),但在方案 B 中明显优于(0.4%[0.0 至 0.9])。
这款基于手机的 AI 技术简单、实用、准确,适用于在专家就诊环境中对可疑色素性皮肤癌的诊断,尽管其用于管理决策的使用需要更仔细的执行。在实际场景中,一种在实验研究中表现优越的 AI 算法明显劣于专家,这表明在将实验研究结果外推到临床实践时需要谨慎。
MetaOptima Technology。