Rochon Melissa, Tanner Judith, Jurkiewicz James, Beckhelling Jacqueline, Aondoakaa Akuha, Wilson Keith, Dhoonmoon Luxmi, Underwood Max, Mason Lara, Harris Roy, Cariaga Karen
Guy's and St Thomas' NHS Foundation Trust, London, United Kingdom.
University of Nottingham, Nottingham, United Kingdom.
PLoS One. 2024 Dec 9;19(12):e0315384. doi: 10.1371/journal.pone.0315384. eCollection 2024.
Surgical patients frequently experience post-operative complications at home. Digital remote monitoring of surgical wounds via image-based systems has emerged as a promising solution for early detection and intervention. However, the increased clinician workload from reviewing patient-submitted images presents a challenge. This study utilises artificial intelligence (AI) to prioritise surgical wound images for clinician review, aiming to efficiently manage workload.
Conducted from September 2023 to March 2024, the study phases included compiling a training dataset of 37,974 images, creating a testing set of 3,634 images, developing an AI algorithm using 'You Only Look Once' models, and conducting prospective tests compared against clinical nurse specialists' evaluations. The primary objective was to validate the AI's sensitivity in prioritising wound reviews, alongside assessing intra-rater reliability. Secondary objectives focused on specificity, positive predictive value (PPV), and negative predictive value (NPV) for various wound features.
The AI demonstrated a sensitivity of 89%, exceeding the target of 85% and proving effective in identifying cases requiring priority review. Intra-rater reliability was perfect, achieving 100% consistency in repeated assessments. Observations indicated variations in detecting wound characteristics across different skin tones; sensitivity was notably lower for incisional separation and discolouration in darker skin tones. Specificity remained high overall, with some results favouring darker skin tones. The NPV were similar for both light and dark skin tones. However, the NPV was slightly higher for dark skin tones at 95% (95% CI: 93%-97%) compared to 91% (95% CI: 87%-92%) for light skin tones. Both PPV and NPV varied, especially in identifying sutures or staples, indicating areas needing further refinement to ensure equitable accuracy.
The AI algorithm not only met but surpassed the expected sensitivity for identifying priority cases, showing high reliability. Nonetheless, the disparities in performance across skin tones, especially in recognising certain wound characteristics like discolouration or incisional separation, underline the need for ongoing training and adaptation of the AI to ensure fairness and effectiveness across diverse patient groups.
外科手术患者在家中经常会出现术后并发症。通过基于图像的系统对手术伤口进行数字远程监测已成为早期检测和干预的一种有前景的解决方案。然而,查看患者提交的图像增加了临床医生的工作量,这是一个挑战。本研究利用人工智能(AI)对手术伤口图像进行优先级排序,以便临床医生进行查看,旨在有效管理工作量。
该研究于2023年9月至2024年3月进行,研究阶段包括编制一个包含37974张图像的训练数据集、创建一个包含3634张图像的测试集、使用“你只看一次”模型开发一种AI算法,以及与临床护士专家的评估进行对比的前瞻性测试。主要目标是验证AI在对伤口检查进行优先级排序方面的敏感性,同时评估评分者内信度。次要目标侧重于各种伤口特征的特异性、阳性预测值(PPV)和阴性预测值(NPV)。
AI的敏感性为89%,超过了85%的目标,证明在识别需要优先检查的病例方面是有效的。评分者内信度完美,在重复评估中一致性达到100%。观察结果表明,不同肤色在检测伤口特征方面存在差异;深色皮肤的切口分离和变色的敏感性明显较低。总体而言,特异性仍然很高,一些结果对深色皮肤更有利。浅色和深色皮肤的NPV相似。然而,深色皮肤的NPV略高,为95%(95%CI:93%-97%),而浅色皮肤为91%(95%CI:87%-92%)。PPV和NPV都有所不同,尤其是在识别缝线或钉合线方面,这表明需要进一步改进以确保公平的准确性。
AI算法不仅达到而且超过了识别优先病例的预期敏感性,显示出高可靠性。尽管如此,不同肤色在性能上存在差异,特别是在识别某些伤口特征(如变色或切口分离)方面,这突出表明需要持续对AI进行训练和调整,以确保在不同患者群体中都能实现公平性和有效性。