Chew Melissa, Yu Catherine, Stojevski Leanne, Conilione Paul, Gust Anthony, Suleiman Mani, Swansson Will, Anderson Bennett, Garg Mayur, Lewis Diana
Department of Gastroenterology, Northern Health, Melbourne, Victoria, Australia.
Department of Client Data Management, Northern Health, Melbourne, Victoria, Australia.
J Gastroenterol Hepatol. 2025 May;40(5):1230-1237. doi: 10.1111/jgh.16933. Epub 2025 Mar 31.
Determining adenoma detection rate (ADR) and serrated polyp detection rate (SDR) can be challenging as they usually involve manual matching of colonoscopy and histology reports. This study aimed to validate a Natural Language Processing (NLP) code that enables rapid and efficient data extraction to calculate ADR and SDR.
A NLP code was developed to automatically extract colonoscopy quality indicators from colonoscopy and histology reports at a tertiary health service. These reports were manually reviewed to verify the concordance of ADR and SDR between the two methods. This process was applied in the initial training phase, repeated following modification of the code, and again with a validation cohort.
Included in the training and test phases were 5911 colonoscopies, with 2022 in the validation phase. The NLP code extracted patient names with 99.9% concordance and had a 98.9% accuracy in ADR and SDR in the training phase. Search terms were subsequently modified to take into consideration spelling variations and overlapping terminologies. Using data from the same cohort, accuracy of the NLP improved to 100%, excluding four colonoscopies that had missing histology reports in the test phase. Using a validated cohort, NLP had a 99.9% accuracy in ADR and SDR. The total time taken for auditing using NLP in the validation phase was less than 1 h.
An automatic NLP code had an accuracy of almost 100% in determining ADR and SDR in a tertiary colonoscopy service. Wider adoption of NLP enables significant improvements in colonoscopy audits that is accurate and time efficient.
确定腺瘤检出率(ADR)和锯齿状息肉检出率(SDR)可能具有挑战性,因为这通常涉及结肠镜检查报告和组织学报告的人工匹配。本研究旨在验证一种自然语言处理(NLP)代码,该代码能够快速有效地提取数据以计算ADR和SDR。
开发了一种NLP代码,用于从一家三级医疗服务机构的结肠镜检查报告和组织学报告中自动提取结肠镜检查质量指标。对这些报告进行人工审核,以验证两种方法之间ADR和SDR的一致性。此过程应用于初始训练阶段,在代码修改后重复进行,并再次应用于验证队列。
训练和测试阶段纳入了5911例结肠镜检查,验证阶段纳入了2022例。NLP代码提取患者姓名的一致性为99.9%,在训练阶段ADR和SDR的准确率为98.9%。随后修改了搜索词,以考虑拼写变化和重叠术语。使用同一队列的数据,NLP的准确率提高到了100%,但测试阶段有4例结肠镜检查的组织学报告缺失除外。使用经过验证的队列,NLP在ADR和SDR方面的准确率为99.9%。验证阶段使用NLP进行审核的总时间不到1小时。
在三级结肠镜检查服务中,一种自动NLP代码在确定ADR和SDR方面的准确率几乎达到100%。更广泛地采用NLP能够显著改善结肠镜检查审核工作,既准确又高效。