Bosch Dianne, Kuppen Malou C P, Tascilar Metin, Smilde Tineke J, Mulders Peter F A, Uyl-de Groot Carin A, van Oort Inge M
Department of Urology, Radboud University Medical Center, 6525 GA Nijmegen, The Netherlands.
Department of Radiotherapy, Maastro Clinic, 6229 ET Maastricht, The Netherlands.
Cancers (Basel). 2023 Jul 27;15(15):3808. doi: 10.3390/cancers15153808.
Manual data collection is still the gold standard for disease-specific patient registries. However, CAPRI-3 uses text mining (an artificial intelligence (AI) technology) for patient identification and data collection. The aim of this study is to demonstrate the reliability and efficiency of this AI-driven approach.
CAPRI-3 is an observational retrospective multicenter cohort registry on metastatic prostate cancer. We tested the patient-identification algorithm and automated data extraction through manual validation of the same patients in two pilots in 2019 and 2022.
Pilot one identified 2030 patients and pilot two 9464 patients. The negative predictive value of the algorithm was maximized to prevent false exclusions and reached 94.8%. The completeness and accuracy of the automated data extraction were 92.3% or higher, except for date fields and inaccessible data (images/pdf) (10-88.9%). Additional manual quality control took over 3 h less time per patient than the original fully manual CAPRI registry (105 vs. 300 min).
The CAPRI-3 patient-identification algorithm is a sound replacement for excluding ineligible candidates. The AI-driven data extraction is largely accurate and complete, but manual quality control is needed for less reliable and inaccessible data. Overall, the AI-driven approach of the CAPRI-3 registry is reliable and timesaving.
手动数据收集仍是特定疾病患者登记的金标准。然而,CAPRI - 3使用文本挖掘(一种人工智能(AI)技术)进行患者识别和数据收集。本研究的目的是证明这种人工智能驱动方法的可靠性和效率。
CAPRI - 3是一项关于转移性前列腺癌的观察性回顾性多中心队列登记研究。我们在2019年和2022年的两次试点中,通过对相同患者进行手动验证,测试了患者识别算法和自动数据提取。
第一个试点识别出2030名患者,第二个试点识别出9464名患者。该算法的阴性预测值最大化以防止误排除,达到94.8%。自动数据提取的完整性和准确性为92.3%或更高,但日期字段和无法获取的数据(图像/ PDF)除外(10 - 88.9%)。与原来完全手动的CAPRI登记相比,额外的手动质量控制每位患者花费的时间减少了3小时以上(105分钟对300分钟)。
CAPRI - 3患者识别算法是排除不合格候选者的可靠替代方法。人工智能驱动的数据提取在很大程度上准确且完整,但对于可靠性较低和无法获取的数据需要进行手动质量控制。总体而言,CAPRI - 3登记的人工智能驱动方法可靠且节省时间。