Peter MacCallum Cancer Centre, Department of Radiation Oncology, Melbourne, Australia; University of Melbourne, Sir Peter MacCallum Department of Oncology, Melbourne, Australia; Austin Health, Department of Radiation Oncology, Melbourne, Australia.
The Australian e-Health Research Centre, CSIRO, Brisbane, Australia.
Int J Med Inform. 2019 Jan;121:53-57. doi: 10.1016/j.ijmedinf.2018.10.008. Epub 2018 Oct 23.
To implement a system for unsupervised extraction of tumor stage and prognostic data in patients with genitourinary cancers using clinicopathological and radiology text.
A corpus of 1054 electronic notes (clinician notes, radiology reports and pathology reports) was annotated for tumor stage, prostate specific antigen (PSA) and Gleason grade. Annotations from five clinicians were reconciled to form a gold standard dataset. A training dataset of 386 documents was sequestered. The Medtex algorithm was adapted using the training dataset.
Adapted Medtex equaled or exceeded human performance in most annotations, except for implicit M stage (F-measure of 0.69 vs 0.84) and PSA (0.92 vs 0.96). Overall Medtex performed with an F-measure of 0.86 compared to human annotations of 0.92. There was significant inter-observer variability when comparing human annotators to the gold standard.
The Medtex algorithm performed similarly to human annotators for extracting stage and prognostic data from varied clinical texts.
利用临床病理和放射学文本,实现一种用于从泌尿生殖系统癌症患者中提取肿瘤分期和预后数据的无监督系统。
对 1054 份电子病历(临床医生记录、放射学报告和病理学报告)进行了肿瘤分期、前列腺特异性抗原(PSA)和 Gleason 分级的标注。从五名临床医生的标注中进行了协调,形成了一个黄金标准数据集。隔离了 386 份训练文档。使用训练数据集对 Medtex 算法进行了改编。
改编后的 Medtex 在大多数标注中与人类表现相当或优于人类表现,除了隐性 M 期(F-度量值为 0.69 与 0.84)和 PSA(0.92 与 0.96)。总体而言,Medtex 的 F-度量值为 0.86,而人类标注的 F-度量值为 0.92。在将人类注释与黄金标准进行比较时,存在显著的观察者间变异性。
Medtex 算法在从各种临床文本中提取分期和预后数据方面的表现与人类注释者相似。