Suppr超能文献

癌症分期信息的信息提取的跨医院可移植性。

Cross-hospital portability of information extraction of cancer staging information.

机构信息

Department of Computing and Information Systems, The University of Melbourne, Doug McDonell Building, Parkville, 3010 VIC, Australia.

Barwon Health, Geelong Hospital, 1/75 Bellerine Street, Geelong, 3220 VIC, Australia.

出版信息

Artif Intell Med. 2014 Sep;62(1):11-21. doi: 10.1016/j.artmed.2014.06.002. Epub 2014 Jun 21.

Abstract

OBJECTIVE

We address the task of extracting information from free-text pathology reports, focusing on staging information encoded by the TNM (tumour-node-metastases) and ACPS (Australian clinico-pathological stage) systems. Staging information is critical for diagnosing the extent of cancer in a patient and for planning individualised treatment. Extracting such information into more structured form saves time, improves reporting, and underpins the potential for automated decision support.

METHODS AND MATERIAL

We investigate the portability of a text mining model constructed from records from one health centre, by applying it directly to the extraction task over a set of records from a different health centre, with different reporting narrative characteristics. Other than a simple normalisation step on features associated with target labels, we apply the models from one system directly to the other.

RESULTS

The best F-scores for in-hospital experiments are 81%, 85%, and 94% (for staging T, N, and M respectively), while best cross-hospital F-scores reach 84%, 81%, and 91% for the same respective categories.

CONCLUSIONS

Our performance results compare favourably to the best levels reported in the literature, and--most relevant to our aim here--the cross-corpus results demonstrate the portability of the models we developed.

摘要

目的

从病理报告的自由文本中提取信息,重点关注 TNM(肿瘤-淋巴结-转移)和 ACPS(澳大利亚临床病理分期)系统编码的分期信息。分期信息对于诊断患者癌症的严重程度和制定个体化治疗方案至关重要。将此类信息提取到更结构化的形式中可以节省时间、提高报告质量,并为自动化决策支持提供潜力。

方法和材料

我们通过将模型直接应用于来自另一个健康中心的记录集,研究了从一个健康中心的记录构建的文本挖掘模型的可移植性,这些记录具有不同的报告叙述特征。除了对与目标标签相关的特征进行简单的规范化处理之外,我们直接将一个系统的模型应用于另一个系统。

结果

住院内实验的最佳 F 分数分别为 81%、85% 和 94%(分别用于分期 T、N 和 M),而最佳跨医院 F 分数分别为 84%、81% 和 91%,用于相同的相应类别。

结论

我们的性能结果与文献中报告的最佳水平相当,并且——与我们在这里的目标最相关——跨语料库的结果证明了我们开发的模型的可移植性。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验