监督机器学习和主动学习在放射科报告分类中的应用。

Supervised machine learning and active learning in classification of radiology reports.

机构信息

School of Information Technologies, University of Sydney, Sydney, New South Wales, Australia.

出版信息

J Am Med Inform Assoc. 2014 Sep-Oct;21(5):893-901. doi: 10.1136/amiajnl-2013-002516. Epub 2014 May 22.

DOI:10.1136/amiajnl-2013-002516

PMID:24853067

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4147614/

Abstract

OBJECTIVE

This paper presents an automated system for classifying the results of imaging examinations (CT, MRI, positron emission tomography) into reportable and non-reportable cancer cases. This system is part of an industrial-strength processing pipeline built to extract content from radiology reports for use in the Victorian Cancer Registry.

MATERIALS AND METHODS

In addition to traditional supervised learning methods such as conditional random fields and support vector machines, active learning (AL) approaches were investigated to optimize training production and further improve classification performance. The project involved two pilot sites in Victoria, Australia (Lake Imaging (Ballarat) and Peter MacCallum Cancer Centre (Melbourne)) and, in collaboration with the NSW Central Registry, one pilot site at Westmead Hospital (Sydney).

RESULTS

The reportability classifier performance achieved 98.25% sensitivity and 96.14% specificity on the cancer registry's held-out test set. Up to 92% of training data needed for supervised machine learning can be saved by AL.

DISCUSSION

AL is a promising method for optimizing the supervised training production used in classification of radiology reports. When an AL strategy is applied during the data selection process, the cost of manual classification can be reduced significantly.

CONCLUSIONS

The most important practical application of the reportability classifier is that it can dramatically reduce human effort in identifying relevant reports from the large imaging pool for further investigation of cancer. The classifier is built on a large real-world dataset and can achieve high performance in filtering relevant reports to support cancer registries.

摘要

目的

本文提出了一种自动系统，用于将影像学检查（CT、MRI、正电子发射断层扫描）的结果分类为可报告和不可报告的癌症病例。该系统是一个工业强度处理管道的一部分，该管道用于从放射学报告中提取内容，供维多利亚癌症登记处使用。

材料与方法

除了条件随机场和支持向量机等传统监督学习方法外，还研究了主动学习（AL）方法，以优化训练生产并进一步提高分类性能。该项目涉及澳大利亚维多利亚州的两个试点站点（Lake Imaging（巴拉腊特）和 Peter MacCallum Cancer Centre（墨尔本）），并与新南威尔士州中央登记处合作，在 Westmead 医院（悉尼）进行了一个试点。

结果

可报告性分类器在癌症登记处的保留测试集上实现了 98.25%的敏感性和 96.14%的特异性。主动学习可节省高达 92%的监督机器学习所需的训练数据。

讨论

主动学习是优化分类放射学报告中使用的监督训练生产的一种很有前途的方法。当在数据选择过程中应用 AL 策略时，可以显著减少手动分类的成本。

结论

可报告性分类器的最重要实际应用是，它可以大大减少从大型成像池中识别相关报告以进一步调查癌症的人力。该分类器建立在一个大型真实数据集上，可以实现高性能的相关报告筛选，以支持癌症登记处。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

监督机器学习和主动学习在放射科报告分类中的应用。

Supervised machine learning and active learning in classification of radiology reports.

机构信息

出版信息

OBJECTIVE

MATERIALS AND METHODS

RESULTS

DISCUSSION

CONCLUSIONS

目的

材料与方法

结果

讨论

结论

相似文献

引用本文的文献

本文引用的文献

监督机器学习和主动学习在放射科报告分类中的应用。

Supervised machine learning and active learning in classification of radiology reports.

机构信息

出版信息

OBJECTIVE

MATERIALS AND METHODS

RESULTS

DISCUSSION

CONCLUSIONS

目的

材料与方法

结果

讨论

结论

相似文献

引用本文的文献

本文引用的文献