Victoria University of Wellington, Wellington, New Zealand.
Medius Health, Sydney, Australia.
Artif Intell Med. 2021 Oct;120:102167. doi: 10.1016/j.artmed.2021.102167. Epub 2021 Sep 10.
Biomedical natural language processing (NLP) has an important role in extracting consequential information in medical discharge notes. Detecting meaningful features from unstructured notes is a challenging task in medical document classification. The domain specific phrases and different synonyms within the medical documents make it hard to analyze them. Analyzing clinical notes becomes more challenging for short documents like abstract texts. All of these can result in poor classification performance, especially when there is a shortage of the clinical data in real life. Two new approaches (an ontology-guided approach and a combined ontology-based with dictionary-based approach) are suggested for augmenting medical data to enrich training data. Three different deep learning approaches are used to evaluate the classification performance of the proposed methods. The obtained results show that the proposed methods improved the classification accuracy in clinical notes classification.
生物医学自然语言处理(NLP)在从医疗出院记录中提取相关信息方面具有重要作用。从非结构化的记录中检测有意义的特征是医学文档分类中的一项具有挑战性的任务。医学文档中的特定领域短语和不同同义词使得分析它们变得困难。对于像摘要文本这样的短文档,分析临床记录变得更加具有挑战性。所有这些都可能导致分类性能不佳,尤其是在现实生活中临床数据不足的情况下。为了丰富训练数据,提出了两种新方法(本体指导方法和基于本体与基于词典相结合的方法)来扩充医学数据。使用三种不同的深度学习方法来评估所提出方法的分类性能。结果表明,所提出的方法提高了临床记录分类的分类准确性。