Division of Biomedical Statistics and Informatics, Department of Health Sciences Research, Mayo Clinic, 200 1st ST SW, Rochester, MN, 55905, USA.
Division of Rheumatology, Department of Medicine, Mayo Clinic, 200 1st ST SW, Rochester, MN, 55905, USA.
BMC Med Inform Decis Mak. 2019 Jan 7;19(1):1. doi: 10.1186/s12911-018-0723-6.
BACKGROUND: Automatic clinical text classification is a natural language processing (NLP) technology that unlocks information embedded in clinical narratives. Machine learning approaches have been shown to be effective for clinical text classification tasks. However, a successful machine learning model usually requires extensive human efforts to create labeled training data and conduct feature engineering. In this study, we propose a clinical text classification paradigm using weak supervision and deep representation to reduce these human efforts. METHODS: We develop a rule-based NLP algorithm to automatically generate labels for the training data, and then use the pre-trained word embeddings as deep representation features for training machine learning models. Since machine learning is trained on labels generated by the automatic NLP algorithm, this training process is called weak supervision. We evaluat the paradigm effectiveness on two institutional case studies at Mayo Clinic: smoking status classification and proximal femur (hip) fracture classification, and one case study using a public dataset: the i2b2 2006 smoking status classification shared task. We test four widely used machine learning models, namely, Support Vector Machine (SVM), Random Forest (RF), Multilayer Perceptron Neural Networks (MLPNN), and Convolutional Neural Networks (CNN), using this paradigm. Precision, recall, and F1 score are used as metrics to evaluate performance. RESULTS: CNN achieves the best performance in both institutional tasks (F1 score: 0.92 for Mayo Clinic smoking status classification and 0.97 for fracture classification). We show that word embeddings significantly outperform tf-idf and topic modeling features in the paradigm, and that CNN captures additional patterns from the weak supervision compared to the rule-based NLP algorithms. We also observe two drawbacks of the proposed paradigm that CNN is more sensitive to the size of training data, and that the proposed paradigm might not be effective for complex multiclass classification tasks. CONCLUSION: The proposed clinical text classification paradigm could reduce human efforts of labeled training data creation and feature engineering for applying machine learning to clinical text classification by leveraging weak supervision and deep representation. The experimental experiments have validated the effectiveness of paradigm by two institutional and one shared clinical text classification tasks.
背景:自动临床文本分类是一种自然语言处理(NLP)技术,可挖掘临床叙述中嵌入的信息。机器学习方法已被证明可有效用于临床文本分类任务。然而,成功的机器学习模型通常需要大量人力来创建标记训练数据并进行特征工程。在这项研究中,我们提出了一种使用弱监督和深度表示的临床文本分类范例,以减少这些人工工作。
方法:我们开发了一种基于规则的 NLP 算法,可自动为训练数据生成标签,然后使用预先训练的词向量作为深度表示特征来训练机器学习模型。由于机器学习是基于自动 NLP 算法生成的标签进行训练的,因此这种训练过程称为弱监督。我们在 Mayo 诊所的两个机构案例研究中评估了该范例的有效性:吸烟状况分类和股骨近端(髋部)骨折分类,以及一个使用公共数据集的案例研究:i2b2 2006 年吸烟状况分类共享任务。我们使用此范例测试了四种广泛使用的机器学习模型,即支持向量机(SVM)、随机森林(RF)、多层感知机神经网络(MLPNN)和卷积神经网络(CNN)。使用精度、召回率和 F1 分数作为指标来评估性能。
结果:CNN 在两个机构任务中均取得了最佳性能(Mayo 诊所吸烟状况分类的 F1 得分为 0.92,骨折分类的 F1 得分为 0.97)。我们表明,在该范例中,词向量明显优于 tf-idf 和主题建模特征,并且 CNN 从弱监督中捕获了比基于规则的 NLP 算法更多的模式。我们还观察到该范例的两个缺点,即 CNN 对训练数据的大小更敏感,并且该范例可能不适用于复杂的多类分类任务。
结论:该临床文本分类范例可以通过利用弱监督和深度表示来减少应用机器学习进行临床文本分类的标记训练数据创建和特征工程的人工工作。通过两个机构和一个共享的临床文本分类任务的实验验证了范例的有效性。
BMC Med Inform Decis Mak. 2019-1-7
J Am Med Inform Assoc. 2019-11-1
BMC Med Inform Decis Mak. 2022-7-7
BMC Med Inform Decis Mak. 2017-12-1
BMC Med Res Methodol. 2024-5-17
J Biomed Inform. 2018-9-12
AMIA Jt Summits Transl Sci Proc. 2025-6-10
Comput Stat Data Anal. 2025-6
Alzheimers Dement (N Y). 2025-4-24
BMC Public Health. 2025-2-7
J Biomed Inform. 2018-9-12
IEEE J Biomed Health Inform. 2018-5-10
Adv Neural Inf Process Syst. 2016-12
AMIA Annu Symp Proc. 2018-4-16
J R Soc Interface. 2018-4
Proc IEEE Int Symp Bioinformatics Bioeng. 2017-10
Stud Health Technol Inform. 2017