Suppr超能文献

基于整洁文本挖掘的海员健康文件中医学术语预测的套索回归建模

LASSO Regression Modeling on Prediction of Medical Terms among Seafarers' Health Documents Using Tidy Text Mining.

作者信息

Chintalapudi Nalini, Angeloni Ulrico, Battineni Gopi, di Canio Marzio, Marotta Claudia, Rezza Giovanni, Sagaro Getu Gamo, Silenzi Andrea, Amenta Francesco

机构信息

Clinical Research Centre, School of Medicinal and Health Products Sciences, University of Camerino, 62032 Camerino, Italy.

General Directorate of Health Prevention, Ministry of Health, 00144 Rome, Italy.

出版信息

Bioengineering (Basel). 2022 Mar 17;9(3):124. doi: 10.3390/bioengineering9030124.

Abstract

Generally, seafarers face a higher risk of illnesses and accidents than land workers. In most cases, there are no medical professionals on board seagoing vessels, which makes disease diagnosis even more difficult. When this occurs, onshore doctors may be able to provide medical advice through telemedicine by receiving better symptomatic and clinical details in the health abstracts of seafarers. The adoption of text mining techniques can assist in extracting diagnostic information from clinical texts. We applied lexicon sentimental analysis to explore the automatic labeling of positive and negative healthcare terms to seafarers' text healthcare documents. This was due to the lack of experimental evaluations using computational techniques. In order to classify diseases and their associated symptoms, the LASSO regression algorithm is applied to analyze these text documents. A visualization of symptomatic data frequency for each disease can be achieved by analyzing TF-IDF values. The proposed approach allows for the classification of text documents with 93.8% accuracy by using a machine learning model called LASSO regression. It is possible to classify text documents effectively with tidy text mining libraries. In addition to delivering health assistance, this method can be used to classify diseases and establish health observatories. Knowledge developed in the present work will be applied to establish an Epidemiological Observatory of Seafarers' Pathologies and Injuries. This Observatory will be a collaborative initiative of the Italian Ministry of Health, University of Camerino, and International Radio Medical Centre (C.I.R.M.), the Italian TMAS.

摘要

一般来说,海员比陆上工作者面临更高的患病和事故风险。在大多数情况下,远洋船舶上没有医疗专业人员,这使得疾病诊断更加困难。当这种情况发生时,岸上医生或许能够通过远程医疗提供医疗建议,这需要在海员的健康摘要中获取更好的症状和临床细节。采用文本挖掘技术有助于从临床文本中提取诊断信息。我们应用词汇情感分析来探索对海员的文本医疗文档进行正面和负面医疗术语的自动标注。这是由于缺乏使用计算技术的实验评估。为了对疾病及其相关症状进行分类,应用套索回归算法来分析这些文本文档。通过分析词频 - 逆文档频率(TF - IDF)值,可以实现每种疾病症状数据频率的可视化。所提出的方法使用一种名为套索回归的机器学习模型,能够以93.8%的准确率对文本文档进行分类。使用整洁文本挖掘库可以有效地对文本文档进行分类。除了提供健康援助外,该方法还可用于疾病分类和建立健康观测站。当前工作中开发的知识将应用于建立海员病理与损伤流行病学观测站。该观测站将是意大利卫生部、卡梅里诺大学和意大利国际无线电医疗中心(C.I.R.M.)、意大利TMAS的一项合作倡议。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9b2b/8945331/5deae5454a13/bioengineering-09-00124-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验