Suppr超能文献

基于异常的智能健康中的威胁检测:机器学习方法

Anomaly-based threat detection in smart health using machine learning.

机构信息

Department of Computer Science, Bahria University, Islamabad, Pakistan.

Department of Information Systems and Technology, Collage of Computer Science and Engineering, University of Jeddah, Jeddah, Saudi Arabia.

出版信息

BMC Med Inform Decis Mak. 2024 Nov 19;24(1):347. doi: 10.1186/s12911-024-02760-4.

Abstract

BACKGROUND

Anomaly detection is crucial in healthcare data due to challenges associated with the integration of smart technologies and healthcare. Anomaly in electronic health record can be associated with an insider trying to access and manipulate the data. This article focuses around the anomalies under different contexts.

METHODOLOGY

This research has proposed methodology to secure Electronic Health Records (EHRs) within a complex environment. We have employed a systematic approach encompassing data preprocessing, labeling, modeling, and evaluation. Anomalies are not labelled thus a mechanism is required that predicts them with greater accuracy and less false positive results. This research utilized unsupervised machine learning algorithms that includes Isolation Forest and Local Outlier Factor clustering algorithms. By calculating anomaly scores and validating clustering through metrics like the Silhouette Score and Dunn Score, we enhanced the capacity to secure sensitive healthcare data evolving digital threats. Three variations of Isolation Forest (IForest)models (SVM, Decision Tree, and Random Forest) and three variations of Local Outlier Factor (LOF) models (SVM, Decision Tree, and Random Forest) are evaluated based on accuracy, sensitivity, specificity, and F1 Score.

RESULTS

Isolation Forest SVM achieves the highest accuracy of 99.21%, high sensitivity (99.75%) and specificity (99.32%), and a commendable F1 Score of 98.72%. The Isolation Forest Decision Tree also performs well with an accuracy of 98.92% and an F1 Score of 99.35%. However, the Isolation Forest Random Forest exhibits lower specificity (72.84%) than the other models.

CONCLUSION

The experimental results reveal that Isolation Forest SVM emerges as the top performer showcasing the effectiveness of these models in anomaly detection tasks. The proposed methodology utilizing isolation forest and SVM produced better results by detecting anomalies with less false positives in this specific EHR of a hospital in North England. Furthermore the proposal is also able to identify new contextual anomalies that were not identified in the baseline methodology.

摘要

背景

由于与智能技术和医疗保健集成相关的挑战,异常检测在医疗保健数据中至关重要。电子健康记录中的异常可能与试图访问和操纵数据的内部人员有关。本文重点介绍了不同上下文中的异常情况。

方法

本研究提出了一种在复杂环境中保护电子健康记录 (EHR) 的方法。我们采用了一种系统的方法,包括数据预处理、标记、建模和评估。异常没有标记,因此需要一种机制来更准确地预测它们,并减少假阳性结果。本研究利用了包括隔离森林和局部离群因子聚类算法在内的无监督机器学习算法。通过计算异常分数并通过轮廓分数和 Dunn 分数等指标验证聚类,我们提高了应对数字威胁下不断发展的敏感医疗保健数据的安全性。基于准确性、敏感性、特异性和 F1 分数,评估了三种隔离森林 (IForest) 模型(SVM、决策树和随机森林)和三种局部离群因子 (LOF) 模型(SVM、决策树和随机森林)的变体。

结果

隔离森林 SVM 实现了最高的准确性 99.21%、高灵敏度(99.75%)和特异性(99.32%)以及令人赞赏的 F1 分数 98.72%。隔离森林决策树的准确性也达到了 98.92%,F1 分数为 99.35%。然而,隔离森林随机森林的特异性(72.84%)低于其他模型。

结论

实验结果表明,隔离森林 SVM 表现最佳,展示了这些模型在异常检测任务中的有效性。本研究提出的方法利用隔离森林和 SVM,在英格兰北部一家医院的特定 EHR 中,通过更少的假阳性来检测异常,从而产生了更好的结果。此外,该方法还能够识别基线方法未识别的新上下文异常。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/76b6/11577804/5d5cc72d0e3b/12911_2024_2760_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验