Suppr超能文献

通过对症状进行语言模型分析来优化疾病分类。

Optimizing classification of diseases through language model analysis of symptoms.

机构信息

Faculty of Artificial Intelligence, Kafrelsheikh University, Kafrelsheikh, 33516, Egypt.

Department of Computer Science, Faculty of Science, Minia University, Minia, 61519, Egypt.

出版信息

Sci Rep. 2024 Jan 17;14(1):1507. doi: 10.1038/s41598-024-51615-5.

Abstract

This paper investigated the use of language models and deep learning techniques for automating disease prediction from symptoms. Specifically, we explored the use of two Medical Concept Normalization-Bidirectional Encoder Representations from Transformers (MCN-BERT) models and a Bidirectional Long Short-Term Memory (BiLSTM) model, each optimized with a different hyperparameter optimization method, to predict diseases from symptom descriptions. In this paper, we utilized two distinct dataset called Dataset-1, and Dataset-2. Dataset-1 consists of 1,200 data points, with each point representing a unique combination of disease labels and symptom descriptions. While, Dataset-2 is designed to identify Adverse Drug Reactions (ADRs) from Twitter data, comprising 23,516 rows categorized as ADR (1) or Non-ADR (0) tweets. The results indicate that the MCN-BERT model optimized with AdamP achieved 99.58% accuracy for Dataset-1 and 96.15% accuracy for Dataset-2. The MCN-BERT model optimized with AdamW performed well with 98.33% accuracy for Dataset-1 and 95.15% for Dataset-2, while the BiLSTM model optimized with Hyperopt achieved 97.08% accuracy for Dataset-1 and 94.15% for Dataset-2. Our findings suggest that language models and deep learning techniques have promise for supporting earlier detection and more prompt treatment of diseases, as well as expanding remote diagnostic capabilities. The MCN-BERT and BiLSTM models demonstrated robust performance in accurately predicting diseases from symptoms, indicating the potential for further related research.

摘要

本文研究了使用语言模型和深度学习技术来实现从症状自动预测疾病。具体来说,我们探索了使用两个 Medical Concept Normalization-Bidirectional Encoder Representations from Transformers (MCN-BERT) 模型和一个 Bidirectional Long Short-Term Memory (BiLSTM) 模型,每个模型都使用不同的超参数优化方法进行优化,以从症状描述中预测疾病。在本文中,我们使用了两个不同的数据集,分别称为 Dataset-1 和 Dataset-2。Dataset-1 包含 1200 个数据点,每个数据点代表疾病标签和症状描述的独特组合。而 Dataset-2 旨在从 Twitter 数据中识别药物不良反应 (ADR),包含 23516 行,分为 ADR(1)或非 ADR(0)推文。结果表明,使用 AdamP 优化的 MCN-BERT 模型在 Dataset-1 上的准确率为 99.58%,在 Dataset-2 上的准确率为 96.15%。使用 AdamW 优化的 MCN-BERT 模型在 Dataset-1 上的准确率为 98.33%,在 Dataset-2 上的准确率为 95.15%,而使用 Hyperopt 优化的 BiLSTM 模型在 Dataset-1 上的准确率为 97.08%,在 Dataset-2 上的准确率为 94.15%。我们的研究结果表明,语言模型和深度学习技术在支持疾病的早期检测和更及时的治疗,以及扩大远程诊断能力方面具有潜力。MCN-BERT 和 BiLSTM 模型在准确预测疾病方面表现出了强大的性能,表明进一步相关研究的潜力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/faa9/10794698/66a29e3290cf/41598_2024_51615_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验