• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过对症状进行语言模型分析来优化疾病分类。

Optimizing classification of diseases through language model analysis of symptoms.

机构信息

Faculty of Artificial Intelligence, Kafrelsheikh University, Kafrelsheikh, 33516, Egypt.

Department of Computer Science, Faculty of Science, Minia University, Minia, 61519, Egypt.

出版信息

Sci Rep. 2024 Jan 17;14(1):1507. doi: 10.1038/s41598-024-51615-5.

DOI:10.1038/s41598-024-51615-5
PMID:38233458
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10794698/
Abstract

This paper investigated the use of language models and deep learning techniques for automating disease prediction from symptoms. Specifically, we explored the use of two Medical Concept Normalization-Bidirectional Encoder Representations from Transformers (MCN-BERT) models and a Bidirectional Long Short-Term Memory (BiLSTM) model, each optimized with a different hyperparameter optimization method, to predict diseases from symptom descriptions. In this paper, we utilized two distinct dataset called Dataset-1, and Dataset-2. Dataset-1 consists of 1,200 data points, with each point representing a unique combination of disease labels and symptom descriptions. While, Dataset-2 is designed to identify Adverse Drug Reactions (ADRs) from Twitter data, comprising 23,516 rows categorized as ADR (1) or Non-ADR (0) tweets. The results indicate that the MCN-BERT model optimized with AdamP achieved 99.58% accuracy for Dataset-1 and 96.15% accuracy for Dataset-2. The MCN-BERT model optimized with AdamW performed well with 98.33% accuracy for Dataset-1 and 95.15% for Dataset-2, while the BiLSTM model optimized with Hyperopt achieved 97.08% accuracy for Dataset-1 and 94.15% for Dataset-2. Our findings suggest that language models and deep learning techniques have promise for supporting earlier detection and more prompt treatment of diseases, as well as expanding remote diagnostic capabilities. The MCN-BERT and BiLSTM models demonstrated robust performance in accurately predicting diseases from symptoms, indicating the potential for further related research.

摘要

本文研究了使用语言模型和深度学习技术来实现从症状自动预测疾病。具体来说,我们探索了使用两个 Medical Concept Normalization-Bidirectional Encoder Representations from Transformers (MCN-BERT) 模型和一个 Bidirectional Long Short-Term Memory (BiLSTM) 模型,每个模型都使用不同的超参数优化方法进行优化,以从症状描述中预测疾病。在本文中,我们使用了两个不同的数据集,分别称为 Dataset-1 和 Dataset-2。Dataset-1 包含 1200 个数据点,每个数据点代表疾病标签和症状描述的独特组合。而 Dataset-2 旨在从 Twitter 数据中识别药物不良反应 (ADR),包含 23516 行,分为 ADR(1)或非 ADR(0)推文。结果表明,使用 AdamP 优化的 MCN-BERT 模型在 Dataset-1 上的准确率为 99.58%,在 Dataset-2 上的准确率为 96.15%。使用 AdamW 优化的 MCN-BERT 模型在 Dataset-1 上的准确率为 98.33%,在 Dataset-2 上的准确率为 95.15%,而使用 Hyperopt 优化的 BiLSTM 模型在 Dataset-1 上的准确率为 97.08%,在 Dataset-2 上的准确率为 94.15%。我们的研究结果表明,语言模型和深度学习技术在支持疾病的早期检测和更及时的治疗,以及扩大远程诊断能力方面具有潜力。MCN-BERT 和 BiLSTM 模型在准确预测疾病方面表现出了强大的性能,表明进一步相关研究的潜力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/faa9/10794698/ddbc60259543/41598_2024_51615_Fig12_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/faa9/10794698/66a29e3290cf/41598_2024_51615_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/faa9/10794698/d8e3ca2fbbf1/41598_2024_51615_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/faa9/10794698/68fd9734e9a0/41598_2024_51615_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/faa9/10794698/fcb64cec7d3c/41598_2024_51615_Figa_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/faa9/10794698/4ce96d007fff/41598_2024_51615_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/faa9/10794698/d18e6e76beb8/41598_2024_51615_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/faa9/10794698/528231bcaf43/41598_2024_51615_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/faa9/10794698/0f3ff8d1f506/41598_2024_51615_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/faa9/10794698/277b2a065472/41598_2024_51615_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/faa9/10794698/b30b8a8cb11f/41598_2024_51615_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/faa9/10794698/f95e9e5ccbc1/41598_2024_51615_Fig10_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/faa9/10794698/b8e2598f3b90/41598_2024_51615_Fig11_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/faa9/10794698/ddbc60259543/41598_2024_51615_Fig12_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/faa9/10794698/66a29e3290cf/41598_2024_51615_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/faa9/10794698/d8e3ca2fbbf1/41598_2024_51615_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/faa9/10794698/68fd9734e9a0/41598_2024_51615_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/faa9/10794698/fcb64cec7d3c/41598_2024_51615_Figa_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/faa9/10794698/4ce96d007fff/41598_2024_51615_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/faa9/10794698/d18e6e76beb8/41598_2024_51615_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/faa9/10794698/528231bcaf43/41598_2024_51615_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/faa9/10794698/0f3ff8d1f506/41598_2024_51615_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/faa9/10794698/277b2a065472/41598_2024_51615_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/faa9/10794698/b30b8a8cb11f/41598_2024_51615_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/faa9/10794698/f95e9e5ccbc1/41598_2024_51615_Fig10_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/faa9/10794698/b8e2598f3b90/41598_2024_51615_Fig11_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/faa9/10794698/ddbc60259543/41598_2024_51615_Fig12_HTML.jpg

相似文献

1
Optimizing classification of diseases through language model analysis of symptoms.通过对症状进行语言模型分析来优化疾病分类。
Sci Rep. 2024 Jan 17;14(1):1507. doi: 10.1038/s41598-024-51615-5.
2
A BERT Framework to Sentiment Analysis of Tweets.一种用于推文情感分析的BERT框架。
Sensors (Basel). 2023 Jan 2;23(1):506. doi: 10.3390/s23010506.
3
Autonomous International Classification of Diseases Coding Using Pretrained Language Models and Advanced Prompt Learning Techniques: Evaluation of an Automated Analysis System Using Medical Text.使用预训练语言模型和先进提示学习技术的自主国际疾病分类编码:对一个使用医学文本的自动分析系统的评估
JMIR Med Inform. 2025 Jan 6;13:e63020. doi: 10.2196/63020.
4
Use of BERT (Bidirectional Encoder Representations from Transformers)-Based Deep Learning Method for Extracting Evidences in Chinese Radiology Reports: Development of a Computer-Aided Liver Cancer Diagnosis Framework.基于 BERT(来自 Transformers 的双向编码器表示)的深度学习方法在提取中文放射学报告证据中的应用:计算机辅助肝癌诊断框架的开发。
J Med Internet Res. 2021 Jan 12;23(1):e19689. doi: 10.2196/19689.
5
Named entity recognition from Chinese adverse drug event reports with lexical feature based BiLSTM-CRF and tri-training.基于词汇特征的 BiLSTM-CRF 和三训练的中药不良事件报告命名实体识别。
J Biomed Inform. 2019 Aug;96:103252. doi: 10.1016/j.jbi.2019.103252. Epub 2019 Jul 16.
6
Pharmacovigilance with Transformers: A Framework to Detect Adverse Drug Reactions Using BERT Fine-Tuned with FARM.基于 Transformer 的药物警戒:使用 FARM 微调的 BERT 检测药物不良反应的框架。
Comput Math Methods Med. 2021 Aug 13;2021:5589829. doi: 10.1155/2021/5589829. eCollection 2021.
7
Prediction of Personal Experience Tweets of Medication Use via Contextual Word Representations.通过上下文词表示预测用药的个人体验推文
Annu Int Conf IEEE Eng Med Biol Soc. 2019 Jul;2019:6093-6096. doi: 10.1109/EMBC.2019.8856753.
8
Comparison of pretrained transformer-based models for influenza and COVID-19 detection using social media text data in Saskatchewan, Canada.加拿大萨斯喀彻温省使用社交媒体文本数据对基于预训练变压器的流感和新冠病毒检测模型的比较
Front Digit Health. 2023 Jun 28;5:1203874. doi: 10.3389/fdgth.2023.1203874. eCollection 2023.
9
Identification of hand-foot syndrome from cancer patients' blog posts: BERT-based deep-learning approach to detect potential adverse drug reaction symptoms.基于 BERT 的深度学习方法从癌症患者的博客文章中识别手足综合征:检测潜在药物不良反应症状。
PLoS One. 2022 May 4;17(5):e0267901. doi: 10.1371/journal.pone.0267901. eCollection 2022.
10
Multi-label classification of symptom terms from free-text bilingual adverse drug reaction reports using natural language processing.基于自然语言处理的自由文本双语药物不良反应报告中症状术语的多标签分类。
PLoS One. 2022 Aug 4;17(8):e0270595. doi: 10.1371/journal.pone.0270595. eCollection 2022.

引用本文的文献

1
Assessing the transferability of BERT to patient safety: classifying multiple types of incident reports.评估BERT在患者安全方面的可转移性:对多种类型的事件报告进行分类。
BMJ Health Care Inform. 2025 Aug 18;32(1):e101146. doi: 10.1136/bmjhci-2024-101146.
2
A novel model for expanding horizons in sign Language recognition.一种拓展手语识别视野的新型模型。
Sci Rep. 2025 Jul 8;15(1):24358. doi: 10.1038/s41598-025-09643-2.
3
SLCCC-Net: Hybrid steganography and AI system for secure cancer classification from histopathological images in internet of medical things applications.

本文引用的文献

1
AIPs-SnTCN: Predicting Anti-Inflammatory Peptides Using fastText and Transformer Encoder-Based Hybrid Word Embedding with Self-Normalized Temporal Convolutional Networks.AIPs-SnTCN:使用基于fastText和基于Transformer编码器的混合词嵌入与自归一化时间卷积网络预测抗炎肽
J Chem Inf Model. 2023 Nov 13;63(21):6537-6554. doi: 10.1021/acs.jcim.3c01563. Epub 2023 Oct 31.
2
Improving Drug-Drug Interaction Extraction with Gaussian Noise.利用高斯噪声改进药物相互作用提取
Pharmaceutics. 2023 Jun 26;15(7):1823. doi: 10.3390/pharmaceutics15071823.
3
The reporting of neuropsychiatric symptoms in electronic health records of individuals with Alzheimer's disease: a natural language processing study.
SLCCC-Net:用于医疗物联网应用中基于组织病理学图像进行安全癌症分类的混合隐写术与人工智能系统。
MethodsX. 2025 May 27;15:103398. doi: 10.1016/j.mex.2025.103398. eCollection 2025 Dec.
4
A novel framework for sentiment classification employing Bi-GRU optimized by enhanced human evolutionary optimization algorithm.一种采用增强人类进化优化算法优化的双向门控循环单元(Bi-GRU)进行情感分类的新框架。
Sci Rep. 2025 May 16;15(1):17038. doi: 10.1038/s41598-025-01516-y.
5
Predicting Mesothelioma Using Artificial Intelligence: A Scoping Review of Common Models and Applications.使用人工智能预测间皮瘤:常见模型与应用的范围综述
Technol Cancer Res Treat. 2025 Jan-Dec;24:15330338251341053. doi: 10.1177/15330338251341053. Epub 2025 May 8.
6
Role of artificial intelligence in early identification and risk evaluation of non-communicable diseases: a bibliometric analysis of global research trends.人工智能在非传染性疾病早期识别与风险评估中的作用:全球研究趋势的文献计量分析
BMJ Open. 2025 May 2;15(5):e101169. doi: 10.1136/bmjopen-2025-101169.
7
Deep learning models for segmenting phonocardiogram signals: a comparative study.用于分割心音图信号的深度学习模型:一项比较研究。
PLoS One. 2025 Apr 14;20(4):e0320297. doi: 10.1371/journal.pone.0320297. eCollection 2025.
8
Towards preventing the false alarms in indoor physical intrusion detector system and the incorporation of intruder immobilizer system.致力于防止室内物理入侵探测器系统中的误报以及集成入侵者固定系统。
Heliyon. 2025 Feb 20;11(4):e42855. doi: 10.1016/j.heliyon.2025.e42855. eCollection 2025 Feb 28.
9
A circular economy based nonlinear corrugated waste management system using Fermatean bipolar hesitant fuzzy logic.一种基于循环经济的非线性瓦楞纸废物管理系统,采用费马双极犹豫模糊逻辑。
Sci Rep. 2025 Feb 27;15(1):7099. doi: 10.1038/s41598-025-90948-7.
10
A novel device-free Wi-Fi indoor localization using a convolutional neural network based on residual attention.一种基于残差注意力卷积神经网络的新型无设备Wi-Fi室内定位方法。
PeerJ Comput Sci. 2024 Dec 23;10:e2471. doi: 10.7717/peerj-cs.2471. eCollection 2024.
阿尔茨海默病患者电子健康记录中神经精神症状报告:一项自然语言处理研究。
Alzheimers Res Ther. 2023 May 12;15(1):94. doi: 10.1186/s13195-023-01240-7.
4
Bayesian Optimization with Support Vector Machine Model for Parkinson Disease Classification.基于支持向量机模型的贝叶斯优化在帕金森病分类中的应用。
Sensors (Basel). 2023 Feb 13;23(4):2085. doi: 10.3390/s23042085.
5
The effect of choosing optimizer algorithms to improve computer vision tasks: a comparative study.选择优化器算法对改进计算机视觉任务的影响:一项比较研究。
Multimed Tools Appl. 2023;82(11):16591-16633. doi: 10.1007/s11042-022-13820-0. Epub 2022 Sep 28.
6
cACP-DeepGram: Classification of anticancer peptides via deep neural network and skip-gram-based word embedding model.cACP-DeepGram:基于深度神经网络和 Skip-Gram 词嵌入模型的抗癌肽分类。
Artif Intell Med. 2022 Sep;131:102349. doi: 10.1016/j.artmed.2022.102349. Epub 2022 Jul 6.
7
Multi-label classification of symptom terms from free-text bilingual adverse drug reaction reports using natural language processing.基于自然语言处理的自由文本双语药物不良反应报告中症状术语的多标签分类。
PLoS One. 2022 Aug 4;17(8):e0270595. doi: 10.1371/journal.pone.0270595. eCollection 2022.
8
FIDChain: Federated Intrusion Detection System for Blockchain-Enabled IoT Healthcare Applications.FIDChain:用于支持区块链的物联网医疗保健应用的联邦入侵检测系统。
Healthcare (Basel). 2022 Jun 15;10(6):1110. doi: 10.3390/healthcare10061110.
9
A Novel Text Mining Approach for Mental Health Prediction Using Bi-LSTM and BERT Model.一种基于 Bi-LSTM 和 BERT 模型的心理健康预测新型文本挖掘方法。
Comput Intell Neurosci. 2022 Mar 3;2022:7893775. doi: 10.1155/2022/7893775. eCollection 2022.
10
Traditional Machine Learning Models and Bidirectional Encoder Representations From Transformer (BERT)-Based Automatic Classification of Tweets About Eating Disorders: Algorithm Development and Validation Study.传统机器学习模型与基于双向编码器表征变换器(BERT)的饮食失调推文自动分类:算法开发与验证研究
JMIR Med Inform. 2022 Feb 24;10(2):e34492. doi: 10.2196/34492.