• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于实验室检查的机器学习模型在疾病诊断预测中的开发。

Development of machine learning model for diagnostic disease prediction based on laboratory tests.

机构信息

Department of Laboratory Medicine, College of Medicine, Ewha Womans University of Korea, Seoul, South Korea.

Department of Laboratory Medicine, St. Vincent's Hospital, The Catholic University of Korea, Seoul, South Korea.

出版信息

Sci Rep. 2021 Apr 7;11(1):7567. doi: 10.1038/s41598-021-87171-5.

DOI:10.1038/s41598-021-87171-5
PMID:33828178
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8026627/
Abstract

The use of deep learning and machine learning (ML) in medical science is increasing, particularly in the visual, audio, and language data fields. We aimed to build a new optimized ensemble model by blending a DNN (deep neural network) model with two ML models for disease prediction using laboratory test results. 86 attributes (laboratory tests) were selected from datasets based on value counts, clinical importance-related features, and missing values. We collected sample datasets on 5145 cases, including 326,686 laboratory test results. We investigated a total of 39 specific diseases based on the International Classification of Diseases, 10th revision (ICD-10) codes. These datasets were used to construct light gradient boosting machine (LightGBM) and extreme gradient boosting (XGBoost) ML models and a DNN model using TensorFlow. The optimized ensemble model achieved an F1-score of 81% and prediction accuracy of 92% for the five most common diseases. The deep learning and ML models showed differences in predictive power and disease classification patterns. We used a confusion matrix and analyzed feature importance using the SHAP value method. Our new ML model achieved high efficiency of disease prediction through classification of diseases. This study will be useful in the prediction and diagnosis of diseases.

摘要

深度学习和机器学习(ML)在医学中的应用正在增加,特别是在视觉、音频和语言数据领域。我们旨在通过融合深度神经网络(DNN)模型和两个用于使用实验室测试结果进行疾病预测的 ML 模型来构建一个新的优化集成模型。从基于数值计数、临床重要性相关特征和缺失值的数据集选择了 86 个属性(实验室测试)。我们收集了 5145 例样本数据集,包括 326686 个实验室测试结果。我们根据国际疾病分类第 10 版(ICD-10)代码总共调查了 39 种特定疾病。这些数据集用于构建轻梯度提升机(LightGBM)和极端梯度提升(XGBoost)ML 模型以及使用 TensorFlow 的 DNN 模型。优化集成模型对五种最常见疾病的 F1 得分为 81%,预测准确率为 92%。深度学习和 ML 模型在预测能力和疾病分类模式方面表现出差异。我们使用混淆矩阵并使用 SHAP 值方法分析特征重要性。我们的新 ML 模型通过疾病分类实现了疾病预测的高效率。这项研究将有助于疾病的预测和诊断。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b597/8026627/5582b74c1ca9/41598_2021_87171_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b597/8026627/74d8fc2ee186/41598_2021_87171_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b597/8026627/13444135c54d/41598_2021_87171_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b597/8026627/e2d36b63d817/41598_2021_87171_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b597/8026627/5582b74c1ca9/41598_2021_87171_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b597/8026627/74d8fc2ee186/41598_2021_87171_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b597/8026627/13444135c54d/41598_2021_87171_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b597/8026627/e2d36b63d817/41598_2021_87171_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b597/8026627/5582b74c1ca9/41598_2021_87171_Fig4_HTML.jpg

相似文献

1
Development of machine learning model for diagnostic disease prediction based on laboratory tests.基于实验室检查的机器学习模型在疾病诊断预测中的开发。
Sci Rep. 2021 Apr 7;11(1):7567. doi: 10.1038/s41598-021-87171-5.
2
Deep learning assisted detection of glaucomatous optic neuropathy and potential designs for a generalizable model.深度学习辅助青光眼视神经病变检测及通用模型的潜在设计。
PLoS One. 2020 May 14;15(5):e0233079. doi: 10.1371/journal.pone.0233079. eCollection 2020.
3
Construction of machine learning diagnostic models for cardiovascular pan-disease based on blood routine and biochemical detection data.基于血常规和生化检测数据构建心血管多病种的机器学习诊断模型。
Cardiovasc Diabetol. 2024 Sep 28;23(1):351. doi: 10.1186/s12933-024-02439-0.
4
Use of extreme gradient boosting, light gradient boosting machine, and deep neural networks to evaluate the activity stage of extraocular muscles in thyroid-associated ophthalmopathy.使用极端梯度提升、轻量级梯度提升机和深度神经网络评估甲状腺相关眼病眼外肌的活动阶段。
Graefes Arch Clin Exp Ophthalmol. 2024 Jan;262(1):203-210. doi: 10.1007/s00417-023-06256-1. Epub 2023 Sep 29.
5
A data-driven approach to predicting diabetes and cardiovascular disease with machine learning.基于机器学习的数据驱动方法预测糖尿病和心血管疾病。
BMC Med Inform Decis Mak. 2019 Nov 6;19(1):211. doi: 10.1186/s12911-019-0918-5.
6
Who's your data? Primary immune deficiency differential diagnosis prediction via machine learning and data mining of the USIDNET registry.基于 USIDNET 注册中心的机器学习和数据挖掘的原发性免疫缺陷病鉴别诊断预测:您的数据是谁的?
Clin Immunol. 2023 Oct;255:109759. doi: 10.1016/j.clim.2023.109759. Epub 2023 Sep 9.
7
Machine learning approaches to predict peak demand days of cardiovascular admissions considering environmental exposure.考虑环境暴露因素的机器学习方法预测心血管病入院高峰日
BMC Med Inform Decis Mak. 2020 May 1;20(1):83. doi: 10.1186/s12911-020-1101-8.
8
DeepStack-DTIs: Predicting Drug-Target Interactions Using LightGBM Feature Selection and Deep-Stacked Ensemble Classifier.DeepStack-DTIs:使用 LightGBM 特征选择和深度堆叠集成分类器预测药物-靶标相互作用。
Interdiscip Sci. 2022 Jun;14(2):311-330. doi: 10.1007/s12539-021-00488-7. Epub 2021 Nov 3.
9
Computer-aided diagnosis of ground glass pulmonary nodule by fusing deep learning and radiomics features.基于深度学习和放射组学特征融合的磨玻璃肺结节计算机辅助诊断。
Phys Med Biol. 2021 Mar 4;66(6):065015. doi: 10.1088/1361-6560/abe735.
10
Artificial Intelligence Learning Semantics via External Resources for Classifying Diagnosis Codes in Discharge Notes.人工智能通过外部资源学习语义以对出院小结中的诊断代码进行分类。
J Med Internet Res. 2017 Nov 6;19(11):e380. doi: 10.2196/jmir.8344.

引用本文的文献

1
Machine learning prediction and explanation of high intraoperative blood pressure variability for noncardiac surgery using preoperative factors.使用术前因素对非心脏手术术中高血压变异性进行机器学习预测与解释。
BMC Cardiovasc Disord. 2025 Aug 6;25(1):581. doi: 10.1186/s12872-025-05026-7.
2
Machine Learning and Artificial Intelligence for Infectious Disease Surveillance, Diagnosis, and Prognosis.用于传染病监测、诊断和预后的机器学习与人工智能
Viruses. 2025 Jun 23;17(7):882. doi: 10.3390/v17070882.
3
Identification of age-specific risk factors for hyperuricemia: a machine learning-driven stratified analysis in health examination cohorts.

本文引用的文献

1
Predicting drug-induced hepatotoxicity based on biological feature maps and diverse classification strategies.基于生物特征图谱和多种分类策略预测药物性肝毒性
Brief Bioinform. 2021 Jan 18;22(1):428-437. doi: 10.1093/bib/bbz165.
2
Comparison of Bagging and Boosting Ensemble Machine Learning Methods for Automated EMG Signal Classification.基于集成机器学习方法的自动肌电信号分类的比较。
Biomed Res Int. 2019 Oct 31;2019:9152506. doi: 10.1155/2019/9152506. eCollection 2019.
3
Gene Expression Value Prediction Based on XGBoost Algorithm.
高尿酸血症年龄特异性危险因素的识别:健康体检队列中的机器学习驱动分层分析
BMC Med Inform Decis Mak. 2025 Jul 28;25(1):280. doi: 10.1186/s12911-025-03123-3.
4
Secondary Pulmonary Tuberculosis Recognition by 4-Direction Varying-Distance GLCM and Fuzzy SVM.基于四方向变距灰度共生矩阵和模糊支持向量机的继发性肺结核识别
Mob Netw Appl. 2022 Feb 21:1-14. doi: 10.1007/s11036-021-01901-7.
5
Multi-view contrastive learning and symptom extraction insights for medical report generation.用于医学报告生成的多视图对比学习和症状提取见解
Sci Rep. 2025 May 23;15(1):17991. doi: 10.1038/s41598-025-00570-w.
6
Retinal vein occlusion risk prediction without fundus examination using a no-code machine learning tool for tabular data: a nationwide cross-sectional study from South Korea.使用用于表格数据的无代码机器学习工具在不进行眼底检查的情况下预测视网膜静脉阻塞风险:来自韩国的一项全国性横断面研究。
BMC Med Inform Decis Mak. 2025 Mar 7;25(1):118. doi: 10.1186/s12911-025-02950-8.
7
Application of machine learning techniques in GlaucomAI system for glaucoma diagnosis and collaborative research support.机器学习技术在用于青光眼诊断和协作研究支持的GlaucomAI系统中的应用。
Sci Rep. 2025 Mar 7;15(1):7940. doi: 10.1038/s41598-025-89893-2.
8
Optimized Machine Learning for the Early Detection of Polycystic Ovary Syndrome in Women.优化机器学习用于女性多囊卵巢综合征的早期检测
Sensors (Basel). 2025 Feb 14;25(4):1166. doi: 10.3390/s25041166.
9
Artificial-Intelligence Bio-Inspired Peptide for Salivary Detection of SARS-CoV-2 in Electrochemical Biosensor Integrated with Machine Learning Algorithms.集成机器学习算法的电化学生物传感器中用于唾液检测SARS-CoV-2的人工智能生物启发肽
Biosensors (Basel). 2025 Jan 28;15(2):75. doi: 10.3390/bios15020075.
10
Predicting Leptospirosis Using Baseline Laboratory Tests and Geospatial Mapping of Acute Febrile Illness Cases Through Machine Learning-Based Algorithm.利用基线实验室检测和基于机器学习算法的急性发热性疾病病例地理空间映射预测钩端螺旋体病
Cureus. 2024 Nov 15;16(11):e73779. doi: 10.7759/cureus.73779. eCollection 2024 Nov.
基于XGBoost算法的基因表达值预测
Front Genet. 2019 Nov 12;10:1077. doi: 10.3389/fgene.2019.01077. eCollection 2019.
4
Establishment of Best Practices for Evidence for Prediction: A Review.建立最佳实践证据预测:综述。
JAMA Psychiatry. 2020 May 1;77(5):534-540. doi: 10.1001/jamapsychiatry.2019.3671.
5
LightGBM: An Effective and Scalable Algorithm for Prediction of Chemical Toxicity-Application to the Tox21 and Mutagenicity Data Sets.LightGBM:一种用于化学毒性预测的有效且可扩展的算法-在 Tox21 和致突变性数据集上的应用。
J Chem Inf Model. 2019 Oct 28;59(10):4150-4158. doi: 10.1021/acs.jcim.9b00633. Epub 2019 Oct 9.
6
Machine Learning To Predict Standard Enthalpy of Formation of Hydrocarbons.机器学习预测烃类标准生成焓。
J Phys Chem A. 2019 Sep 26;123(38):8305-8313. doi: 10.1021/acs.jpca.9b04771. Epub 2019 Sep 16.
7
Machine learning models accurately predict ozone exposure during wildfire events.机器学习模型能准确预测野火事件中的臭氧暴露情况。
Environ Pollut. 2019 Nov;254(Pt A):112792. doi: 10.1016/j.envpol.2019.06.088. Epub 2019 Jul 5.
8
IRESpy: an XGBoost model for prediction of internal ribosome entry sites.IRESpy:一种用于预测内部核糖体进入位点的 XGBoost 模型。
BMC Bioinformatics. 2019 Jul 30;20(1):409. doi: 10.1186/s12859-019-2999-7.
9
Comparative performances of machine learning methods for classifying Crohn Disease patients using genome-wide genotyping data.使用全基因组基因分型数据对克罗恩病患者进行分类的机器学习方法的比较性能。
Sci Rep. 2019 Jul 17;9(1):10351. doi: 10.1038/s41598-019-46649-z.
10
Deep convolutional neural networks for mammography: advances, challenges and applications.深度学习卷积神经网络在乳腺 X 线摄影中的应用:进展、挑战和应用。
BMC Bioinformatics. 2019 Jun 6;20(Suppl 11):281. doi: 10.1186/s12859-019-2823-4.