文献检索文档翻译深度研究
Suppr Zotero 插件Zotero 插件
邀请有礼套餐&价格历史记录

新学期,新优惠

限时优惠:9月1日-9月22日

30天高级会员仅需29元

1天体验卡首发特惠仅需5.99元

了解详情
不再提醒
插件&应用
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
高级版
套餐订阅购买积分包
AI 工具
文献检索文档翻译深度研究
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2025

在标签不足的情况下改进基于电子健康记录的临床预测模型:基于网络的生成对抗半监督方法。

Improving an Electronic Health Record-Based Clinical Prediction Model Under Label Deficiency: Network-Based Generative Adversarial Semisupervised Approach.

作者信息

Li Runze, Tian Yu, Shen Zhuyi, Li Jin, Li Jun, Ding Kefeng, Li Jingsong

机构信息

College of Biomedical Engineering and Instrument Science, Zhejiang University, Hangzhou, China.

Institute for Artificial Intelligence in Medicine, School of Artificial Intelligence, Nanjing University of Information Science and Technology, Nanjing, China.

出版信息

JMIR Med Inform. 2023 Jun 13;11:e47862. doi: 10.2196/47862.


DOI:10.2196/47862
PMID:37310778
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10337516/
Abstract

BACKGROUND: Observational biomedical studies facilitate a new strategy for large-scale electronic health record (EHR) utilization to support precision medicine. However, data label inaccessibility is an increasingly important issue in clinical prediction, despite the use of synthetic and semisupervised learning from data. Little research has aimed to uncover the underlying graphical structure of EHRs. OBJECTIVE: A network-based generative adversarial semisupervised method is proposed. The objective is to train clinical prediction models on label-deficient EHRs to achieve comparable learning performance to supervised methods. METHODS: Three public data sets and one colorectal cancer data set gathered from the Second Affiliated Hospital of Zhejiang University were selected as benchmarks. The proposed models were trained on 5% to 25% labeled data and evaluated on classification metrics against conventional semisupervised and supervised methods. The data quality, model security, and memory scalability were also evaluated. RESULTS: The proposed method for semisupervised classification outperforms related semisupervised methods under the same setup, with the average area under the receiver operating characteristics curve (AUC) reaching 0.945, 0.673, 0.611, and 0.588 for the four data sets, respectively, followed by graph-based semisupervised learning (0.450, 0.454, 0.425, and 0.5676, respectively) and label propagation (0.475,0.344, 0.440, and 0.477, respectively). The average classification AUCs with 10% labeled data were 0.929, 0.719, 0.652, and 0.650, respectively, comparable to that of the supervised learning methods logistic regression (0.601, 0.670, 0.731, and 0.710, respectively), support vector machines (0.733, 0.720, 0.720, and 0.721, respectively), and random forests (0.982, 0.750, 0.758, and 0.740, respectively). The concerns regarding the secondary use of data and data security are alleviated by realistic data synthesis and robust privacy preservation. CONCLUSIONS: Training clinical prediction models on label-deficient EHRs is indispensable in data-driven research. The proposed method has great potential to exploit the intrinsic structure of EHRs and achieve comparable learning performance to supervised methods.

摘要

背景:观察性生物医学研究推动了一种利用大规模电子健康记录(EHR)来支持精准医学的新策略。然而,尽管采用了从数据中进行合成和半监督学习的方法,但在临床预测中,数据标签难以获取仍是一个日益重要的问题。很少有研究旨在揭示电子健康记录的潜在图形结构。 目的:提出一种基于网络的生成对抗半监督方法。目标是在标签缺失的电子健康记录上训练临床预测模型,以实现与监督方法相当的学习性能。 方法:选择三个公共数据集和一个从浙江大学医学院附属第二医院收集的结直肠癌数据集作为基准。所提出的模型在5%至25%的标记数据上进行训练,并根据分类指标与传统半监督和监督方法进行评估。还评估了数据质量、模型安全性和内存可扩展性。 结果:在相同设置下,所提出的半监督分类方法优于相关的半监督方法,四个数据集的受试者操作特征曲线下面积(AUC)平均值分别达到0.945、0.673、0.611和0.588,其次是基于图的半监督学习(分别为0.450、0.454、0.425和0.5676)和标签传播(分别为0.475、0.344、0.440和0.477)。使用10%标记数据时的平均分类AUC分别为0.929、0.719、0.652和0.650,与监督学习方法逻辑回归(分别为0.601、0.670、0.731和0.710)、支持向量机(分别为0.733、0.720、0.720和0.721)以及随机森林(分别为0.982、0.750、0.758和0.740)相当。现实的数据合成和强大的隐私保护减轻了对数据二次使用和数据安全的担忧。 结论:在标签缺失的电子健康记录上训练临床预测模型在数据驱动的研究中不可或缺。所提出的方法具有很大潜力来挖掘电子健康记录的内在结构,并实现与监督方法相当的学习性能。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/73ed/10337516/23e78f6d9dbd/medinform_v11i1e47862_fig6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/73ed/10337516/b33e97318689/medinform_v11i1e47862_fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/73ed/10337516/8abc5e8148a0/medinform_v11i1e47862_fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/73ed/10337516/7499e126d608/medinform_v11i1e47862_fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/73ed/10337516/d6dbcf3fc399/medinform_v11i1e47862_fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/73ed/10337516/96606c500eb0/medinform_v11i1e47862_fig5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/73ed/10337516/23e78f6d9dbd/medinform_v11i1e47862_fig6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/73ed/10337516/b33e97318689/medinform_v11i1e47862_fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/73ed/10337516/8abc5e8148a0/medinform_v11i1e47862_fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/73ed/10337516/7499e126d608/medinform_v11i1e47862_fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/73ed/10337516/d6dbcf3fc399/medinform_v11i1e47862_fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/73ed/10337516/96606c500eb0/medinform_v11i1e47862_fig5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/73ed/10337516/23e78f6d9dbd/medinform_v11i1e47862_fig6.jpg

相似文献

[1]
Improving an Electronic Health Record-Based Clinical Prediction Model Under Label Deficiency: Network-Based Generative Adversarial Semisupervised Approach.

JMIR Med Inform. 2023-6-13

[2]
Treatment effect prediction with adversarial deep learning using electronic health records.

BMC Med Inform Decis Mak. 2020-12-14

[3]
Generative Adversarial Networks and Conditional Random Fields for Hyperspectral Image Classification.

IEEE Trans Cybern. 2020-7

[4]
Transformer- and Generative Adversarial Network-Based Inpatient Traditional Chinese Medicine Prescription Recommendation: Development Study.

JMIR Med Inform. 2022-5-31

[5]
A reciprocal learning strategy for semisupervised medical image segmentation.

Med Phys. 2023-1

[6]
Semisupervised Semantic Segmentation by Improving Prediction Confidence.

IEEE Trans Neural Netw Learn Syst. 2022-9

[7]
Semisupervised Training of Deep Generative Models for High-Dimensional Anomaly Detection.

IEEE Trans Neural Netw Learn Syst. 2022-6

[8]
Robust Semisupervised Deep Generative Model Under Compound Noise.

IEEE Trans Neural Netw Learn Syst. 2023-3

[9]
A multicenter random forest model for effective prognosis prediction in collaborative clinical research network.

Artif Intell Med. 2020-3

[10]
Scaling up graph-based semisupervised learning via prototype vector machines.

IEEE Trans Neural Netw Learn Syst. 2015-3

引用本文的文献

[1]
Role of Generative Artificial Intelligence in Personalized Medicine: A Systematic Review.

Cureus. 2025-4-15

[2]
A review on generative AI models for synthetic medical text, time series, and longitudinal data.

NPJ Digit Med. 2025-5-15

本文引用的文献

[1]
A long-term clinical trial on the efficacy and safety profile of doxofylline in Asthma: The LESDA study.

Pulm Pharmacol Ther. 2020-2

[2]
Advances in Electronic Phenotyping: From Rule-Based Definitions to Machine Learning Models.

Annu Rev Biomed Data Sci. 2018-7

[3]
Semi-supervised encoding for outlier detection in clinical observation data.

Comput Methods Programs Biomed. 2019-1-12

[4]
Synthesizing electronic health records using improved generative adversarial networks.

J Am Med Inform Assoc. 2019-3-1

[5]
Opportunities and challenges in developing deep learning models using electronic health records data: a systematic review.

J Am Med Inform Assoc. 2018-10-1

[6]
Analysis of randomised trials with long-term follow-up.

BMC Med Res Methodol. 2018-5-29

[7]
Comorbidity network for chronic disease: A novel approach to understand type 2 diabetes progression.

Int J Med Inform. 2018-4-9

[8]
Synthea: An approach, method, and software mechanism for generating synthetic patients and the synthetic electronic health care record.

J Am Med Inform Assoc. 2018-3-1

[9]
Semi-supervised learning of the electronic health record for phenotype stratification.

J Biomed Inform. 2016-12

[10]
Identification of type 2 diabetes subgroups through topological analysis of patient similarity.

Sci Transl Med. 2015-10-28

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

推荐工具

医学文档翻译智能文献检索