• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于深度学习制定策略,通过电子健康记录预测非酒精性脂肪性肝病患者的肝细胞癌风险。

Developing deep learning-based strategies to predict the risk of hepatocellular carcinoma among patients with nonalcoholic fatty liver disease from electronic health records.

作者信息

Li Zhao, Lan Lan, Zhou Yujia, Li Ruoxing, Chavin Kenneth D, Xu Hua, Li Liang, Shih David J H, Zheng W Jim

机构信息

McWilliams School of Biomedical Informatics, University of Texas Health Science Center at Houston, 7000 Fannin Street, Suite 600, Houston, Texas, 77030.

Department of Surgery, Case Western Reserve University School of Medicine, 11100 Euclid Ave, Cleveland OH 44106.

出版信息

medRxiv. 2023 Nov 17:2023.11.17.23298691. doi: 10.1101/2023.11.17.23298691.

DOI:10.1101/2023.11.17.23298691
PMID:38014193
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10680899/
Abstract

BACKGROUND

Deep learning models showed great success and potential when applied to many biomedical problems. However, the accuracy of deep learning models for many disease prediction problems is affected by time-varying covariates, rare incidence, and covariate imbalance when using structured electronic health records data. The situation is further exasperated when predicting the risk of one disease on condition of another disease, such as the hepatocellular carcinoma risk among patients with nonalcoholic fatty liver disease due to slow, chronic progression, the scarce of data with both disease conditions and the sex bias of the diseases.

OBJECTIVE

The goal of this study is to investigate the extent to which time-varying covariates, rare incidence, and covariate imbalance influence deep learning performance, and then devised strategies to tackle these challenges. These strategies were applied to improve hepatocellular carcinoma risk prediction among patients with nonalcoholic fatty liver disease.

METHODS

We evaluated two representative deep learning models in the task of predicting the occurrence of hepatocellular carcinoma in a cohort of patients with nonalcoholic fatty liver disease (n = 220,838) from a national EHR database. The disease prediction task was carefully formulated as a classification problem while taking censorship and the length of follow-up into consideration.

RESULTS

We developed a novel backward masking scheme to evaluate how the length of longitudinal information after the index date affects disease prediction. We observed that modeling time-varying covariates improved the performance of the algorithms and transfer learning mitigated reduced performance caused by the lack of data. In addition, covariate imbalance, such as sex bias in data impaired performance. Deep learning models trained on one sex and evaluated in the other sex showed reduced performance, indicating the importance of assessing covariate imbalance while preparing data for model training.

CONCLUSIONS

Devising proper strategies to address challenges from time-varying covariates, lack of data, and covariate imbalance can be key to counteracting data bias and accurately predicting disease occurrence using deep learning models. The novel strategies developed in this work can significantly improve the performance of hepatocellular carcinoma risk prediction among patients with nonalcoholic fatty liver disease. Furthermore, our novel strategies can be generalized to apply to other disease risk predictions using structured electronic health records, especially for disease risks on condition of another disease.

摘要

背景

深度学习模型在应用于许多生物医学问题时显示出巨大的成功和潜力。然而,在使用结构化电子健康记录数据时,深度学习模型在许多疾病预测问题上的准确性会受到随时间变化的协变量、罕见发病率和协变量不平衡的影响。当预测一种疾病在另一种疾病条件下的风险时,情况会更加恶化,例如非酒精性脂肪性肝病患者的肝细胞癌风险,这是由于疾病进展缓慢、同时患有两种疾病的数据稀缺以及疾病的性别偏差。

目的

本研究的目的是调查随时间变化的协变量、罕见发病率和协变量不平衡对深度学习性能的影响程度,然后设计应对这些挑战的策略。这些策略被应用于改善非酒精性脂肪性肝病患者的肝细胞癌风险预测。

方法

我们在一项来自国家电子健康记录数据库的非酒精性脂肪性肝病患者队列(n = 220,838)中,评估了两种代表性的深度学习模型在预测肝细胞癌发生情况的任务中的表现。疾病预测任务被精心制定为一个分类问题,同时考虑到删失和随访时间长度。

结果

我们开发了一种新颖的反向掩码方案,以评估索引日期后纵向信息的长度如何影响疾病预测。我们观察到,对随时间变化的协变量进行建模提高了算法的性能,而迁移学习减轻了因数据缺乏导致的性能下降。此外,协变量不平衡,如数据中的性别偏差会损害性能。在一种性别上训练并在另一种性别上评估的深度学习模型表现出性能下降,这表明在为模型训练准备数据时评估协变量不平衡的重要性。

结论

设计适当的策略来应对随时间变化的协变量、数据缺乏和协变量不平衡带来的挑战,可能是抵消数据偏差并使用深度学习模型准确预测疾病发生的关键。本研究中开发的新策略可以显著提高非酒精性脂肪性肝病患者肝细胞癌风险预测的性能。此外,我们的新策略可以推广应用于使用结构化电子健康记录的其他疾病风险预测,特别是对于一种疾病条件下的另一种疾病风险预测。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bbaa/10680899/23a1a2c221cd/nihpp-2023.11.17.23298691v1-f0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bbaa/10680899/622006cce60d/nihpp-2023.11.17.23298691v1-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bbaa/10680899/acbea566fa0f/nihpp-2023.11.17.23298691v1-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bbaa/10680899/fe4d8b2b6d55/nihpp-2023.11.17.23298691v1-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bbaa/10680899/e89ab7bbb0e4/nihpp-2023.11.17.23298691v1-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bbaa/10680899/23a1a2c221cd/nihpp-2023.11.17.23298691v1-f0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bbaa/10680899/622006cce60d/nihpp-2023.11.17.23298691v1-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bbaa/10680899/acbea566fa0f/nihpp-2023.11.17.23298691v1-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bbaa/10680899/fe4d8b2b6d55/nihpp-2023.11.17.23298691v1-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bbaa/10680899/e89ab7bbb0e4/nihpp-2023.11.17.23298691v1-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bbaa/10680899/23a1a2c221cd/nihpp-2023.11.17.23298691v1-f0005.jpg

相似文献

1
Developing deep learning-based strategies to predict the risk of hepatocellular carcinoma among patients with nonalcoholic fatty liver disease from electronic health records.基于深度学习制定策略,通过电子健康记录预测非酒精性脂肪性肝病患者的肝细胞癌风险。
medRxiv. 2023 Nov 17:2023.11.17.23298691. doi: 10.1101/2023.11.17.23298691.
2
Developing deep learning-based strategies to predict the risk of hepatocellular carcinoma among patients with nonalcoholic fatty liver disease from electronic health records.从电子健康记录中开发基于深度学习的策略,以预测非酒精性脂肪性肝病患者发生肝细胞癌的风险。
J Biomed Inform. 2024 Apr;152:104626. doi: 10.1016/j.jbi.2024.104626. Epub 2024 Mar 22.
3
Assessment of a Deep Learning Model to Predict Hepatocellular Carcinoma in Patients With Hepatitis C Cirrhosis.评估深度学习模型在丙型肝炎肝硬化患者中预测肝细胞癌的价值。
JAMA Netw Open. 2020 Sep 1;3(9):e2015626. doi: 10.1001/jamanetworkopen.2020.15626.
4
Deep learning model for prediction of hepatocellular carcinoma in patients with HBV-related cirrhosis on antiviral therapy.用于预测接受抗病毒治疗的乙肝相关肝硬化患者肝细胞癌的深度学习模型。
JHEP Rep. 2020 Aug 30;2(6):100175. doi: 10.1016/j.jhepr.2020.100175. eCollection 2020 Dec.
5
Applying interpretable deep learning models to identify chronic cough patients using EHR data.应用可解释的深度学习模型,利用电子病历数据识别慢性咳嗽患者。
Comput Methods Programs Biomed. 2021 Oct;210:106395. doi: 10.1016/j.cmpb.2021.106395. Epub 2021 Sep 4.
6
Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.在流行地区,服用抗叶酸抗疟药物的人群中,叶酸补充剂与疟疾易感性和严重程度的关系。
Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.
7
Predicting the 5-Year Risk of Nonalcoholic Fatty Liver Disease Using Machine Learning Models: Prospective Cohort Study.利用机器学习模型预测非酒精性脂肪性肝病的 5 年风险:前瞻性队列研究。
J Med Internet Res. 2023 Sep 12;25:e46891. doi: 10.2196/46891.
8
Looking for low vision: Predicting visual prognosis by fusing structured and free-text data from electronic health records.寻找低视力:通过融合电子健康记录中的结构化和自由文本数据来预测视觉预后。
Int J Med Inform. 2022 Mar;159:104678. doi: 10.1016/j.ijmedinf.2021.104678. Epub 2021 Dec 30.
9
Accurate Prediction of Coronary Heart Disease for Patients With Hypertension From Electronic Health Records With Big Data and Machine-Learning Methods: Model Development and Performance Evaluation.利用大数据和机器学习方法从电子健康记录中准确预测高血压患者的冠心病:模型开发与性能评估
JMIR Med Inform. 2020 Jul 6;8(7):e17257. doi: 10.2196/17257.
10
Effect of ethnicity on liver transplant for hepatocellular carcinoma.种族对肝细胞癌肝移植的影响。
Exp Clin Transplant. 2013 Aug;11(4):339-45. doi: 10.6002/ect.2013.0008.

本文引用的文献

1
NAFLD-related hepatocellular carcinoma: The growing challenge.非酒精性脂肪性肝病相关肝细胞癌:日益严峻的挑战。
Hepatology. 2023 Jan 1;77(1):323-338. doi: 10.1002/hep.32542. Epub 2022 Nov 8.
2
Patients with degenerative cervical myelopathy exhibit neurophysiological improvement upon extension and flexion: a retrospective cohort study with a minimum 1-year follow-up.退变性颈脊髓病患者在屈伸时表现出神经生理学改善:一项至少 1 年随访的回顾性队列研究。
BMC Neurol. 2022 Mar 23;22(1):110. doi: 10.1186/s12883-022-02641-1.
3
Deep learning in hepatocellular carcinoma: Current status and future perspectives.
肝细胞癌中的深度学习:现状与未来展望。
World J Hepatol. 2021 Dec 27;13(12):2039-2051. doi: 10.4254/wjh.v13.i12.2039.
4
Med-BERT: pretrained contextualized embeddings on large-scale structured electronic health records for disease prediction.医学BERT:基于大规模结构化电子健康记录进行疾病预测的预训练上下文嵌入模型
NPJ Digit Med. 2021 May 20;4(1):86. doi: 10.1038/s41746-021-00455-y.
5
Real-time prediction of COVID-19 related mortality using electronic health records.利用电子健康记录实时预测 COVID-19 相关死亡率。
Nat Commun. 2021 Feb 16;12(1):1058. doi: 10.1038/s41467-020-20816-7.
6
Deep representation learning of patient data from Electronic Health Records (EHR): A systematic review.电子健康记录(EHR)中患者数据的深度表征学习:一项系统综述。
J Biomed Inform. 2021 Mar;115:103671. doi: 10.1016/j.jbi.2020.103671. Epub 2020 Dec 31.
7
Deep learning model for prediction of hepatocellular carcinoma in patients with HBV-related cirrhosis on antiviral therapy.用于预测接受抗病毒治疗的乙肝相关肝硬化患者肝细胞癌的深度学习模型。
JHEP Rep. 2020 Aug 30;2(6):100175. doi: 10.1016/j.jhepr.2020.100175. eCollection 2020 Dec.
8
The Sex Bias of Cancer.癌症的性别偏见。
Trends Endocrinol Metab. 2020 Oct;31(10):785-799. doi: 10.1016/j.tem.2020.07.002. Epub 2020 Sep 6.
9
Assessment of a Deep Learning Model to Predict Hepatocellular Carcinoma in Patients With Hepatitis C Cirrhosis.评估深度学习模型在丙型肝炎肝硬化患者中预测肝细胞癌的价值。
JAMA Netw Open. 2020 Sep 1;3(9):e2015626. doi: 10.1001/jamanetworkopen.2020.15626.
10
Sex and gender: modifiers of health, disease, and medicine.性别与健康、疾病和医学。
Lancet. 2020 Aug 22;396(10250):565-582. doi: 10.1016/S0140-6736(20)31561-0.