• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

多变量纵向数据在年轻人心血管事件预测生存分析中的应用:来自可解释性比较研究的启示。

Multivariate longitudinal data for survival analysis of cardiovascular event prediction in young adults: insights from a comparative explainable study.

机构信息

Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA.

Department of Cardiology, Johns Hopkins University, Baltimore, MD, USA.

出版信息

BMC Med Res Methodol. 2023 Jan 25;23(1):23. doi: 10.1186/s12874-023-01845-4.

DOI:10.1186/s12874-023-01845-4
PMID:36698064
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9878947/
Abstract

BACKGROUND

Multivariate longitudinal data are under-utilized for survival analysis compared to cross-sectional data (CS - data collected once across cohort). Particularly in cardiovascular risk prediction, despite available methods of longitudinal data analysis, the value of longitudinal information has not been established in terms of improved predictive accuracy and clinical applicability.

METHODS

We investigated the value of longitudinal data over and above the use of cross-sectional data via 6 distinct modeling strategies from statistics, machine learning, and deep learning that incorporate repeated measures for survival analysis of the time-to-cardiovascular event in the Coronary Artery Risk Development in Young Adults (CARDIA) cohort. We then examined and compared the use of model-specific interpretability methods (Random Survival Forest Variable Importance) and model-agnostic methods (SHapley Additive exPlanation (SHAP) and Temporal Importance Model Explanation (TIME)) in cardiovascular risk prediction using the top-performing models.

RESULTS

In a cohort of 3539 participants, longitudinal information from 35 variables that were repeatedly collected in 6 exam visits over 15 years improved subsequent long-term (17 years after) risk prediction by up to 8.3% in C-index compared to using baseline data (0.78 vs. 0.72), and up to approximately 4% compared to using the last observed CS data (0.75). Time-varying AUC was also higher in models using longitudinal data (0.86-0.87 at 5 years, 0.79-0.81 at 10 years) than using baseline or last observed CS data (0.80-0.86 at 5 years, 0.73-0.77 at 10 years). Comparative model interpretability analysis revealed the impact of longitudinal variables on model prediction on both the individual and global scales among different modeling strategies, as well as identifying the best time windows and best timing within that window for event prediction. The best strategy to incorporate longitudinal data for accuracy was time series massive feature extraction, and the easiest interpretable strategy was trajectory clustering.

CONCLUSION

Our analysis demonstrates the added value of longitudinal data in predictive accuracy and epidemiological utility in cardiovascular risk survival analysis in young adults via a unified, scalable framework that compares model performance and explainability. The framework can be extended to a larger number of variables and other longitudinal modeling methods.

TRIAL REGISTRATION

ClinicalTrials.gov Identifier: NCT00005130, Registration Date: 26/05/2000.

摘要

背景

与横断面数据(CS-在队列中一次性收集的数据)相比,多变量纵向数据在生存分析中未得到充分利用。特别是在心血管风险预测中,尽管有纵向数据分析方法,但尚未确定纵向信息在提高预测准确性和临床适用性方面的价值。

方法

我们通过统计、机器学习和深度学习中的 6 种不同建模策略,调查了纵向数据相对于使用横断面数据的价值,这些策略包括重复测量,用于分析冠状动脉风险发展中的年轻人(CARDIA)队列中发生心血管事件的时间。然后,我们使用特定于模型的可解释性方法(随机生存森林变量重要性)和模型不可知的方法(SHapley Additive exPlanation(SHAP)和Temporal Importance Model Explanation(TIME))在心血管风险预测中检查和比较了使用表现最佳的模型的方法。

结果

在 3539 名参与者的队列中,在 15 年的 6 次检查中重复收集的 35 个变量的纵向信息,与使用基线数据(0.78 对 0.72)相比,在 C 指数中最多可将后续长期(17 年后)风险预测提高 8.3%,与使用最后一次观察到的 CS 数据(0.75)相比,最多可提高约 4%。使用纵向数据的模型的时间变化 AUC 也更高(5 年时为 0.86-0.87,10 年时为 0.79-0.81),而使用基线或最后观察到的 CS 数据(5 年时为 0.80-0.86,10 年时为 0.73-0.77)。比较模型的可解释性分析表明,在不同的建模策略中,纵向变量对个体和整体规模上的模型预测的影响,以及确定最佳的时间窗口和该窗口内用于事件预测的最佳时间。用于准确性的最佳纵向数据纳入策略是时间序列海量特征提取,最易于解释的策略是轨迹聚类。

结论

我们的分析通过统一的可扩展框架,比较了模型性能和可解释性,证明了纵向数据在年轻成年人心血管风险生存分析中的预测准确性和流行病学效用方面的附加价值。该框架可以扩展到更多变量和其他纵向建模方法。

试验注册

ClinicalTrials.gov 标识符:NCT00005130,注册日期:2000 年 5 月 26 日。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5c63/9878947/f028fe550eeb/12874_2023_1845_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5c63/9878947/037dd07b0a0b/12874_2023_1845_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5c63/9878947/e9478bf1ec5c/12874_2023_1845_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5c63/9878947/651fc2537cc0/12874_2023_1845_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5c63/9878947/c22fb9690052/12874_2023_1845_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5c63/9878947/f028fe550eeb/12874_2023_1845_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5c63/9878947/037dd07b0a0b/12874_2023_1845_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5c63/9878947/e9478bf1ec5c/12874_2023_1845_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5c63/9878947/651fc2537cc0/12874_2023_1845_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5c63/9878947/c22fb9690052/12874_2023_1845_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5c63/9878947/f028fe550eeb/12874_2023_1845_Fig5_HTML.jpg

相似文献

1
Multivariate longitudinal data for survival analysis of cardiovascular event prediction in young adults: insights from a comparative explainable study.多变量纵向数据在年轻人心血管事件预测生存分析中的应用:来自可解释性比较研究的启示。
BMC Med Res Methodol. 2023 Jan 25;23(1):23. doi: 10.1186/s12874-023-01845-4.
2
Comparison of Two Modern Survival Prediction Tools, SORG-MLA and METSSS, in Patients With Symptomatic Long-bone Metastases Who Underwent Local Treatment With Surgery Followed by Radiotherapy and With Radiotherapy Alone.两种现代生存预测工具 SORG-MLA 和 METSSS 在接受手术联合放疗和单纯放疗治疗有症状长骨转移患者中的比较。
Clin Orthop Relat Res. 2024 Dec 1;482(12):2193-2208. doi: 10.1097/CORR.0000000000003185. Epub 2024 Jul 23.
3
Falls prevention interventions for community-dwelling older adults: systematic review and meta-analysis of benefits, harms, and patient values and preferences.社区居住的老年人跌倒预防干预措施:系统评价和荟萃分析的益处、危害以及患者的价值观和偏好。
Syst Rev. 2024 Nov 26;13(1):289. doi: 10.1186/s13643-024-02681-3.
4
A Responsible Framework for Assessing, Selecting, and Explaining Machine Learning Models in Cardiovascular Disease Outcomes Among People With Type 2 Diabetes: Methodology and Validation Study.用于评估、选择和解释2型糖尿病患者心血管疾病结局机器学习模型的责任框架:方法与验证研究
JMIR Med Inform. 2025 Jun 27;13:e66200. doi: 10.2196/66200.
5
Stabilizing machine learning for reproducible and explainable results: A novel validation approach to subject-specific insights.稳定机器学习以获得可重复和可解释的结果:一种针对特定个体见解的新型验证方法。
Comput Methods Programs Biomed. 2025 Jun 21;269:108899. doi: 10.1016/j.cmpb.2025.108899.
6
A New Measure of Quantified Social Health Is Associated With Levels of Discomfort, Capability, and Mental and General Health Among Patients Seeking Musculoskeletal Specialty Care.一种新的量化社会健康指标与寻求肌肉骨骼专科护理的患者的不适程度、能力以及心理和总体健康水平相关。
Clin Orthop Relat Res. 2025 Apr 1;483(4):647-663. doi: 10.1097/CORR.0000000000003394. Epub 2025 Feb 5.
7
Are Current Survival Prediction Tools Useful When Treating Subsequent Skeletal-related Events From Bone Metastases?当前的生存预测工具在治疗骨转移后的骨骼相关事件时有用吗?
Clin Orthop Relat Res. 2024 Sep 1;482(9):1710-1721. doi: 10.1097/CORR.0000000000003030. Epub 2024 Mar 22.
8
Signs and symptoms to determine if a patient presenting in primary care or hospital outpatient settings has COVID-19.在基层医疗机构或医院门诊环境中,如果患者出现以下症状和体征,可判断其是否患有 COVID-19。
Cochrane Database Syst Rev. 2022 May 20;5(5):CD013665. doi: 10.1002/14651858.CD013665.pub3.
9
AI-based Hepatic Steatosis Detection and Integrated Hepatic Assessment from Cardiac CT Attenuation Scans Enhances All-cause Mortality Risk Stratification: A Multi-center Study.基于人工智能的心脏CT衰减扫描检测肝脂肪变性及综合肝脏评估可增强全因死亡风险分层:一项多中心研究
medRxiv. 2025 Jun 11:2025.06.09.25329157. doi: 10.1101/2025.06.09.25329157.
10
Prognosis of adults and children following a first unprovoked seizure.首次无诱因发作后成人和儿童的预后。
Cochrane Database Syst Rev. 2023 Jan 23;1(1):CD013847. doi: 10.1002/14651858.CD013847.pub2.

引用本文的文献

1
A Systematic Review of Artificial Intelligence Models for Time-to-Event Outcome Applied in Cardiovascular Disease Risk Prediction.人工智能模型在心血管疾病风险预测中应用的时间事件结局的系统评价。
J Med Syst. 2024 Jul 19;48(1):68. doi: 10.1007/s10916-024-02087-7.
2
Machine learning-based prediction of clinical outcomes after traumatic brain injury: Hidden information of early physiological time series.基于机器学习的创伤性脑损伤后临床结局预测:早期生理时间序列的隐藏信息。
CNS Neurosci Ther. 2024 Jul;30(7):e14848. doi: 10.1111/cns.14848.
3
Utility of multimodal longitudinal imaging data for dynamic prediction of cardiovascular and renal disease: the CARDIA study.

本文引用的文献

1
Comparative analysis of explainable machine learning prediction models for hospital mortality.可解释机器学习预测模型在医院死亡率预测中的对比分析。
BMC Med Res Methodol. 2022 Feb 27;22(1):53. doi: 10.1186/s12874-022-01540-w.
2
Computational signatures for post-cardiac arrest trajectory prediction: Importance of early physiological time series.计算心脏停搏后轨迹预测的特征:早期生理时间序列的重要性。
Anaesth Crit Care Pain Med. 2022 Feb;41(1):101015. doi: 10.1016/j.accpm.2021.101015. Epub 2021 Dec 27.
3
Modelling of longitudinal data to predict cardiovascular disease risk: a methodological review.
多模态纵向成像数据对心血管和肾脏疾病动态预测的效用:CARDIA研究
Front Radiol. 2024 Feb 27;4:1269023. doi: 10.3389/fradi.2024.1269023. eCollection 2024.
纵向数据分析模型预测心血管疾病风险:方法学综述。
BMC Med Res Methodol. 2021 Dec 18;21(1):283. doi: 10.1186/s12874-021-01472-x.
4
Obesity Duration, Severity, and Distribution Trajectories and Cardiovascular Disease Risk in the Atherosclerosis Risk in Communities Study.肥胖持续时间、严重程度和分布轨迹与社区动脉粥样硬化风险研究中的心血管疾病风险。
J Am Heart Assoc. 2021 Dec 21;10(24):e019946. doi: 10.1161/JAHA.121.019946. Epub 2021 Dec 10.
5
Interpretability of time-series deep learning models: A study in cardiovascular patients admitted to Intensive care unit.时间序列深度学习模型的可解释性:一项 ICU 收治心血管病患者的研究。
J Biomed Inform. 2021 Sep;121:103876. doi: 10.1016/j.jbi.2021.103876. Epub 2021 Jul 27.
6
Serum Urate Trajectory in Young Adulthood and Incident Cardiovascular Disease Events by Middle Age: CARDIA Study.血清尿酸在青年期的变化轨迹与中年时期心血管疾病事件的关系:CARDIA 研究。
Hypertension. 2021 Nov;78(5):1211-1218. doi: 10.1161/HYPERTENSIONAHA.121.17555. Epub 2021 Jun 7.
7
Interpreting a recurrent neural network's predictions of ICU mortality risk.解读 循环神经网络对 ICU 死亡率风险预测。
J Biomed Inform. 2021 Feb;114:103672. doi: 10.1016/j.jbi.2021.103672. Epub 2021 Jan 7.
8
Harnessing repeated measurements of predictor variables for clinical risk prediction: a review of existing methods.利用预测变量的重复测量进行临床风险预测:现有方法综述
Diagn Progn Res. 2020 Jul 9;4:9. doi: 10.1186/s41512-020-00078-z. eCollection 2020.
9
The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation.马修斯相关系数(MCC)在二分类评估中优于 F1 得分和准确率的优势。
BMC Genomics. 2020 Jan 2;21(1):6. doi: 10.1186/s12864-019-6413-7.
10
Incorporating repeated measurements into prediction models in the critical care setting: a framework, systematic review and meta-analysis.将重复测量纳入重症监护环境中的预测模型:框架、系统评价和荟萃分析。
BMC Med Res Methodol. 2019 Oct 26;19(1):199. doi: 10.1186/s12874-019-0847-0.