文献检索文档翻译深度研究
Suppr Zotero 插件Zotero 插件
邀请有礼套餐&价格历史记录

新学期,新优惠

限时优惠:9月1日-9月22日

30天高级会员仅需29元

1天体验卡首发特惠仅需5.99元

了解详情
不再提醒
插件&应用
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
高级版
套餐订阅购买积分包
AI 工具
文献检索文档翻译深度研究
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2025

基于纵向电子健康记录的事件时间标注的半监督方法。

Semi-supervised approach to event time annotation using longitudinal electronic health records.

机构信息

Department of Biostatistics, Harvard T. H. Chan School of Public Health, Boston, MA, USA.

Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, MA, USA.

出版信息

Lifetime Data Anal. 2022 Jul;28(3):428-491. doi: 10.1007/s10985-022-09557-5. Epub 2022 Jun 26.


DOI:10.1007/s10985-022-09557-5
PMID:35753014
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10044535/
Abstract

Large clinical datasets derived from insurance claims and electronic health record (EHR) systems are valuable sources for precision medicine research. These datasets can be used to develop models for personalized prediction of risk or treatment response. Efficiently deriving prediction models using real world data, however, faces practical and methodological challenges. Precise information on important clinical outcomes such as time to cancer progression are not readily available in these databases. The true clinical event times typically cannot be approximated well based on simple extracts of billing or procedure codes. Whereas, annotating event times manually is time and resource prohibitive. In this paper, we propose a two-step semi-supervised multi-modal automated time annotation (MATA) method leveraging multi-dimensional longitudinal EHR encounter records. In step I, we employ a functional principal component analysis approach to estimate the underlying intensity functions based on observed point processes from the unlabeled patients. In step II, we fit a penalized proportional odds model to the event time outcomes with features derived in step I in the labeled data where the non-parametric baseline function is approximated using B-splines. Under regularity conditions, the resulting estimator of the feature effect vector is shown as root-n consistent. We demonstrate the superiority of our approach relative to existing approaches through simulations and a real data example on annotating lung cancer recurrence in an EHR cohort of lung cancer patients from Veteran Health Administration.

摘要

大型临床数据集来源于保险索赔和电子健康记录 (EHR) 系统,是精准医学研究的宝贵资源。这些数据集可用于开发针对风险或治疗反应的个性化预测模型。然而,使用真实世界的数据高效地得出预测模型面临着实际和方法上的挑战。这些数据库中通常无法提供有关重要临床结果(如癌症进展时间)的精确信息。基于计费或程序代码的简单提取,真实的临床事件时间通常无法很好地近似。而手动注释事件时间在时间和资源上都是不可行的。在本文中,我们提出了一种两步半监督多模态自动时间注释 (MATA) 方法,利用多维纵向 EHR 就诊记录。在步骤 I 中,我们采用功能主成分分析方法,根据未标记患者的观察到的点过程来估计潜在的强度函数。在步骤 II 中,我们在标记数据中使用步骤 I 中提取的特征拟合带惩罚的比例优势模型,其中非参数基线函数使用 B 样条逼近。在正则条件下,特征效应向量的估计量被证明是根-n 一致的。我们通过模拟和退伍军人管理局肺癌患者 EHR 队列中肺癌复发的真实数据示例,展示了我们的方法相对于现有方法的优越性。

相似文献

[1]
Semi-supervised approach to event time annotation using longitudinal electronic health records.

Lifetime Data Anal. 2022-7

[2]
Semi-supervised calibration of noisy event risk (SCANER) with electronic health records.

J Biomed Inform. 2023-8

[3]
Semi-supervised Double Deep Learning Temporal Risk Prediction (SeDDLeR) with Electronic Health Records.

J Biomed Inform. 2024-9

[4]
Automated feature selection of predictors in electronic medical records data.

Biometrics. 2019-3

[5]
A semi-supervised adaptive Markov Gaussian embedding process (SAMGEP) for prediction of phenotype event times using the electronic health record.

Sci Rep. 2022-10-22

[6]
Semi-supervised estimation of covariance with application to phenome-wide association studies with electronic medical records data.

Stat Methods Med Res. 2020-2

[7]
MixEHR-Guided: A guided multi-modal topic modeling approach for large-scale automatic phenotyping using the electronic health record.

J Biomed Inform. 2022-10

[8]
Prior Adaptive Semi-supervised Learning with Application to EHR Phenotyping.

J Mach Learn Res. 2022

[9]
Semi-supervised validation of multiple surrogate outcomes with application to electronic medical records phenotyping.

Biometrics. 2019-3

[10]
Semisupervised Calibration of Risk with Noisy Event Times (SCORNET) using electronic health record data.

Biostatistics. 2023-7-14

引用本文的文献

[1]
A comprehensive review of machine learning algorithms and their application in geriatric medicine: present and future.

Aging Clin Exp Res. 2023-11

[2]
Generate Analysis-Ready Data for Real-world Evidence: Tutorial for Harnessing Electronic Health Records With Advanced Informatic Technologies.

J Med Internet Res. 2023-5-25

[3]
Machine learning approaches for electronic health records phenotyping: a methodical review.

J Am Med Inform Assoc. 2023-1-18

本文引用的文献

[1]
A semi-supervised adaptive Markov Gaussian embedding process (SAMGEP) for prediction of phenotype event times using the electronic health record.

Sci Rep. 2022-10-22

[2]
High-throughput phenotyping with electronic medical record data using a common semi-supervised approach (PheCAP).

Nat Protoc. 2019-11-20

[3]
Determining the Time of Cancer Recurrence Using Claims or Electronic Medical Record Data.

JCO Clin Cancer Inform. 2018-12

[4]
Detecting Lung and Colorectal Cancer Recurrence Using Structured Clinical/Administrative Data to Enable Outcomes Research and Population Health Management.

Med Care. 2017-12

[5]
An Electronic Health Record-based Algorithm to Ascertain the Date of Second Breast Cancer Events.

Med Care. 2017-12

[6]
Surrogate-assisted feature extraction for high-throughput phenotyping.

J Am Med Inform Assoc. 2017-4-1

[7]
Toward high-throughput phenotyping: unbiased automated feature extraction and selection from knowledge sources.

J Am Med Inform Assoc. 2015-9

[8]
On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data.

Stat Med. 2011-1-13

[9]
Flexible parametric proportional-hazards and proportional-odds models for censored survival data, with application to prognostic modelling and estimation of treatment effects.

Stat Med. 2002-8-15

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

推荐工具

医学文档翻译智能文献检索