Department of Biostatistics and Informatics, Colorado School of Public Health, University of Colorado Anschutz Medical Campus, 13001 East 17th Place, 3rd Floor, Mail Stop B119, Aurora, CO 80045, United States of America; Department of Biomedical Informatics, University of Colorado School of Medicine, Anschutz Health Sciences Building, 1890 N. Revere Court, Mailstop F600, Aurora, CO 80045, United States of America.
Department of Biomedical Informatics, University of Colorado School of Medicine, Anschutz Health Sciences Building, 1890 N. Revere Court, Mailstop F600, Aurora, CO 80045, United States of America; Department of Biomedical Engineering, University of Colorado, 12705 East Montview Boulevard, Suite 100, Aurora, CO 80045, United States of America.
J Biomed Inform. 2023 Dec;148:104547. doi: 10.1016/j.jbi.2023.104547. Epub 2023 Nov 18.
Computing phenotypes that provide high-fidelity, time-dependent characterizations and yield personalized interpretations is challenging, especially given the complexity of physiological and healthcare systems and clinical data quality. This paper develops a methodological pipeline to estimate unmeasured physiological parameters and produce high-fidelity, personalized phenotypes anchored to physiological mechanics from electronic health record (EHR).
A methodological phenotyping pipeline is developed that computes new phenotypes defined with unmeasurable computational biomarkers quantifying specific physiological properties in real time. Working within the inverse problem framework, this pipeline is applied to the glucose-insulin system for ICU patients using data assimilation to estimate an established mathematical physiological model with stochastic optimization. This produces physiological model parameter vectors of clinically unmeasured endocrine properties, here insulin secretion, clearance, and resistance, estimated for individual patient. These physiological parameter vectors are used as inputs to unsupervised machine learning methods to produce phenotypic labels and discrete physiological phenotypes. These phenotypes are inherently interpretable because they are based on parametric physiological descriptors. To establish potential clinical utility, the computed phenotypes are evaluated with external EHR data for consistency and reliability and with clinician face validation.
The phenotype computation was performed on a cohort of 109 ICU patients who received no or short-acting insulin therapy, rendering continuous and discrete physiological phenotypes as specific computational biomarkers of unmeasured insulin secretion, clearance, and resistance on time windows of three days. Six, six, and five discrete phenotypes were found in the first, middle, and last three-day periods of ICU stays, respectively. Computed phenotypic labels were predictive with an average accuracy of 89%. External validation of discrete phenotypes showed coherence and consistency in clinically observable differences based on laboratory measurements and ICD 9/10 codes and clinical concordance from face validity. A particularly clinically impactful parameter, insulin secretion, had a concordance accuracy of 83%±27%.
The new physiological phenotypes computed with individual patient ICU data and defined by estimates of mechanistic model parameters have high physiological fidelity, are continuous, time-specific, personalized, interpretable, and predictive. This methodology is generalizable to other clinical and physiological settings and opens the door for discovering deeper physiological information to personalize medical care.
计算提供高保真度、时变特征并产生个性化解释的表型具有挑战性,特别是考虑到生理和医疗保健系统以及临床数据质量的复杂性。本文开发了一种方法学管道,从电子健康记录(EHR)中估计未测量的生理参数并生成基于生理力学的高保真度、个性化表型。
开发了一种方法学表型分析管道,该管道计算新的表型,这些表型使用实时量化特定生理特性的不可测量计算生物标志物来定义。在逆问题框架内,该管道应用于 ICU 患者的葡萄糖-胰岛素系统,使用数据同化通过随机优化来估计已建立的数学生理模型。这为个体患者产生了临床不可测量内分泌特性的生理模型参数向量,这里是胰岛素分泌、清除和抵抗。这些生理参数向量被用作无监督机器学习方法的输入,以产生表型标签和离散生理表型。这些表型具有内在的可解释性,因为它们基于参数生理描述符。为了建立潜在的临床实用性,通过外部 EHR 数据评估计算出的表型,以评估其一致性和可靠性,并进行临床医生的面部验证。
对 109 名未接受或接受短期胰岛素治疗的 ICU 患者进行了表型计算,在三天的时间窗口内,将连续和离散的生理表型作为未测量的胰岛素分泌、清除和抵抗的特定计算生物标志物。在 ICU 住院的前、中、后三个三天期间,分别发现了六个、六个和五个离散表型。计算出的表型标签具有平均准确率为 89%的预测能力。离散表型的外部验证显示,基于实验室测量和 ICD 9/10 代码以及临床医生面部验证的临床可观察差异具有一致性和连贯性。一个特别具有临床影响力的参数,即胰岛素分泌,其一致性准确率为 83%±27%。
使用个体患者 ICU 数据计算并通过机制模型参数估计定义的新生理表型具有高生理保真度、连续、时变、个性化、可解释和可预测性。该方法具有普遍性,可以应用于其他临床和生理环境,并为发现更深层次的生理信息以实现医疗保健个性化开辟了道路。