Department of Biomedical Informatics, Columbia University, New York, New York 10032, USA.
J Am Med Inform Assoc. 2011 Dec;18 Suppl 1(Suppl 1):i109-15. doi: 10.1136/amiajnl-2011-000463. Epub 2011 Nov 23.
To demonstrate that a large, heterogeneous clinical database can reveal fine temporal patterns in clinical associations; to illustrate several types of associations; and to ascertain the value of exploiting time.
Lagged linear correlation was calculated between seven clinical laboratory values and 30 clinical concepts extracted from resident signout notes from a 22-year, 3-million-patient database of electronic health records. Time points were interpolated, and patients were normalized to reduce inter-patient effects.
The method revealed several types of associations with detailed temporal patterns. Definitional associations included low blood potassium preceding 'hypokalemia.' Low potassium preceding the drug spironolactone with high potassium following spironolactone exemplified intentional and physiologic associations, respectively. Counterintuitive results such as the fact that diseases appeared to follow their effects may be due to the workflow of healthcare, in which clinical findings precede the clinician's diagnosis of a disease even though the disease actually preceded the findings. Fully exploiting time by interpolating time points produced less noisy results.
Electronic health records are not direct reflections of the patient state, but rather reflections of the healthcare process and the recording process. With proper techniques and understanding, and with proper incorporation of time, interpretable associations can be derived from a large clinical database.
A large, heterogeneous clinical database can reveal clinical associations, time is an important feature, and care must be taken to interpret the results.
展示大型、异构的临床数据库可以揭示临床关联中的细微时间模式;举例说明几种类型的关联;并确定利用时间的价值。
从电子病历 22 年 300 万患者数据库中提取住院医生交班记录的 30 个临床概念和 7 个临床实验室值,计算滞后线性相关性。插入时间点,并对患者进行标准化以减少患者间的影响。
该方法揭示了具有详细时间模式的几种类型的关联。定义性关联包括低钾血症之前的血钾降低。低血钾症之前的药物螺内酯,高血钾症之后的螺内酯分别代表了有意和生理关联。反直觉的结果,如疾病似乎跟随其效应,可能是由于医疗保健的工作流程,在该流程中,临床发现先于临床医生对疾病的诊断,尽管实际上疾病先于发现。通过插入时间点充分利用时间可以产生更少噪声的结果。
电子病历不是患者状态的直接反映,而是医疗保健过程和记录过程的反映。通过适当的技术和理解,并适当结合时间,可以从大型临床数据库中得出可解释的关联。
大型、异构的临床数据库可以揭示临床关联,时间是一个重要的特征,必须注意解释结果。