Lee Suhyeon, Kim Suhyun, Koh Gayoun, Ahn Hongryul
Division of Data Science, The University of Suwon, Hwaseong-si 16419, Republic of Korea.
DS&ML Center, The University of Suwon, Hwaseong-si 16419, Republic of Korea.
J Pers Med. 2024 Jul 31;14(8):812. doi: 10.3390/jpm14080812.
Electronic Health Records (EHRs) are a significant source of big data used to track health variables over time. The analysis of EHR data can uncover medical markers or risk factors, aiding in the diagnosis and monitoring of diseases. We introduce a novel method for identifying markers with various temporal trend patterns, including monotonic and fluctuating trends, using machine learning models such as Long Short-Term Memory (LSTM). By applying our method to pneumonia patients in the intensive care unit using the MIMIC-III dataset, we identified markers exhibiting both monotonic and fluctuating trends. Specifically, monotonic markers such as red cell distribution width, urea nitrogen, creatinine, calcium, morphine sulfate, bicarbonate, sodium, troponin T, albumin, and prothrombin time were more frequently observed in the mortality group compared to the recovery group throughout the 10-day period before discharge. Conversely, fluctuating trend markers such as dextrose in sterile water, polystyrene sulfonate, free calcium, and glucose were more frequently observed in the mortality group as the discharge date approached. Our study presents a method for detecting time-series pattern markers in EHR data that respond differently according to disease progression. These markers can contribute to monitoring disease progression and enable stage-specific treatment, thereby advancing precision medicine.
电子健康记录(EHRs)是用于长期跟踪健康变量的大数据的重要来源。对EHR数据的分析可以发现医学标志物或风险因素,有助于疾病的诊断和监测。我们介绍了一种使用长短期记忆(LSTM)等机器学习模型来识别具有各种时间趋势模式(包括单调趋势和波动趋势)的标志物的新方法。通过将我们的方法应用于使用MIMIC-III数据集的重症监护病房中的肺炎患者,我们识别出了呈现单调和波动趋势的标志物。具体而言,在出院前的10天期间,与康复组相比,死亡组中更频繁地观察到诸如红细胞分布宽度、尿素氮、肌酐、钙、硫酸吗啡、碳酸氢盐、钠、肌钙蛋白T、白蛋白和凝血酶原时间等单调标志物。相反,随着出院日期临近,在死亡组中更频繁地观察到诸如灭菌注射用水中的葡萄糖、聚苯乙烯磺酸钠、游离钙和葡萄糖等波动趋势标志物。我们的研究提出了一种在EHR数据中检测根据疾病进展而有不同反应的时间序列模式标志物的方法。这些标志物有助于监测疾病进展并实现阶段特异性治疗,从而推动精准医学的发展。