Suppr超能文献

利用热图挖掘结构化常规护理数据中的关联。

Employing heat maps to mine associations in structured routine care data.

机构信息

Chair of Medical Informatics, University of Erlangen-Nuremberg, Krankenhausstr. 12, 91054 Erlangen, Germany.

Medical Center for Information and Communication, Erlangen University Hospital, Krankenhausstr. 12, 91054 Erlangen, Germany.

出版信息

Artif Intell Med. 2014 Feb;60(2):79-88. doi: 10.1016/j.artmed.2013.12.003. Epub 2013 Dec 15.

Abstract

OBJECTIVE

Mining the electronic medical record (EMR) has the potential to deliver new medical knowledge about causal effects, which are hidden in statistical associations between different patient attributes. It is our goal to detect such causal mechanisms within current research projects which include e.g. the detection of determinants of imminent ICU readmission. An iterative statistical approach to examine each set of considered attribute pairs delivers potential answers but is difficult to interpret. Therefore, we aimed to improve the interpretation of the resulting matrices by the use of heat maps. We propose strategies to adapt heat maps for the search for associations and causal effects within routine EMR data.

METHODS

Heat maps visualize tabulated metric datasets as grid-like choropleth maps, and thus present measures of association between numerous attribute pairs clearly arranged. Basic assumptions about plausible exposures and outcomes are used to allocate distinct attribute sets to both matrix dimensions. The image then avoids certain redundant graphical elements and provides a clearer picture of the supposed associations. Specific color schemes have been chosen to incorporate preexisting information about similarities between attributes. The use of measures of association as a clustering input has been taken as a trigger to apply transformations which ensure that distance metrics always assume finite values and treat positive and negative associations in the same way. To evaluate the general capability of the approach, we conducted analyses of simulated datasets and assessed diagnostic and procedural codes in a large routine care dataset.

RESULTS

Simulation results demonstrate that the proposed clustering procedure rearranges attributes similar to simulated statistical associations. Thus, heat maps are an excellent tool to indicate whether associations concern the same attributes or different ones, and whether affected attribute sets conform to any preexisting relationship between attributes. The dendrograms help in deciding if contiguous sequences of attributes effectively correspond to homogeneous attribute associations. The exemplary analysis of a routine care dataset revealed patterns of associations that follow plausible medical constellations for several diseases and the associated medical procedures and activities. Cases with breast cancer (ICD C50), for example, appeared to be associated with radiation therapy (8-52). In cross check, approximately 60 percent of the attribute pairs in this dataset showed a strong negative association, which can be explained by diseases treated in a medical specialty which routinely does not perform the respective procedures in these cases. The corresponding diagram clearly reflects these relationships in the shape of coherent subareas.

CONCLUSION

We could demonstrate that heat maps of measures of association are effective for the visualization of patterns in routine care EMRs. The adjustable method for the assignment of attributes to image dimensions permits a balance between the display of ample information and a favorable level of graphical complexity. The scope of the search can be adapted by the use of pre-existing assumptions about plausible effects to select exposure and outcome attributes. Thus, the proposed method promises to simplify the detection of undiscovered causal effects within routine EMR data.

摘要

目的

挖掘电子病历(EMR)具有发现隐藏在不同患者属性之间统计关联背后的因果效应的潜力。我们的目标是在当前的研究项目中检测到这种因果机制,例如检测 ICU 再入院的决定因素。迭代统计方法检查每一组考虑的属性对,可以提供潜在的答案,但难以解释。因此,我们旨在通过使用热图来提高对生成矩阵的解释。我们提出了在常规 EMR 数据中搜索关联和因果效应的策略,以适应热图。

方法

热图将表格化的度量数据集可视化作为网格状的等值线地图,从而清晰地呈现出大量属性对之间的关联度量。基本的假设是合理的暴露和结果,将不同的属性集分配到矩阵的两个维度上。然后,图像避免了某些冗余的图形元素,并提供了关联的清晰画面。选择了特定的颜色方案来合并属性之间的相似性的现有信息。将关联度量用作聚类输入的触发因素,应用了变换,以确保距离度量始终取有限的值,并以相同的方式处理正关联和负关联。为了评估该方法的总体能力,我们对模拟数据集进行了分析,并在大型常规护理数据集中评估了诊断和程序代码。

结果

模拟结果表明,所提出的聚类过程对相似的模拟统计关联进行了属性重排。因此,热图是一个很好的工具,可以指示关联是否涉及相同的属性或不同的属性,以及受影响的属性集是否符合属性之间的任何预先存在的关系。树状图有助于确定属性的连续序列是否对应于同质的属性关联。对常规护理数据集的示例分析揭示了与几种疾病和相关医疗程序和活动相关的关联模式。例如,患有乳腺癌(ICD C50)的病例似乎与放射治疗(8-52)有关。交叉检查显示,该数据集中约 60%的属性对呈强烈的负相关,这可以用治疗疾病的医学专业通常不在这些病例中进行相关程序来解释。相应的图表以连贯的子区域的形式清晰地反映了这些关系。

结论

我们能够证明关联度量的热图对于常规护理 EMR 中的模式可视化是有效的。可调整的将属性分配给图像维度的方法可以在显示丰富信息和图形复杂度之间达到平衡。可以通过使用关于合理影响的先验假设来调整搜索范围,以选择暴露和结果属性。因此,该方法有望简化常规 EMR 数据中未发现的因果效应的检测。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验