Banerjee Ashis Gopal, Khan Mridul, Higgins John, Giani Annarita, Das Amar K
General Electric Global Research, Niskayuna, NY.
Department of Computer Science.
AMIA Annu Symp Proc. 2015 Nov 5;2015:306-13. eCollection 2015.
A major challenge in advancing scientific discoveries using data-driven clinical research is the fragmentation of relevant data among multiple information systems. This fragmentation requires significant data-engineering work before correlations can be found among data attributes in multiple systems. In this paper, we focus on integrating information on breast cancer care, and present a novel computational approach to identify correlations between administered drugs captured in an electronic medical records and biological factors obtained from a tumor registry through rapid data aggregation and analysis. We use an associative memory (AM) model to encode all existing associations among the data attributes from both systems in a high-dimensional vector space. The AM model stores highly associated data items in neighboring memory locations to enable efficient querying operations. The results of applying AM to a set of integrated data on tumor markers and drug administrations discovered anomalies between clinical recommendations and derived associations.
利用数据驱动的临床研究推进科学发现面临的一个主要挑战是相关数据在多个信息系统之间的碎片化。这种碎片化要求在多个系统中的数据属性之间找到关联之前,进行大量的数据工程工作。在本文中,我们专注于整合乳腺癌护理信息,并提出一种新颖的计算方法,通过快速的数据聚合和分析,识别电子病历中记录的给药药物与从肿瘤登记处获得的生物学因素之间的关联。我们使用关联记忆(AM)模型在高维向量空间中对来自两个系统的数据属性之间的所有现有关联进行编码。AM模型将高度相关的数据项存储在相邻的内存位置,以实现高效的查询操作。将AM应用于一组关于肿瘤标志物和药物给药的整合数据的结果发现了临床建议与推导关联之间的异常。