Integrated Program in Quantitative Biology, University of California San Francisco, San Francisco, CA, USA.
Bakar Computational Health Sciences Institute, University of California San Francisco, San Francisco, CA, USA.
Nat Commun. 2019 Jul 10;10(1):3045. doi: 10.1038/s41467-019-11069-0.
In order to advance precision medicine, detailed clinical features ought to be described in a way that leverages current knowledge. Although data collected from biomedical research is expanding at an almost exponential rate, our ability to transform that information into patient care has not kept at pace. A major barrier preventing this transformation is that multi-dimensional data collection and analysis is usually carried out without much understanding of the underlying knowledge structure. Here, in an effort to bridge this gap, Electronic Health Records (EHRs) of individual patients are connected to a heterogeneous knowledge network called Scalable Precision Medicine Oriented Knowledge Engine (SPOKE). Then an unsupervised machine-learning algorithm creates Propagated SPOKE Entry Vectors (PSEVs) that encode the importance of each SPOKE node for any code in the EHRs. We argue that these results, alongside the natural integration of PSEVs into any EHR machine-learning platform, provide a key step toward precision medicine.
为了推进精准医学,应当以利用现有知识的方式详细描述临床特征。虽然从生物医学研究中收集的数据呈指数级增长,但我们将这些信息转化为患者护理的能力并没有跟上。阻碍这种转化的一个主要障碍是,多维数据的收集和分析通常在对底层知识结构缺乏了解的情况下进行。在这里,为了弥补这一差距,将个体患者的电子健康记录 (EHR) 与一个名为可扩展精准医学定向知识引擎 (SPOKE) 的异构知识网络连接起来。然后,一种无监督的机器学习算法创建传播 SPOKE 条目向量 (PSEV),为 EHR 中的任何代码编码 SPOKE 节点的重要性。我们认为,这些结果以及 PSEV 自然融入任何 EHR 机器学习平台,为精准医学提供了关键的一步。