Simmons Michael, Singhal Ayush, Lu Zhiyong
National Center for Biotechnology Information (NCBI), National Library of Medicine (NLM), 8600 Rockville Pike, Bldg 38A, 10N1003A, Bethesda, MD, 20894, USA.
Adv Exp Med Biol. 2016;939:139-166. doi: 10.1007/978-981-10-1503-8_7.
The key question of precision medicine is whether it is possible to find clinically actionable granularity in diagnosing disease and classifying patient risk. The advent of next-generation sequencing and the widespread adoption of electronic health records (EHRs) have provided clinicians and researchers a wealth of data and made possible the precise characterization of individual patient genotypes and phenotypes. Unstructured text-found in biomedical publications and clinical notes-is an important component of genotype and phenotype knowledge. Publications in the biomedical literature provide essential information for interpreting genetic data. Likewise, clinical notes contain the richest source of phenotype information in EHRs. Text mining can render these texts computationally accessible and support information extraction and hypothesis generation. This chapter reviews the mechanics of text mining in precision medicine and discusses several specific use cases, including database curation for personalized cancer medicine, patient outcome prediction from EHR-derived cohorts, and pharmacogenomic research. Taken as a whole, these use cases demonstrate how text mining enables effective utilization of existing knowledge sources and thus promotes increased value for patients and healthcare systems. Text mining is an indispensable tool for translating genotype-phenotype data into effective clinical care that will undoubtedly play an important role in the eventual realization of precision medicine.
精准医学的关键问题在于,是否有可能在疾病诊断和患者风险分类中找到具有临床可操作性的精细程度。下一代测序技术的出现以及电子健康记录(EHR)的广泛应用,为临床医生和研究人员提供了大量数据,并使得精确描述个体患者的基因型和表型成为可能。在生物医学出版物和临床记录中发现的非结构化文本,是基因型和表型知识的重要组成部分。生物医学文献中的出版物为解读遗传数据提供了重要信息。同样,临床记录包含了电子健康记录中最丰富的表型信息来源。文本挖掘可以使这些文本在计算上易于获取,并支持信息提取和假设生成。本章回顾了精准医学中文本挖掘的机制,并讨论了几个具体的用例,包括个性化癌症医学的数据库管理、从电子健康记录衍生队列中预测患者结局以及药物基因组学研究。总体而言,这些用例展示了文本挖掘如何有效利用现有知识来源,从而为患者和医疗系统提升价值。文本挖掘是将基因型-表型数据转化为有效临床护理的不可或缺的工具,无疑将在精准医学的最终实现中发挥重要作用。