The Donnelly Centre, University of Toronto, Toronto, Canada.
The Donnelly Centre, University of Toronto, Toronto, Canada; Department of Molecular Genetics, University of Toronto, Toronto, Canada; Department of Computer Science, University of Toronto, Toronto, Canada; The Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Toronto, Canada.
J Mol Biol. 2018 Sep 14;430(18 Pt A):2924-2938. doi: 10.1016/j.jmb.2018.05.037. Epub 2018 Jun 1.
Clinical research and practice in the 21st century is poised to be transformed by analysis of computable electronic medical records and population-level genome-scale patient profiles. Genomic data capture genetic and environmental state, providing information on heterogeneity in disease and treatment outcome, but genomic-based clinical risk scores are limited. Achieving the goal of routine precision medicine that takes advantage of these rich genomics data will require computational methods that support heterogeneous data, have excellent predictive performance, and ideally, provide biologically interpretable results. Traditional machine-learning approaches excel at performance, but often have limited interpretability. Patient similarity networks are an emerging paradigm for precision medicine, in which patients are clustered or classified based on their similarities in various features, including genomic profiles. This strategy is analogous to standard medical diagnosis, has excellent performance, is interpretable, and can preserve patient privacy. We review new methods based on patient similarity networks, including Similarity Network Fusion for patient clustering and netDx for patient classification. While these methods are already useful, much work is required to improve their scalability for contemporary genetic cohorts, optimize parameters, and incorporate a wide range of genomics and clinical data. The coming 5 years will provide an opportunity to assess the utility of network-based algorithms for precision medicine.
在 21 世纪,通过对可计算的电子病历和人群规模的基因组患者谱进行分析,临床研究和实践有望发生变革。基因组数据捕捉遗传和环境状态,提供疾病和治疗结果异质性的信息,但基于基因组的临床风险评分有限。要实现常规精准医学的目标,充分利用这些丰富的基因组数据,就需要计算方法来支持异构数据,具有出色的预测性能,并且理想情况下,提供具有生物学可解释性的结果。传统的机器学习方法在性能上表现出色,但通常可解释性有限。患者相似性网络是精准医学的一种新兴范例,其中患者根据其在各种特征(包括基因组谱)上的相似性进行聚类或分类。这种策略类似于标准的医学诊断,具有出色的性能、可解释性,并能保护患者隐私。我们回顾了基于患者相似性网络的新方法,包括用于患者聚类的相似网络融合和用于患者分类的 netDx。虽然这些方法已经很有用,但仍需要做大量工作来提高它们对当代遗传队列的可扩展性、优化参数,并纳入广泛的基因组和临床数据。未来 5 年将有机会评估基于网络的算法在精准医学中的效用。