Turku Centre for Computer Science (TUCS), Turku, Finland.
BioData Min. 2013 Mar 1;6(1):5. doi: 10.1186/1756-0381-6-5.
A central challenge in systems biology and medical genetics is to understand how interactions among genetic loci contribute to complex phenotypic traits and human diseases. While most studies have so far relied on statistical modeling and association testing procedures, machine learning and predictive modeling approaches are increasingly being applied to mining genotype-phenotype relationships, also among those associations that do not necessarily meet statistical significance at the level of individual variants, yet still contributing to the combined predictive power at the level of variant panels. Network-based analysis of genetic variants and their interaction partners is another emerging trend by which to explore how sub-network level features contribute to complex disease processes and related phenotypes. In this review, we describe the basic concepts and algorithms behind machine learning-based genetic feature selection approaches, their potential benefits and limitations in genome-wide setting, and how physical or genetic interaction networks could be used as a priori information for providing improved predictive power and mechanistic insights into the disease networks. These developments are geared toward explaining a part of the missing heritability, and when combined with individual genomic profiling, such systems medicine approaches may also provide a principled means for tailoring personalized treatment strategies in the future.
系统生物学和医学遗传学的一个核心挑战是了解遗传基因座之间的相互作用如何导致复杂的表型特征和人类疾病。虽然到目前为止大多数研究都依赖于统计建模和关联测试程序,但机器学习和预测建模方法越来越多地被应用于挖掘基因型-表型关系,包括那些在个体变异水平上不一定具有统计学意义的关联,但仍然有助于变异面板水平的综合预测能力。基于网络的遗传变异及其相互作用伙伴的分析是另一种新兴趋势,可以探索亚网络级特征如何导致复杂的疾病过程和相关表型。在这篇综述中,我们描述了基于机器学习的遗传特征选择方法背后的基本概念和算法,以及它们在全基因组环境中的潜在优势和局限性,以及物理或遗传相互作用网络如何可用作提供改进的预测能力和对疾病网络的机制见解的先验信息。这些发展旨在解释部分遗传缺失的原因,并且当与个体基因组分析结合使用时,这些系统医学方法也可能为未来量身定制个性化治疗策略提供一种原则性的手段。