Institute for Cardiovascular Regeneration, Goethe University, 60590 Frankfurt am Main, Germany.
Institute of Computer Sciences and Engineering, University of Jeddah, 21959 Jeddah, Saudi Arabia.
Biol Chem. 2021 Jul 5;402(8):871-885. doi: 10.1515/hsz-2021-0109. Print 2021 Jul 27.
Using results from genome-wide association studies for understanding complex traits is a current challenge. Here we review how genotype data can be used with different machine learning (ML) methods to predict phenotype occurrence and severity from genotype data. We discuss common feature encoding schemes and how studies handle the often small number of samples compared to the huge number of variants. We compare which ML methods are being applied, including recent results using deep neural networks. Further, we review the application of methods for feature explanation and interpretation.
利用全基因组关联研究的结果来理解复杂性状是当前的一个挑战。在这里,我们回顾了如何使用不同的机器学习(ML)方法,根据基因型数据预测表型的发生和严重程度。我们讨论了常见的特征编码方案,以及研究如何处理与大量变体相比通常数量较少的样本。我们比较了正在应用的哪些 ML 方法,包括最近使用深度神经网络的结果。此外,我们还回顾了特征解释和解释方法的应用。