Department of Plant Biology, Michigan State University, East Lansing, MI, USA; Bioinformatics and Cellular Genomics, St. Vincent's Institute of Medical Research, Fitzroy, Victoria, Australia.
Department of Computer Science and Engineering, Michigan State University, East Lansing, MI, USA.
Trends Genet. 2020 Jun;36(6):442-455. doi: 10.1016/j.tig.2020.03.005. Epub 2020 Apr 17.
Because of its ability to find complex patterns in high dimensional and heterogeneous data, machine learning (ML) has emerged as a critical tool for making sense of the growing amount of genetic and genomic data available. While the complexity of ML models is what makes them powerful, it also makes them difficult to interpret. Fortunately, efforts to develop approaches that make the inner workings of ML models understandable to humans have improved our ability to make novel biological insights. Here, we discuss the importance of interpretable ML, different strategies for interpreting ML models, and examples of how these strategies have been applied. Finally, we identify challenges and promising future directions for interpretable ML in genetics and genomics.
由于机器学习(ML)能够在高维异构数据中找到复杂的模式,因此它已成为理解越来越多可用的遗传和基因组数据的关键工具。虽然 ML 模型的复杂性使它们变得强大,但也使得它们难以解释。幸运的是,开发使 ML 模型的内部工作原理对人类来说可以理解的方法的努力提高了我们获得新的生物学见解的能力。在这里,我们讨论了可解释性 ML 的重要性、解释 ML 模型的不同策略,以及这些策略在应用中的例子。最后,我们确定了遗传学和基因组学中可解释性 ML 的挑战和有前途的未来方向。