Chicco Davide, Faultless Trent
Krembil Research Institute, Toronto, Ontario, Canada.
Toronto General Hospital, Toronto, Ontario, Canada.
Methods Mol Biol. 2021;2212:169-179. doi: 10.1007/978-1-0716-0947-7_11.
In biology, the term "epistasis" indicates the effect of the interaction of a gene with another gene. A gene can interact with an independently sorted gene, located far away on the chromosome or on an entirely different chromosome, and this interaction can have a strong effect on the function of the two genes. These changes then can alter the consequences of the biological processes, influencing the organism's phenotype. Machine learning is an area of computer science that develops statistical methods able to recognize patterns from data. A typical machine learning algorithm consists of a training phase, where the model learns to recognize specific trends in the data, and a test phase, where the trained model applies its learned intelligence to recognize trends in external data. Scientists have applied machine learning to epistasis problems multiple times, especially to identify gene-gene interactions from genome-wide association study (GWAS) data. In this brief survey, we report and describe the main scientific articles published in data mining and epistasis. Our article confirms the effectiveness of machine learning in this genetics subfield.
在生物学中,术语“上位性”表示一个基因与另一个基因相互作用的效应。一个基因可以与一个独立分离的基因相互作用,该基因位于染色体上较远的位置或完全不同的染色体上,这种相互作用会对这两个基因的功能产生强烈影响。这些变化进而会改变生物过程的结果,影响生物体的表型。机器学习是计算机科学的一个领域,它开发能够从数据中识别模式的统计方法。典型的机器学习算法包括一个训练阶段,模型在该阶段学习识别数据中的特定趋势,以及一个测试阶段,训练后的模型在该阶段应用其所学知识来识别外部数据中的趋势。科学家们多次将机器学习应用于上位性问题,特别是从全基因组关联研究(GWAS)数据中识别基因-基因相互作用。在这个简短的综述中,我们报告并描述了数据挖掘和上位性领域发表的主要科学文章。我们的文章证实了机器学习在这个遗传学子领域的有效性。