Max Delbrück Center for Molecular Medicine (MDC), Helmholtz Association of German Research Centers, Berlin Institute for Medical Systems Biology (BIMSB), Berlin, Germany; email:
Digital Health-Machine Learning, Hasso Plattner Institute, Digital Engineering Faculty, University of Potsdam, Potsdam, Germany.
Annu Rev Biomed Data Sci. 2023 Aug 10;6:191-210. doi: 10.1146/annurev-biodatasci-122120-110102. Epub 2023 Jun 1.
Understanding the noncoding part of the genome, which encodes gene regulation, is necessary to identify genetic mechanisms of disease and translate findings from genome-wide association studies into actionable results for treatments and personalized care. Here we provide an overview of the computational analysis of noncoding regions, starting from gene-regulatory mechanisms and their representation in data. Deep learning methods, when applied to these data, highlight important regulatory sequence elements and predict the functional effects of genetic variants. These and other algorithms are used to predict damaging sequence variants. Finally, we introduce rare-variant association tests that incorporate functional annotations and predictions in order to increase interpretability and statistical power.
理解基因组的非编码部分,即基因调控的编码,对于识别疾病的遗传机制以及将全基因组关联研究的发现转化为治疗和个性化护理的可行结果是必要的。在这里,我们提供了非编码区域计算分析的概述,从基因调控机制及其在数据中的表示开始。当将深度学习方法应用于这些数据时,它们突出了重要的调节序列元素,并预测了遗传变异的功能影响。这些和其他算法用于预测破坏性的序列变异。最后,我们介绍了罕见变异关联测试,该测试将功能注释和预测纳入其中,以提高可解释性和统计功效。