Department of Informatics, Technical University of Munich, Boltzmannstraße, Garching, 85748, Germany.
Graduate School of Quantitative Biosciences (QBM), Ludwig-Maximilians-Universität München, München, Germany.
Genome Biol. 2019 Mar 1;20(1):48. doi: 10.1186/s13059-019-1653-z.
Predicting the effects of genetic variants on splicing is highly relevant for human genetics. We describe the framework MMSplice (modular modeling of splicing) with which we built the winning model of the CAGI5 exon skipping prediction challenge. The MMSplice modules are neural networks scoring exon, intron, and splice sites, trained on distinct large-scale genomics datasets. These modules are combined to predict effects of variants on exon skipping, splice site choice, splicing efficiency, and pathogenicity, with matched or higher performance than state-of-the-art. Our models, available in the repository Kipoi, apply to variants including indels directly from VCF files.
预测遗传变异对剪接的影响对人类遗传学至关重要。我们描述了 MMSplice(剪接的模块化建模)框架,我们使用该框架构建了 CAGI5 外显子跳跃预测挑战赛的获胜模型。MMSplice 模块是对不同大规模基因组学数据集进行训练的评分外显子、内含子和剪接位点的神经网络。这些模块组合在一起,预测变异对外显子跳跃、剪接位点选择、剪接效率和致病性的影响,其性能与最先进的方法相匹配或更高。我们的模型,可在 Kipoi 存储库中获得,适用于包括来自 VCF 文件的插入缺失在内的各种变体。