Molecular Biology, Cell Biology and Biochemistry, Brown University, Providence, Rhode Island.
Center for Computational Molecular Biology, Brown University, Providence, Rhode Island.
Hum Mutat. 2019 Sep;40(9):1225-1234. doi: 10.1002/humu.23866. Epub 2019 Aug 17.
Classification of variants of unknown significance is a challenging technical problem in clinical genetics. As up to one-third of disease-causing mutations are thought to affect pre-mRNA splicing, it is important to accurately classify splicing mutations in patient sequencing data. Several consortia and healthcare systems have conducted large-scale patient sequencing studies, which discover novel variants faster than they can be classified. Here, we compare the advantages and limitations of several high-throughput splicing assays aimed at mitigating this bottleneck, and describe a data set of ~5,000 variants that we analyzed using our Massively Parallel Splicing Assay (MaPSy). The Critical Assessment of Genome Interpretation group (CAGI) organized a challenge, in which participants submitted machine learning models to predict the splicing effects of variants in this data set. We discuss the winning submission of the challenge (MMSplice) which outperformed existing software. Finally, we highlight methods to overcome the limitations of MaPSy and similar assays, such as tissue-specific splicing, the effect of surrounding sequence context, classifying intronic variants, synthesizing large exons, and amplifying complex libraries of minigene species. Further development of these assays will greatly benefit the field of clinical genetics, which lack high-throughput methods for variant interpretation.
意义不明变异体的分类是临床遗传学中的一个具有挑战性的技术问题。由于多达三分之一的致病突变被认为会影响前体 mRNA 的剪接,因此准确分类患者测序数据中的剪接突变非常重要。一些联盟和医疗保健系统已经进行了大规模的患者测序研究,这些研究发现新的变异体的速度比它们能够被分类的速度还要快。在这里,我们比较了几种旨在缓解这一瓶颈的高通量剪接检测方法的优缺点,并描述了一个约 5000 个变体的数据集,我们使用我们的大规模并行剪接分析(MaPSy)对其进行了分析。基因组解释评估小组(CAGI)组织了一次挑战,参与者提交了机器学习模型来预测这个数据集变体的剪接效应。我们讨论了挑战的获胜提交(MMSplice),它优于现有的软件。最后,我们强调了克服 MaPSy 和类似检测方法的局限性的方法,例如组织特异性剪接、周围序列上下文的影响、分类内含子变体、合成大外显子以及扩增复杂的 minigene 文库。这些检测方法的进一步发展将极大地有益于临床遗传学领域,该领域缺乏用于变异解释的高通量方法。