Suppr超能文献

人类深内含子变异的计算预测。

Computational prediction of human deep intronic variation.

机构信息

LASIGE, Departamento de Informática, Faculdade de Ciências, Universidade de Lisboa, 1749-016,, Lisboa, Portugal.

Instituto de Medicina Molecular João Lobo Antunes, Faculdade de Medicina, Universidade de Lisboa, 1649-028, Lisboa, Portugal.

出版信息

Gigascience. 2022 Dec 28;12. doi: 10.1093/gigascience/giad085. Epub 2023 Oct 25.

Abstract

BACKGROUND

The adoption of whole-genome sequencing in genetic screens has facilitated the detection of genetic variation in the intronic regions of genes, far from annotated splice sites. However, selecting an appropriate computational tool to discriminate functionally relevant genetic variants from those with no effect is challenging, particularly for deep intronic regions where independent benchmarks are scarce.

RESULTS

In this study, we have provided an overview of the computational methods available and the extent to which they can be used to analyze deep intronic variation. We leveraged diverse datasets to extensively evaluate tool performance across different intronic regions, distinguishing between variants that are expected to disrupt splicing through different molecular mechanisms. Notably, we compared the performance of SpliceAI, a widely used sequence-based deep learning model, with that of more recent methods that extend its original implementation. We observed considerable differences in tool performance depending on the region considered, with variants generating cryptic splice sites being better predicted than those that potentially affect splicing regulatory elements. Finally, we devised a novel quantitative assessment of tool interpretability and found that tools providing mechanistic explanations of their predictions are often correct with respect to the ground - information, but the use of these tools results in decreased predictive power when compared to black box methods.

CONCLUSIONS

Our findings translate into practical recommendations for tool usage and provide a reference framework for applying prediction tools in deep intronic regions, enabling more informed decision-making by practitioners.

摘要

背景

全基因组测序在遗传筛选中的应用使得人们能够检测到远离注释剪接位点的基因内含子区域中的遗传变异。然而,选择一种合适的计算工具来区分具有功能相关性的遗传变异和无影响的变异具有挑战性,特别是在独立基准稀缺的深内含子区域。

结果

在这项研究中,我们概述了现有的计算方法,并评估了它们在分析深内含子变异方面的适用程度。我们利用多样化的数据集,在不同的内含子区域上广泛评估了工具的性能,区分了预期通过不同分子机制破坏剪接的变异。值得注意的是,我们比较了广泛使用的基于序列的深度学习模型 SpliceAI 与扩展其原始实现的最新方法的性能。我们观察到,工具的性能取决于所考虑的区域,产生隐蔽剪接位点的变异比那些可能影响剪接调控元件的变异预测得更好。最后,我们设计了一种新的工具可解释性的定量评估方法,发现提供预测机制解释的工具在与地面信息的比较中往往是正确的,但与黑盒方法相比,使用这些工具会降低预测能力。

结论

我们的研究结果转化为工具使用的实用建议,并为在深内含子区域应用预测工具提供了参考框架,使从业人员能够做出更明智的决策。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c995/10599398/5cba11e24713/giad085fig1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验