Sorbonne Université, CNRS, IBPS, Laboratoire de Biologie Computationnelle et Quantitative (LCQB), 75005 Paris, France.
Bioinformatics Unit (MF1), Department for Methods Development and Research Infrastructure, Robert Koch Institute, 13353 Berlin, Germany.
Genome Res. 2021 Aug;31(8):1462-1473. doi: 10.1101/gr.274696.120. Epub 2021 Jun 15.
Understanding how protein function has evolved and diversified is of great importance for human genetics and medicine. Here, we tackle the problem of describing the whole transcript variability observed in several species by generalizing the definition of splicing graph. We provide a practical solution to construct parsimonious evolutionary splicing graphs where each node is a minimal transcript building block defined across species. We show a clear link between the functional relevance, tissue regulation, and conservation of alternative transcripts on a set of 50 genes. By scaling up to the whole human protein-coding genome, we identify a few thousand genes where alternative splicing modulates the number and composition of pseudorepeats. We have implemented our approach in ThorAxe, an efficient, versatile, robust, and freely available computational tool.
了解蛋白质功能是如何进化和多样化的,对于人类遗传学和医学来说非常重要。在这里,我们通过推广剪接图的定义来解决描述在几个物种中观察到的整个转录变体的问题。我们提供了一种实用的解决方案来构建简约的进化剪接图,其中每个节点都是在物种间定义的最小转录构建块。我们在一组 50 个基因上展示了功能相关性、组织调节和替代转录本的保守性之间的明确联系。通过扩展到整个人类蛋白质编码基因组,我们确定了几千个基因,其中可变剪接调节了伪重复的数量和组成。我们已经在 ThorAxe 中实现了我们的方法,这是一个高效、通用、强大且免费的计算工具。