Suppr超能文献

将遗传编程应用于预测替代性mRNA剪接变体。

Applying genetic programming to the prediction of alternative mRNA splice variants.

作者信息

Vukusic Ivana, Grellscheid Sushma Nagaraja, Wiehe Thomas

机构信息

Institut für Genetik, Universität zu Köln, Zülpicher Strasse 47, 50674 Köln, Germany.

出版信息

Genomics. 2007 Apr;89(4):471-9. doi: 10.1016/j.ygeno.2007.01.001. Epub 2007 Feb 5.

Abstract

Genetic programming (GP) can be used to classify a given gene sequence as either constitutively or alternatively spliced. We describe the principles of GP and apply it to a well-defined data set of alternatively spliced genes. A feature matrix of sequence properties, such as nucleotide composition or exon length, was passed to the GP system "Discipulus." To test its performance we concentrated on cassette exons (SCE) and retained introns (SIR). We analyzed 27,519 constitutively spliced and 9641 cassette exons including their neighboring introns; in addition we analyzed 33,316 constitutively spliced introns compared to 2712 retained introns. We find that the classifier yields highly accurate predictions on the SIR data with a sensitivity of 92.1% and a specificity of 79.2%. Prediction accuracies on the SCE data are lower, 47.3% (sensitivity) and 70.9% (specificity), indicating that alternative splicing of introns can be better captured by sequence properties than that of exons.

摘要

遗传编程(GP)可用于将给定的基因序列分类为组成型剪接或可变剪接。我们描述了遗传编程的原理,并将其应用于一个定义明确的可变剪接基因数据集。一个包含序列特性(如核苷酸组成或外显子长度)的特征矩阵被传递给遗传编程系统“Discipulus”。为了测试其性能,我们重点关注盒式外显子(SCE)和保留内含子(SIR)。我们分析了27519个组成型剪接的外显子以及9641个盒式外显子(包括其相邻内含子);此外,我们还分析了33316个组成型剪接的内含子,并与2712个保留内含子进行了比较。我们发现,该分类器对保留内含子数据产生了高度准确的预测,灵敏度为92.1%,特异性为79.2%。对盒式外显子数据的预测准确率较低,分别为47.3%(灵敏度)和70.9%(特异性),这表明内含子的可变剪接比外显子的可变剪接能更好地通过序列特性来捕捉。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验