• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

帕尔马:使用大间隔算法将信使核糖核酸与基因组进行比对。

PALMA: mRNA to genome alignments using large margin algorithms.

作者信息

Schulze Uta, Hepp Bettina, Ong Cheng Soon, Rätsch Gunnar

机构信息

Friedrich Miescher Laboratory, Max Planck Society, Tübingen, Germany.

出版信息

Bioinformatics. 2007 Aug 1;23(15):1892-900. doi: 10.1093/bioinformatics/btm275. Epub 2007 May 30.

DOI:10.1093/bioinformatics/btm275
PMID:17537755
Abstract

MOTIVATION

Despite many years of research on how to properly align sequences in the presence of sequencing errors, alternative splicing and micro-exons, the correct alignment of mRNA sequences to genomic DNA is still a challenging task.

RESULTS

We present a novel approach based on large margin learning that combines accurate splice site predictions with common sequence alignment techniques. By solving a convex optimization problem, our algorithm-called PALMA-tunes the parameters of the model such that true alignments score higher than other alignments. We study the accuracy of alignments of mRNAs containing artificially generated micro-exons to genomic DNA. In a carefully designed experiment, we show that our algorithm accurately identifies the intron boundaries as well as boundaries of the optimal local alignment. It outperforms all other methods: for 5702 artificially shortened EST sequences from Caenorhabditis elegans and human, it correctly identifies the intron boundaries in all except two cases. The best other method is a recently proposed method called exalin which misaligns 37 of the sequences. Our method also demonstrates robustness to mutations, insertions and deletions, retaining accuracy even at high noise levels.

AVAILABILITY

Datasets for training, evaluation and testing, additional results and a stand-alone alignment tool implemented in C++ and python are available at http://www.fml.mpg.de/raetsch/projects/palma

摘要

动机

尽管多年来一直在研究如何在存在测序错误、可变剪接和微小外显子的情况下正确比对序列,但将mRNA序列与基因组DNA进行正确比对仍然是一项具有挑战性的任务。

结果

我们提出了一种基于大间隔学习的新方法,该方法将准确的剪接位点预测与常见的序列比对技术相结合。通过解决一个凸优化问题,我们的算法——称为PALMA——调整模型参数,以使真实比对的得分高于其他比对。我们研究了包含人工生成的微小外显子的mRNA与基因组DNA的比对准确性。在一个精心设计的实验中,我们表明我们的算法能够准确识别内含子边界以及最优局部比对的边界。它优于所有其他方法:对于来自秀丽隐杆线虫和人类的5702条人工缩短的EST序列,除了两个案例外,它在所有情况下都能正确识别内含子边界。另一种最好的方法是最近提出的名为exalin的方法,它错误比对了37条序列。我们的方法还展示了对突变、插入和缺失的鲁棒性,即使在高噪声水平下也能保持准确性。

可用性

可在http://www.fml.mpg.de/raetsch/projects/palma获得用于训练、评估和测试的数据集、其他结果以及用C++和Python实现的独立比对工具。

相似文献

1
PALMA: mRNA to genome alignments using large margin algorithms.帕尔马:使用大间隔算法将信使核糖核酸与基因组进行比对。
Bioinformatics. 2007 Aug 1;23(15):1892-900. doi: 10.1093/bioinformatics/btm275. Epub 2007 May 30.
2
Optimal spliced alignments of short sequence reads.短序列 reads 的最优剪接比对。
Bioinformatics. 2008 Aug 15;24(16):i174-80. doi: 10.1093/bioinformatics/btn300.
3
RASE: recognition of alternatively spliced exons in C.elegans.RASE:秀丽隐杆线虫中可变剪接外显子的识别
Bioinformatics. 2005 Jun;21 Suppl 1:i369-77. doi: 10.1093/bioinformatics/bti1053.
4
GARD: a genetic algorithm for recombination detection.GARD:一种用于重组检测的遗传算法。
Bioinformatics. 2006 Dec 15;22(24):3096-8. doi: 10.1093/bioinformatics/btl474. Epub 2006 Nov 16.
5
Considerations in the identification of functional RNA structural elements in genomic alignments.基因组比对中功能性RNA结构元件识别的考量因素。
BMC Bioinformatics. 2007 Jan 30;8:33. doi: 10.1186/1471-2105-8-33.
6
ARTS: accurate recognition of transcription starts in human.ARTS:人类转录起始位点的准确识别
Bioinformatics. 2006 Jul 15;22(14):e472-80. doi: 10.1093/bioinformatics/btl250.
7
NcDNAlign: plausible multiple alignments of non-protein-coding genomic sequences.NcDNAlign:非蛋白质编码基因组序列的合理多重比对。
Genomics. 2008 Jul;92(1):65-74. doi: 10.1016/j.ygeno.2008.04.003. Epub 2008 Jun 3.
8
RNA secondary structural alignment with conditional random fields.基于条件随机场的RNA二级结构比对
Bioinformatics. 2005 Sep 1;21 Suppl 2:ii237-42. doi: 10.1093/bioinformatics/bti1139.
9
Murlet: a practical multiple alignment tool for structural RNA sequences.Murlet:一种用于结构RNA序列的实用多序列比对工具。
Bioinformatics. 2007 Jul 1;23(13):1588-98. doi: 10.1093/bioinformatics/btm146. Epub 2007 Apr 25.
10
Alignment of RNA base pairing probability matrices.RNA碱基配对概率矩阵的比对。
Bioinformatics. 2004 Sep 22;20(14):2222-7. doi: 10.1093/bioinformatics/bth229. Epub 2004 Apr 8.

引用本文的文献

1
Benchmarking spliced alignment programs including Spaln2, an extended version of Spaln that incorporates additional species-specific features.对拼接比对程序进行基准测试,包括 Spaln2,这是 Spaln 的扩展版本,其中包含了额外的特定于物种的特征。
Nucleic Acids Res. 2012 Nov 1;40(20):e161. doi: 10.1093/nar/gks708. Epub 2012 Jul 30.
2
WebGMAP: a web service for mapping and aligning cDNA sequences to genomes.WebGMAP:一种用于将cDNA序列映射和比对到基因组的网络服务。
Nucleic Acids Res. 2009 Jul;37(Web Server issue):W77-83. doi: 10.1093/nar/gkp389. Epub 2009 May 22.
3
Pairagon: a highly accurate, HMM-based cDNA-to-genome aligner.
派拉贡:一种基于隐马尔可夫模型的高度精确的cDNA到基因组比对工具。
Bioinformatics. 2009 Jul 1;25(13):1587-93. doi: 10.1093/bioinformatics/btp273. Epub 2009 May 4.
4
A space-efficient and accurate method for mapping and aligning cDNA sequences onto genomic sequence.一种用于将cDNA序列定位和比对到基因组序列上的节省空间且准确的方法。
Nucleic Acids Res. 2008 May;36(8):2630-8. doi: 10.1093/nar/gkn105. Epub 2008 Mar 15.
5
Detecting polymorphic regions in Arabidopsis thaliana with resequencing microarrays.利用重测序微阵列检测拟南芥中的多态性区域。
Genome Res. 2008 Jun;18(6):918-29. doi: 10.1101/gr.070169.107. Epub 2008 Mar 6.