Department of Biology, Boston College, 140 Commonwealth Avenue, Chestnut Hill, MA 02467, USA.
Genome Biol. 2009;10(11):R133. doi: 10.1186/gb-2009-10-11-r133. Epub 2009 Nov 20.
Coding nucleotide sequences contain myriad functions independent of their encoded protein sequences. We present the COMIT algorithm to detect functional noncoding motifs in coding regions using sequence conservation, explicitly separating nucleotide from amino acid effects. COMIT concurs with diverse experimental datasets, including splicing enhancers, silencers, replication motifs, and microRNA targets, and predicts many novel functional motifs. Intriguingly, COMIT scores are well-correlated to scores uncalibrated for amino acids, suggesting that nucleotide motifs often override peptide-level constraints.
编码核苷酸序列包含许多独立于其编码蛋白质序列的功能。我们提出了 COMIT 算法,该算法使用序列保守性来检测编码区域中的功能非编码基序,明确区分核苷酸和氨基酸的影响。COMIT 与多种实验数据集一致,包括剪接增强子、沉默子、复制基序和 microRNA 靶标,并预测了许多新的功能基序。有趣的是,COMIT 分数与未校准氨基酸的分数高度相关,这表明核苷酸基序通常会覆盖肽级别的限制。