School of Computational Science and Engineering at Georgia Tech, Atlanta, GA 30332, USA.
Nucleic Acids Res. 2013 Jul;41(13):6514-30. doi: 10.1093/nar/gkt274. Epub 2013 May 6.
Our goal was to identify evolutionary conserved frame transitions in protein coding regions and to uncover an underlying functional role of these structural aberrations. We used the ab initio frameshift prediction program, GeneTack, to detect reading frame transitions in 206 991 genes (fs-genes) from 1106 complete prokaryotic genomes. We grouped 102 731 fs-genes into 19 430 clusters based on sequence similarity between protein products (fs-proteins) as well as conservation of predicted position of the frameshift and its direction. We identified 4010 pseudogene clusters and 146 clusters of fs-genes apparently using recoding (local deviation from using standard genetic code) due to possessing specific sequence motifs near frameshift positions. Particularly interesting was finding of a novel type of organization of the dnaX gene, where recoding is required for synthesis of the longer subunit, τ. We selected 20 clusters of predicted recoding candidates and designed a series of genetic constructs with a reporter gene or affinity tag whose expression would require a frameshift event. Expression of the constructs in Escherichia coli demonstrated enrichment of the set of candidates with sequences that trigger genuine programmed ribosomal frameshifting; we have experimentally confirmed four new families of programmed frameshifts.
我们的目标是鉴定蛋白质编码区中进化保守的框架转换,并揭示这些结构异常的潜在功能作用。我们使用从头开始的移码预测程序 GeneTack,在 1106 个完整的原核基因组中的 206991 个基因(fs-genes)中检测读框转换。我们根据蛋白质产物(fs-proteins)之间的序列相似性以及预测的移码位置及其方向的保守性,将 102731 个 fs-genes 分为 19430 个簇。我们鉴定了 4010 个假基因簇和 146 个 fs-genes 簇,这些簇显然是由于在移码位置附近具有特定的序列基序而发生重编码(局部偏离使用标准遗传密码)。特别有趣的是发现了一种新型的 dnaX 基因的组织方式,其中重编码对于合成较长的亚基 τ 是必需的。我们选择了 20 个预测的重编码候选簇,并设计了一系列带有报告基因或亲和标签的遗传构建体,其表达需要移码事件。在大肠杆菌中表达这些构建体表明,候选序列富集了触发真正程序化核糖体移码的序列;我们已经通过实验证实了四个新的程序化移码家族。