Suppr超能文献

密码子使用度量及其在微生物基因表达预测中的适用性比较。

Comparison of codon usage measures and their applicability in prediction of microbial gene expressivity.

作者信息

Supek Fran, Vlahovicek Kristian

机构信息

Division of Biology, Department of Molecular Biology, Faculty of Science, Zagreb University, Rooseveltov trg 6, 10000 Zagreb, Croatia.

出版信息

BMC Bioinformatics. 2005 Jul 19;6:182. doi: 10.1186/1471-2105-6-182.

Abstract

BACKGROUND

There are a number of methods (also called: measures) currently in use that quantify codon usage in genes. These measures are often influenced by other sequence properties, such as length. This can introduce strong methodological bias into measurements; therefore we attempted to develop a method free from such dependencies. One of the common applications of codon usage analyses is to quantitatively predict gene expressivity.

RESULTS

We compared the performance of several commonly used measures and a novel method we introduce in this paper--Measure Independent of Length and Composition (MILC). Large, randomly generated sequence sets were used to test for dependence on (i) sequence length, (ii) overall amount of codon bias and (iii) codon bias discrepancy in the sequences. A derivative of the method, named MELP (MILC-based Expression Level Predictor) can be used to quantitatively predict gene expression levels from genomic data. It was compared to other similar predictors by examining their correlation with actual, experimentally obtained mRNA or protein abundances.

CONCLUSION

We have established that MILC is a generally applicable measure, being resistant to changes in gene length and overall nucleotide composition, and introducing little noise into measurements. Other methods, however, may also be appropriate in certain applications. Our efforts to quantitatively predict gene expression levels in several prokaryotes and unicellular eukaryotes met with varying levels of success, depending on the experimental dataset and predictor used. Out of all methods, MELP and Rainer Merkl's GCB method had the most consistent behaviour. A 'reference set' containing known ribosomal protein genes appears to be a valid starting point for a codon usage-based expressivity prediction.

摘要

背景

目前有多种方法(也称为:度量)用于量化基因中的密码子使用情况。这些度量通常会受到其他序列特性的影响,例如长度。这可能会在测量中引入强烈的方法偏差;因此,我们试图开发一种不受此类依赖性影响的方法。密码子使用分析的常见应用之一是定量预测基因表达能力。

结果

我们比较了几种常用度量以及本文中引入的一种新方法——长度和组成无关度量(MILC)的性能。使用大量随机生成的序列集来测试对(i)序列长度、(ii)密码子偏差的总量以及(iii)序列中密码子偏差差异的依赖性。该方法的一个衍生方法,名为MELP(基于MILC的表达水平预测器),可用于从基因组数据定量预测基因表达水平。通过检查它们与实际实验获得的mRNA或蛋白质丰度的相关性,将其与其他类似的预测器进行了比较。

结论

我们已经确定MILC是一种普遍适用的度量,不受基因长度和总体核苷酸组成变化的影响,并且在测量中引入的噪声很小。然而,其他方法在某些应用中也可能适用。我们在几种原核生物和单细胞真核生物中定量预测基因表达水平的努力取得了不同程度的成功,这取决于所使用的实验数据集和预测器。在所有方法中,MELP和Rainer Merkl的GCB方法表现最为一致。一个包含已知核糖体蛋白基因的“参考集”似乎是基于密码子使用的表达能力预测的有效起点。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bc54/1199580/c5687d8d21a3/1471-2105-6-182-1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验