Suppr超能文献

基于 bootstrap 算法推断出一种用于预测真核生物中蛋白质编码区的特定最优阈值。

Inferring an organism-specific optimal threshold for predicting protein coding regions in eukaryotes based on a bootstrapping algorithm.

机构信息

School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 610054, People's Republic of China.

出版信息

Biotechnol Lett. 2011 May;33(5):889-96. doi: 10.1007/s10529-011-0525-8. Epub 2011 Jan 14.

Abstract

The accuracy of prediction methods based on power spectrum analysis depends on the threshold that is used to discriminate between protein coding and non-coding sequences in the genomes of eukaryotes. Because the structure of genes vary among different eukaryotes, it is difficult to determine the best prediction threshold for a eukaryote relying only on prior biological knowledge. To improve the accuracy of prediction methods based on power spectral analysis, we developed a novel method based on a bootstrap algorithm to infer organism-specific optimal thresholds for eukaryotes. As prior information, our method requires the input of only a few annotated protein coding regions from the organism being studied. Our results show that using the calculated optimal thresholds for our test datasets, the average prediction accuracy of our method is 81%, an increase of 19% over that obtained using the same empirical threshold P=4 for all datasets. The proposed method is simple and convenient and easily applied to infer optimal thresholds that can be used to predict coding regions in the genomes of most organisms.

摘要

基于功率谱分析的预测方法的准确性取决于用于区分真核生物基因组中编码蛋白和非编码序列的阈值。由于不同真核生物的基因结构不同,仅依靠先前的生物学知识很难确定针对特定真核生物的最佳预测阈值。为了提高基于功率谱分析的预测方法的准确性,我们开发了一种基于自举算法的新方法,以推断出针对真核生物的特定于生物体的最佳阈值。作为先验信息,我们的方法仅需要输入来自正在研究的生物体的少数几个已注释的编码蛋白区域。我们的结果表明,使用计算出的最佳阈值对我们的测试数据集进行预测,我们的方法的平均预测准确性为 81%,比使用相同的经验阈值 P=4 对所有数据集进行预测时提高了 19%。所提出的方法简单方便,可轻松应用于推断可用于预测大多数生物体基因组中编码区域的最佳阈值。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验