文献检索，用中文搜 PubMed

Suppr 超能文献

核心技术专利：CN118964589B侵权必究

Suppr 超能文献

核心技术专利：CN118964589B侵权必究

Department of Computational Biology and Bioinformatics, University of Kerala, Trivandrum 695581, India.

Biochem Biophys Res Commun. 2012 May 25;422(1):36-41. doi: 10.1016/j.bbrc.2012.04.090. Epub 2012 Apr 25.

Accurate prediction of short protein coding DNA from genome sequence information remains an unsolved problem in DNA sequence analysis. Popular gene finding tools show drastic reduction in accuracy while attempting to predict genes of length less than 400 nt, a length we define as short. This study performs a quantitative evaluation of a set of selected coding measures in terms of their discriminative power in recognizing short genes in prokaryotic genomes. By performing Fast Correlation Based Feature Selection (FCBF) technique, we identified a subset of coding measures with high discriminative power. Using the measures identified thus, we present a novel approach for short genes recognition. A short-gene predictor employing AdaBoost.M1 in conjunction with random forests as the base classifier gives 92.74% accuracy, 94.77% sensitivity and 90.06% specificity on short genes.

准确预测基因组序列信息中的短蛋白质编码 DNA 仍然是 DNA 序列分析中的一个未解决的问题。流行的基因发现工具在尝试预测长度小于 400 个核苷酸的基因（我们定义为短基因）时，准确性会急剧下降。本研究对一组选定的编码度量标准进行了定量评估，以评估它们在识别原核基因组中短基因方面的区分能力。通过执行快速相关的特征选择（FCBF）技术，我们确定了一组具有高区分能力的编码度量标准。使用这样确定的度量标准，我们提出了一种用于识别短基因的新方法。在短基因上，使用 AdaBoost.M1 与随机森林作为基础分类器的短基因预测器的准确率为 92.74%，灵敏度为 94.77%，特异性为 90.06%。

Department of Computational Biology and Bioinformatics, University of Kerala, Trivandrum 695581, India.

Biochem Biophys Res Commun. 2012 May 25;422(1):36-41. doi: 10.1016/j.bbrc.2012.04.090. Epub 2012 Apr 25.

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

elusive 短基因——原核基因组识别的集成方法。

The elusive short gene--an ensemble method for recognition for prokaryotic genome.

机构信息

出版信息

相似文献

引用本文的文献

elusive 短基因——原核基因组识别的集成方法。

The elusive short gene--an ensemble method for recognition for prokaryotic genome.

机构信息

出版信息

相似文献

引用本文的文献