Suppr超能文献

一种隐马尔可夫支持向量机框架,结合了轮廓几何学习,用于识别平铺阵列数据中的微生物 RNA。

A hidden Markov support vector machine framework incorporating profile geometry learning for identifying microbial RNA in tiling array data.

机构信息

Department of Molecular Genetics, The Forsyth Institute, Boston, MA 02115, USA.

出版信息

Bioinformatics. 2010 Jun 1;26(11):1423-30. doi: 10.1093/bioinformatics/btq162. Epub 2010 Apr 15.

Abstract

MOTIVATION

RNA expression signals detected by high-density genomic tiling microarrays contain comprehensive transcriptomic information of the target organism. Current methods for determining the RNA transcription units are still computation intense and lack the discriminative power. This article describes an efficient and accurate methodology to reveal complicated transcriptional architecture, including small regulatory RNAs, in microbial transcriptome profiles.

RESULTS

Normalized microarray data were first subject to support vector regression to estimate the profile tendency by reducing noise interruption. A hybrid supervised machine learning algorithm, hidden Markov support vector machines, was then used to classify the underlying state of each probe to 'expression' or 'silence' with the assumption that the consecutive state sequence was a heterogeneous Markov chain. For model construction, we introduced a profile geometry learning method to construct the feature vectors, which considered both intensity profiles and changes of intensities over the probe spacing. Also, a robust strategy was used to dynamically evaluate and select the training set based only on prior computer gene annotation. The algorithm performed better than other methods in accuracy on simulated data, especially for small expressed regions with lower (<1) SNR (signal-to-noise ratio), hence more sensitive for detecting small RNAs.

AVAILABILITY AND IMPLEMENTATION

Detail implementation steps of the algorithm and the complete result of the transcriptome analysis for a microbial genome Porphyromonas gingivalis W83 can be viewed at http://bioinformatics.forsyth.org/mtd.

摘要

动机

高密度基因组平铺微阵列检测到的 RNA 表达信号包含目标生物的综合转录组信息。目前确定 RNA 转录单位的方法仍然计算密集且缺乏辨别力。本文描述了一种有效且准确的方法,用于揭示微生物转录组谱中复杂的转录结构,包括小调控 RNA。

结果

首先对归一化的微阵列数据进行支持向量回归,通过减少噪声干扰来估计图谱趋势。然后使用混合监督机器学习算法——隐马尔可夫支持向量机,假设连续状态序列是异构马尔可夫链,将每个探针的潜在状态分类为“表达”或“沉默”。对于模型构建,我们引入了一种图谱几何学习方法来构建特征向量,同时考虑了强度图谱和探针间距上强度变化。此外,还使用了一种稳健的策略,仅根据先前的计算机基因注释动态评估和选择训练集。该算法在模拟数据上的准确性优于其他方法,尤其是对于 SNR(信噪比)较低(<1)的小表达区域,因此更能检测到小 RNA。

可用性和实施

算法的详细实施步骤和微生物基因组 Porphyromonas gingivalis W83 的转录组分析的完整结果可在 http://bioinformatics.forsyth.org/mtd 上查看。

相似文献

引用本文的文献

本文引用的文献

5
Regulatory mechanisms employed by cis-encoded antisense RNAs.顺式编码反义RNA所采用的调控机制。
Curr Opin Microbiol. 2007 Apr;10(2):102-9. doi: 10.1016/j.mib.2007.03.012. Epub 2007 Mar 26.
8
Transcript mapping with high-density oligonucleotide tiling arrays.使用高密度寡核苷酸平铺阵列进行转录本图谱分析。
Bioinformatics. 2006 Aug 15;22(16):1963-70. doi: 10.1093/bioinformatics/btl289. Epub 2006 Jun 20.
10
A high-resolution map of transcription in the yeast genome.酵母基因组转录的高分辨率图谱。
Proc Natl Acad Sci U S A. 2006 Apr 4;103(14):5320-5. doi: 10.1073/pnas.0601091103. Epub 2006 Mar 28.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验