Paul Sushmita, Vera Julio
Laboratory of Systems Tumor Immunology, Department of Dermatology, University of Erlangen-Nürnberg, Hartmannstr. 14, 91052 Erlangen, Germany.
Mol Biosyst. 2015 Jul;11(7):2068-81. doi: 10.1039/c5mb00213c.
The microRNAs are small, endogenous non-coding RNAs found in plants, animals, and some viruses, which function in RNA silencing and post-transcriptional regulation of gene expression. It is suggested by various genome-wide studies that a substantial fraction of miRNA genes is likely to form clusters. The coherent expression of the miRNA clusters can then be used to classify samples according to the clinical outcome. In this regard, a new clustering algorithm, termed as rough hypercuboid based supervised attribute clustering (RH-SAC), is proposed to find such groups of miRNAs. The proposed algorithm is based on the theory of rough set, which directly incorporates the information of sample categories into the miRNA clustering process, generating a supervised clustering algorithm for miRNAs. The effectiveness of the new approach is demonstrated on several publicly available miRNA expression data sets using support vector machine. The so-called B.632+ bootstrap error estimate is used to minimize the variability and biasedness of the derived results. The association of the miRNA clusters to various biological pathways is also shown by doing pathway enrichment analysis.
微小RNA是在植物、动物和一些病毒中发现的小型内源性非编码RNA,其在RNA沉默和基因表达的转录后调控中发挥作用。各种全基因组研究表明,相当一部分微小RNA基因可能形成簇。然后,微小RNA簇的协同表达可用于根据临床结果对样本进行分类。在这方面,提出了一种新的聚类算法,称为基于粗糙超长方体的监督属性聚类(RH-SAC),以找到这样的微小RNA组。该算法基于粗糙集理论,直接将样本类别的信息纳入微小RNA聚类过程,生成一种针对微小RNA的监督聚类算法。使用支持向量机在几个公开可用的微小RNA表达数据集上证明了新方法的有效性。使用所谓的B.632+自举误差估计来最小化所得结果的变异性和偏差。通过进行通路富集分析,还展示了微小RNA簇与各种生物通路的关联。