Lan Chaowang, Chen Qingfeng, Li Jinyan
School of Computer, Electronic and Information, and State Key Laboratory for Conservation and Utilization of Subtropical Agro-bioresources, Guangxi University, No.100 Daxue Road, Nanning, 530004, China.
Advanced Analytics Institute, Faculty of Engineering and IT, University of Technology Sydney, PO Box 123, Broadway, Sydney, NSW 2007, Australia.
BMC Bioinformatics. 2016 Dec 22;17(Suppl 19):507. doi: 10.1186/s12859-016-1367-0.
Regulation mechanisms between miRNAs and genes are complicated. To accomplish a biological function, a miRNA may regulate multiple target genes, and similarly a target gene may be regulated by multiple miRNAs. Wet-lab knowledge of co-regulating miRNAs is limited. This work introduces a computational method to group miRNAs of similar functions to identify co-regulating miRNAsfrom a similarity matrix of miRNAs.
We define a novel information content of gene ontology (GO) to measure similarity between two sets of GO graphs corresponding to the two sets of target genes of two miRNAs. This between-graph similarity is then transferred as a functional similarity between the two miRNAs. Our definition of the information content is based on the size of a GO term's descendants, but adjusted by a weight derived from its depth level and the GO relationships at its path to the root node or to the most informative common ancestor (MICA). Further, a self-tuning technique and the eigenvalues of the normalized Laplacian matrix are applied to determine the optimal parameters for the spectral clustering of the similarity matrix of the miRNAs.
Experimental results demonstrate that our method has better clustering performance than the existing edge-based, node-based or hybrid methods. Our method has also demonstrated a novel usefulness for the function annotation of new miRNAs, as reported in the detailed case studies.
微小RNA(miRNA)与基因之间的调控机制十分复杂。为实现一种生物学功能,一个miRNA可能调控多个靶基因,同样地,一个靶基因也可能受到多个miRNA的调控。关于共同调控miRNA的实验知识有限。这项工作引入了一种计算方法,对功能相似的miRNA进行分组,以便从miRNA的相似性矩阵中识别共同调控的miRNA。
我们定义了一种新的基因本体论(GO)信息内容,用于测量与两个miRNA的两组靶基因相对应的两组GO图之间的相似性。然后,这种图间相似性被转换为两个miRNA之间的功能相似性。我们对信息内容的定义基于GO术语后代的大小,但通过从其深度级别以及其到根节点或最具信息性的共同祖先(MICA)路径上的GO关系得出的权重进行调整。此外,应用自调整技术和归一化拉普拉斯矩阵的特征值来确定miRNA相似性矩阵谱聚类的最佳参数。
实验结果表明,我们的方法比现有的基于边、基于节点或混合方法具有更好的聚类性能。如详细案例研究中所报道的,我们的方法在新miRNA的功能注释方面也显示出了新的用途。