Huang Ting-Hua, Fan Bin, Rothschild Max F, Hu Zhi-Liang, Li Kui, Zhao Shu-Hong
Key Lab of Agricultural Animal Genetics, Breeding, and Reproduction of Ministry of Education & Key Laboratory of Swine Genetics and Breeding of Ministry of Agriculture, Huazhong Agricultural University, Wuhan, P R China.
BMC Bioinformatics. 2007 Sep 17;8:341. doi: 10.1186/1471-2105-8-341.
MicroRNAs (miRNAs) are recognized as one of the most important families of non-coding RNAs that serve as important sequence-specific post-transcriptional regulators of gene expression. Identification of miRNAs is an important requirement for understanding the mechanisms of post-transcriptional regulation. Hundreds of miRNAs have been identified by direct cloning and computational approaches in several species. However, there are still many miRNAs that remain to be identified due to lack of either sequence features or robust algorithms to efficiently identify them.
We have evaluated features valuable for pre-miRNA prediction, such as the local secondary structure differences of the stem region of miRNA and non-miRNA hairpins. We have also established correlations between different types of mutations and the secondary structures of pre-miRNAs. Utilizing these features and combining some improvements of the current pre-miRNA prediction methods, we implemented a computational learning method SVM (support vector machine) to build a high throughput and good performance computational pre-miRNA prediction tool called MiRFinder. The tool was designed for genome-wise, pair-wise sequences from two related species. The method built into the tool consisted of two major steps: 1) genome wide search for hairpin candidates and 2) exclusion of the non-robust structures based on analysis of 18 parameters by the SVM method. Results from applying the tool for chicken/human and D. melanogaster/D. pseudoobscura pair-wise genome alignments showed that the tool can be used for genome wide pre-miRNA predictions.
The MiRFinder can be a good alternative to current miRNA discovery software. This tool is available at http://www.bioinformatics.org/mirfinder/.
微小RNA(miRNA)被认为是最重要的非编码RNA家族之一,作为基因表达重要的序列特异性转录后调节因子。miRNA的鉴定是理解转录后调控机制的重要要求。通过直接克隆和计算方法,在多个物种中已鉴定出数百种miRNA。然而,由于缺乏序列特征或有效识别它们的强大算法,仍有许多miRNA有待鉴定。
我们评估了对前体miRNA预测有价值的特征,例如miRNA和非miRNA发夹茎区域的局部二级结构差异。我们还建立了不同类型突变与前体miRNA二级结构之间的相关性。利用这些特征并结合当前前体miRNA预测方法的一些改进,我们实施了一种计算学习方法——支持向量机(SVM),以构建一个高通量且性能良好的计算前体miRNA预测工具MiRFinder。该工具专为两个相关物种的全基因组、成对序列设计。该工具所采用的方法包括两个主要步骤:1)全基因组搜索发夹候选物;2)基于支持向量机方法对18个参数的分析排除不稳定结构。将该工具应用于鸡/人和黑腹果蝇/拟暗果蝇成对基因组比对的结果表明,该工具可用于全基因组前体miRNA预测。
MiRFinder可以成为当前miRNA发现软件的一个很好的替代工具。该工具可在http://www.bioinformatics.org/mirfinder/获取。