Lindow Morten, Gorodkin Jan
Bioinformatics Centre, University of Copenhagen, Copenhagen N, Denmark.
DNA Cell Biol. 2007 May;26(5):339-51. doi: 10.1089/dna.2006.0551.
In 2001 there were four PubMed entries matching the word "microRNA" (miRNA). Interestingly, this number has now far exceeded 1300 and is still rapidly increasing. This more than anything demonstrates the extreme attention this field has had within a short period of time. With the large amounts of sequence data being generated, the need for analysis by computational approaches is obvious. Here, we review the general principles used in computational gene and target finding, and discuss the strengths and weaknesses of the methods. Several methods rely on detection of evolutionary conserved candidates, but recent methods have challenged this paradigm by simultaneously searching for the gene and the corresponding target(s). Whereas the early methods made predictions based on sets of hand-derived rules from precursor-miRNA structure or observed target-miRNA interactions, recent methods apply machine learning techniques. Even though these methods are already powerful, the amount of data they rely on is still limited. Since it is evident that data are continuously being generated, it must be anticipated that these methods will further improve their performance.
2001年,在PubMed中有4条与“微小RNA”(miRNA)相关的条目。有趣的是,现在这个数字已远远超过1300,并且仍在迅速增加。这比任何事情都更能说明该领域在短时间内受到的极大关注。随着大量序列数据的产生,显然需要采用计算方法进行分析。在此,我们回顾了计算基因和靶点发现中使用的一般原则,并讨论了这些方法的优缺点。几种方法依赖于检测进化保守的候选物,但最近的方法通过同时搜索基因和相应的靶点对这种范式提出了挑战。早期的方法是基于从前体miRNA结构或观察到的靶点 - miRNA相互作用中手工推导的规则集进行预测,而最近的方法则应用机器学习技术。尽管这些方法已经很强大,但它们所依赖的数据量仍然有限。由于显然数据在不断产生,因此可以预期这些方法将进一步提高其性能。