Sengupta Debarka, Maulik Ujjwal, Bandyopadhyay Sanghamitra
Indian Statistical Institute, Kolkata.
IEEE/ACM Trans Comput Biol Bioinform. 2012 May-Jun;9(3):924-933. doi: 10.1109/TCBB.2012.28. Epub 2012 Jan 31.
The scope and effectiveness of rank aggregation have already been established in contemporary bioinformatics research. Rank aggregation helps in meta analysis of putative results collected from different analytic or experimental sources. For example, we often receive considerably differing ranked lists of genes or microRNAs from various target prediction algorithms or microarray studies. Sometimes combining them all, in some sense, yields more effective ordering of the set of objects. Also, assigning a certain level of confidence to each source of ranking is a natural demand of aggregation. Assignment of weights to the sources of orderings can be performed by experts. Several rank aggregation approaches like those based on Markov chains (MC), evolutionary algorithms etc., exist in the literature. Markov chains, in general are faster than the evolutionary approaches. Unlike the evolutionary computing approaches Markov chains have not been used for weighted aggregation scenarios. This is because of the absence of a formal framework of weighted Markov chain. In this article we propose the use of a modified version of MC4 (one of the Markov chains proposed by Dwork et al., 2001), followed by the weighted analog of local Kemenization for performing rank aggregation, where the sources of rankings can be prioritized by an expert.
排名聚合的范围和有效性在当代生物信息学研究中已经得到确立。排名聚合有助于对从不同分析或实验来源收集的假定结果进行元分析。例如,我们经常从各种靶标预测算法或微阵列研究中收到基因或 microRNA 的排名列表,这些列表差异很大。在某种意义上,有时将它们全部组合起来会产生更有效的对象集排序。此外,给每个排名来源赋予一定程度的置信度是聚合的自然要求。可以由专家对排序来源进行权重分配。文献中存在几种排名聚合方法,如基于马尔可夫链(MC)、进化算法等的方法。一般来说,马尔可夫链比进化方法更快。与进化计算方法不同,马尔可夫链尚未用于加权聚合场景。这是因为缺乏加权马尔可夫链的正式框架。在本文中,我们建议使用 MC4 的修改版本(Dwork 等人于 2001 年提出的马尔可夫链之一),随后使用局部 Kemeni 化的加权类似方法进行排名聚合,其中排名来源可以由专家确定优先级。