Suppr超能文献

基于基因组学应用的部分和顶级排名列表的等级聚合方法的比较研究。

A comparative study of rank aggregation methods for partial and top ranked lists in genomic applications.

机构信息

Department of Statistical Science at Southern Methodist University, Dallas, TX.

Department of Clinical Sciences, University of Texas Southwestern Medical Center, Dallas, TX.

出版信息

Brief Bioinform. 2019 Jan 18;20(1):178-189. doi: 10.1093/bib/bbx101.

Abstract

Rank aggregation (RA), the process of combining multiple ranked lists into a single ranking, has played an important role in integrating information from individual genomic studies that address the same biological question. In previous research, attention has been focused on aggregating full lists. However, partial and/or top ranked lists are prevalent because of the great heterogeneity of genomic studies and limited resources for follow-up investigation. To be able to handle such lists, some ad hoc adjustments have been suggested in the past, but how RA methods perform on them (after the adjustments) has never been fully evaluated. In this article, a systematic framework is proposed to define different situations that may occur based on the nature of individually ranked lists. A comprehensive simulation study is conducted to examine the performance characteristics of a collection of existing RA methods that are suitable for genomic applications under various settings simulated to mimic practical situations. A non-small cell lung cancer data example is provided for further comparison. Based on our numerical results, general guidelines about which methods perform the best/worst, and under what conditions, are provided. Also, we discuss key factors that substantially affect the performance of the different methods.

摘要

排名聚合(RA),即将多个排名列表组合成一个单一排名的过程,在整合针对同一生物学问题的个体基因组研究信息方面发挥了重要作用。在之前的研究中,人们主要关注的是聚合完整的列表。然而,由于基因组研究的巨大异质性和后续调查的资源有限,部分和/或排名靠前的列表很常见。为了能够处理这些列表,过去曾提出了一些特定的调整,但 RA 方法在这些列表上(经过调整后)的表现如何,这一点从未得到过充分评估。在本文中,我们提出了一个系统的框架,以根据单独排名列表的性质来定义可能出现的不同情况。我们进行了一项全面的模拟研究,以检查适用于基因组应用的一系列现有 RA 方法在各种模拟实际情况的设置下的性能特征。提供了一个非小细胞肺癌数据的例子进行进一步比较。根据我们的数值结果,提供了关于哪些方法表现最好/最差以及在什么条件下表现最好/最差的一般指导原则。此外,我们还讨论了对不同方法的性能有重大影响的关键因素。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/afe4/6357556/604c8967a5eb/bbx101f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验