Greehey Children's Cancer Research Institute, The University of Texas Health Science Center at San Antonio, San Antonio, TX, USA.
BMC Genomics. 2012;13 Suppl 6(Suppl 6):S18. doi: 10.1186/1471-2164-13-S6-S18. Epub 2012 Oct 26.
BACKGROUND: One method to understand and evaluate an experiment that produces a large set of genes, such as a gene expression microarray analysis, is to identify overrepresentation or enrichment for biological pathways. Because pathways are able to functionally describe the set of genes, much effort has been made to collect curated biological pathways into publicly accessible databases. When combining disparate databases, highly related or redundant pathways exist, making their consolidation into pathway concepts essential. This will facilitate unbiased, comprehensive yet streamlined analysis of experiments that result in large gene sets. METHODS: After gene set enrichment finds representative pathways for large gene sets, pathways are consolidated into representative pathway concepts. Three complementary, but different methods of pathway consolidation are explored. Enrichment Consolidation combines the set of the pathways enriched for the signature gene list through iterative combining of enriched pathways with other pathways with similar signature gene sets; Weighted Consolidation utilizes a Protein-Protein Interaction network based gene-weighting approach that finds clusters of both enriched and non-enriched pathways limited to the experiments' resultant gene list; and finally the de novo Consolidation method uses several measurements of pathway similarity, that finds static pathway clusters independent of any given experiment. RESULTS: We demonstrate that the three consolidation methods provide unified yet different functional insights of a resultant gene set derived from a genome-wide profiling experiment. Results from the methods are presented, demonstrating their applications in biological studies and comparing with a pathway web-based framework that also combines several pathway databases. Additionally a web-based consolidation framework that encompasses all three methods discussed in this paper, Pathway Distiller (http://cbbiweb.uthscsa.edu/PathwayDistiller), is established to allow researchers access to the methods and example microarray data described in this manuscript, and the ability to analyze their own gene list by using our unique consolidation methods. CONCLUSIONS: By combining several pathway systems, implementing different, but complementary pathway consolidation methods, and providing a user-friendly web-accessible tool, we have enabled users the ability to extract functional explanations of their genome wide experiments.
背景:理解和评估产生大量基因的实验(如基因表达微阵列分析)的一种方法是识别生物途径的过表达或富集。由于途径能够从功能上描述基因集,因此人们投入了大量精力将经过精心整理的生物途径收集到可公开访问的数据库中。当组合不同的数据库时,存在高度相关或冗余的途径,因此将它们合并为途径概念至关重要。这将促进对产生大量基因集的实验进行无偏、全面且精简的分析。
方法:在基因集富集发现代表大基因集的代表性途径后,将途径合并为代表性途径概念。探索了三种互补但不同的途径整合方法。富集整合通过迭代组合富集途径与具有相似特征基因集的其他途径,合并代表签名基因列表的途径;加权整合利用基于蛋白质-蛋白质相互作用网络的基因加权方法,找到受实验结果基因列表限制的富集和非富集途径的聚类;最后,从头整合方法使用几种途径相似性度量,找到与任何给定实验无关的静态途径聚类。
结果:我们证明了三种整合方法为源自全基因组分析实验的所得基因集提供了统一但不同的功能见解。展示了这些方法的结果,展示了它们在生物学研究中的应用,并与一种基于途径网络的框架进行了比较,该框架还结合了几个途径数据库。此外,建立了一个包含本文讨论的所有三种方法的基于网络的整合框架,Pathway Distiller(http://cbbiweb.uthscsa.edu/PathwayDistiller),允许研究人员访问本文档中描述的方法和示例微阵列数据,并能够通过使用我们独特的整合方法分析他们自己的基因列表。
结论:通过结合几种途径系统,实施不同但互补的途径整合方法,并提供用户友好的网络访问工具,我们使用户能够提取其全基因组实验的功能解释。
BMC Genomics. 2012-10-26
BMC Bioinformatics. 2017-11-22
BMC Bioinformatics. 2010-4-1
PLoS Comput Biol. 2018-3-19
BMC Bioinformatics. 2012-9-11
BMC Bioinformatics. 2018-10-19
BMC Bioinformatics. 2011-11-14
NPJ Syst Biol Appl. 2018-12-13
BMC Bioinformatics. 2018-10-19
Chin J Cancer. 2016-6-16
Database (Oxford). 2015-2-27
Nucleic Acids Res. 2011-11-28
Cell Res. 2011-9-6
BMC Res Notes. 2011-6-14
BMC Bioinformatics. 2010-12-30
Nucleic Acids Res. 2011-1
BMC Syst Biol. 2010-3-30
Bioinformatics. 2009-12-9
Nucleic Acids Res. 2009-11-11