Department of Pharmacological and Biomolecular Sciences, Università Degli Studi Di Milano, Milan, Italy.
Fondazione IRCCS Ca' Granda Ospedale Maggiore Policlinico, Milan, Italy.
Sci Rep. 2021 Dec 6;11(1):23458. doi: 10.1038/s41598-021-02528-0.
Functional enrichment analysis is an analytical method to extract biological insights from gene expression data, popularized by the ever-growing application of high-throughput techniques. Typically, expression profiles are generated for hundreds to thousands of genes/proteins from samples belonging to two experimental groups, and after ad-hoc statistical tests, researchers are left with lists of statistically significant entities, possibly lacking any unifying biological theme. Functional enrichment tackles the problem of putting overall gene expression changes into a broader biological context, based on pre-existing knowledge bases of reference: database collections of known expression regulation, relationships and molecular interactions. STRING is among the most popular tools, providing both protein-protein interaction networks and functional enrichment analysis for any given set of identifiers. For complex experimental designs, manually retrieving, interpreting, analyzing and abridging functional enrichment results is a daunting task, usually performed by hand by the average wet-biology researcher. We have developed reString, a cross-platform software that seamlessly retrieves from STRING functional enrichments from multiple user-supplied gene sets, with just a few clicks, without any need for specific bioinformatics skills. Further, it aggregates all findings into human-readable table summaries, with built-in features to easily produce user-customizable publication-grade clustermaps and bubble plots. Herein, we outline a complete reString protocol, showcasing its features on a real use-case.
功能富集分析是一种从基因表达数据中提取生物学见解的分析方法,随着高通量技术的应用越来越广泛,这种方法也变得流行起来。通常,从属于两个实验组的样本中生成数百到数千个基因/蛋白质的表达谱,并且在进行特定的统计测试后,研究人员得到了一组具有统计学意义的实体列表,这些实体可能缺乏任何统一的生物学主题。功能富集基于现有的参考知识库,即已知的表达调控、关系和分子相互作用的数据库集合,解决了将整体基因表达变化置于更广泛的生物学背景下的问题。STRING 是最受欢迎的工具之一,它为任何给定的标识符集提供蛋白质-蛋白质相互作用网络和功能富集分析。对于复杂的实验设计,手动检索、解释、分析和简化功能富集结果是一项艰巨的任务,通常由普通湿实验研究人员手动完成。我们开发了 reString,这是一个跨平台软件,可以通过几个简单的点击,从 STRING 中无缝检索多个用户提供的基因集的功能富集,而无需任何特定的生物信息学技能。此外,它将所有发现汇总到易于阅读的表格摘要中,并具有内置功能,可轻松生成用户可自定义的出版物级别的聚类图和气泡图。在此,我们概述了一个完整的 reString 协议,并在实际用例中展示了其功能。