• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

银:锻造近乎黄金标准数据集。

Silver: Forging almost Gold Standard Datasets.

机构信息

Augmented Intelligence & Precision Health Laboratory, Institute of the McGill University Health Centre, McGill University, Montreal, QC H4A 3S5, Canada.

Department of Computer Science, University of Saskatchewan, Saskatoon, SK S7N 5C9, Canada.

出版信息

Genes (Basel). 2021 Sep 28;12(10):1523. doi: 10.3390/genes12101523.

DOI:10.3390/genes12101523
PMID:34680918
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8535810/
Abstract

Gene set analysis has been widely used to gain insight from high-throughput expression studies. Although various tools and methods have been developed for gene set analysis, there is no consensus among researchers regarding best practice(s). Most often, evaluation studies have reported contradictory recommendations of which methods are superior. Therefore, an unbiased quantitative framework for evaluations of gene set analysis methods will be valuable. Such a framework requires gene expression datasets where enrichment status of gene sets is known . In the absence of such gold standard datasets, artificial datasets are commonly used for evaluations of gene set analysis methods; however, they often rely on oversimplifying assumptions that make them biased in favor of or against a given method. In this paper, we propose a quantitative framework for evaluation of gene set analysis methods by synthesizing expression datasets using real data, without relying on oversimplifying or unrealistic assumptions, while preserving complex gene-gene correlations and retaining the distribution of expression values. The utility of the quantitative approach is shown by evaluating ten widely used gene set analysis methods. An implementation of the proposed method is publicly available. We suggest using Silver to evaluate existing and new gene set analysis methods. Evaluation using Silver provides a better understanding of current methods and can aid in the development of gene set analysis methods to achieve higher specificity without sacrificing sensitivity.

摘要

基因集分析已被广泛应用于从高通量表达研究中获得深入了解。尽管已经开发了各种工具和方法用于基因集分析,但研究人员在最佳实践方面没有达成共识。大多数情况下,评估研究报告了哪种方法更优越的相互矛盾的建议。因此,一个公正的定量框架用于评估基因集分析方法将是有价值的。这样的框架需要基因表达数据集,其中基因集的富集状态是已知的。在没有这样的黄金标准数据集的情况下,通常使用人工数据集来评估基因集分析方法;然而,它们往往依赖于过于简化的假设,这些假设使它们偏向于或反对给定的方法。在本文中,我们提出了一个定量框架,通过使用真实数据合成表达数据集来评估基因集分析方法,而不依赖于过于简化或不现实的假设,同时保留复杂的基因-基因相关性,并保留表达值的分布。通过评估十种广泛使用的基因集分析方法,展示了定量方法的实用性。所提出方法的实现是公开可用的。我们建议使用 Silver 来评估现有的和新的基因集分析方法。使用 Silver 进行评估可以更好地了解当前的方法,并有助于开发基因集分析方法,在不牺牲敏感性的情况下实现更高的特异性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e62f/8535810/6e8c9c160908/genes-12-01523-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e62f/8535810/664b62ea5938/genes-12-01523-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e62f/8535810/a87cbea556b7/genes-12-01523-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e62f/8535810/b4781f2d609e/genes-12-01523-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e62f/8535810/ac5fc9ea2b7f/genes-12-01523-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e62f/8535810/025157064e94/genes-12-01523-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e62f/8535810/24e7242b0e03/genes-12-01523-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e62f/8535810/6e8c9c160908/genes-12-01523-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e62f/8535810/664b62ea5938/genes-12-01523-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e62f/8535810/a87cbea556b7/genes-12-01523-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e62f/8535810/b4781f2d609e/genes-12-01523-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e62f/8535810/ac5fc9ea2b7f/genes-12-01523-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e62f/8535810/025157064e94/genes-12-01523-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e62f/8535810/24e7242b0e03/genes-12-01523-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e62f/8535810/6e8c9c160908/genes-12-01523-g007.jpg

相似文献

1
Silver: Forging almost Gold Standard Datasets.银:锻造近乎黄金标准数据集。
Genes (Basel). 2021 Sep 28;12(10):1523. doi: 10.3390/genes12101523.
2
Toward a gold standard for benchmarking gene set enrichment analysis.迈向基因集富集分析基准测试的金标准。
Brief Bioinform. 2021 Jan 18;22(1):545-556. doi: 10.1093/bib/bbz158.
3
Finding function: evaluation methods for functional genomic data.寻找功能:功能基因组数据的评估方法
BMC Genomics. 2006 Jul 25;7:187. doi: 10.1186/1471-2164-7-187.
4
Dintor: functional annotation of genomic and proteomic data.Dintor:基因组和蛋白质组数据的功能注释。
BMC Genomics. 2015 Dec 21;16:1081. doi: 10.1186/s12864-015-2279-5.
5
CGPS: A machine learning-based approach integrating multiple gene set analysis tools for better prioritization of biologically relevant pathways.CGPS:一种基于机器学习的方法,整合了多种基因集分析工具,以便更好地对生物学相关途径进行优先级排序。
J Genet Genomics. 2018 Sep 20;45(9):489-504. doi: 10.1016/j.jgg.2018.08.002. Epub 2018 Sep 13.
6
Translational Metabolomics of Head Injury: Exploring Dysfunctional Cerebral Metabolism with Ex Vivo NMR Spectroscopy-Based Metabolite Quantification头部损伤的转化代谢组学:基于体外核磁共振波谱的代谢物定量分析探索脑代谢功能障碍
7
Forming Big Datasets through Latent Class Concatenation of Imperfectly Matched Databases Features.通过潜在类别拼接不完美匹配数据库特征来形成大数据集。
Genes (Basel). 2019 Sep 19;10(9):727. doi: 10.3390/genes10090727.
8
Size matters: how sample size affects the reproducibility and specificity of gene set analysis.大小很重要:样本量如何影响基因集分析的可重复性和特异性。
Hum Genomics. 2019 Oct 22;13(Suppl 1):42. doi: 10.1186/s40246-019-0226-2.
9
A strategy for evaluating pathway analysis methods.一种评估通路分析方法的策略。
BMC Bioinformatics. 2017 Oct 13;18(1):453. doi: 10.1186/s12859-017-1866-7.
10
GOModeler--a tool for hypothesis-testing of functional genomics datasets.GOModeler——一个用于功能基因组学数据集假设检验的工具。
BMC Bioinformatics. 2010 Oct 7;11 Suppl 6(Suppl 6):S29. doi: 10.1186/1471-2105-11-S6-S29.

引用本文的文献

1
Editorial: Advancement in Gene Set Analysis: Gaining Insight From High-Throughput Data.社论:基因集分析的进展:从高通量数据中获取见解
Front Genet. 2022 May 26;13:928724. doi: 10.3389/fgene.2022.928724. eCollection 2022.

本文引用的文献

1
Comparative Analyses of Gene Co-expression Networks: Implementations and Applications in the Study of Evolution.基因共表达网络的比较分析:在进化研究中的实现与应用
Front Genet. 2021 Aug 13;12:695399. doi: 10.3389/fgene.2021.695399. eCollection 2021.
2
Juxtapose: a gene-embedding approach for comparing co-expression networks.并列:一种用于比较共表达网络的基因嵌入方法。
BMC Bioinformatics. 2021 Mar 16;22(1):125. doi: 10.1186/s12859-021-04055-1.
3
Gene Set Analysis: Challenges, Opportunities, and Future Research.基因集分析:挑战、机遇与未来研究
Front Genet. 2020 Jun 30;11:654. doi: 10.3389/fgene.2020.00654. eCollection 2020.
4
Measuring consistency among gene set analysis methods: A systematic study.评估基因集分析方法之间的一致性:一项系统研究。
J Bioinform Comput Biol. 2019 Oct;17(5):1940010. doi: 10.1142/S0219720019400109.
5
Size matters: how sample size affects the reproducibility and specificity of gene set analysis.大小很重要:样本量如何影响基因集分析的可重复性和特异性。
Hum Genomics. 2019 Oct 22;13(Suppl 1):42. doi: 10.1186/s40246-019-0226-2.
6
Gene set analysis methods: a systematic comparison.基因集分析方法:系统比较
BioData Min. 2018 May 31;11:8. doi: 10.1186/s13040-018-0166-8. eCollection 2018.
7
WebGestalt 2017: a more comprehensive, powerful, flexible and interactive gene set enrichment analysis toolkit.WebGestalt 2017:一个更全面、强大、灵活和互动的基因集富集分析工具包。
Nucleic Acids Res. 2017 Jul 3;45(W1):W130-W137. doi: 10.1093/nar/gkx356.
8
Avoiding the pitfalls of gene set enrichment analysis with SetRank.使用SetRank避免基因集富集分析的陷阱。
BMC Bioinformatics. 2017 Mar 4;18(1):151. doi: 10.1186/s12859-017-1571-6.
9
A Synthetic Kinome Microarray Data Generator.一种合成激酶组微阵列数据生成器。
Microarrays (Basel). 2015 Oct 16;4(4):432-53. doi: 10.3390/microarrays4040432.
10
A comparison of gene set analysis methods in terms of sensitivity, prioritization and specificity.基因集分析方法在灵敏度、优先级和特异性方面的比较。
PLoS One. 2013 Nov 15;8(11):e79217. doi: 10.1371/journal.pone.0079217. eCollection 2013.