• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

Comprior:促进基于先验知识的特征选择方法在基因表达数据集上的实施和自动化基准测试。

Comprior: facilitating the implementation and automated benchmarking of prior knowledge-based feature selection approaches on gene expression data sets.

机构信息

Hasso Plattner Institute, Digital Engineering Faculty, University of Potsdam, Potsdam, Germany.

出版信息

BMC Bioinformatics. 2021 Aug 12;22(1):401. doi: 10.1186/s12859-021-04308-z.

DOI:10.1186/s12859-021-04308-z
PMID:34384353
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8361636/
Abstract

BACKGROUND

Reproducible benchmarking is important for assessing the effectiveness of novel feature selection approaches applied on gene expression data, especially for prior knowledge approaches that incorporate biological information from online knowledge bases. However, no full-fledged benchmarking system exists that is extensible, provides built-in feature selection approaches, and a comprehensive result assessment encompassing classification performance, robustness, and biological relevance. Moreover, the particular needs of prior knowledge feature selection approaches, i.e. uniform access to knowledge bases, are not addressed. As a consequence, prior knowledge approaches are not evaluated amongst each other, leaving open questions regarding their effectiveness.

RESULTS

We present the Comprior benchmark tool, which facilitates the rapid development and effortless benchmarking of feature selection approaches, with a special focus on prior knowledge approaches. Comprior is extensible by custom approaches, offers built-in standard feature selection approaches, enables uniform access to multiple knowledge bases, and provides a customizable evaluation infrastructure to compare multiple feature selection approaches regarding their classification performance, robustness, runtime, and biological relevance.

CONCLUSION

Comprior allows reproducible benchmarking especially of prior knowledge approaches, which facilitates their applicability and for the first time enables a comprehensive assessment of their effectiveness.

摘要

背景

可重现的基准测试对于评估应用于基因表达数据的新型特征选择方法的有效性非常重要,特别是对于那些利用在线知识库中的生物信息的先验知识方法。然而,目前还没有一个可扩展的、提供内置特征选择方法以及全面的结果评估(包括分类性能、鲁棒性和生物学相关性)的成熟基准测试系统。此外,先验知识特征选择方法的特殊需求,即统一访问知识库,也没有得到解决。因此,先验知识方法之间没有相互评估,这就留下了关于它们有效性的问题。

结果

我们提出了 Comprior 基准工具,它为特征选择方法的快速开发和轻松基准测试提供了便利,特别关注先验知识方法。Comprior 可以通过自定义方法进行扩展,提供内置的标准特征选择方法,实现对多个知识库的统一访问,并提供可定制的评估基础设施,以比较多个特征选择方法在分类性能、鲁棒性、运行时和生物学相关性方面的性能。

结论

Comprior 允许对先验知识方法进行可重现的基准测试,这有助于它们的应用,并首次能够全面评估它们的有效性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3643/8361636/b34ba2e90216/12859_2021_4308_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3643/8361636/705547bae1f1/12859_2021_4308_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3643/8361636/2c84caf7301f/12859_2021_4308_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3643/8361636/ce32220524d4/12859_2021_4308_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3643/8361636/18d9d0531e55/12859_2021_4308_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3643/8361636/30b1ccada9a3/12859_2021_4308_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3643/8361636/b34ba2e90216/12859_2021_4308_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3643/8361636/705547bae1f1/12859_2021_4308_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3643/8361636/2c84caf7301f/12859_2021_4308_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3643/8361636/ce32220524d4/12859_2021_4308_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3643/8361636/18d9d0531e55/12859_2021_4308_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3643/8361636/30b1ccada9a3/12859_2021_4308_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3643/8361636/b34ba2e90216/12859_2021_4308_Fig6_HTML.jpg

相似文献

1
Comprior: facilitating the implementation and automated benchmarking of prior knowledge-based feature selection approaches on gene expression data sets.Comprior:促进基于先验知识的特征选择方法在基因表达数据集上的实施和自动化基准测试。
BMC Bioinformatics. 2021 Aug 12;22(1):401. doi: 10.1186/s12859-021-04308-z.
2
Integrative biomarker detection on high-dimensional gene expression data sets: a survey on prior knowledge approaches.高维基因表达数据集的综合生物标志物检测:先验知识方法综述。
Brief Bioinform. 2021 May 20;22(3). doi: 10.1093/bib/bbaa151.
3
Integrative Gene Selection on Gene Expression Data: Providing Biological Context to Traditional Approaches.基因表达数据的整合基因选择:为传统方法提供生物学背景。
J Integr Bioinform. 2018 Dec 22;16(1):20180064. doi: 10.1515/jib-2018-0064.
4
Benchmarking and Testing Machine Learning Approaches with BARRA:CuRDa, a for Cancer Research.
J Comput Biol. 2021 Sep;28(9):931-944. doi: 10.1089/cmb.2020.0463. Epub 2021 Jul 14.
5
Feature selection and nearest centroid classification for protein mass spectrometry.蛋白质质谱的特征选择与最近质心分类
BMC Bioinformatics. 2005 Mar 23;6:68. doi: 10.1186/1471-2105-6-68.
6
Benchmarking feature selection methods for compressing image information in high-content screening.用于在高内涵筛选中压缩图像信息的基准特征选择方法
SLAS Technol. 2022 Feb;27(1):85-93. doi: 10.1016/j.slast.2021.10.015. Epub 2021 Oct 25.
7
Benchmark of filter methods for feature selection in high-dimensional gene expression survival data.高维基因表达生存数据中特征选择的过滤方法的基准测试。
Brief Bioinform. 2022 Jan 17;23(1). doi: 10.1093/bib/bbab354.
8
Dynamic incorporation of prior knowledge from multiple domains in biomarker discovery.在生物标志物发现中动态纳入来自多个领域的先验知识。
BMC Bioinformatics. 2020 Mar 11;21(Suppl 2):77. doi: 10.1186/s12859-020-3344-x.
9
Benchmark on Indexing Algorithms for Accelerating Molecular Similarity Search.用于加速分子相似性搜索的索引算法基准测试。
J Chem Inf Model. 2020 Dec 28;60(12):6167-6184. doi: 10.1021/acs.jcim.0c00393. Epub 2020 Oct 23.
10
Budget constrained non-monotonic feature selection.预算受限的非单调特征选择
Neural Netw. 2015 Nov;71:214-24. doi: 10.1016/j.neunet.2015.08.004. Epub 2015 Sep 4.

引用本文的文献

1
Leveraging external information by guided adaptive shrinkage to improve variable selection in high-dimensional regression settings.通过引导式自适应收缩利用外部信息以改善高维回归设置中的变量选择。
Int J Biostat. 2025 Sep 8. doi: 10.1515/ijb-2024-0108.
2
Challenges and best practices in omics benchmarking.组学基准测试中的挑战和最佳实践。
Nat Rev Genet. 2024 May;25(5):326-339. doi: 10.1038/s41576-023-00679-6. Epub 2024 Jan 12.

本文引用的文献

1
Gene Set Knowledge Discovery with Enrichr.基因集知识发现与 Enrichr
Curr Protoc. 2021 Mar;1(3):e90. doi: 10.1002/cpz1.90.
2
Incorporating prior knowledge into regularized regression.将先验知识纳入正则化回归。
Bioinformatics. 2021 May 1;37(4):514-521. doi: 10.1093/bioinformatics/btaa776.
3
pipeComp, a general framework for the evaluation of computational pipelines, reveals performant single cell RNA-seq preprocessing tools.pipeComp 是一个用于评估计算流程的通用框架,它揭示了表现出色的单细胞 RNA-seq 预处理工具。
Genome Biol. 2020 Sep 1;21(1):227. doi: 10.1186/s13059-020-02136-7.
4
Integrative biomarker detection on high-dimensional gene expression data sets: a survey on prior knowledge approaches.高维基因表达数据集的综合生物标志物检测:先验知识方法综述。
Brief Bioinform. 2021 May 20;22(3). doi: 10.1093/bib/bbaa151.
5
CellBench: R/Bioconductor software for comparing single-cell RNA-seq analysis methods.CellBench:用于比较单细胞 RNA-seq 分析方法的 R/Bioconductor 软件。
Bioinformatics. 2020 Apr 1;36(7):2288-2290. doi: 10.1093/bioinformatics/btz889.
6
Pathway Commons 2019 Update: integration, analysis and exploration of pathway data.Pathway Commons 2019 更新:途径数据的整合、分析和探索。
Nucleic Acids Res. 2020 Jan 8;48(D1):D489-D497. doi: 10.1093/nar/gkz946.
7
g:Profiler: a web server for functional enrichment analysis and conversions of gene lists (2019 update).g:Profiler:一个用于功能富集分析和基因列表转换的网络服务器(2019 更新)。
Nucleic Acids Res. 2019 Jul 2;47(W1):W191-W198. doi: 10.1093/nar/gkz369.
8
Integrative Gene Selection on Gene Expression Data: Providing Biological Context to Traditional Approaches.基因表达数据的整合基因选择:为传统方法提供生物学背景。
J Integr Bioinform. 2018 Dec 22;16(1):20180064. doi: 10.1515/jib-2018-0064.
9
NormalyzerDE: Online Tool for Improved Normalization of Omics Expression Data and High-Sensitivity Differential Expression Analysis.NormalyzerDE:一种用于改善组学表达数据标准化和高灵敏度差异表达分析的在线工具。
J Proteome Res. 2019 Feb 1;18(2):732-740. doi: 10.1021/acs.jproteome.8b00523. Epub 2018 Oct 15.
10
Reproducible and replicable comparisons using SummarizedBenchmark.使用 SummarizedBenchmark 进行可重现和可复制的比较。
Bioinformatics. 2019 Jan 1;35(1):137-139. doi: 10.1093/bioinformatics/bty627.