• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

CGPS:一种基于机器学习的方法,整合了多种基因集分析工具,以便更好地对生物学相关途径进行优先级排序。

CGPS: A machine learning-based approach integrating multiple gene set analysis tools for better prioritization of biologically relevant pathways.

机构信息

Center for Bioinformatics, State Key Laboratory of Protein and Plant Gene Research, School of Life Sciences, Peking University, Beijing 100871, China.

Center for Bioinformatics, State Key Laboratory of Protein and Plant Gene Research, School of Life Sciences, Peking University, Beijing 100871, China.

出版信息

J Genet Genomics. 2018 Sep 20;45(9):489-504. doi: 10.1016/j.jgg.2018.08.002. Epub 2018 Sep 13.

DOI:10.1016/j.jgg.2018.08.002
PMID:30292791
Abstract

Gene set enrichment (GSE) analyses play an important role in the interpretation of large-scale transcriptome datasets. Multiple GSE tools can be integrated into a single method as obtaining optimal results is challenging due to the plethora of GSE tools and their discrepant performances. Several existing ensemble methods lead to different scores in sorting pathways as integrated results; furthermore, it is difficult for users to choose a single ensemble score to obtain optimal final results. Here, we develop an ensemble method using a machine learning approach called Combined Gene set analysis incorporating Prioritization and Sensitivity (CGPS) that integrates the results provided by nine prominent GSE tools into a single ensemble score (R score) to sort pathways as integrated results. Moreover, to the best of our knowledge, CGPS is the first GSE ensemble method built based on a priori knowledge of pathways and phenotypes. Compared with 10 widely used individual methods and five types of ensemble scores from two ensemble methods, we demonstrate that sorting pathways based on the R score can better prioritize relevant pathways, as established by an evaluation of 120 simulated datasets and 45 real datasets. Additionally, CGPS is applied to expression data involving the drug panobinostat, which is an anticancer treatment against multiple myeloma. The results identify cell processes associated with cancer, such as the p53 signaling pathway (hsa04115); by contrast, according to two ensemble methods (EnrichmentBrowser and EGSEA), this pathway has a rank higher than 20, which may cause users to miss the pathway in their analyses. We show that this method, which is based on a priori knowledge, can capture valuable biological information from numerous types of gene set collections, such as KEGG pathways, GO terms, Reactome, and BioCarta. CGPS is publicly available as a standalone source code at ftp://ftp.cbi.pku.edu.cn/pub/CGPS_download/cgps-1.0.0.tar.gz.

摘要

基因集富集 (GSE) 分析在解释大规模转录组数据集方面发挥着重要作用。由于 GSE 工具众多且性能参差不齐,将多个 GSE 工具集成到单个方法中以获得最佳结果具有挑战性。几种现有的集成方法在整合结果中导致不同的途径排序分数;此外,用户很难选择单个集成分数来获得最佳的最终结果。在这里,我们开发了一种使用机器学习方法的集成方法,称为结合基因集分析纳入优先级和敏感性 (CGPS),该方法将来自九个著名 GSE 工具的结果集成到单个集成分数(R 分数)中,以对途径进行排序作为整合结果。此外,据我们所知,CGPS 是第一个基于途径和表型先验知识构建的 GSE 集成方法。与 10 种广泛使用的个体方法和两种集成方法的 5 种集成分数相比,我们证明了基于 R 分数对途径进行排序可以更好地优先考虑相关途径,这是通过对 120 个模拟数据集和 45 个真实数据集的评估得出的。此外,CGPS 还应用于涉及抗癌药物帕比司他的表达数据,该药是一种针对多发性骨髓瘤的抗癌治疗药物。结果确定了与癌症相关的细胞过程,例如 p53 信号通路(hsa04115);相比之下,根据两种集成方法(EnrichmentBrowser 和 EGSEA),该途径的排名高于 20,这可能导致用户在分析中忽略该途径。我们表明,这种基于先验知识的方法可以从多种类型的基因集集合(如 KEGG 途径、GO 术语、Reactome 和 BioCarta)中捕获有价值的生物学信息。CGPS 可作为独立源代码在 ftp://ftp.cbi.pku.edu.cn/pub/CGPS_download/cgps-1.0.0.tar.gz 上公开获取。

相似文献

1
CGPS: A machine learning-based approach integrating multiple gene set analysis tools for better prioritization of biologically relevant pathways.CGPS:一种基于机器学习的方法,整合了多种基因集分析工具,以便更好地对生物学相关途径进行优先级排序。
J Genet Genomics. 2018 Sep 20;45(9):489-504. doi: 10.1016/j.jgg.2018.08.002. Epub 2018 Sep 13.
2
Combining multiple tools outperforms individual methods in gene set enrichment analyses.在基因集富集分析中,结合多种工具比单独使用方法表现更优。
Bioinformatics. 2017 Feb 1;33(3):414-424. doi: 10.1093/bioinformatics/btw623.
3
KOBAS-i: intelligent prioritization and exploratory visualization of biological functions for gene enrichment analysis.KOBAS-i:用于基因富集分析的生物学功能智能优先级排序和探索性可视化。
Nucleic Acids Res. 2021 Jul 2;49(W1):W317-W325. doi: 10.1093/nar/gkab447.
4
Testing gene set enrichment for subset of genes: Sub-GSE.针对基因子集进行基因集富集分析:子基因集富集分析(Sub-GSE)。
BMC Bioinformatics. 2008 Sep 2;9:362. doi: 10.1186/1471-2105-9-362.
5
Easy and efficient ensemble gene set testing with EGSEA.使用EGSEA进行简单高效的整合基因集测试。
F1000Res. 2017 Nov 14;6:2010. doi: 10.12688/f1000research.12544.1. eCollection 2017.
6
Ensemble Prediction of Synergistic Drug Combinations Incorporating Biological, Chemical, Pharmacological, and Network Knowledge.综合考虑生物学、化学、药理学和网络知识的协同药物组合的预测。
IEEE J Biomed Health Inform. 2019 May;23(3):1336-1345. doi: 10.1109/JBHI.2018.2852274. Epub 2018 Jul 2.
7
A transfer learning approach via procrustes analysis and mean shift for cancer drug sensitivity prediction.一种通过普罗克汝斯分析和均值漂移进行癌症药物敏感性预测的迁移学习方法。
J Bioinform Comput Biol. 2018 Jun;16(3):1840014. doi: 10.1142/S0219720018400140.
8
PeNGaRoo, a combined gradient boosting and ensemble learning framework for predicting non-classical secreted proteins.PeNGaRoo,一种组合梯度提升和集成学习框架,用于预测非经典分泌蛋白。
Bioinformatics. 2020 Feb 1;36(3):704-712. doi: 10.1093/bioinformatics/btz629.
9
Pathway analysis using random forests classification and regression.使用随机森林分类和回归的通路分析
Bioinformatics. 2006 Aug 15;22(16):2028-36. doi: 10.1093/bioinformatics/btl344. Epub 2006 Jun 29.
10
GOLabeler: improving sequence-based large-scale protein function prediction by learning to rank.GOLabeler:通过学习排序提高基于序列的大规模蛋白质功能预测。
Bioinformatics. 2018 Jul 15;34(14):2465-2473. doi: 10.1093/bioinformatics/bty130.

引用本文的文献

1
PSD3 as a context-dependent modulator of immune landscape and tumor aggressiveness in esophageal squamous cell carcinoma.PSD3作为食管鳞状细胞癌免疫格局和肿瘤侵袭性的一种上下文依赖性调节因子。
Front Immunol. 2025 Aug 15;16:1641254. doi: 10.3389/fimmu.2025.1641254. eCollection 2025.
2
CrWRKY57 and CrABF3 cooperatively activate to modulate drought tolerance and root development.CrWRKY57和CrABF3协同激活以调节耐旱性和根系发育。
Hortic Res. 2025 Jun 20;12(9):uhaf158. doi: 10.1093/hr/uhaf158. eCollection 2025 Sep.
3
Treponema pallidum inhibits CD4+ T-cell proliferation through METAP2: insights from Mendelian randomization analysis.
梅毒螺旋体通过甲硫氨酸氨基肽酶2抑制CD4 + T细胞增殖:孟德尔随机化分析的见解
AMB Express. 2025 Aug 25;15(1):126. doi: 10.1186/s13568-025-01940-3.
4
Study on Single Nucleotide Polymorphism of Gene and Its Correlation with Dairy Quality Traits of Gannan Yak.甘南牦牛基因单核苷酸多态性及其与乳品质性状的相关性研究
Foods. 2024 Sep 18;13(18):2953. doi: 10.3390/foods13182953.
5
P.O.L.A.R. Star: A New Framework Developed and Applied by One Mid-Sized Pharmaceutical Company to Drive Digital Transformation in R&D.P.O.L.A.R. 之星:一家中型制药公司开发并应用的新框架,推动研发领域的数字化转型。
Pharmaceut Med. 2024 Sep;38(5):343-353. doi: 10.1007/s40290-024-00533-y. Epub 2024 Aug 9.
6
Identification, validation and candidate gene analysis of major QTL for Supernumerary spikelets in wheat.小麦多小穗主效 QTL 的鉴定、验证和候选基因分析。
BMC Genomics. 2024 Jul 8;25(1):675. doi: 10.1186/s12864-024-10540-7.
7
Preosteoclast plays a pathogenic role in syndesmophyte formation of ankylosing spondylitis through the secreted PDGFB - GRB2/ERK/RUNX2 pathway.破骨前体细胞通过分泌的 PDGFB-GRB2/ERK/RUNX2 通路在强直性脊柱炎骨桥形成中起致病作用。
Arthritis Res Ther. 2023 Oct 5;25(1):194. doi: 10.1186/s13075-023-03142-3.
8
Tuberous Sclerosis Complex 1 Deficiency in Macrophages Promotes Unclassical Inflammatory Response to Lipopolysaccharide and Dextran Sodium Sulfate-Induced Colitis in Mice.巨噬细胞中结节性硬化复合物1缺陷促进小鼠对脂多糖和葡聚糖硫酸钠诱导的结肠炎产生非典型炎症反应。
Aging Dis. 2022 Dec 1;13(6):1875-1890. doi: 10.14336/AD.2022.0408.
9
Tissue-specific transcriptome responses to Fusarium head blight and Fusarium root rot.组织特异性转录组对赤霉病和镰刀菌根腐病的反应。
Front Plant Sci. 2022 Oct 24;13:1025161. doi: 10.3389/fpls.2022.1025161. eCollection 2022.
10
Glia Maturation Factor β as a Novel Independent Prognostic Biomarker and Potential Therapeutic Target of Kidney Renal Clear Cell Carcinoma.胶质细胞成熟因子β作为肾透明细胞癌新的独立预后生物标志物及潜在治疗靶点
Front Oncol. 2022 Jul 4;12:880100. doi: 10.3389/fonc.2022.880100. eCollection 2022.