• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

候选途径与途径评分关联的概率优先级排序。

Probabilistic prioritization of candidate pathway association with pathway score.

机构信息

Institute of Epidemiology and Preventive Medicine, College of Public Health, National Taiwan University, Taipei, 10055, Taiwan.

Bioinformatics and Biostatistics Core, Center of Genomic Medicine, National Taiwan University, Taipei, 10055, Taiwan.

出版信息

BMC Bioinformatics. 2018 Oct 24;19(1):391. doi: 10.1186/s12859-018-2411-z.

DOI:10.1186/s12859-018-2411-z
PMID:30355338
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6201593/
Abstract

BACKGROUND

Current methods for gene-set or pathway analysis are usually designed to test the enrichment of a single gene-set. Once the analysis is carried out for each of the sets under study, a list of significant sets can be obtained. However, if one wishes to further prioritize the importance or strength of association of these sets, no such quantitative measure is available. Using the magnitude of p-value to rank the pathways may not be appropriate because p-value is not a measure for strength of significance. In addition, when testing each pathway, these analyses are often implicitly affected by the number of differentially expressed genes included in the set and/or affected by the dependence among genes.

RESULTS

Here we propose a two-stage procedure to prioritize the pathways/gene-sets. In the first stage we develop a pathway-level measure with three properties. First, it contains all genes (differentially expressed or not) in the same set, and summarizes the collective effect of all genes per sample. Second, this pathway score accounts for the correlation between genes by synchronizing their correlation directions. Third, the score includes a rank transformation to enhance the variation among samples as well as to avoid the influence of extreme heterogeneity among genes. In the second stage, all scores are included simultaneously in a Bayesian logistic regression model which can evaluate the strength of association for each set and rank the sets based on posterior probabilities. Simulations from Gaussian distributions and human microarray data, and a breast cancer study with RNA-Seq are considered for demonstration and comparison with other existing methods.

CONCLUSIONS

The proposed summary pathway score provides for each sample an overall evaluation of gene expression in a gene-set. It demonstrates the advantages of including all genes in the set and the synchronization of correlation direction. The simultaneous utilization of all pathway-level scores in a Bayesian model not only offers a probabilistic evaluation and ranking of the pathway association but also presents good accuracy in identifying the top-ranking pathways. The resulting recommendation list of ranked pathways can be a reference for potential target therapy or for future allocation of research resources.

摘要

背景

目前的基因集或通路分析方法通常旨在测试单个基因集的富集情况。一旦对每个研究中的基因集进行分析,就可以获得一组显著的基因集。然而,如果希望进一步优先考虑这些基因集的重要性或关联强度,则没有这样的定量衡量标准。使用 p 值的大小来对通路进行排序可能并不合适,因为 p 值不是衡量显著性强度的指标。此外,在对每个通路进行测试时,这些分析通常会受到纳入基因集的差异表达基因数量的影响,并且/或者受到基因之间的相关性的影响。

结果

我们提出了一种两阶段程序来对通路/基因集进行优先级排序。在第一阶段,我们开发了一种具有三个特性的通路水平度量标准。首先,它包含同一基因集中的所有基因(差异表达或不差异表达),并汇总了每个样本中所有基因的综合效应。其次,该通路得分通过同步其相关方向来考虑基因之间的相关性。第三,该得分包括一个排名转换,以增强样本之间的变异性,同时避免基因之间极端异质性的影响。在第二阶段,所有得分同时包含在贝叶斯逻辑回归模型中,该模型可以评估每个基因集的关联强度,并根据后验概率对基因集进行排名。从高斯分布和人类微阵列数据模拟以及 RNA-Seq 的乳腺癌研究中进行了演示和与其他现有方法的比较。

结论

所提出的综合通路得分标准为每个样本提供了基因集内基因表达的整体评估。它展示了纳入基因集中的所有基因以及同步相关方向的优势。在贝叶斯模型中同时利用所有通路水平得分不仅提供了对通路关联的概率评估和排名,而且在识别排名靠前的通路方面具有很好的准确性。按排名顺序排列的推荐通路列表可以作为潜在靶向治疗或未来研究资源分配的参考。

相似文献

1
Probabilistic prioritization of candidate pathway association with pathway score.候选途径与途径评分关联的概率优先级排序。
BMC Bioinformatics. 2018 Oct 24;19(1):391. doi: 10.1186/s12859-018-2411-z.
2
Detecting discordance enrichment among a series of two-sample genome-wide expression data sets.检测一系列双样本全基因组表达数据集之间的不一致性富集情况。
BMC Genomics. 2017 Jan 25;18(Suppl 1):1050. doi: 10.1186/s12864-016-3265-2.
3
Network hub-node prioritization of gene regulation with intra-network association.基于网络内关联的基因调控网络枢纽-节点优先级排序。
BMC Bioinformatics. 2020 Mar 12;21(1):101. doi: 10.1186/s12859-020-3444-7.
4
Identifying genetic marker sets associated with phenotypes via an efficient adaptive score test.通过有效的自适应评分检验识别与表型相关的遗传标记集。
Biostatistics. 2012 Sep;13(4):776-90. doi: 10.1093/biostatistics/kxs015. Epub 2012 Jun 25.
5
Signal transduction pathway profiling of individual tumor samples.单个肿瘤样本的信号转导通路分析
BMC Bioinformatics. 2005 Jun 29;6:163. doi: 10.1186/1471-2105-6-163.
6
A route-based pathway analysis framework integrating mutation information and gene expression data.一种整合突变信息和基因表达数据的基于通路的通路分析框架。
Methods. 2017 Jul 15;124:3-12. doi: 10.1016/j.ymeth.2017.06.016. Epub 2017 Jun 22.
7
Semiparametric Bayesian kernel survival model for evaluating pathway effects.半参数贝叶斯核生存模型用于评估途径效应。
Stat Methods Med Res. 2019 Oct-Nov;28(10-11):3301-3317. doi: 10.1177/0962280218797360. Epub 2018 Oct 5.
8
A predictive risk probability approach for microarray data with survival as an endpoint.一种以生存为终点的微阵列数据预测风险概率方法。
J Biopharm Stat. 2008;18(5):841-52. doi: 10.1080/10543400802277967.
9
A Bayesian variable selection procedure to rank overlapping gene sets.一种贝叶斯变量选择程序,用于对重叠基因集进行排名。
BMC Bioinformatics. 2012 May 3;13:73. doi: 10.1186/1471-2105-13-73.
10
Confident difference criterion: a new Bayesian differentially expressed gene selection algorithm with applications.置信差异准则:一种新的贝叶斯差异表达基因选择算法及其应用
BMC Bioinformatics. 2015 Aug 7;16:245. doi: 10.1186/s12859-015-0664-3.

引用本文的文献

1
Candidate pathway association and genome-wide association approaches reveal alternative genetic architectures of carotenoid content in cultivated sunflower ().候选通路关联分析和全基因组关联分析方法揭示了栽培向日葵中类胡萝卜素含量的不同遗传结构。
Appl Plant Sci. 2023 Dec 2;11(6):e11558. doi: 10.1002/aps3.11558. eCollection 2023 Nov-Dec.
2
Probabilistic edge inference of gene networks with markov random field-based bayesian learning.基于马尔可夫随机场贝叶斯学习的基因网络概率边推断
Front Genet. 2022 Nov 10;13:1034946. doi: 10.3389/fgene.2022.1034946. eCollection 2022.
3
Network hub-node prioritization of gene regulation with intra-network association.

本文引用的文献

1
Gene-set Analysis with CGI Information for Differential DNA Methylation Profiling.利用CGI信息进行差异DNA甲基化谱分析的基因集分析
Sci Rep. 2016 Apr 19;6:24666. doi: 10.1038/srep24666.
2
The statistical properties of gene-set analysis.基因集分析的统计特性。
Nat Rev Genet. 2016 Apr 12;17(6):353-64. doi: 10.1038/nrg.2016.29.
3
Gene set analysis approaches for RNA-seq data: performance evaluation and application guideline.RNA测序数据的基因集分析方法:性能评估与应用指南
基于网络内关联的基因调控网络枢纽-节点优先级排序。
BMC Bioinformatics. 2020 Mar 12;21(1):101. doi: 10.1186/s12859-020-3444-7.
Brief Bioinform. 2016 May;17(3):393-407. doi: 10.1093/bib/bbv069. Epub 2015 Sep 4.
4
A Molecular Portrait of High-Grade Ductal Carcinoma In Situ.高级别导管原位癌的分子图谱
Cancer Res. 2015 Sep 15;75(18):3980-90. doi: 10.1158/0008-5472.CAN-15-0506. Epub 2015 Aug 6.
5
IL17 Promotes Mammary Tumor Progression by Changing the Behavior of Tumor Cells and Eliciting Tumorigenic Neutrophils Recruitment.白细胞介素-17通过改变肿瘤细胞行为和引发致瘤性中性粒细胞募集来促进乳腺肿瘤进展。
Cancer Res. 2015 Sep 15;75(18):3788-99. doi: 10.1158/0008-5472.CAN-15-0054. Epub 2015 Jul 24.
6
Endothelial ALK1 Is a Therapeutic Target to Block Metastatic Dissemination of Breast Cancer.内皮细胞 ALK1 是阻断乳腺癌转移扩散的治疗靶点。
Cancer Res. 2015 Jun 15;75(12):2445-56. doi: 10.1158/0008-5472.CAN-14-3706.
7
Comparative evaluation of gene set analysis approaches for RNA-Seq data.RNA测序数据基因集分析方法的比较评估
BMC Bioinformatics. 2014 Dec 5;15(1):397. doi: 10.1186/s12859-014-0397-8.
8
Interleukin-17D mediates tumor rejection through recruitment of natural killer cells.白细胞介素-17D通过募集自然杀伤细胞介导肿瘤排斥反应。
Cell Rep. 2014 May 22;7(4):989-98. doi: 10.1016/j.celrep.2014.03.073. Epub 2014 May 1.
9
Scientific method: statistical errors.科学方法:统计误差
Nature. 2014 Feb 13;506(7487):150-2. doi: 10.1038/506150a.
10
Gene set analysis methods: statistical models and methodological differences.基因集分析方法:统计模型与方法差异
Brief Bioinform. 2014 Jul;15(4):504-18. doi: 10.1093/bib/bbt002.