• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用充分降维的基因集分析。

Gene set analysis using sufficient dimension reduction.

作者信息

Hsueh Huey-Miin, Tsai Chen-An

机构信息

Department of Statistics, National Chengchi UniversityZhinan Road, Taipei116, Taiwan, Taipei, 116, Taiwan.

Department of Agronomy, National Taiwan University, No. 1, Section 4, Roosevelt Road, Taipei, 106, Taiwan.

出版信息

BMC Bioinformatics. 2016 Feb 6;17:74. doi: 10.1186/s12859-016-0928-6.

DOI:10.1186/s12859-016-0928-6
PMID:26852017
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4744442/
Abstract

BACKGROUND

Gene set analysis (GSA) aims to evaluate the association between the expression of biological pathways, or a priori defined gene sets, and a particular phenotype. Numerous GSA methods have been proposed to assess the enrichment of sets of genes. However, most methods are developed with respect to a specific alternative scenario, such as a differential mean pattern or a differential coexpression. Moreover, a very limited number of methods can handle either binary, categorical, or continuous phenotypes. In this paper, we develop two novel GSA tests, called SDRs, based on the sufficient dimension reduction technique, which aims to capture sufficient information about the relationship between genes and the phenotype. The advantages of our proposed methods are that they allow for categorical and continuous phenotypes, and they are also able to identify a variety of enriched gene sets.

RESULTS

Through simulation studies, we compared the type I error and power of SDRs with existing GSA methods for binary, triple, and continuous phenotypes. We found that SDR methods adequately control the type I error rate at the pre-specified nominal level, and they have a satisfactory power to detect gene sets with differential coexpression and to test non-linear associations between gene sets and a continuous phenotype. In addition, the SDR methods were compared with seven widely-used GSA methods using two real microarray datasets for illustration.

CONCLUSIONS

We concluded that the SDR methods outperform the others because of their flexibility with regard to handling different kinds of phenotypes and their power to detect a wide range of alternative scenarios. Our real data analysis highlights the differences between GSA methods for detecting enriched gene sets.

摘要

背景

基因集分析(GSA)旨在评估生物途径或预先定义的基因集的表达与特定表型之间的关联。已经提出了许多GSA方法来评估基因集的富集情况。然而,大多数方法是针对特定的替代情形开发的,例如差异均值模式或差异共表达。此外,能够处理二元、分类或连续表型的方法数量非常有限。在本文中,我们基于充分降维技术开发了两种新颖的GSA检验,称为SDR,其目的是捕获有关基因与表型之间关系的充分信息。我们提出的方法的优点是它们允许处理分类和连续表型,并且还能够识别各种富集的基因集。

结果

通过模拟研究,我们将SDR的I型错误率和检验功效与现有的针对二元、三元和连续表型的GSA方法进行了比较。我们发现SDR方法能够在预先指定的名义水平上充分控制I型错误率,并且它们具有令人满意的功效来检测具有差异共表达的基因集,并检验基因集与连续表型之间的非线性关联。此外,使用两个真实的微阵列数据集将SDR方法与七种广泛使用的GSA方法进行了比较以作说明。

结论

我们得出结论,SDR方法优于其他方法,因为它们在处理不同类型表型方面具有灵活性,并且具有检测广泛替代情形的能力。我们的实际数据分析突出了检测富集基因集的GSA方法之间的差异。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d85e/4744442/d2e1ab74f0fb/12859_2016_928_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d85e/4744442/5001d1439065/12859_2016_928_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d85e/4744442/9dbab0700a0a/12859_2016_928_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d85e/4744442/012d41af5e28/12859_2016_928_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d85e/4744442/cee0555e2cc1/12859_2016_928_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d85e/4744442/983d24fa17bf/12859_2016_928_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d85e/4744442/7d3219596a96/12859_2016_928_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d85e/4744442/a100cfa4ac9e/12859_2016_928_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d85e/4744442/d2e1ab74f0fb/12859_2016_928_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d85e/4744442/5001d1439065/12859_2016_928_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d85e/4744442/9dbab0700a0a/12859_2016_928_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d85e/4744442/012d41af5e28/12859_2016_928_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d85e/4744442/cee0555e2cc1/12859_2016_928_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d85e/4744442/983d24fa17bf/12859_2016_928_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d85e/4744442/7d3219596a96/12859_2016_928_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d85e/4744442/a100cfa4ac9e/12859_2016_928_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d85e/4744442/d2e1ab74f0fb/12859_2016_928_Fig8_HTML.jpg

相似文献

1
Gene set analysis using sufficient dimension reduction.使用充分降维的基因集分析。
BMC Bioinformatics. 2016 Feb 6;17:74. doi: 10.1186/s12859-016-0928-6.
2
Linear combination test for gene set analysis of a continuous phenotype.线性组合检验用于分析连续表型的基因集。
BMC Bioinformatics. 2013 Jul 1;14:212. doi: 10.1186/1471-2105-14-212.
3
Extracting the Strongest Signals from Omics Data: Differentially Expressed Pathways and Beyond.从组学数据中提取最强信号:差异表达通路及其他。
Methods Mol Biol. 2017;1613:125-159. doi: 10.1007/978-1-4939-7027-8_7.
4
Comparative study of gene set enrichment methods.基因集富集方法的比较研究。
BMC Bioinformatics. 2009 Sep 2;10:275. doi: 10.1186/1471-2105-10-275.
5
Gene set enrichment analysis for multiple continuous phenotypes.基因集富集分析多个连续表型。
BMC Bioinformatics. 2014 Aug 3;15(1):260. doi: 10.1186/1471-2105-15-260.
6
Differential regulation enrichment analysis via the integration of transcriptional regulatory network and gene expression data.通过整合转录调控网络和基因表达数据进行差异调控富集分析。
Bioinformatics. 2015 Feb 15;31(4):563-71. doi: 10.1093/bioinformatics/btu672. Epub 2014 Oct 15.
7
Improving gene set analysis of microarray data by SAM-GS.通过SAM-GS改进微阵列数据的基因集分析
BMC Bioinformatics. 2007 Jul 5;8:242. doi: 10.1186/1471-2105-8-242.
8
The multivariate nonparametric methods for identifying gene sets with differential expression.用于识别具有差异表达的基因集的多变量非参数方法。
Gene. 2014 Nov 15;552(1):18-23. doi: 10.1016/j.gene.2014.09.007. Epub 2014 Sep 4.
9
De-correlating expression in gene-set analysis.基因集分析中的去相关表达。
Bioinformatics. 2010 Sep 15;26(18):i511-6. doi: 10.1093/bioinformatics/btq380.
10
GAGE: generally applicable gene set enrichment for pathway analysis.GAGE:用于通路分析的通用基因集富集分析
BMC Bioinformatics. 2009 May 27;10:161. doi: 10.1186/1471-2105-10-161.

引用本文的文献

1
Weighted overlapping group lasso for integrating prior network knowledge into gene set analysis.用于将先验网络知识整合到基因集分析中的加权重叠组套索法。
BMC Bioinformatics. 2025 Sep 1;26(1):226. doi: 10.1186/s12859-025-06170-9.
2
Fifteen Years of Gene Set Analysis for High-Throughput Genomic Data: A Review of Statistical Approaches and Future Challenges.高通量基因组数据的基因集分析十五年:统计方法综述与未来挑战
Entropy (Basel). 2020 Apr 10;22(4):427. doi: 10.3390/e22040427.
3
Unsupervised gene set testing based on random matrix theory.

本文引用的文献

1
MAVTgsa: an R package for gene set (enrichment) analysis.MAVTgsa:一个用于基因集(富集)分析的R软件包。
Biomed Res Int. 2014;2014:346074. doi: 10.1155/2014/346074. Epub 2014 Jul 3.
2
Gene set enrichment analysis for multiple continuous phenotypes.基因集富集分析多个连续表型。
BMC Bioinformatics. 2014 Aug 3;15(1):260. doi: 10.1186/1471-2105-15-260.
3
EDDY: a novel statistical gene set test method to detect differential genetic dependencies.EDDY:一种新型的统计基因集测试方法,用于检测差异的遗传依赖性。
基于随机矩阵理论的无监督基因集测试
BMC Bioinformatics. 2016 Nov 4;17(1):442. doi: 10.1186/s12859-016-1299-8.
Nucleic Acids Res. 2014 Apr;42(7):e60. doi: 10.1093/nar/gku099. Epub 2014 Feb 5.
4
Gene Sets Net Correlations Analysis (GSNCA): a multivariate differential coexpression test for gene sets.基因集网络相关分析(GSNCA):一种用于基因集的多元差异共表达检验方法。
Bioinformatics. 2014 Feb 1;30(3):360-8. doi: 10.1093/bioinformatics/btt687. Epub 2013 Nov 30.
5
Linear combination test for gene set analysis of a continuous phenotype.线性组合检验用于分析连续表型的基因集。
BMC Bioinformatics. 2013 Jul 1;14:212. doi: 10.1186/1471-2105-14-212.
6
Gene set analysis methods: statistical models and methodological differences.基因集分析方法:统计模型与方法差异
Brief Bioinform. 2014 Jul;15(4):504-18. doi: 10.1093/bib/bbt002.
7
Obesity and prostate cancer: weighing the evidence.肥胖与前列腺癌:权衡证据。
Eur Urol. 2013 May;63(5):800-9. doi: 10.1016/j.eururo.2012.11.013. Epub 2012 Nov 15.
8
Gene set analysis for self-contained tests: complex null and specific alternative hypotheses.基于独立检验的基因集分析:复杂的零假设和特定备择假设。
Bioinformatics. 2012 Dec 1;28(23):3073-80. doi: 10.1093/bioinformatics/bts579. Epub 2012 Oct 7.
9
Prior biological knowledge-based approaches for the analysis of genome-wide expression profiles using gene sets and pathways.基于先验生物学知识的方法,使用基因集和途径分析全基因组表达谱。
Stat Methods Med Res. 2009 Dec;18(6):577-93. doi: 10.1177/0962280209351925.
10
Gene set enrichment analysis made simple.基因集富集分析变得简单。
Stat Methods Med Res. 2009 Dec;18(6):565-75. doi: 10.1177/0962280209351908.