GSAn：一种用于注释基因集的富集分析替代方法。

GSAn: an alternative to enrichment analysis for annotating gene sets.

作者信息

Ayllon-Benitez Aaron, Bourqui Romain, Thébault Patricia, Mougin Fleur

机构信息

University of Bordeaux, Inserm UMR 1219, Bordeaux Population Health Research Center, team ERIAS, Bordeaux 33000, France.

University of Bordeaux, CNRS UMR 5800, LaBRI, Bordeaux 33400, France.

出版信息

NAR Genom Bioinform. 2020 Mar 14;2(2):lqaa017. doi: 10.1093/nargab/lqaa017. eCollection 2020 Jun.

DOI:10.1093/nargab/lqaa017

PMID:33575577

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7671311/

Abstract

The revolution in new sequencing technologies is greatly leading to new understandings of the relations between genotype and phenotype. To interpret and analyze data that are grouped according to a phenotype of interest, methods based on statistical enrichment became a standard in biology. However, these methods synthesize the biological information by selecting the over-represented terms and may suffer from focusing on the most studied genes that represent a limited coverage of annotated genes within a gene set. Semantic similarity measures have shown great results within the pairwise gene comparison by making advantage of the underlying structure of the Gene Ontology. We developed GSAn, a novel gene set annotation method that uses semantic similarity measures to synthesize Gene Ontology annotation terms. The originality of our approach is to identify the best compromise between the number of retained annotation terms that has to be drastically reduced and the number of related genes that has to be as large as possible. Moreover, GSAn offers interactive visualization facilities dedicated to the multi-scale analysis of gene set annotations. Compared to enrichment analysis tools, GSAn has shown excellent results in terms of maximizing the gene coverage while minimizing the number of terms.

摘要

新测序技术的革命极大地推动了人们对基因型与表型之间关系的新认识。为了解释和分析根据感兴趣的表型分组的数据，基于统计富集的方法已成为生物学中的标准方法。然而，这些方法通过选择过度代表的术语来综合生物信息，可能会因专注于研究最多的基因而受到影响，这些基因在基因集中仅占有限的注释基因覆盖范围。语义相似性度量通过利用基因本体论的底层结构，在成对基因比较中显示出了很好的效果。我们开发了GSAn，一种新颖的基因集注释方法，它使用语义相似性度量来综合基因本体论注释术语。我们方法的独特之处在于，要在必须大幅减少的保留注释术语数量与尽可能多的相关基因数量之间找到最佳平衡。此外，GSAn提供了专门用于基因集注释多尺度分析的交互式可视化工具。与富集分析工具相比，GSAn在最大化基因覆盖范围同时最小化术语数量方面显示出了优异的结果。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/650e/7671311/31f36a47fc38/lqaa017fig1.jpg

相似文献

GSAn: an alternative to enrichment analysis for annotating gene sets.GSAn：一种用于注释基因集的富集分析替代方法。

NAR Genom Bioinform. 2020 Mar 14;2(2):lqaa017. doi: 10.1093/nargab/lqaa017. eCollection 2020 Jun.

A new method for evaluating the impacts of semantic similarity measures on the annotation of gene sets.一种评估语义相似性度量对基因集注释影响的新方法。

PLoS One. 2018 Nov 27;13(11):e0208037. doi: 10.1371/journal.pone.0208037. eCollection 2018.

A relation based measure of semantic similarity for Gene Ontology annotations.一种基于关系的基因本体注释语义相似度度量方法。

BMC Bioinformatics. 2008 Nov 4;9:468. doi: 10.1186/1471-2105-9-468.

GO functional similarity clustering depends on similarity measure, clustering method, and annotation completeness.GO 功能相似性聚类取决于相似性度量、聚类方法和注释完整性。

BMC Bioinformatics. 2019 Mar 27;20(1):155. doi: 10.1186/s12859-019-2752-2.

TopoICSim: a new semantic similarity measure based on gene ontology.TopoICSim：一种基于基因本体论的新语义相似性度量方法。

BMC Bioinformatics. 2016 Jul 29;17(1):296. doi: 10.1186/s12859-016-1160-0.

UFO: A tool for unifying biomedical ontology-based semantic similarity calculation, enrichment analysis and visualization.UFO：一种用于统一基于生物医学本体的语义相似性计算、富集分析和可视化的工具。

PLoS One. 2020 Jul 9;15(7):e0235670. doi: 10.1371/journal.pone.0235670. eCollection 2020.

GARNET--gene set analysis with exploration of annotation relations.GARNET--基于注释关系探索的基因集分析。

BMC Bioinformatics. 2011 Feb 15;12 Suppl 1(Suppl 1):S25. doi: 10.1186/1471-2105-12-S1-S25.

NoisyGOA: Noisy GO annotations prediction using taxonomic and semantic similarity.NoisyGOA：利用分类学和语义相似性预测有噪声的基因本体注释

Comput Biol Chem. 2016 Dec;65:203-211. doi: 10.1016/j.compbiolchem.2016.09.005. Epub 2016 Sep 13.

NeVOmics: An Enrichment Tool for Gene Ontology and Functional Network Analysis and Visualization of Data from OMICs Technologies.NeVOmics：一种用于基因本体论和功能网络分析以及对来自组学技术的数据进行可视化的富集工具。

Genes (Basel). 2018 Nov 23;9(12):569. doi: 10.3390/genes9120569.

引用本文的文献

Transcriptomic profiling of reward and sensory brain areas in perinatal fentanyl exposed juvenile mice.围产期芬太尼暴露的幼年小鼠奖赏和感觉脑区的转录组特征分析。

Neuropsychopharmacology. 2023 Nov;48(12):1724-1734. doi: 10.1038/s41386-023-01639-8. Epub 2023 Jul 3.

A systematic review of non-coding RNA genes with differential expression profiles associated with autism spectrum disorders.非编码 RNA 基因与自闭症谱系障碍相关的差异表达谱的系统评价。

PLoS One. 2023 Jun 15;18(6):e0287131. doi: 10.1371/journal.pone.0287131. eCollection 2023.

Proteomic analysis of sialoliths from calcified, lipid and mixed groups as a source of potential biomarkers of deposit formation in the salivary glands.对钙化组、脂质组和混合组涎石进行蛋白质组学分析，以寻找唾液腺沉积物形成潜在生物标志物的来源。

Clin Proteomics. 2023 Mar 22;20(1):11. doi: 10.1186/s12014-023-09402-3.

Distinct Cellular Origins and Differentiation Process Account for Distinct Oncogenic and Clinical Behaviors of Leiomyosarcomas.不同的细胞起源和分化过程导致平滑肌肉瘤具有不同的致癌和临床行为。

Cancers (Basel). 2023 Jan 15;15(2):534. doi: 10.3390/cancers15020534.

simplifyEnrichment: A Bioconductor Package for Clustering and Visualizing Functional Enrichment Results.simplifyEnrichment：一个用于聚类和可视化功能富集结果的 Bioconductor 包。

Genomics Proteomics Bioinformatics. 2023 Feb;21(1):190-202. doi: 10.1016/j.gpb.2022.04.008. Epub 2022 Jun 6.

Three Microbial Musketeers of the Seas: , and , and Their Adaptation to Different Salinity Probed by a Proteomic Approach.海洋中的三位微生物火枪手：、和，以及通过蛋白质组学方法探究它们对不同盐度的适应机制。

Int J Mol Sci. 2022 Jan 6;23(2):619. doi: 10.3390/ijms23020619.

Development of a fixed module repertoire for the analysis and interpretation of blood transcriptome data.建立固定模块库，用于分析和解释血液转录组数据。

Nat Commun. 2021 Jul 19;12(1):4385. doi: 10.1038/s41467-021-24584-w.

Virus-Host Interaction Gets . PART II: Functional Transcriptomics of the DksA-Deficient Cell upon Phage P1 Infection.病毒-宿主相互作用的研究。第二部分：噬菌体 P1 感染时 DksA 缺陷细胞的功能转录组学。

Int J Mol Sci. 2021 Jun 7;22(11):6159. doi: 10.3390/ijms22116159.

Activation of a neural stem cell transcriptional program in parenchymal astrocytes.在实质星形胶质细胞中激活神经干细胞转录程序。

Elife. 2020 Aug 3;9:e59733. doi: 10.7554/eLife.59733.

A modular framework for the development of targeted Covid-19 blood transcript profiling panels.靶向 COVID-19 血液转录谱分析面板开发的模块化框架。

J Transl Med. 2020 Jul 31;18(1):291. doi: 10.1186/s12967-020-02456-z.

本文引用的文献

A new method for evaluating the impacts of semantic similarity measures on the annotation of gene sets.一种评估语义相似性度量对基因集注释影响的新方法。

PLoS One. 2018 Nov 27;13(11):e0208037. doi: 10.1371/journal.pone.0208037. eCollection 2018.

GOGO: An improved algorithm to measure the semantic similarity between gene ontology terms.GO-GO：一种改进的基因本体术语间语义相似度测量算法。

Sci Rep. 2018 Oct 10;8(1):15107. doi: 10.1038/s41598-018-33219-y.

Interpretation of biological experiments changes with evolution of the Gene Ontology and its annotations.生物实验的解释随着基因本体论及其注释的发展而变化。

Sci Rep. 2018 Mar 23;8(1):5115. doi: 10.1038/s41598-018-23395-2.

Gene annotation bias impedes biomedical research.基因注释偏差阻碍了生物医学研究。

Sci Rep. 2018 Jan 22;8(1):1362. doi: 10.1038/s41598-018-19333-x.

Methods Mol Biol. 2017;1446:161-173. doi: 10.1007/978-1-4939-3743-1_12.

Tree Colors: Color Schemes for Tree-Structured Data.树的颜色：树状结构数据的配色方案。

IEEE Trans Vis Comput Graph. 2014 Dec;20(12):2072-81. doi: 10.1109/TVCG.2014.2346277.

Measure the Semantic Similarity of GO Terms Using Aggregate Information Content.使用聚合信息内容测量基因本体术语的语义相似性。

IEEE/ACM Trans Comput Biol Bioinform. 2014 May-Jun;11(3):468-76. doi: 10.1109/TCBB.2013.176.

Bias in microRNA functional enrichment analysis.微小RNA功能富集分析中的偏差

Bioinformatics. 2015 May 15;31(10):1592-8. doi: 10.1093/bioinformatics/btv023. Epub 2015 Jan 20.

Prioritising lexical patterns to increase axiomatisation in biomedical ontologies. The role of localisation and modularity.优先考虑词汇模式以增加生物医学本体中的公理化。定位和模块化的作用。

Methods Inf Med. 2015;54(1):56-64. doi: 10.3414/ME13-02-0026. Epub 2014 Jul 4.

Molecular signatures of antibody responses derived from a systems biology study of five human vaccines.从五项人体疫苗的系统生物学研究中得出的抗体反应的分子特征。

Nat Immunol. 2014 Feb;15(2):195-204. doi: 10.1038/ni.2789. Epub 2013 Dec 15.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

GSAn：一种用于注释基因集的富集分析替代方法。

GSAn: an alternative to enrichment analysis for annotating gene sets.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献