• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

结合基因本体结构和基因表达数据优化基因集注释

Optimizing gene set annotations combining GO structure and gene expression data.

作者信息

Wang Dong, Li Jie, Liu Rui, Wang Yadong

机构信息

School of Computer Science and Technology, Harbin Institute of Technology, West Da-Zhi Street, Harbin, China.

出版信息

BMC Syst Biol. 2018 Dec 31;12(Suppl 9):133. doi: 10.1186/s12918-018-0659-6.

DOI:10.1186/s12918-018-0659-6
PMID:30598093
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6311910/
Abstract

BACKGROUND

With the rapid accumulation of genomic data, it has become a challenge issue to annotate and interpret these data. As a representative, Gene set enrichment analysis has been widely used to interpret large molecular datasets generated by biological experiments. The result of gene set enrichment analysis heavily relies on the quality and integrity of gene set annotations. Although several methods were developed to annotate gene sets, there is still a lack of high quality annotation methods. Here, we propose a novel method to improve the annotation accuracy through combining the GO structure and gene expression data.

RESULTS

We propose a novel approach for optimizing gene set annotations to get more accurate annotation results. The proposed method filters the inconsistent annotations using GO structure information and probabilistic gene set clusters calculated by a range of cluster sizes over multiple bootstrap resampled datasets. The proposed method is employed to analyze p53 cell lines, colon cancer and breast cancer gene expression data. The experimental results show that the proposed method can filter a number of annotations unrelated to experimental data and increase gene set enrichment power and decrease the inconsistent of annotations.

CONCLUSIONS

A novel gene set annotation optimization approach is proposed to improve the quality of gene annotations. Experimental results indicate that the proposed method effectively improves gene set annotation quality based on the GO structure and gene expression data.

摘要

背景

随着基因组数据的快速积累,对这些数据进行注释和解读已成为一个具有挑战性的问题。作为一种代表性方法,基因集富集分析已被广泛用于解读生物学实验产生的大分子数据集。基因集富集分析的结果在很大程度上依赖于基因集注释的质量和完整性。尽管已经开发了几种方法来注释基因集,但仍然缺乏高质量的注释方法。在此,我们提出一种通过结合基因本体(GO)结构和基因表达数据来提高注释准确性的新方法。

结果

我们提出了一种优化基因集注释以获得更准确注释结果的新方法。该方法利用GO结构信息以及在多个自展重采样数据集上通过一系列聚类大小计算得到的概率性基因集簇来筛选不一致的注释。所提出的方法被用于分析p53细胞系、结肠癌和乳腺癌基因表达数据。实验结果表明,该方法能够筛选出许多与实验数据无关的注释,提高基因集富集能力,并减少注释的不一致性。

结论

提出了一种新的基因集注释优化方法以提高基因注释质量。实验结果表明,该方法基于GO结构和基因表达数据有效地提高了基因集注释质量。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2941/6311910/f49ad90b0600/12918_2018_659_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2941/6311910/64ba71fd9ea0/12918_2018_659_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2941/6311910/76cae46c75a2/12918_2018_659_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2941/6311910/a6ff42a21543/12918_2018_659_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2941/6311910/f49ad90b0600/12918_2018_659_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2941/6311910/64ba71fd9ea0/12918_2018_659_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2941/6311910/76cae46c75a2/12918_2018_659_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2941/6311910/a6ff42a21543/12918_2018_659_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2941/6311910/f49ad90b0600/12918_2018_659_Fig4_HTML.jpg

相似文献

1
Optimizing gene set annotations combining GO structure and gene expression data.结合基因本体结构和基因表达数据优化基因集注释
BMC Syst Biol. 2018 Dec 31;12(Suppl 9):133. doi: 10.1186/s12918-018-0659-6.
2
Optimization of gene set annotations via entropy minimization over variable clusters (EMVC).通过对可变聚类进行熵最小化(EMVC)优化基因集注释。
Bioinformatics. 2014 Jun 15;30(12):1698-706. doi: 10.1093/bioinformatics/btu110. Epub 2014 Feb 25.
3
Optimization of Gene Set Annotations Using Robust Trace-Norm Multitask Learning.利用稳健轨迹范数多任务学习优化基因集注释。
IEEE/ACM Trans Comput Biol Bioinform. 2018 May-Jun;15(3):1016-1021. doi: 10.1109/TCBB.2017.2690427. Epub 2017 Apr 3.
4
GO functional similarity clustering depends on similarity measure, clustering method, and annotation completeness.GO 功能相似性聚类取决于相似性度量、聚类方法和注释完整性。
BMC Bioinformatics. 2019 Mar 27;20(1):155. doi: 10.1186/s12859-019-2752-2.
5
Computational algorithms to predict Gene Ontology annotations.预测基因本体注释的计算算法。
BMC Bioinformatics. 2015;16 Suppl 6(Suppl 6):S4. doi: 10.1186/1471-2105-16-S6-S4. Epub 2015 Apr 17.
6
Comparing gene annotation enrichment tools for functional modeling of agricultural microarray data.比较基因注释富集工具在农业微阵列数据分析中的功能建模。
BMC Bioinformatics. 2009 Oct 8;10 Suppl 11(Suppl 11):S9. doi: 10.1186/1471-2105-10-S11-S9.
7
GOcats: A tool for categorizing Gene Ontology into subgraphs of user-defined concepts.GOcats:一个将基因本体论分类为用户定义概念子图的工具。
PLoS One. 2020 Jun 11;15(6):e0233311. doi: 10.1371/journal.pone.0233311. eCollection 2020.
8
Novelty Indicator for Enhanced Prioritization of Predicted Gene Ontology Annotations.新型指标提高预测基因本体论注释的优先级。
IEEE/ACM Trans Comput Biol Bioinform. 2018 May-Jun;15(3):954-965. doi: 10.1109/TCBB.2017.2695459. Epub 2017 Apr 18.
9
Information theory applied to the sparse gene ontology annotation network to predict novel gene function.信息论应用于稀疏基因本体注释网络以预测新的基因功能。
Bioinformatics. 2007 Jul 1;23(13):i529-38. doi: 10.1093/bioinformatics/btm195.
10
How to decide which are the most pertinent overly-represented features during gene set enrichment analysis.如何在基因集富集分析中确定哪些是最相关的过度表达特征。
BMC Bioinformatics. 2007 Sep 11;8:332. doi: 10.1186/1471-2105-8-332.

本文引用的文献

1
A network diffusion approach to inferring sample-specific function reveals functional changes associated with breast cancer.一种用于推断样本特异性功能的网络扩散方法揭示了与乳腺癌相关的功能变化。
PLoS Comput Biol. 2017 Nov 30;13(11):e1005793. doi: 10.1371/journal.pcbi.1005793. eCollection 2017 Nov.
2
The Molecular Signatures Database (MSigDB) hallmark gene set collection.分子特征数据库(MSigDB)标志性基因集集合。
Cell Syst. 2015 Dec 23;1(6):417-425. doi: 10.1016/j.cels.2015.12.004.
3
Predicting protein function via downward random walks on a gene ontology.
通过在基因本体上进行向下随机游走预测蛋白质功能。
BMC Bioinformatics. 2015 Aug 27;16:271. doi: 10.1186/s12859-015-0713-y.
4
The Cancer Genome Atlas (TCGA): an immeasurable source of knowledge.癌症基因组图谱(TCGA):一个不可估量的知识来源。
Contemp Oncol (Pozn). 2015;19(1A):A68-77. doi: 10.5114/wo.2014.47136.
5
Gene Ontology Consortium: going forward.基因本体论联盟:展望未来。
Nucleic Acids Res. 2015 Jan;43(Database issue):D1049-56. doi: 10.1093/nar/gku1179. Epub 2014 Nov 26.
6
Proteogenomic characterization of human colon and rectal cancer.人类结肠癌和直肠癌的蛋白质基因组学特征分析
Nature. 2014 Sep 18;513(7518):382-7. doi: 10.1038/nature13438. Epub 2014 Jul 20.
7
GOssTo: a stand-alone application and a web tool for calculating semantic similarities on the Gene Ontology.GOssTo:一个独立的应用程序和一个网络工具,用于计算基因本体论上的语义相似度。
Bioinformatics. 2014 Aug 1;30(15):2235-6. doi: 10.1093/bioinformatics/btu144. Epub 2014 Mar 22.
8
Optimization of gene set annotations via entropy minimization over variable clusters (EMVC).通过对可变聚类进行熵最小化(EMVC)优化基因集注释。
Bioinformatics. 2014 Jun 15;30(12):1698-706. doi: 10.1093/bioinformatics/btu110. Epub 2014 Feb 25.
9
A neural network algorithm for semi-supervised node label learning from unbalanced data.一种从不平衡数据中进行半监督节点标签学习的神经网络算法。
Neural Netw. 2013 Jul;43:84-98. doi: 10.1016/j.neunet.2013.01.021. Epub 2013 Feb 6.
10
Mining GO annotations for improving annotation consistency.挖掘 GO 注释以提高注释一致性。
PLoS One. 2012;7(7):e40519. doi: 10.1371/journal.pone.0040519. Epub 2012 Jul 25.