• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

从全球 1 色微阵列实验的荟萃分析中预测基因本体。

Predicting gene ontology from a global meta-analysis of 1-color microarray experiments.

机构信息

Arthritis and Clinical Immunology Research Program, Oklahoma Medical Research Foundation 825 NE 13th Street, Oklahoma City, Oklahoma 73104-5005, USA.

出版信息

BMC Bioinformatics. 2011 Oct 18;12 Suppl 10(Suppl 10):S14. doi: 10.1186/1471-2105-12-S10-S14.

DOI:10.1186/1471-2105-12-S10-S14
PMID:22166114
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3236836/
Abstract

BACKGROUND

Global meta-analysis (GMA) of microarray data to identify genes with highly similar co-expression profiles is emerging as an accurate method to predict gene function and phenotype, even in the absence of published data on the gene(s) being analyzed. With a third of human genes still uncharacterized, this approach is a promising way to direct experiments and rapidly understand the biological roles of genes. To predict function for genes of interest, GMA relies on a guilt-by-association approach to identify sets of genes with known functions that are consistently co-expressed with it across different experimental conditions, suggesting coordinated regulation for a specific biological purpose. Our goal here is to define how sample, dataset size and ranking parameters affect prediction performance.

RESULTS

13,000 human 1-color microarrays were downloaded from GEO for GMA analysis. Prediction performance was benchmarked by calculating the distance within the Gene Ontology (GO) tree between predicted function and annotated function for sets of 100 randomly selected genes. We find the number of new predicted functions rises as more datasets are added, but begins to saturate at a sample size of approximately 2,000 experiments. For the gene set used to predict function, we find precision to be higher with smaller set sizes, yet with correspondingly poor recall and, as set size is increased, recall and F-measure also tend to increase but at the cost of precision.

CONCLUSIONS

Of the 20,813 genes expressed in 50 or more experiments, at least one predicted GO category was found for 72.5% of them. Of the 5,720 genes without GO annotation, 4,189 had at least one predicted ontology using top 40 co-expressed genes for prediction analysis. For the remaining 1,531 genes without GO predictions or annotations, ~17% (257 genes) had sufficient co-expression data yet no statistically significantly overrepresented ontologies, suggesting their regulation may be more complex.

摘要

背景

通过对微阵列数据进行全球荟萃分析(GMA),以识别具有高度相似共表达谱的基因,这是一种预测基因功能和表型的准确方法,即使在缺乏正在分析的基因的已发表数据的情况下也是如此。由于三分之一的人类基因仍未被描述,因此这种方法是一种很有前途的方法,可以指导实验并快速了解基因的生物学作用。为了预测感兴趣基因的功能,GMA 依赖于一种关联罪责的方法来识别一组具有已知功能的基因,这些基因在不同的实验条件下与它一致地共表达,表明为特定的生物学目的进行协调调控。我们的目标是定义样本、数据集大小和排名参数如何影响预测性能。

结果

从 GEO 下载了 13000 个人类 1 色微阵列进行 GMA 分析。通过计算 100 个随机选择的基因集的预测功能和注释功能之间在基因本体论(GO)树内的距离来评估预测性能。我们发现,随着数据集的增加,新预测功能的数量增加,但在样本量约为 2000 次实验时开始饱和。对于用于预测功能的基因集,我们发现,随着集的大小减小,精度更高,但召回率相应较低,并且随着集的大小增加,召回率和 F-measure 也趋于增加,但代价是精度降低。

结论

在 50 次或更多实验中表达的 20813 个基因中,至少有一个预测的 GO 类别可以找到 72.5%的基因。在没有 GO 注释的 5720 个基因中,使用前 40 个共表达基因进行预测分析,有 4189 个基因至少有一个预测的本体论。对于其余的 1531 个没有 GO 预测或注释的基因,约 17%(257 个基因)有足够的共表达数据,但没有统计学上显著的过表达本体论,这表明它们的调控可能更为复杂。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b453/3236836/f07944fd3a0a/1471-2105-12-S10-S14-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b453/3236836/9829159e52e1/1471-2105-12-S10-S14-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b453/3236836/d0163cb01bef/1471-2105-12-S10-S14-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b453/3236836/d992b8730db8/1471-2105-12-S10-S14-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b453/3236836/4855edbba5e5/1471-2105-12-S10-S14-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b453/3236836/4b6a0ab076f8/1471-2105-12-S10-S14-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b453/3236836/f07944fd3a0a/1471-2105-12-S10-S14-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b453/3236836/9829159e52e1/1471-2105-12-S10-S14-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b453/3236836/d0163cb01bef/1471-2105-12-S10-S14-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b453/3236836/d992b8730db8/1471-2105-12-S10-S14-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b453/3236836/4855edbba5e5/1471-2105-12-S10-S14-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b453/3236836/4b6a0ab076f8/1471-2105-12-S10-S14-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b453/3236836/f07944fd3a0a/1471-2105-12-S10-S14-6.jpg

相似文献

1
Predicting gene ontology from a global meta-analysis of 1-color microarray experiments.从全球 1 色微阵列实验的荟萃分析中预测基因本体。
BMC Bioinformatics. 2011 Oct 18;12 Suppl 10(Suppl 10):S14. doi: 10.1186/1471-2105-12-S10-S14.
2
A global meta-analysis of microarray expression data to predict unknown gene functions and estimate the literature-data divide.一项用于预测未知基因功能并评估文献数据差异的微阵列表达数据的全球荟萃分析。
Bioinformatics. 2009 Jul 1;25(13):1694-701. doi: 10.1093/bioinformatics/btp290. Epub 2009 May 15.
3
High-throughput processing and normalization of one-color microarrays for transcriptional meta-analyses.高通量处理和归一化单色微阵列用于转录组元分析。
BMC Bioinformatics. 2011 Oct 18;12 Suppl 10(Suppl 10):S2. doi: 10.1186/1471-2105-12-S10-S2.
4
Statistical Test of Expression Pattern (STEPath): a new strategy to integrate gene expression data with genomic information in individual and meta-analysis studies.统计表达模式检验(STEPath):一种在个体和荟萃分析研究中整合基因表达数据和基因组信息的新策略。
BMC Bioinformatics. 2011 Apr 11;12:92. doi: 10.1186/1471-2105-12-92.
5
ADGO: analysis of differentially expressed gene sets using composite GO annotation.ADGO:使用复合基因本体注释分析差异表达基因集
Bioinformatics. 2006 Sep 15;22(18):2249-53. doi: 10.1093/bioinformatics/btl378. Epub 2006 Jul 12.
6
Integrating biological knowledge with gene expression profiles for survival prediction of cancer.整合生物学知识与基因表达谱以预测癌症患者的生存情况。
J Comput Biol. 2009 Feb;16(2):265-78. doi: 10.1089/cmb.2008.12TT.
7
Extracting gene expression patterns and identifying co-expressed genes from microarray data reveals biologically responsive processes.从微阵列数据中提取基因表达模式并识别共表达基因,可揭示生物响应过程。
BMC Bioinformatics. 2007 Nov 2;8:427. doi: 10.1186/1471-2105-8-427.
8
Information theory applied to the sparse gene ontology annotation network to predict novel gene function.信息论应用于稀疏基因本体注释网络以预测新的基因功能。
Bioinformatics. 2007 Jul 1;23(13):i529-38. doi: 10.1093/bioinformatics/btm195.
9
Prediction of Drosophila melanogaster gene function using Support Vector Machines.基于支持向量机的果蝇基因功能预测。
BioData Min. 2013 Apr 2;6(1):8. doi: 10.1186/1756-0381-6-8.
10
Utility and Limitations of Using Gene Expression Data to Identify Functional Associations.利用基因表达数据识别功能关联的效用与局限性
PLoS Comput Biol. 2016 Dec 9;12(12):e1005244. doi: 10.1371/journal.pcbi.1005244. eCollection 2016 Dec.

引用本文的文献

1
Multi-tissue DNA methylation microarray signature is predictive of gene function.多组织 DNA 甲基化微阵列特征可预测基因功能。
Epigenetics. 2022 Nov;17(11):1404-1418. doi: 10.1080/15592294.2022.2036411. Epub 2022 Feb 13.
2
Characterization of cxorf21 Provides Molecular Insight Into Female-Bias Immune Response in SLE Pathogenesis.cxorf21 的特征分析为 SLE 发病机制中女性偏向性免疫反应提供了分子见解。
Front Immunol. 2019 Oct 21;10:2160. doi: 10.3389/fimmu.2019.02160. eCollection 2019.
3
Targeting ELTD1, an angiogenesis marker for glioblastoma (GBM), also affects VEGFR2: molecular-targeted MRI assessment.

本文引用的文献

1
Mining high-throughput experimental data to link gene and function.挖掘高通量实验数据以关联基因和功能。
Trends Biotechnol. 2011 Apr;29(4):174-82. doi: 10.1016/j.tibtech.2011.01.001.
2
Too many roads not taken.太多未选择的道路。
Nature. 2011 Feb 10;470(7333):163-5. doi: 10.1038/470163a.
3
MiRNA-miRNA synergistic network: construction via co-regulating functional modules and disease miRNA topological features.miRNA-miRNA 协同网络:通过共调控功能模块和疾病 miRNA 拓扑特征构建。
靶向胶质母细胞瘤(GBM)血管生成标志物ELTD1,也会影响血管内皮生长因子受体2(VEGFR2):分子靶向磁共振成像评估
Am J Nucl Med Mol Imaging. 2019 Feb 15;9(1):93-109. eCollection 2019.
4
Early synergistic interactions between the HPV16‑E7 oncoprotein and 17β-oestradiol for repressing the expression of Granzyme B in a cervical cancer model.HPV16-E7 癌蛋白与 17β-雌二醇早期协同作用抑制宫颈癌模型中颗粒酶 B 的表达。
Int J Oncol. 2018 Aug;53(2):579-591. doi: 10.3892/ijo.2018.4432. Epub 2018 Jun 6.
5
Tetraspanin-enriched microdomains regulate digitation junctions.四跨膜蛋白富集微域调控指状连接。
Cell Mol Life Sci. 2018 Sep;75(18):3423-3439. doi: 10.1007/s00018-018-2803-2. Epub 2018 Mar 27.
6
Integrating sequence and gene expression information predicts genome-wide DNA-binding proteins and suggests a cooperative mechanism.整合序列和基因表达信息可预测全基因组 DNA 结合蛋白,并提出一种协同作用机制。
Nucleic Acids Res. 2018 Jan 9;46(1):54-70. doi: 10.1093/nar/gkx1166.
7
Predictive bioinformatics identifies novel regulators of proliferation in a cancer stem cell model.预测性生物信息学在癌症干细胞模型中鉴定出增殖的新型调节因子。
Stem Cell Res. 2018 Jan;26:1-7. doi: 10.1016/j.scr.2017.11.009. Epub 2017 Nov 21.
8
GTSE1 regulates spindle microtubule dynamics to control Aurora B kinase and Kif4A chromokinesin on chromosome arms.GTSE1调节纺锤体微管动力学,以控制染色体臂上的极光B激酶和驱动蛋白4A(Kif4A)染色体驱动蛋白。
J Cell Biol. 2017 Oct 2;216(10):3117-3132. doi: 10.1083/jcb.201610012. Epub 2017 Aug 18.
9
IGF-1 deficiency in a critical period early in life influences the vascular aging phenotype in mice by altering miRNA-mediated post-transcriptional gene regulation: implications for the developmental origins of health and disease hypothesis.生命早期关键时期的IGF-1缺乏通过改变miRNA介导的转录后基因调控影响小鼠的血管衰老表型:对健康与疾病假说的发育起源的启示。
Age (Dordr). 2016 Aug;38(4):239-258. doi: 10.1007/s11357-016-9943-9. Epub 2016 Aug 26.
10
Establishing an analytic pipeline for genome-wide DNA methylation.建立全基因组DNA甲基化分析流程。
Clin Epigenetics. 2016 Apr 27;8:45. doi: 10.1186/s13148-016-0212-7. eCollection 2016.
Nucleic Acids Res. 2011 Feb;39(3):825-36. doi: 10.1093/nar/gkq832. Epub 2010 Oct 6.
4
Genome-wide functional annotation by integrating multiple microarray datasets using meta-analysis.通过整合多个微阵列数据集并使用荟萃分析进行全基因组功能注释。
Int J Data Min Bioinform. 2010;4(4):357-76. doi: 10.1504/ijdmb.2010.034194.
5
Dynamism in gene expression across multiple studies.在多个研究中基因表达的动态变化。
Physiol Genomics. 2010 Feb 4;40(3):128-40. doi: 10.1152/physiolgenomics.90403.2008. Epub 2009 Nov 17.
6
Ska3 is required for spindle checkpoint silencing and the maintenance of chromosome cohesion in mitosis.Ska3 对于纺锤体检验点失活和有丝分裂中染色体凝聚的维持是必需的。
Curr Biol. 2009 Sep 15;19(17):1467-72. doi: 10.1016/j.cub.2009.07.017. Epub 2009 Jul 30.
7
A global meta-analysis of microarray expression data to predict unknown gene functions and estimate the literature-data divide.一项用于预测未知基因功能并评估文献数据差异的微阵列表达数据的全球荟萃分析。
Bioinformatics. 2009 Jul 1;25(13):1694-701. doi: 10.1093/bioinformatics/btp290. Epub 2009 May 15.
8
Cross species analysis of microarray expression data.微阵列表达数据的跨物种分析。
Bioinformatics. 2009 Jun 15;25(12):1476-83. doi: 10.1093/bioinformatics/btp247. Epub 2009 Apr 8.
9
Boolean implication networks derived from large scale, whole genome microarray datasets.从大规模、全基因组微阵列数据集得出的布尔蕴涵网络。
Genome Biol. 2008 Oct 30;9(10):R157. doi: 10.1186/gb-2008-9-10-r157.
10
Implementation of GenePattern within the Stanford Microarray Database.基因模式在斯坦福微阵列数据库中的实现。
Nucleic Acids Res. 2009 Jan;37(Database issue):D898-901. doi: 10.1093/nar/gkn786. Epub 2008 Oct 25.