• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于从大型公共数据库推断基因组规模网络的微阵列数据处理技术

Microarray Data Processing Techniques for Genome-Scale Network Inference from Large Public Repositories.

作者信息

Chockalingam Sriram, Aluru Maneesha, Aluru Srinivas

机构信息

Department of Computer Science and Engineering, Indian Institute of Technology Bombay, Mumbai 40076, India.

School of Biology, Georgia Institute of Technology, Atlanta, GA 30332, USA.

出版信息

Microarrays (Basel). 2016 Sep 19;5(3):23. doi: 10.3390/microarrays5030023.

DOI:10.3390/microarrays5030023
PMID:27657141
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5040970/
Abstract

Pre-processing of microarray data is a well-studied problem. Furthermore, all popular platforms come with their own recommended best practices for differential analysis of genes. However, for genome-scale network inference using microarray data collected from large public repositories, these methods filter out a considerable number of genes. This is primarily due to the effects of aggregating a diverse array of experiments with different technical and biological scenarios. Here we introduce a pre-processing pipeline suitable for inferring genome-scale gene networks from large microarray datasets. We show that partitioning of the available microarray datasets according to biological relevance into tissue- and process-specific categories significantly extends the limits of downstream network construction. We demonstrate the effectiveness of our pre-processing pipeline by inferring genome-scale networks for the model plant Arabidopsis thaliana using two different construction methods and a collection of 11,760 Affymetrix ATH1 microarray chips. Our pre-processing pipeline and the datasets used in this paper are made available at http://alurulab.cc.gatech.edu/microarray-pp.

摘要

微阵列数据的预处理是一个已得到充分研究的问题。此外,所有流行的平台都有其各自推荐的基因差异分析最佳实践方法。然而,对于使用从大型公共数据库收集的微阵列数据进行基因组规模的网络推断而言,这些方法会过滤掉相当数量的基因。这主要是由于将一系列具有不同技术和生物学背景的多样实验进行汇总所产生的影响。在此,我们介绍一种适用于从大型微阵列数据集中推断基因组规模基因网络的预处理流程。我们表明,根据生物学相关性将可用的微阵列数据集划分为组织特异性和过程特异性类别,可显著扩展下游网络构建的界限。我们通过使用两种不同的构建方法以及11760个Affymetrix ATH1微阵列芯片的集合,为模式植物拟南芥推断基因组规模网络,从而证明了我们预处理流程的有效性。我们的预处理流程以及本文中使用的数据集可在http://alurulab.cc.gatech.edu/microarray-pp获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/34d8/5040970/6a02ef622082/microarrays-05-00023-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/34d8/5040970/ca8943ee7fa6/microarrays-05-00023-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/34d8/5040970/7ce725880690/microarrays-05-00023-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/34d8/5040970/8484b8f62f49/microarrays-05-00023-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/34d8/5040970/6a02ef622082/microarrays-05-00023-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/34d8/5040970/ca8943ee7fa6/microarrays-05-00023-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/34d8/5040970/7ce725880690/microarrays-05-00023-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/34d8/5040970/8484b8f62f49/microarrays-05-00023-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/34d8/5040970/6a02ef622082/microarrays-05-00023-g004.jpg

相似文献

1
Microarray Data Processing Techniques for Genome-Scale Network Inference from Large Public Repositories.用于从大型公共数据库推断基因组规模网络的微阵列数据处理技术
Microarrays (Basel). 2016 Sep 19;5(3):23. doi: 10.3390/microarrays5030023.
2
Maximizing capture of gene co-expression relationships through pre-clustering of input expression samples: an Arabidopsis case study.通过对输入表达样本进行预聚类来最大化基因共表达关系的捕获:拟南芥案例研究
BMC Syst Biol. 2013 Jun 5;7:44. doi: 10.1186/1752-0509-7-44.
3
EnGRaiN: a supervised ensemble learning method for recovery of large-scale gene regulatory networks.EnGRaiN:一种用于大规模基因调控网络恢复的监督集成学习方法。
Bioinformatics. 2022 Feb 7;38(5):1312-1319. doi: 10.1093/bioinformatics/btab829.
4
PlantExpress: A Database Integrating OryzaExpress and ArthaExpress for Single-species and Cross-species Gene Expression Network Analyses with Microarray-Based Transcriptome Data.植物表达数据库:一个整合水稻表达数据库和ArthaExpress的数据库,用于基于微阵列转录组数据的单物种和跨物种基因表达网络分析。
Plant Cell Physiol. 2017 Jan 1;58(1):e1. doi: 10.1093/pcp/pcw208.
5
Consensus and Meta-analysis regulatory networks for combining multiple microarray gene expression datasets.用于整合多个微阵列基因表达数据集的共识与荟萃分析调控网络。
J Biomed Inform. 2008 Dec;41(6):914-26. doi: 10.1016/j.jbi.2008.01.011. Epub 2008 Feb 6.
6
Reverse engineering and analysis of large genome-scale gene networks.大规模基因组基因网络的反向工程和分析。
Nucleic Acids Res. 2013 Jan 7;41(1):e24. doi: 10.1093/nar/gks904. Epub 2012 Oct 4.
7
Parallel Mutual Information Based Construction of Genome-Scale Networks on the Intel® Xeon Phi™ Coprocessor.基于并行互信息的英特尔® 至强融核™ 协处理器上基因组规模网络的构建
IEEE/ACM Trans Comput Biol Bioinform. 2015 Sep-Oct;12(5):1008-20. doi: 10.1109/TCBB.2015.2415931.
8
Inferring gene regulatory networks from multiple microarray datasets.从多个微阵列数据集推断基因调控网络。
Bioinformatics. 2006 Oct 1;22(19):2413-20. doi: 10.1093/bioinformatics/btl396. Epub 2006 Jul 24.
9
Consistency of biological networks inferred from microarray and sequencing data.从微阵列和测序数据推断出的生物网络的一致性。
BMC Bioinformatics. 2016 Jun 24;17:254. doi: 10.1186/s12859-016-1136-0.
10
The rules of gene expression in plants: organ identity and gene body methylation are key factors for regulation of gene expression in Arabidopsis thaliana.植物中的基因表达规则:器官特征和基因体甲基化是拟南芥基因表达调控的关键因素。
BMC Genomics. 2008 Sep 23;9:438. doi: 10.1186/1471-2164-9-438.

引用本文的文献

1
Combined inhibition of histone methyltransferases EZH2 and DOT1L is an effective therapy for neuroblastoma.联合抑制组蛋白甲基转移酶 EZH2 和 DOT1L 是神经母细胞瘤的有效治疗方法。
Cancer Med. 2024 Nov;13(21):e70082. doi: 10.1002/cam4.70082.
2
A time-resolved meta-analysis of consensus gene expression profiles during human T-cell activation.人类 T 细胞激活过程中共识基因表达谱的时分辨析荟萃分析。
Genome Biol. 2023 Dec 14;24(1):287. doi: 10.1186/s13059-023-03120-7.
3
Enhancement of Classifier Performance Using Swarm Intelligence in Detection of Diabetes from Pancreatic Microarray Gene Data.

本文引用的文献

1
A developmental transcriptional network for maize defines coexpression modules.一个用于定义玉米共表达模块的发育转录网络。
Plant Physiol. 2013 Apr;161(4):1830-43. doi: 10.1104/pp.112.213231. Epub 2013 Feb 6.
2
Reverse engineering and analysis of large genome-scale gene networks.大规模基因组基因网络的反向工程和分析。
Nucleic Acids Res. 2013 Jan 7;41(1):e24. doi: 10.1093/nar/gks904. Epub 2012 Oct 4.
3
AMDA 2.13: A major update for automated cross-platform microarray data analysis.AMDA 2.13:自动化跨平台微阵列数据分析的重大更新。
利用群体智能提升从胰腺微阵列基因数据中检测糖尿病的分类器性能
Biomimetics (Basel). 2023 Oct 22;8(6):503. doi: 10.3390/biomimetics8060503.
4
MCPNet: a parallel maximum capacity-based genome-scale gene network construction framework.MCPNet:一种基于最大容量的并行基因组规模基因网络构建框架。
Bioinformatics. 2023 Jun 1;39(6). doi: 10.1093/bioinformatics/btad373.
5
Long-Term Mild Heat Causes Post-Mitotic Pollen Abortion Through a Local Effect on Flowers.长期轻度高温通过对花朵的局部影响导致有丝分裂后花粉败育。
Front Plant Sci. 2022 Jul 11;13:925754. doi: 10.3389/fpls.2022.925754. eCollection 2022.
6
Altered expression of genes controlling metabolism characterizes the tissue response to immune injury in lupus.控制代谢的基因表达改变是狼疮免疫损伤组织反应的特征。
Sci Rep. 2021 Jul 20;11(1):14789. doi: 10.1038/s41598-021-93034-w.
7
Analysis of gene expression from systemic lupus erythematosus synovium reveals myeloid cell-driven pathogenesis of lupus arthritis.系统性红斑狼疮滑膜中基因表达的分析揭示了髓样细胞驱动狼疮关节炎发病机制。
Sci Rep. 2020 Oct 15;10(1):17361. doi: 10.1038/s41598-020-74391-4.
8
Machine learning-based microarray analyses indicate low-expression genes might collectively influence PAH disease.基于机器学习的微阵列分析表明,低表达基因可能共同影响 PAH 疾病。
PLoS Comput Biol. 2019 Aug 12;15(8):e1007264. doi: 10.1371/journal.pcbi.1007264. eCollection 2019 Aug.
9
AP2/ERF Transcription Factor Regulatory Networks in Hormone and Abiotic Stress Responses in .植物中激素和非生物胁迫响应中的AP2/ERF转录因子调控网络
Front Plant Sci. 2019 Feb 28;10:228. doi: 10.3389/fpls.2019.00228. eCollection 2019.
Biotechniques. 2012 Jul;53(1):33-40. doi: 10.2144/0000113889.
4
Wisdom of crowds for robust gene network inference.群体智慧在稳健基因网络推断中的应用。
Nat Methods. 2012 Jul 15;9(8):796-804. doi: 10.1038/nmeth.2016.
5
Independent filtering increases detection power for high-throughput experiments.独立过滤提高了高通量实验的检测能力。
Proc Natl Acad Sci U S A. 2010 May 25;107(21):9546-51. doi: 10.1073/pnas.0914005107. Epub 2010 May 11.
6
Arabidopsis gene co-expression network and its functional modules.拟南芥基因共表达网络及其功能模块。
BMC Bioinformatics. 2009 Oct 21;10:346. doi: 10.1186/1471-2105-10-346.
7
Computational methods for discovering gene networks from expression data.从表达数据中发现基因网络的计算方法。
Brief Bioinform. 2009 Jul;10(4):408-23. doi: 10.1093/bib/bbp028.
8
GEOmetadb: powerful alternative search engine for the Gene Expression Omnibus.GEOmetadb:用于基因表达综合数据库(Gene Expression Omnibus)的强大替代搜索引擎。
Bioinformatics. 2008 Dec 1;24(23):2798-800. doi: 10.1093/bioinformatics/btn520. Epub 2008 Oct 7.
9
An Arabidopsis gene network based on the graphical Gaussian model.基于图形高斯模型的拟南芥基因网络。
Genome Res. 2007 Nov;17(11):1614-25. doi: 10.1101/gr.6911207. Epub 2007 Oct 5.
10
Simpleaffy: a BioConductor package for Affymetrix Quality Control and data analysis.Simpleaffy:一个用于Affymetrix质量控制和数据分析的生物导体软件包。
Bioinformatics. 2005 Sep 15;21(18):3683-5. doi: 10.1093/bioinformatics/bti605. Epub 2005 Aug 2.