为聚类和网络分析筛选基因。

Filtering genes for cluster and network analysis.

作者信息

Tritchler David, Parkhomenko Elena, Beyene Joseph

机构信息

Department of Biostatistics, University of Toronto, Toronto, Ontario, Canada.

出版信息

BMC Bioinformatics. 2009 Jun 23;10:193. doi: 10.1186/1471-2105-10-193.

DOI:10.1186/1471-2105-10-193

PMID:19549335

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2708160/

Abstract

BACKGROUND

Prior to cluster analysis or genetic network analysis it is customary to filter, or remove genes considered to be irrelevant from the set of genes to be analyzed. Often genes whose variation across samples is less than an arbitrary threshold value are deleted. This can improve interpretability and reduce bias.

RESULTS

This paper introduces modular models for representing network structure in order to study the relative effects of different filtering methods. We show that cluster analysis and principal components are strongly affected by filtering. Filtering methods intended specifically for cluster and network analysis are introduced and compared by simulating modular networks with known statistical properties. To study more realistic situations, we analyze simulated "real" data based on well-characterized E. coli and S. cerevisiae regulatory networks.

CONCLUSION

The methods introduced apply very generally, to any similarity matrix describing gene expression. One of the proposed methods, SUMCOV, performed well for all models simulated.

摘要

背景

在进行聚类分析或基因网络分析之前，通常会从待分析的基因集中筛选或去除被认为无关的基因。通常会删除那些在样本间变异小于任意阈值的基因。这可以提高可解释性并减少偏差。

结果

本文引入了用于表示网络结构的模块化模型，以研究不同筛选方法的相对影响。我们表明聚类分析和主成分分析受筛选的影响很大。通过模拟具有已知统计特性的模块化网络，引入并比较了专门用于聚类和网络分析的筛选方法。为了研究更现实的情况，我们基于特征明确的大肠杆菌和酿酒酵母调控网络分析了模拟的“真实”数据。

结论

所介绍的方法非常通用，适用于描述基因表达的任何相似性矩阵。所提出的方法之一SUMCOV在所有模拟模型中表现良好。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f484/2708160/a02679183ee1/1471-2105-10-193-1.jpg

相似文献

Filtering genes for cluster and network analysis.为聚类和网络分析筛选基因。

BMC Bioinformatics. 2009 Jun 23;10:193. doi: 10.1186/1471-2105-10-193.

Reconstructing Genetic Regulatory Networks Using Two-Step Algorithms with the Differential Equation Models of Neural Networks.使用两步算法和神经网络的微分方程模型重建遗传调控网络。

Interdiscip Sci. 2018 Dec;10(4):823-835. doi: 10.1007/s12539-017-0254-3. Epub 2017 Jul 26.

Identification of functional modules using network topology and high-throughput data.利用网络拓扑结构和高通量数据识别功能模块。

BMC Syst Biol. 2007 Jan 26;1:8. doi: 10.1186/1752-0509-1-8.

Estimating genomic coexpression networks using first-order conditional independence.使用一阶条件独立性估计基因组共表达网络。

Genome Biol. 2004;5(12):R100. doi: 10.1186/gb-2004-5-12-r100. Epub 2004 Nov 30.

Reverse engineering module networks by PSO-RNN hybrid modeling.通过粒子群优化-递归神经网络混合建模对模块网络进行逆向工程。

BMC Genomics. 2009 Jul 7;10 Suppl 1(Suppl 1):S15. doi: 10.1186/1471-2164-10-S1-S15.

Comparative analysis of the transcription-factor gene regulatory networks of E. coli and S. cerevisiae.大肠杆菌和酿酒酵母转录因子基因调控网络的比较分析。

BMC Syst Biol. 2008 Jan 31;2:13. doi: 10.1186/1752-0509-2-13.

SuMO-Fil: Supervised multi-omic filtering prior to performing network analysis.SuMO-Fil：在进行网络分析之前进行监督多组学过滤。

PLoS One. 2021 Aug 3;16(8):e0255579. doi: 10.1371/journal.pone.0255579. eCollection 2021.

Predicting essential genes based on network and sequence analysis.基于网络和序列分析预测必需基因。

Mol Biosyst. 2009 Dec;5(12):1672-8. doi: 10.1039/B900611G.

Analysis of gene sets based on the underlying regulatory network.基于潜在调控网络的基因集分析。

J Comput Biol. 2009 Mar;16(3):407-26. doi: 10.1089/cmb.2008.0081.

H∞ filtering for discrete-time genetic regulatory networks with random delays.具有随机时滞的离散时间基因调控网络的 H∞ 滤波。

Math Biosci. 2012 Sep;239(1):97-105. doi: 10.1016/j.mbs.2012.05.002. Epub 2012 May 28.

引用本文的文献

Molecular Phenogroups in Heart Failure: Large-Scale Proteomics in a Population-Based Cohort.心力衰竭中的分子表型组：基于人群队列的大规模蛋白质组学研究

Circ Genom Precis Med. 2025 Jul 16:e004953. doi: 10.1161/CIRCGEN.124.004953.

Integrated Multi-Omic Analysis Reveals Immunosuppressive Phenotype Associated with Poor Outcomes in High-Grade Serous Ovarian Cancer.综合多组学分析揭示高级别浆液性卵巢癌中与不良预后相关的免疫抑制表型。

Cancers (Basel). 2023 Jul 17;15(14):3649. doi: 10.3390/cancers15143649.

Ten quick tips for biomarker discovery and validation analyses using machine learning.使用机器学习进行生物标志物发现与验证分析的十条快速提示。

PLoS Comput Biol. 2022 Aug 11;18(8):e1010357. doi: 10.1371/journal.pcbi.1010357. eCollection 2022 Aug.

MODEL-BASED FEATURE SELECTION AND CLUSTERING OF RNA-SEQ DATA FOR UNSUPERVISED SUBTYPE DISCOVERY.基于模型的RNA测序数据特征选择与聚类用于无监督亚型发现

Ann Appl Stat. 2021 Mar;15(1):481-508. doi: 10.1214/20-aoas1407. Epub 2021 Mar 18.

SuMO-Fil: Supervised multi-omic filtering prior to performing network analysis.SuMO-Fil：在进行网络分析之前进行监督多组学过滤。

PLoS One. 2021 Aug 3;16(8):e0255579. doi: 10.1371/journal.pone.0255579. eCollection 2021.

Riemannian Variance Filtering: An Independent Filtering Scheme for Statistical Tests on Manifold-valued Data.黎曼方差滤波：一种用于流形值数据统计检验的独立滤波方案。

Conf Comput Vis Pattern Recognit Workshops. 2017 Jul;2017:699-708. doi: 10.1109/CVPRW.2017.99. Epub 2017 Aug 24.

Hydra: A mixture modeling framework for subtyping pediatric cancer cohorts using multimodal gene expression signatures.Hydra：一种基于多模态基因表达特征的儿科癌症队列亚组分型混合建模框架。

PLoS Comput Biol. 2020 Apr 10;16(4):e1007753. doi: 10.1371/journal.pcbi.1007753. eCollection 2020 Apr.

Chronic Chemogenetic Stimulation of the Nucleus Accumbens Produces Lasting Reductions in Binge Drinking and Ameliorates Alcohol-Related Morphological and Transcriptional Changes.对伏隔核进行慢性化学遗传刺激可持久减少暴饮行为，并改善与酒精相关的形态学和转录变化。

Brain Sci. 2020 Feb 18;10(2):109. doi: 10.3390/brainsci10020109.

Sex-Specific Co-expression Networks and Sex-Biased Gene Expression in the Salmonid Brook Charr .性别的特异性共表达网络和鲑鱼科的银大麻哈鱼中的性别偏倚基因表达。

G3 (Bethesda). 2019 Mar 7;9(3):955-968. doi: 10.1534/g3.118.200910.

Co-expression of long non-coding RNAs and autism risk genes in the developing human brain.长链非编码RNA与自闭症风险基因在人类发育大脑中的共表达。

BMC Syst Biol. 2018 Dec 14;12(Suppl 7):91. doi: 10.1186/s12918-018-0639-x.

本文引用的文献

Variable selection in penalized model-based clustering via regularization on grouped parameters.基于分组参数正则化的惩罚模型聚类中的变量选择

Biometrics. 2008 Sep;64(3):921-930. doi: 10.1111/j.1541-0420.2007.00955.x. Epub 2007 Dec 20.

The road to modularity.模块化之路。

Nat Rev Genet. 2007 Dec;8(12):921-31. doi: 10.1038/nrg2267.

Variable selection for model-based high-dimensional clustering and its application to microarray data.基于模型的高维聚类的变量选择及其在微阵列数据中的应用。

Biometrics. 2008 Jun;64(2):440-8. doi: 10.1111/j.1541-0420.2007.00922.x. Epub 2007 Oct 26.

SynTReN: a generator of synthetic gene expression data for design and analysis of structure learning algorithms.SynTReN：用于结构学习算法设计与分析的合成基因表达数据生成器。

BMC Bioinformatics. 2006 Jan 26;7:43. doi: 10.1186/1471-2105-7-43.

Network motifs: simple building blocks of complex networks.网络基序：复杂网络的简单构建模块。

Science. 2002 Oct 25;298(5594):824-7. doi: 10.1126/science.298.5594.824.

Transcriptional regulatory networks in Saccharomyces cerevisiae.酿酒酵母中的转录调控网络。

Science. 2002 Oct 25;298(5594):799-804. doi: 10.1126/science.1075090.

Variance stabilization applied to microarray data calibration and to the quantification of differential expression.方差稳定化应用于微阵列数据校准和差异表达定量分析。

Bioinformatics. 2002;18 Suppl 1:S96-104. doi: 10.1093/bioinformatics/18.suppl_1.s96.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

为聚类和网络分析筛选基因。

Filtering genes for cluster and network analysis.

作者信息

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSION

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献