额外的k均值聚类步骤改善了WGCNA基因共表达网络的生物学特征。

An additional k-means clustering step improves the biological features of WGCNA gene co-expression networks.

作者信息

Botía Juan A, Vandrovcova Jana, Forabosco Paola, Guelfi Sebastian, D'Sa Karishma, Hardy John, Lewis Cathryn M, Ryten Mina, Weale Michael E

机构信息

Department of Molecular Neuroscience, Institute of Neurology, University College London, Queen Square, London, WC1N, UK.

Department of Medical & Molecular Genetics, School of Medical Sciences, King's College London, Guy's Hospital, London, SE1 9RT, UK.

出版信息

BMC Syst Biol. 2017 Apr 12;11(1):47. doi: 10.1186/s12918-017-0420-6.

DOI:10.1186/s12918-017-0420-6

PMID:28403906

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5389000/

Abstract

BACKGROUND

Weighted Gene Co-expression Network Analysis (WGCNA) is a widely used R software package for the generation of gene co-expression networks (GCN). WGCNA generates both a GCN and a derived partitioning of clusters of genes (modules). We propose k-means clustering as an additional processing step to conventional WGCNA, which we have implemented in the R package km2gcn (k-means to gene co-expression network, https://github.com/juanbot/km2gcn ).

RESULTS

We assessed our method on networks created from UKBEC data (10 different human brain tissues), on networks created from GTEx data (42 human tissues, including 13 brain tissues), and on simulated networks derived from GTEx data. We observed substantially improved module properties, including: (1) few or zero misplaced genes; (2) increased counts of replicable clusters in alternate tissues (x3.1 on average); (3) improved enrichment of Gene Ontology terms (seen in 48/52 GCNs) (4) improved cell type enrichment signals (seen in 21/23 brain GCNs); and (5) more accurate partitions in simulated data according to a range of similarity indices.

CONCLUSIONS

The results obtained from our investigations indicate that our k-means method, applied as an adjunct to standard WGCNA, results in better network partitions. These improved partitions enable more fruitful downstream analyses, as gene modules are more biologically meaningful.

摘要

背景

加权基因共表达网络分析（WGCNA）是一个广泛使用的R软件包，用于生成基因共表达网络（GCN）。WGCNA既生成一个GCN，也生成基因簇（模块）的派生划分。我们提出将k均值聚类作为传统WGCNA的一个额外处理步骤，我们已在R包km2gcn（从k均值到基因共表达网络，https://github.com/juanbot/km2gcn ）中实现了这一方法。

结果

我们在由UKBEC数据（10种不同的人类脑组织）创建的网络、由GTEx数据（42种人类组织，包括13种脑组织）创建的网络以及从GTEx数据派生的模拟网络上评估了我们的方法。我们观察到模块属性有显著改善，包括：（1）错误放置的基因很少或为零；（2）在其他组织中可复制簇的数量增加（平均增加3.1倍）；（3）基因本体术语的富集得到改善（在52个GCN中的48个中可见）；（4）细胞类型富集信号得到改善（在23个脑GCN中的21个中可见）；以及（5）根据一系列相似性指标，模拟数据中的划分更准确。

结论

我们的研究结果表明，我们的k均值方法作为标准WGCNA的辅助方法，可产生更好的网络划分。这些改进的划分使下游分析更有成效，因为基因模块更具生物学意义。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0534/5389000/1847a2630404/12918_2017_420_Fig1_HTML.jpg

相似文献

An additional k-means clustering step improves the biological features of WGCNA gene co-expression networks.额外的k均值聚类步骤改善了WGCNA基因共表达网络的生物学特征。

BMC Syst Biol. 2017 Apr 12;11(1):47. doi: 10.1186/s12918-017-0420-6.

K-Module Algorithm: An Additional Step to Improve the Clustering Results of WGCNA Co-Expression Networks.K-模块算法：改进 WGCNA 共表达网络聚类结果的附加步骤。

Genes (Basel). 2021 Jan 12;12(1):87. doi: 10.3390/genes12010087.

WGCNA: an R package for weighted correlation network analysis.WGCNA：一个用于加权相关网络分析的R软件包。

BMC Bioinformatics. 2008 Dec 29;9:559. doi: 10.1186/1471-2105-9-559.

SGCP: a spectral self-learning method for clustering genes in co-expression networks.SGCP：一种用于共表达网络中基因聚类的光谱自学习方法。

BMC Bioinformatics. 2024 Jul 2;25(1):230. doi: 10.1186/s12859-024-05848-w.

Assessment of complementarity of WGCNA and NERI results for identification of modules associated to schizophrenia spectrum disorders.评估 WGCNA 和 NERI 结果的互补性，以鉴定与精神分裂症谱系障碍相关的模块。

PLoS One. 2019 Jan 15;14(1):e0210431. doi: 10.1371/journal.pone.0210431. eCollection 2019.

Weighted gene co-expression network analysis revealed key biomarkers associated with the diagnosis of hypertrophic cardiomyopathy.加权基因共表达网络分析揭示了与肥厚型心肌病诊断相关的关键生物标志物。

Hereditas. 2020 Oct 24;157(1):42. doi: 10.1186/s41065-020-00155-9.

multiWGCNA: an R package for deep mining gene co-expression networks in multi-trait expression data.multiWGCNA：一个用于在多表型表达数据中深度挖掘基因共表达网络的 R 包。

BMC Bioinformatics. 2023 Mar 24;24(1):115. doi: 10.1186/s12859-023-05233-z.

Targeted co-expression networks for the study of traits.靶向共表达网络在性状研究中的应用。

Sci Rep. 2024 Jul 19;14(1):16675. doi: 10.1038/s41598-024-67329-7.

A Novel Calibration Step in Gene Co-Expression Network Construction.基因共表达网络构建中的一种新型校准步骤。

Front Bioinform. 2021 Nov 23;1:704817. doi: 10.3389/fbinf.2021.704817. eCollection 2021.

Juxtapose: a gene-embedding approach for comparing co-expression networks.并列：一种用于比较共表达网络的基因嵌入方法。

BMC Bioinformatics. 2021 Mar 16;22(1):125. doi: 10.1186/s12859-021-04055-1.

引用本文的文献

Construction and validation of immune prognosis model for lung adenocarcinoma based on machine learning.基于机器学习的肺腺癌免疫预后模型的构建与验证

Front Oncol. 2025 Jul 22;15:1630663. doi: 10.3389/fonc.2025.1630663. eCollection 2025.

Female gametophyte development, pollen‒pistil interactions and embryogenic patterns in chicory (Cichorium intybus): a self-incompatibility perspective.菊苣（Cichorium intybus）中的雌配子体发育、花粉-雌蕊相互作用及胚胎发生模式：自交不亲和视角

Plant Cell Rep. 2025 Jun 25;44(7):156. doi: 10.1007/s00299-025-03546-2.

Increased burden of rare risk variants across gene expression networks predisposes to sporadic Parkinson's disease.基因表达网络中罕见风险变异负担的增加易患散发性帕金森病。

Cell Rep. 2025 May 2;44(5):115636. doi: 10.1016/j.celrep.2025.115636.

Hypergraph-based analysis of weighted gene co-expression hypernetwork.基于超图的加权基因共表达超网络分析

Front Genet. 2025 Apr 4;16:1560841. doi: 10.3389/fgene.2025.1560841. eCollection 2025.

Analysis of key lncRNA related to Parkinson's disease based on gene co-expression weight networks.基于基因共表达权重网络的帕金森病相关关键长链非编码RNA分析

Neurosciences (Riyadh). 2025 Jan;30(1):20-29. doi: 10.17712/nsj.2025.1.20230112.

Human longevity and Alzheimer's disease variants act via microglia and oligodendrocyte gene networks.人类长寿和阿尔茨海默病相关变异通过小胶质细胞和少突胶质细胞基因网络发挥作用。

Brain. 2025 Mar 6;148(3):969-984. doi: 10.1093/brain/awae339.

NUF2 is associated with cancer stem cell characteristics and a potential drug target for prostate cancer.NUF2与癌症干细胞特征相关，是前列腺癌的一个潜在药物靶点。

Front Mol Biosci. 2024 Dec 5;11:1481375. doi: 10.3389/fmolb.2024.1481375. eCollection 2024.

Identification of Hub Biomarkers and Immune and Inflammation Pathways Contributing to Kawasaki Disease Progression with RT-qPCR Verification.通过逆转录定量聚合酶链反应验证确定促成川崎病进展的关键生物标志物以及免疫和炎症途径。

J Immunol Res. 2023 Apr 6;2023:1774260. doi: 10.1155/2023/1774260. eCollection 2023.

Microglia contribute to the production of the amyloidogenic ABri peptide in familial British dementia.小胶质细胞有助于家族性英国痴呆症中淀粉样蛋白 ABri 肽的产生。

Acta Neuropathol. 2024 Nov 15;148(1):65. doi: 10.1007/s00401-024-02820-z.

The Transcriptional Landscape of Berry Skin in Red and White PIWI ("Pilzwiderstandsfähig") Grapevines Possessing QTLs for Partial Resistance to Downy and Powdery Mildews.具有霜霉病和白粉病部分抗性QTL的红色和白色PIWI（“抗真菌能力”）葡萄品种浆果表皮的转录图谱

Plants (Basel). 2024 Sep 13;13(18):2574. doi: 10.3390/plants13182574.

本文引用的文献

A survey of best practices for RNA-seq data analysis.RNA测序数据分析的最佳实践调查。

Genome Biol. 2016 Jan 26;17:13. doi: 10.1186/s13059-016-0881-8.

Transcriptional regulators form diverse groups with context-dependent regulatory functions.转录调控因子形成具有上下文相关调控功能的多种组合。

Nature. 2015 Dec 3;528(7580):147-51. doi: 10.1038/nature15545. Epub 2015 Nov 9.

A gene-based association method for mapping traits using reference transcriptome data.一种利用参考转录组数据进行性状定位的基于基因的关联方法。

Nat Genet. 2015 Sep;47(9):1091-8. doi: 10.1038/ng.3367. Epub 2015 Aug 10.

Systems biology and gene networks in neurodevelopmental and neurodegenerative disorders.神经发育障碍和神经退行性疾病中的系统生物学与基因网络

Nat Rev Genet. 2015 Aug;16(8):441-58. doi: 10.1038/nrg3934. Epub 2015 Jul 7.

LRRK2 Pathways Leading to Neurodegeneration.导致神经退行性变的LRRK2信号通路。

Curr Neurol Neurosci Rep. 2015 Jul;15(7):42. doi: 10.1007/s11910-015-0564-y.

A missense mutation in KCTD17 causes autosomal dominant myoclonus-dystonia.KCTD17基因的错义突变导致常染色体显性遗传的肌阵挛性肌张力障碍。

Am J Hum Genet. 2015 Jun 4;96(6):938-47. doi: 10.1016/j.ajhg.2015.04.008. Epub 2015 May 14.

Human genomics. The Genotype-Tissue Expression (GTEx) pilot analysis: multitissue gene regulation in humans.人类基因组学。基因型-组织表达（GTEx）试点分析：人类多组织基因调控

Science. 2015 May 8;348(6235):648-60. doi: 10.1126/science.1262110. Epub 2015 May 7.

Guidance for RNA-seq co-expression network construction and analysis: safety in numbers.RNA测序共表达网络构建与分析指南：数量带来的安全性

Bioinformatics. 2015 Jul 1;31(13):2123-30. doi: 10.1093/bioinformatics/btv118. Epub 2015 Feb 24.

Brain structure. Cell types in the mouse cortex and hippocampus revealed by single-cell RNA-seq.脑结构。单细胞 RNA 测序揭示的小鼠皮层和海马中的细胞类型。

Science. 2015 Mar 6;347(6226):1138-42. doi: 10.1126/science.aaa1934. Epub 2015 Feb 19.

Genetic evidence for a pathogenic role for the vitamin D3 metabolizing enzyme in multiple sclerosis.维生素D3代谢酶在多发性硬化症中致病作用的遗传学证据。

Mult Scler Relat Disord. 2014 Mar;3(2):211-219. doi: 10.1016/j.msard.2013.08.009.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

额外的k均值聚类步骤改善了WGCNA基因共表达网络的生物学特征。

An additional k-means clustering step improves the biological features of WGCNA gene co-expression networks.

作者信息

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSIONS

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献