Suppr超能文献

额外的k均值聚类步骤改善了WGCNA基因共表达网络的生物学特征。

An additional k-means clustering step improves the biological features of WGCNA gene co-expression networks.

作者信息

Botía Juan A, Vandrovcova Jana, Forabosco Paola, Guelfi Sebastian, D'Sa Karishma, Hardy John, Lewis Cathryn M, Ryten Mina, Weale Michael E

机构信息

Department of Molecular Neuroscience, Institute of Neurology, University College London, Queen Square, London, WC1N, UK.

Department of Medical & Molecular Genetics, School of Medical Sciences, King's College London, Guy's Hospital, London, SE1 9RT, UK.

出版信息

BMC Syst Biol. 2017 Apr 12;11(1):47. doi: 10.1186/s12918-017-0420-6.

Abstract

BACKGROUND

Weighted Gene Co-expression Network Analysis (WGCNA) is a widely used R software package for the generation of gene co-expression networks (GCN). WGCNA generates both a GCN and a derived partitioning of clusters of genes (modules). We propose k-means clustering as an additional processing step to conventional WGCNA, which we have implemented in the R package km2gcn (k-means to gene co-expression network, https://github.com/juanbot/km2gcn ).

RESULTS

We assessed our method on networks created from UKBEC data (10 different human brain tissues), on networks created from GTEx data (42 human tissues, including 13 brain tissues), and on simulated networks derived from GTEx data. We observed substantially improved module properties, including: (1) few or zero misplaced genes; (2) increased counts of replicable clusters in alternate tissues (x3.1 on average); (3) improved enrichment of Gene Ontology terms (seen in 48/52 GCNs) (4) improved cell type enrichment signals (seen in 21/23 brain GCNs); and (5) more accurate partitions in simulated data according to a range of similarity indices.

CONCLUSIONS

The results obtained from our investigations indicate that our k-means method, applied as an adjunct to standard WGCNA, results in better network partitions. These improved partitions enable more fruitful downstream analyses, as gene modules are more biologically meaningful.

摘要

背景

加权基因共表达网络分析(WGCNA)是一个广泛使用的R软件包,用于生成基因共表达网络(GCN)。WGCNA既生成一个GCN,也生成基因簇(模块)的派生划分。我们提出将k均值聚类作为传统WGCNA的一个额外处理步骤,我们已在R包km2gcn(从k均值到基因共表达网络,https://github.com/juanbot/km2gcn )中实现了这一方法。

结果

我们在由UKBEC数据(10种不同的人类脑组织)创建的网络、由GTEx数据(42种人类组织,包括13种脑组织)创建的网络以及从GTEx数据派生的模拟网络上评估了我们的方法。我们观察到模块属性有显著改善,包括:(1)错误放置的基因很少或为零;(2)在其他组织中可复制簇的数量增加(平均增加3.1倍);(3)基因本体术语的富集得到改善(在52个GCN中的48个中可见);(4)细胞类型富集信号得到改善(在23个脑GCN中的21个中可见);以及(5)根据一系列相似性指标,模拟数据中的划分更准确。

结论

我们的研究结果表明,我们的k均值方法作为标准WGCNA的辅助方法,可产生更好的网络划分。这些改进的划分使下游分析更有成效,因为基因模块更具生物学意义。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0534/5389000/1847a2630404/12918_2017_420_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验