Department of Bio-AI Convergence, Chungnam National University, Daejeon, 305-764, Korea.
Division of Animal and Dairy Science, Chungnam National University, Daejeon, 305-764, Korea.
Sci Rep. 2022 Jun 14;12(1):9854. doi: 10.1038/s41598-022-13796-9.
In the general framework of the weighted gene co-expression network analysis (WGCNA), a hierarchical clustering algorithm is commonly used to module definition. However, hierarchical clustering depends strongly on the topological overlap measure. In other words, this algorithm may assign two genes with low topological overlap to different modules even though their expression patterns are similar. Here, a novel gene module clustering algorithm for WGCNA is proposed. We develop a gene module clustering network (gmcNet), which simultaneously addresses single-level expression and topological overlap measure. The proposed gmcNet includes a "co-expression pattern recognizer" (CEPR) and "module classifier". The CEPR incorporates expression features of single genes into the topological features of co-expressed ones. Given this CEPR-embedded feature, the module classifier computes module assignment probabilities. We validated gmcNet performance using 4,976 genes from 20 native Korean cattle. We observed that the CEPR generates more robust features than single-level expression or topological overlap measure. Given the CEPR-embedded feature, gmcNet achieved the best performance in terms of modularity (0.261) and the differentially expressed signal (27.739) compared with other clustering methods tested. Furthermore, gmcNet detected some interesting biological functionalities for carcass weight, backfat thickness, intramuscular fat, and beef tenderness of Korean native cattle. Therefore, gmcNet is a useful framework for WGCNA module clustering.
在加权基因共表达网络分析(WGCNA)的总体框架中,通常使用层次聚类算法来定义模块。然而,层次聚类强烈依赖于拓扑重叠度量。换句话说,即使两个基因的表达模式相似,这种算法也可能将它们分配到不同的模块中。在这里,我们提出了一种用于 WGCNA 的新的基因模块聚类算法。我们开发了一种基因模块聚类网络(gmcNet),它同时解决了单级表达和拓扑重叠度量的问题。所提出的 gmcNet 包括“共表达模式识别器”(CEPR)和“模块分类器”。CEPR 将单个基因的表达特征纳入共表达基因的拓扑特征中。有了这个 CEPR 嵌入式特征,模块分类器计算模块分配概率。我们使用来自 20 头韩国本土牛的 4976 个基因验证了 gmcNet 的性能。我们观察到,CEPR 生成的特征比单级表达或拓扑重叠度量更稳健。给定 CEPR 嵌入式特征,gmcNet 在模块性(0.261)和差异表达信号(27.739)方面的性能优于测试的其他聚类方法。此外,gmcNet 检测到了与韩国本土牛的胴体重量、背膘厚度、肌内脂肪和牛肉嫩度有关的一些有趣的生物学功能。因此,gmcNet 是 WGCNA 模块聚类的有用框架。