Yu Zhezhou, Wang Zhuo, Yu Xiangchun, Zhang Zhe
College of Computer Science and Technology, Jilin University, Changchun, China.
School of Information Engineering, Jiangxi University of Science and Technology, Ganzhou, China.
Comput Intell Neurosci. 2020 Oct 29;2020:4737969. doi: 10.1155/2020/4737969. eCollection 2020.
Breast invasive carcinoma (BRCA) is not a single disease as each subtype has a distinct morphology structure. Although several computational methods have been proposed to conduct breast cancer subtype identification, the specific interaction mechanisms of genes involved in the subtypes are still incomplete. To identify and explore the corresponding interaction mechanisms of genes for each subtype of breast cancer can impose an important impact on the personalized treatment for different patients.
We integrate the biological importance of genes from the gene regulatory networks to the differential expression analysis and then obtain the weighted differentially expressed genes (weighted DEGs). A gene with a high weight means it regulates more target genes and thus holds more biological importance. Besides, we constructed gene coexpression networks for control and experiment groups, and the significantly differentially interacting structures encouraged us to design the corresponding Gene Ontology (GO) enrichment based on gene coexpression networks (GOEGCN). The GOEGCN considers the two-side distinction analysis between gene coexpression networks for control and experiment groups. The method allows us to study how the modulated coexpressed gene couples impact biological functions at a GO level.
We modeled the binary classification with weighted DEGs for each subtype. The binary classifier could make a good prediction for an unseen sample, and the experimental results validated the effectiveness of our proposed approaches. The novel enriched GO terms based on GOEGCN for control and experiment groups of each subtype explain the specific biological function changes according to the two-side distinction of coexpression network structures to some extent.
The weighted DEGs contain biological importance derived from the gene regulatory network. Based on the weighted DEGs, five binary classifiers were learned and showed good performance concerning the "Sensitivity," "Specificity," "Accuracy," "1," and "AUC" metrics. The GOEGCN with weighted DEGs for control and experiment groups presented a novel GO enrichment analysis results and the novel enriched GO terms would further unveil the changes of specific biological functions among all the BRCA subtypes to some extent. The R code in this research is available at https://github.com/yxchspring/GOEGCN_BRCA_Subtypes.
乳腺浸润性癌(BRCA)并非单一疾病,因为每种亚型都有独特的形态结构。尽管已经提出了几种计算方法来进行乳腺癌亚型识别,但各亚型中涉及的基因的具体相互作用机制仍不完整。识别和探索乳腺癌各亚型基因的相应相互作用机制,可能会对不同患者的个性化治疗产生重要影响。
我们将基因从基因调控网络到差异表达分析的生物学重要性进行整合,进而获得加权差异表达基因(weighted DEGs)。权重高的基因意味着它调控更多的靶基因,因此具有更高的生物学重要性。此外,我们构建了对照组和实验组的基因共表达网络,显著的差异相互作用结构促使我们基于基因共表达网络设计相应的基因本体(GO)富集分析(GOEGCN)。GOEGCN考虑了对照组和实验组基因共表达网络之间的双边差异分析。该方法使我们能够在GO水平上研究调控的共表达基因对如何影响生物学功能。
我们使用各亚型的加权DEGs对二元分类进行建模。二元分类器能够对未见过的样本做出良好预测,实验结果验证了我们提出的方法的有效性。基于GOEGCN为各亚型的对照组和实验组新富集的GO术语在一定程度上根据共表达网络结构的双边差异解释了特定的生物学功能变化。
加权DEGs包含来自基因调控网络的生物学重要性。基于加权DEGs,学习了五个二元分类器,它们在“敏感性”“特异性”“准确性”“1”和“AUC”指标方面表现良好。为对照组和实验组使用加权DEGs的GOEGCN呈现了新的GO富集分析结果,新富集的GO术语将在一定程度上进一步揭示所有BRCA亚型中特定生物学功能的变化。本研究中的R代码可在https://github.com/yxchspring/GOEGCN_BRCA_Subtypes获取。