Suppr超能文献

基因共表达网络分析中基因选择、枢纽基因识别和模块相互作用的统计方法:在大豆(Glycine max L.)铝胁迫中的应用

Statistical Approaches for Gene Selection, Hub Gene Identification and Module Interaction in Gene Co-Expression Network Analysis: An Application to Aluminum Stress in Soybean (Glycine max L.).

作者信息

Das Samarendra, Meher Prabina Kumar, Rai Anil, Bhar Lal Mohan, Mandal Baidya Nath

机构信息

Division of Statistical Genetics, ICAR-Indian Agricultural Statistics Research Institute, New Delhi, India.

Centre for Agricultural Bioinformatics, ICAR-Indian Agricultural Statistics Research Institute, New Delhi, India.

出版信息

PLoS One. 2017 Jan 5;12(1):e0169605. doi: 10.1371/journal.pone.0169605. eCollection 2017.

Abstract

Selection of informative genes is an important problem in gene expression studies. The small sample size and the large number of genes in gene expression data make the selection process complex. Further, the selected informative genes may act as a vital input for gene co-expression network analysis. Moreover, the identification of hub genes and module interactions in gene co-expression networks is yet to be fully explored. This paper presents a statistically sound gene selection technique based on support vector machine algorithm for selecting informative genes from high dimensional gene expression data. Also, an attempt has been made to develop a statistical approach for identification of hub genes in the gene co-expression network. Besides, a differential hub gene analysis approach has also been developed to group the identified hub genes into various groups based on their gene connectivity in a case vs. control study. Based on this proposed approach, an R package, i.e., dhga (https://cran.r-project.org/web/packages/dhga) has been developed. The comparative performance of the proposed gene selection technique as well as hub gene identification approach was evaluated on three different crop microarray datasets. The proposed gene selection technique outperformed most of the existing techniques for selecting robust set of informative genes. Based on the proposed hub gene identification approach, a few number of hub genes were identified as compared to the existing approach, which is in accordance with the principle of scale free property of real networks. In this study, some key genes along with their Arabidopsis orthologs has been reported, which can be used for Aluminum toxic stress response engineering in soybean. The functional analysis of various selected key genes revealed the underlying molecular mechanisms of Aluminum toxic stress response in soybean.

摘要

在基因表达研究中,选择信息丰富的基因是一个重要问题。基因表达数据中的小样本量和大量基因使得选择过程变得复杂。此外,所选的信息丰富基因可能作为基因共表达网络分析的重要输入。而且,基因共表达网络中枢纽基因的识别和模块相互作用仍有待充分探索。本文提出了一种基于支持向量机算法的统计学合理的基因选择技术,用于从高维基因表达数据中选择信息丰富的基因。此外,还尝试开发一种统计学方法来识别基因共表达网络中的枢纽基因。此外,还开发了一种差异枢纽基因分析方法,以便在病例与对照研究中根据基因连接性将识别出的枢纽基因分组。基于此提出的方法,开发了一个R包,即dhga(https://cran.r-project.org/web/packages/dhga)。在所提出的基因选择技术以及枢纽基因识别方法的比较性能在三个不同的作物微阵列数据集上进行了评估。所提出的基因选择技术在选择稳健的信息丰富基因集方面优于大多数现有技术。基于所提出的枢纽基因识别方法,与现有方法相比识别出的枢纽基因数量较少,这与真实网络的无标度特性原理一致。在本研究中,报道了一些关键基因及其拟南芥直系同源基因,可用于大豆铝毒胁迫响应工程。对各种选定关键基因的功能分析揭示了大豆铝毒胁迫响应的潜在分子机制。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/710b/5215982/a7864e5d8d27/pone.0169605.g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验