Suppr超能文献

同源基因表达和共表达网络分析及异源多倍体的进化推断。

Homoeologous gene expression and co-expression network analyses and evolutionary inference in allopolyploids.

机构信息

Department of Ecology, Evolution, and Organismal Biology, Iowa State University, Ames, IA 50011, USA.

出版信息

Brief Bioinform. 2021 Mar 22;22(2):1819-1835. doi: 10.1093/bib/bbaa035.

Abstract

Polyploidy is a widespread phenomenon throughout eukaryotes. Due to the coexistence of duplicated genomes, polyploids offer unique challenges for estimating gene expression levels, which is essential for understanding the massive and various forms of transcriptomic responses accompanying polyploidy. Although previous studies have explored the bioinformatics of polyploid transcriptomic profiling, the causes and consequences of inaccurate quantification of transcripts from duplicated gene copies have not been addressed. Using transcriptomic data from the cotton genus (Gossypium) as an example, we present an analytical workflow to evaluate a variety of bioinformatic method choices at different stages of RNA-seq analysis, from homoeolog expression quantification to downstream analysis used to infer key phenomena of polyploid expression evolution. In general, EAGLE-RC and GSNAP-PolyCat outperform other quantification pipelines tested, and their derived expression dataset best represents the expected homoeolog expression and co-expression divergence. The performance of co-expression network analysis was less affected by homoeolog quantification than by network construction methods, where weighted networks outperformed binary networks. By examining the extent and consequences of homoeolog read ambiguity, we illuminate the potential artifacts that may affect our understanding of duplicate gene expression, including an overestimation of homoeolog co-regulation and the incorrect inference of subgenome asymmetry in network topology. Taken together, our work points to a set of reasonable practices that we hope are broadly applicable to the evolutionary exploration of polyploids.

摘要

多倍体是真核生物中广泛存在的现象。由于重复基因组的共存,多倍体为估计基因表达水平带来了独特的挑战,而这对于理解伴随多倍体产生的大规模和多样化的转录组响应至关重要。尽管先前的研究已经探索了多倍体转录组分析的生物信息学,但对于重复基因拷贝转录本的不准确定量的原因和后果尚未得到解决。我们使用棉花属(Gossypium)的转录组数据作为示例,提出了一种分析工作流程,用于评估 RNA-seq 分析的不同阶段(从同系物表达定量到用于推断多倍体表达进化关键现象的下游分析)的各种生物信息学方法选择。总的来说,EAGLE-RC 和 GSNAP-PolyCat 优于测试的其他定量管道,并且它们衍生的表达数据集最能代表预期的同系物表达和共表达分歧。共表达网络分析的性能受同系物定量的影响小于网络构建方法的影响,其中加权网络优于二值网络。通过检查同系物读取模糊的程度和后果,我们阐明了可能影响我们对重复基因表达理解的潜在人为因素,包括同系物共调控的高估和网络拓扑中亚基因组不对称的错误推断。总之,我们的工作指出了一组合理的实践,我们希望这些实践广泛适用于多倍体的进化探索。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/769a/7986634/9063607415fe/bbaa035f1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验