玉米RNA测序估计的转录本丰度受读段比对偏差的强烈影响。

Zea mays RNA-seq estimated transcript abundances are strongly affected by read mapping bias.

作者信息

Zhan Shuhua, Griswold Cortland, Lukens Lewis

机构信息

Department of Plant Agriculture, University of Guelph, Guelph, Ontario, Canada.

Department of Integrative Biology, University of Guelph, Guelph, Ontario, Canada.

出版信息

BMC Genomics. 2021 Apr 20;22(1):285. doi: 10.1186/s12864-021-07577-3.

DOI:10.1186/s12864-021-07577-3

PMID:33874908

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8056621/

Abstract

BACKGROUND

Genetic variation for gene expression is a source of phenotypic variation for natural and agricultural species. The common approach to map and to quantify gene expression from genetically distinct individuals is to assign their RNA-seq reads to a single reference genome. However, RNA-seq reads from alleles dissimilar to this reference genome may fail to map correctly, causing transcript levels to be underestimated. Presently, the extent of this mapping problem is not clear, particularly in highly diverse species. We investigated if mapping bias occurred and if chromosomal features associated with mapping bias. Zea mays presents a model species to assess these questions, given it has genotypically distinct and well-studied genetic lines.

RESULTS

In Zea mays, the inbred B73 genome is the standard reference genome and template for RNA-seq read assignments. In the absence of mapping bias, B73 and a second inbred line, Mo17, would each have an approximately equal number of regulatory alleles that increase gene expression. Remarkably, Mo17 had 2-4 times fewer such positively acting alleles than did B73 when RNA-seq reads were aligned to the B73 reference genome. Reciprocally, over one-half of the B73 alleles that increased gene expression were not detected when reads were aligned to the Mo17 genome template. Genes at dissimilar chromosomal ends were strongly affected by mapping bias, and genes at more similar pericentromeric regions were less affected. Biased transcript estimates were higher in untranslated regions and lower in splice junctions. Bias occurred across software and alignment parameters.

CONCLUSIONS

Mapping bias very strongly affects gene transcript abundance estimates in maize, and bias varies across chromosomal features. Individual genome or transcriptome templates are likely necessary for accurate transcript estimation across genetically variable individuals in maize and other species.

摘要

背景

基因表达的遗传变异是自然物种和农业物种表型变异的一个来源。将来自基因不同个体的RNA测序读数定位并定量基因表达的常用方法是将它们的RNA测序读数分配到单个参考基因组。然而，与该参考基因组不同的等位基因的RNA测序读数可能无法正确定位，导致转录水平被低估。目前，这种定位问题的程度尚不清楚，尤其是在高度多样化的物种中。鉴于玉米具有基因型不同且经过充分研究的遗传系，我们研究了是否存在定位偏差以及是否存在与定位偏差相关的染色体特征。

结果

在玉米中，自交系B73基因组是RNA测序读数分配的标准参考基因组和模板。在不存在定位偏差的情况下，B73和另一个自交系Mo17各自具有数量大致相等的增加基因表达的调控等位基因。值得注意的是，当RNA测序读数与B73参考基因组比对时，Mo17中此类正向作用等位基因的数量比B73少2至4倍。相反，当读数与Mo17基因组模板比对时，超过一半的增加基因表达的B73等位基因未被检测到。位于不同染色体末端的基因受定位偏差的影响很大，而位于更相似的着丝粒周围区域的基因受影响较小。有偏差的转录本估计在非翻译区域较高，在剪接连接处较低。偏差在不同软件和比对参数中均存在。

结论

定位偏差对玉米基因转录本丰度估计有非常强烈的影响，并且偏差因染色体特征而异。对于准确估计玉米和其他物种中基因可变个体的转录本，可能需要个体基因组或转录组模板。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

玉米RNA测序估计的转录本丰度受读段比对偏差的强烈影响。

Zea mays RNA-seq estimated transcript abundances are strongly affected by read mapping bias.

作者信息

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSIONS

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

玉米RNA测序估计的转录本丰度受读段比对偏差的强烈影响。

Zea mays RNA-seq estimated transcript abundances are strongly affected by read mapping bias.

作者信息

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSIONS

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献