Center for Plant Science Innovation and Department of Agronomy and Horticulture, University of Nebraska, NE, USA.
Genome Biol Evol. 2012;4(5):670-86. doi: 10.1093/gbe/evs042. Epub 2012 Apr 25.
Despite intense investigation for over 25 years, the in vivo structure of plant mitochondrial genomes remains uncertain. Mapping studies and genome sequencing generally produce large circular chromosomes, whereas electrophoretic and microscopic studies typically reveal linear and multibranched molecules. To more fully assess the structure of plant mitochondrial genomes, the complete sequence of the monkeyflower (Mimulus guttatus DC. line IM62) mitochondrial DNA was constructed from a large (35 kb) paired-end shotgun sequencing library to a high depth of coverage (~30×). The complete genome maps as a 525,671 bp circular molecule and exhibits a fairly conventional set of features including 62 genes (encoding 35 proteins, 24 transfer RNAs, and 3 ribosomal RNAs), 22 introns, 3 large repeats (2.7, 9.6, and 29 kb), and 96 small repeats (40-293 bp). Most paired-end reads (71%) mapped to the consensus sequence at the expected distance and orientation across the entire genome, validating the accuracy of assembly. Another 10% of reads provided clear evidence of alternative genomic conformations due to apparent rearrangements across large repeats. Quantitative assessment of these repeat-spanning read pairs revealed that all large repeat arrangements are present at appreciable frequencies in vivo, although not always in equimolar amounts. The observed stoichiometric differences for some arrangements are inconsistent with a predominant master circular structure for the mitochondrial genome of M. guttatus IM62. Finally, because IM62 contains a cryptic cytoplasmic male sterility (CMS) system, an in silico search for potential CMS genes was undertaken. The three chimeric open reading frames (ORFs) identified in this study, in addition to the previously identified ORFs upstream of the nad6 gene, are the most likely CMS candidate genes in this line.
尽管经过 25 年的深入研究,但植物线粒体基因组的体内结构仍不确定。作图研究和基因组测序通常产生大的圆形染色体,而电泳和显微镜研究通常揭示线性和多分支分子。为了更全面地评估植物线粒体基因组的结构,从一个大的(35kb)配对末端鸟枪法测序文库构建了猴面花(Mimulus guttatus DC. line IM62)线粒体 DNA 的完整序列,达到了很高的覆盖深度(~30×)。完整的基因组图谱为一个 525671bp 的圆形分子,表现出相当常规的一系列特征,包括 62 个基因(编码 35 个蛋白质、24 个转移 RNA 和 3 个核糖体 RNA)、22 个内含子、3 个大重复(2.7、9.6 和 29kb)和 96 个小重复(40-293bp)。大多数配对末端读数(71%)以预期的距离和方向映射到共识序列,验证了组装的准确性。另外 10%的读数提供了明显的证据表明,由于在大重复区的明显重排,基因组的替代构象。对这些重复跨越读取对的定量评估表明,所有大重复排列都以可观的频率存在于体内,尽管并非总是以等量存在。一些排列的观察到的化学计量差异与 M. guttatus IM62 线粒体基因组的主要圆形结构不一致。最后,由于 IM62 含有隐性细胞质雄性不育(CMS)系统,因此进行了潜在 CMS 基因的计算机搜索。本研究中鉴定的三个嵌合开放阅读框(ORF),除了 nad6 基因上游先前鉴定的 ORF 外,是该系中最可能的 CMS 候选基因。