Suppr超能文献

基于图谱的线粒体基因组模型捕捉到了高等植物线粒体DNA组织的巨大复杂性。

Graph-based models of the mitochondrial genome capture the enormous complexity of higher plant mitochondrial DNA organization.

作者信息

Fischer Axel, Dotzek Jana, Walther Dirk, Greiner Stephan

机构信息

Max Planck Institute of Molecular Plant Physiology, Am Mühlenberg 1, 14476 Potsdam-Golm, Germany.

出版信息

NAR Genom Bioinform. 2022 Mar 31;4(2):lqac027. doi: 10.1093/nargab/lqac027. eCollection 2022 Jun.

Abstract

Plant mitochondrial genomes display an enormous structural complexity, as recombining repeat-pairs lead to the generation of various sub-genomic molecules, rendering these genomes extremely challenging to assemble. We present a novel bioinformatic data-processing pipeline called SAGBAC (Semi-Automated Graph-Based Assembly Curator) that identifies recombinogenic repeat-pairs and reconstructs plant mitochondrial genomes. SAGBAC processes assembly outputs and applies our novel ISEIS (Iterative Sequence Ends Identity Search) algorithm to obtain a graph-based visualization. We applied this approach to three mitochondrial genomes of evening primrose (), a plant genus used for cytoplasmic genetics studies. All identified repeat pairs were found to be flanked by two alternative and unique sequence-contigs defining so-called 'double forks', resulting in four possible contig-repeat-contig combinations for each repeat pair. Based on the inferred structural models, the stoichiometry of the different contig-repeat-contig combinations was analyzed using Illumina mate-pair and PacBio RSII data. This uncovered a remarkable structural diversity of the three closely related mitochondrial genomes, as well as substantial phylogenetic variation of the underlying repeats. Our model allows predicting all recombination events and, thus, all possible sub-genomes. In future work, the proposed methodology may prove useful for the investigation of the sub-genome organization and dynamics in different tissues and at various developmental stages.

摘要

植物线粒体基因组呈现出极大的结构复杂性,因为重组重复对会导致各种亚基因组分子的产生,使得这些基因组的组装极具挑战性。我们提出了一种名为SAGBAC(基于图的半自动组装策展人)的新型生物信息学数据处理流程,该流程可识别重组重复对并重建植物线粒体基因组。SAGBAC处理组装输出,并应用我们新颖的ISEIS(迭代序列末端同一性搜索)算法来获得基于图的可视化结果。我们将这种方法应用于月见草属的三个线粒体基因组,该植物属用于细胞质遗传学研究。所有鉴定出的重复对两侧均有两个定义所谓“双叉”的替代且独特的序列重叠群,导致每个重复对有四种可能的重叠群-重复-重叠群组合。基于推断的结构模型,使用Illumina配对末端测序和PacBio RSII数据对不同重叠群-重复-重叠群组合的化学计量进行了分析。这揭示了三个密切相关的线粒体基因组显著的结构多样性,以及潜在重复序列的大量系统发育变异。我们的模型能够预测所有重组事件,进而预测所有可能的亚基因组。在未来的工作中,所提出的方法可能被证明对研究不同组织和不同发育阶段的亚基因组组织及动态变化有用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1dde/8969700/b33cc87241e8/lqac027fig1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验