Suppr超能文献

基于二代测序数据的叶绿体个体内多态性:其来源及处理方法?

Intra-individual polymorphism in chloroplasts from NGS data: where does it come from and how to handle it?

作者信息

Scarcelli N, Mariac C, Couvreur T L P, Faye A, Richard D, Sabot F, Berthouly-Salazar C, Vigouroux Y

机构信息

UMR DIADE, IRD Montpellier, 911 avenue Agropolis, 34394, Montpellier Cedex 5, France.

Département des Sciences Biologiques, Laboratoire de Botanique Systématique et d'Ecologie, Ecole Normale Supérieure, Université de Yaoundé I, BP 047, Yaoundé, Cameroon.

出版信息

Mol Ecol Resour. 2016 Mar;16(2):434-45. doi: 10.1111/1755-0998.12462. Epub 2015 Sep 20.

Abstract

Next-generation sequencing allows access to a large quantity of genomic data. In plants, several studies used whole chloroplast genome sequences for inferring phylogeography or phylogeny. Even though the chloroplast is a haploid organelle, NGS plastome data identified a nonnegligible number of intra-individual polymorphic SNPs. Such observations could have several causes such as sequencing errors, the presence of heteroplasmy or transfer of chloroplast sequences in the nuclear and mitochondrial genomes. The occurrence of allelic diversity has practical important impacts on the identification of diversity, the analysis of the chloroplast data and beyond that, significant evolutionary questions. In this study, we show that the observed intra-individual polymorphism of chloroplast sequence data is probably the result of plastid DNA transferred into the mitochondrial and/or the nuclear genomes. We further assess nine different bioinformatics pipelines' error rates for SNP and genotypes calling using SNPs identified in Sanger sequencing. Specific pipelines are adequate to deal with this issue, optimizing both specificity and sensitivity. Our results will allow a proper use of whole chloroplast NGS sequence and will allow a better handling of NGS chloroplast sequence diversity.

摘要

新一代测序技术能够获取大量的基因组数据。在植物研究中,已有多项研究利用整个叶绿体基因组序列来推断系统地理学或系统发育关系。尽管叶绿体是一个单倍体细胞器,但二代测序的质体基因组数据仍鉴定出了数量不可忽视的个体内多态性单核苷酸多态性(SNP)。此类观察结果可能有多种原因,如测序错误、异质性的存在或叶绿体序列向核基因组和线粒体基因组的转移。等位基因多样性的出现对多样性的鉴定、叶绿体数据分析以及更广泛的重大进化问题都具有实际重要影响。在本研究中,我们表明观察到的叶绿体序列数据个体内多态性可能是质体DNA转移到线粒体和/或核基因组的结果。我们还使用在桑格测序中鉴定出的SNP评估了九种不同生物信息学流程在SNP和基因型调用方面的错误率。特定的流程足以处理这一问题,同时优化特异性和敏感性。我们的结果将有助于正确使用整个叶绿体二代测序序列,并更好地处理二代测序叶绿体序列多样性。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验