Redwan R M, Saidin A, Kumar S V
Biotechnology Research Institute, Universiti Malaysia Sabah, Jalan UMS, 88400, Kota Kinabalu, Sabah, Malaysia.
Novocraft Technology Sdn. Bhd., 3 Two Square, Seksyen 19, Petaling Jaya, Selangor, Malaysia.
BMC Plant Biol. 2015 Aug 12;15:196. doi: 10.1186/s12870-015-0587-1.
Pineapple (Ananas comosus var. comosus) is known as the king of fruits for its crown and is the third most important tropical fruit after banana and citrus. The plant, which is indigenous to South America, is the most important species in the Bromeliaceae family and is largely traded for fresh fruit consumption. Here, we report the complete chloroplast sequence of the MD-2 pineapple that was sequenced using the PacBio sequencing technology.
In this study, the high error rate of PacBio long sequence reads of A. comosus's total genomic DNA were improved by leveraging on the high accuracy but short Illumina reads for error-correction via the latest error correction module from Novocraft. Error corrected long PacBio reads were assembled by using a single tool to produce a contig representing the pineapple chloroplast genome. The genome of 159,636 bp in length is featured with the conserved quadripartite structure of chloroplast containing a large single copy region (LSC) with a size of 87,482 bp, a small single copy region (SSC) with a size of 18,622 bp and two inverted repeat regions (IRA and IRB) each with the size of 26,766 bp. Overall, the genome contained 117 unique coding regions and 30 were repeated in the IR region with its genes contents, structure and arrangement similar to its sister taxon, Typha latifolia. A total of 35 repeats structure were detected in both the coding and non-coding regions with a majority being tandem repeats. In addition, 205 SSRs were detected in the genome with six protein-coding genes contained more than two SSRs. Comparative chloroplast genomes from the subclass Commelinidae revealed a conservative protein coding gene albeit located in a highly divergence region. Analysis of selection pressure on protein-coding genes using Ka/Ks ratio showed significant positive selection exerted on the rps7 gene of the pineapple chloroplast with P less than 0.05. Phylogenetic analysis confirmed the recent taxonomical relation among the member of commelinids which support the monophyly relationship between Arecales and Dasypogonaceae and between Zingiberales to the Poales, which includes the A. comosus.
The complete sequence of the chloroplast of pineapple provides insights to the divergence of genic chloroplast sequences from the members of the subclass Commelinidae. The complete pineapple chloroplast will serve as a reference for in-depth taxonomical studies in the Bromeliaceae family when more species under the family are sequenced in the future. The genetic sequence information will also make feasible other molecular applications of the pineapple chloroplast for plant genetic improvement.
菠萝(Ananas comosus var. comosus)因其顶部形状而被誉为水果之王,是继香蕉和柑橘之后第三重要的热带水果。这种原产于南美洲的植物是凤梨科最重要的物种,主要作为新鲜水果进行贸易。在此,我们报告了使用PacBio测序技术测序的MD - 2菠萝的完整叶绿体序列。
在本研究中,通过利用Illumina短读长的高精度数据,借助Novocraft最新的纠错模块对凤梨科植物总基因组DNA的PacBio长序列读段的高错误率进行了校正。经纠错后的PacBio长读段使用单一工具进行组装,生成了一个代表菠萝叶绿体基因组的重叠群。该基因组长度为159,636 bp,具有叶绿体保守的四分体结构,包含一个大小为87,482 bp的大单拷贝区域(LSC)、一个大小为18,622 bp的小单拷贝区域(SSC)以及两个大小均为26,766 bp的反向重复区域(IRA和IRB)。总体而言,该基因组包含117个独特的编码区,其中30个在IR区域重复,其基因内容、结构和排列与其姐妹分类群宽叶香蒲相似。在编码区和非编码区共检测到35个重复结构,其中大多数为串联重复。此外,在基因组中检测到205个简单序列重复(SSR),有6个蛋白质编码基因包含两个以上的SSR。来自鸭跖草亚纲的叶绿体基因组比较显示,尽管位于高度分化的区域,但蛋白质编码基因具有保守性。使用Ka/Ks比率对蛋白质编码基因的选择压力分析表明,菠萝叶绿体的rps7基因受到显著的正选择,P值小于0.05。系统发育分析证实了鸭跖草亚纲成员之间最近的分类关系,支持棕榈目与刺叶树科之间以及姜目与禾本目(包括菠萝)之间的单系关系。
菠萝叶绿体的完整序列为鸭跖草亚纲成员叶绿体基因序列的分化提供了见解。当未来对凤梨科更多物种进行测序时,完整的菠萝叶绿体将为该科深入的分类学研究提供参考。该遗传序列信息也将使菠萝叶绿体在植物遗传改良中的其他分子应用成为可能。