Gupta Yogesh, Pathak Ashish K, Singh Kashmir, Mantri Shrikant S, Singh Sudhir P, Tuli Rakesh
National Agri-Food Biotechnology Institute (NABI), Department of Biotechnology (DBT), C-127, Industrial Area, Phase-8, -160071, Mohali, India.
University Institute of Engineering and Technology, Panjab University, Chandigarh, India.
BMC Genomics. 2015 Feb 14;16(1):86. doi: 10.1186/s12864-015-1248-3.
Annona squamosa L., a popular fruit tree, is the most widely cultivated species of the genus Annona. The lack of transcriptomic and genomic information limits the scope of genome investigations in this important shrub. It bears aggregate fruits with numerous seeds. A few rare accessions with very few seeds have been reported for Annona. A massive pyrosequencing (Roche, 454 GS FLX+) of transcriptome from early stages of fruit development (0, 4, 8 and 12 days after pollination) was performed to produce expression datasets in two genotypes, Sitaphal and NMK-1, that show a contrast in the number of seeds set in fruits. The data reported here is the first source of genome-wide differential transcriptome sequence in two genotypes of A. squamosa, and identifies several candidate genes related to seed development.
Approximately 1.9 million high-quality clean reads were obtained in the cDNA library from the developing fruits of both the genotypes, with an average length of about 568 bp. Quality-reads were assembled de novo into 2074 to 11004 contigs in the developing fruit samples at different stages of development. The contig sequence data of all the four stages of each genotype were combined into larger units resulting into 14921 (Sitaphal) and 14178 (NMK-1) unigenes, with a mean size of more than 1 Kb. Assembled unigenes were functionally annotated by querying against the protein sequences of five different public databases (NCBI non redundant, Prunus persica, Vitis vinifera, Fragaria vesca, and Amborella trichopoda), with an E-value cut-off of 10(-5). A total of 4588 (Sitaphal) and 2502 (NMK-1) unigenes did not match any known protein in the NR database. These sequences could be genes specific to Annona sp. or belong to untranslated regions. Several of the unigenes representing pathways related to primary and secondary metabolism, and seed and fruit development expressed at a higher level in Sitaphal, the densely seeded cultivar in comparison to the poorly seeded NMK-1. A total of 2629 (Sitaphal) and 3445 (NMK-1) Simple Sequence Repeat (SSR) motifs were identified respectively in the two genotypes. These could be potential candidates for transcript based microsatellite analysis in A. squamosa.
The present work provides early-stage fruit specific transcriptome sequence resource for A. squamosa. This repository will serve as a useful resource for investigating the molecular mechanisms of fruit development, and improvement of fruit related traits in A. squamosa and related species.
番荔枝是一种广受欢迎的果树,是番荔枝属中种植最广泛的物种。转录组和基因组信息的缺乏限制了对这种重要灌木的基因组研究范围。它结聚生果,果实中有许多种子。据报道,番荔枝有一些种子极少的稀有种质。对果实发育早期阶段(授粉后0、4、8和12天)的转录组进行了大规模焦磷酸测序(罗氏454 GS FLX+),以生成两个基因型(释迦和NMK-1)的表达数据集,这两个基因型在果实中形成的种子数量上存在差异。本文报道的数据是两种番荔枝基因型全基因组差异转录组序列的首个来源,并鉴定了几个与种子发育相关的候选基因。
从两种基因型发育中的果实的cDNA文库中获得了约190万个高质量的clean reads,平均长度约为568 bp。在不同发育阶段的发育中果实样本中,质量reads被从头组装成2074至11004个重叠群。每个基因型所有四个阶段的重叠群序列数据被合并成更大的单元,从而产生14921个(释迦)和14178个(NMK-1)单基因,平均大小超过1 kb。通过与五个不同公共数据库(NCBI非冗余、桃、葡萄、草莓和无油樟)的蛋白质序列进行比对,对组装的单基因进行功能注释,E值截止为10^(-5)。共有4588个(释迦)和2502个(NMK-1)单基因在NR数据库中与任何已知蛋白质都不匹配。这些序列可能是番荔枝属特有的基因,或者属于非翻译区。与初级和次级代谢以及种子和果实发育相关途径的几个单基因在种子密集的品种释迦中比种子较少的NMK-1中表达水平更高。在这两个基因型中分别鉴定出了总共2629个(释迦)和3445个(NMK-1)简单序列重复(SSR)基序。这些可能是番荔枝基于转录本的微卫星分析的潜在候选物。
本研究为番荔枝提供了早期果实特异性转录组序列资源。该资源库将为研究果实发育的分子机制以及改良番荔枝和相关物种的果实相关性状提供有用的资源。