Debat Humberto J, Grabiele Mauro, Aguilera Patricia M, Bubillo Rosana E, Otegui Mónica B, Ducasse Daniel A, Zapata Pedro D, Marti Dardo A
Instituto de Patología Vegetal, Centro de Investigaciones Agropecuarias, Instituto Nacional de Tecnología Agropecuaria (IPAVE-CIAP-INTA), Córdoba, Argentina.
Instituto de Biología Subtropical, Universidad Nacional de Misiones (IBS-UNaM-CONICET), Posadas, Misiones, Argentina; Instituto de Biotecnología de Misiones, Facultad de Ciencias Exactas Químicas y Naturales, Universidad Nacional de Misiones (INBIOMIS-FCEQyN-UNaM), Misiones, Argentina.
PLoS One. 2014 Oct 16;9(10):e109835. doi: 10.1371/journal.pone.0109835. eCollection 2014.
Yerba mate (Ilex paraguariensis A. St.-Hil.) is an important subtropical tree crop cultivated on 326,000 ha in Argentina, Brazil and Paraguay, with a total yield production of more than 1,000,000 t. Yerba mate presents a strong limitation regarding sequence information. The NCBI GenBank lacks an EST database of yerba mate and depicts only 80 DNA sequences, mostly uncharacterized. In this scenario, in order to elucidate the yerba mate gene landscape by means of NGS, we explored and discovered a vast collection of I. paraguariensis transcripts. Total RNA from I. paraguariensis was sequenced by Illumina HiSeq-2000 obtaining 72,031,388 pair-end 100 bp sequences. High quality reads were de novo assembled into 44,907 transcripts encompassing 40 million bases with an estimated coverage of 180X. Multiple sequence analysis allowed us to predict that yerba mate contains ∼ 32,355 genes and 12,551 gene variants or isoforms. We identified and categorized members of more than 100 metabolic pathways. Overall, we have identified ∼ 1,000 putative transcription factors, genes involved in heat and oxidative stress, pathogen response, as well as disease resistance and hormone response. We have also identified, based in sequence homology searches, novel transcripts related to osmotic, drought, salinity and cold stress, senescence and early flowering. We have also pinpointed several members of the gene silencing pathway, and characterized the silencing effector Argonaute1. We predicted a diverse supply of putative microRNA precursors involved in developmental processes. We present here the first draft of the transcribed genomes of the yerba mate chloroplast and mitochondrion. The putative sequence and predicted structure of the caffeine synthase of yerba mate is presented. Moreover, we provide a collection of over 10,800 SSR accessible to the scientific community interested in yerba mate genetic improvement. This contribution broadly expands the limited knowledge of yerba mate genes, and is presented as the first genomic resource of this important crop.
巴拉圭茶(冬青科巴拉圭茶)是一种重要的亚热带树木作物,在阿根廷、巴西和巴拉圭种植面积达32.6万公顷,总产量超过100万吨。巴拉圭茶在序列信息方面存在很大限制。美国国立生物技术信息中心的基因库缺乏巴拉圭茶的EST数据库,仅收录了80条DNA序列,且大多未作特征描述。在这种情况下,为了通过新一代测序技术阐明巴拉圭茶的基因图谱,我们探索并发现了大量巴拉圭茶转录本。利用Illumina HiSeq-2000对巴拉圭茶的总RNA进行测序,获得了72,031,388对末端100碱基序列。高质量的 reads 被从头组装成44,907个转录本,覆盖4000万个碱基,估计覆盖率为180倍。多重序列分析使我们预测巴拉圭茶含有约32,355个基因以及12,551个基因变体或异构体。我们鉴定并分类了100多条代谢途径的成员。总体而言,我们鉴定出了约1000个假定的转录因子、参与热应激和氧化应激、病原体应答以及抗病性和激素应答的基因。我们还通过序列同源性搜索鉴定出了与渗透、干旱、盐度和冷应激、衰老以及早花相关的新转录本。我们还确定了基因沉默途径的几个成员,并对沉默效应因子AGO1进行了特征描述。我们预测了参与发育过程的多种假定的microRNA前体。在此,我们展示了巴拉圭茶叶绿体和线粒体转录基因组的初稿。展示了巴拉圭茶咖啡因合酶的假定序列和预测结构。此外,我们为对巴拉圭茶遗传改良感兴趣的科学界提供了一个超过10,800个SSR的集合。这一成果极大地扩展了对巴拉圭茶基因的有限认识,并作为这种重要作物的首个基因组资源呈现。