Zhernakov Aleksandr, Rotter Björn, Winter Peter, Borisov Alexey, Tikhonovich Igor, Zhukov Vladimir
All-Russia Research Institute for Agricultural Microbiology, Podbelsky chausse 3, 196608 Saint-Petersburg - Pushkin, Russia.
GenXPro GmbH, Altenhöferallee 3, 60438 Frankfurt am Main, Germany.
Genom Data. 2016 Dec 12;11:75-76. doi: 10.1016/j.gdata.2016.12.004. eCollection 2017 Mar.
Aimed at gene-based markers design, we generated and analyzed transcriptome sequencing datasets for six pea ( L.) genetic lines that have not previously been massively genotyped. Five cDNA libraries obtained from nodules or nodulated roots of genetic lines Finale, Frisson, Sparkle, Sprint-2 and NGB1238 were sequenced using a versatile 3'-RNA-seq protocol called MACE (Massive Analysis of cDNA Ends). MACE delivers a single next-generation sequence from the 3'-end of each individual cDNA molecule that precisely quantifies the respective transcripts. Since the contig generated from the 3'-end of the cDNA by assembling all sequences encompasses the highly polymorphic 3'-untranslated region (3'-UTR), MACE efficiently detects single nucleotide variants (SNVs). Mapping MACE reads to the reference nodule transcriptome assembly of the pea line SGE (Transcriptome Shotgun Assembly GDTM00000000.1) resulted in characterization of over 34,000 polymorphic sites in more than 9700 contigs. Several of these SNVs were located within recognition sequences of restriction endonucleases which allowed the design of co-dominant CAPS markers for the particular transcript. Cleaned reads of sequenced libraries are available from European Nucleotide Archive (http://www.ebi.ac.uk/) under accessions PRJEB18101, PRJEB18102, PRJEB18103, PRJEB18104, PRJEB17691.
针对基于基因的标记设计,我们生成并分析了六个豌豆(L.)遗传系的转录组测序数据集,这些遗传系以前未曾进行过大规模基因分型。从遗传系Finale、Frisson、Sparkle、Sprint-2和NGB1238的根瘤或结瘤根中获得的五个cDNA文库,使用一种名为MACE(cDNA末端大规模分析)的通用3'-RNA-seq方案进行测序。MACE从每个单独的cDNA分子的3'末端提供单个下一代序列,精确量化各自的转录本。由于通过组装所有序列从cDNA的3'末端生成的重叠群包含高度多态的3'-非翻译区(3'-UTR),MACE能够有效检测单核苷酸变体(SNV)。将MACE读数映射到豌豆品系SGE的参考根瘤转录组组装(转录组鸟枪法组装GDTM00000000.1),导致在9700多个重叠群中鉴定出超过34000个多态性位点。其中一些SNV位于限制性内切酶的识别序列内,这使得能够为特定转录本设计共显性CAPS标记。测序文库的清理读数可从欧洲核苷酸档案馆(http://www.ebi.ac.uk/)获取,登录号分别为PRJEB18101、PRJEB18102、PRJEB18103、PRJEB18104、PRJEB17691。