Dutta Debabrata, Awon Vivek Kumar, Gangopadhyay Gaurab
Division of Plant Biology, Bose Institute (Main Campus), 93/1 APC Road, Kolkata - 700009, India.
Data Brief. 2020 Oct 21;33:106448. doi: 10.1016/j.dib.2020.106448. eCollection 2020 Dec.
We report here the data of transcriptome sequencing of control and infected sesame genotypes. Sesame is an emerging oilseed crop [1]. The destructive soil-borne fungi Tassi (Goid) causes charcoal rot of sesame, leading to high (>50%) yield loss. Most of the high-yielding sesame cultivars () of India are susceptible to charcoal rot. Wild sesame, shows a high degree of tolerance against many pathogens [2]. We have earlier developed an interspecific hybrid between Indian cultivated sesame and . The parents and the F recombinant constitute the three experimental genotypes in the present report. The seedlings were infected with . The data of the infected and control (mock-inoculated) transcriptome is presented. The RNA-seq by Illumina NovaSeq 6000 technology generated 2.9 × 10 paired-end reads. We deposited the data in NCBI sequence read archive (SRA) with accession number PRJNA642699. The assembly of clean reads generated 106,295 unigenes with an average length of 1,342 bp covering 1.42 × 10 nucleotides. The screening of 106,295 unigenes with MISA and SAMtools software resulted in the identification of 26,880 simple sequence repeats (SSRs), 90,181 single nucleotide polymorphisms (SNPs), and 25,063 insertion deletions (InDels). Apart from mono-base repeats, di-nucleotides repeats (42.51%) were found to be the most abundant, followed by tri-nucleotides (14.28%) among the SSRs. Subsequently, we have designed 22,494 pairs of primers based on perfect di and tri-nucleotide SSRs. Transitions (Ts, 60%) were the most abundant substitution type among the SNPs followed by transversions type (Tv, 40%), with a Ts/Tv ratio of 1.48. The development of genic-SSR markers and SNP information will pave the way for molecular marker-assisted breeding of sesame for tolerance against charcoal rot.
我们在此报告对照和受感染芝麻基因型的转录组测序数据。芝麻是一种新兴的油料作物[1]。具有破坏性的土传真菌塔西氏菌(Goid)会导致芝麻发生炭腐病,造成高达50%以上的产量损失。印度大多数高产芝麻品种()对炭腐病敏感。野生芝麻对多种病原体具有高度耐受性[2]。我们之前培育出了印度栽培芝麻与之间的种间杂交种。亲本和F代重组体构成了本报告中的三种实验基因型。幼苗用进行了感染。呈现了受感染和对照(模拟接种)转录组的数据。通过Illumina NovaSeq 6000技术进行的RNA测序产生了2.9×10对末端配对读段。我们将数据存入了NCBI序列读取存档库(SRA),登录号为PRJNA642699。对clean reads进行组装产生了106,295个单基因,平均长度为1342 bp,覆盖1.42×10个核苷酸。使用MISA和SAMtools软件对106,295个单基因进行筛选,鉴定出26,880个简单序列重复(SSR)、90,181个单核苷酸多态性(SNP)和25,063个插入缺失(InDel)。除了单碱基重复外,在SSR中,二核苷酸重复(42.51%)最为丰富,其次是三核苷酸重复(14.28%)。随后,我们基于完美二核苷酸和三核苷酸SSR设计了22,494对引物。在SNP中,转换(Ts,60%)是最丰富的替代类型,其次是颠换类型(Tv,40%),Ts/Tv比率为1.48。基因SSR标记和SNP信息的开发将为芝麻抗炭腐病分子标记辅助育种铺平道路。