The Bioinformatics Group, School of Water, Energy and Environment, Cranfield University, College Road, Bedford, MK43 0AL, UK T.
Division of Crop Improvement, ICAR-Indian Institute of Vegetable Research, Varanasi, India.
Bioinformatics. 2021 Aug 4;37(14):1941–1945. doi: 10.1093/bioinformatics/btab048. Epub 2021 Jan 30.
Solanum sitiens is a self-incompatible wild relative of tomato, characterised by salt and drought resistance traits, with the potential to contribute through breeding programmes to crop improvement in cultivated tomato. This species has a distinct morphology, classification and ecotype compared to other stress resistant wild tomato relatives such as S. pennellii and S. chilense. Therefore, the availability of a reference genome for S. sitiens will facilitate the genetic and molecular understanding of salt and drought resistance.
A high-quality de novo genome and transcriptome assembly for S. sitiens (Accession LA1974) has been developed. A hybrid assembly strategy was followed using Illumina short reads (∼159X coverage) and PacBio long reads (∼44X coverage), generating a total of ∼262 Gbp of DNA sequence. A reference genome of 1,245 Mbp, arranged in 1,483 scaffolds with a N50 of 1.826 Mbp was generated. Genome completeness was estimated at 95% using the Benchmarking Universal Single-Copy Orthologs (BUSCO) and the K-mer Analysis Tool (KAT). In addition, ∼63 Gbp of RNA-Seq were generated to support the prediction of 31,164 genes from the assembly, and to perform a de novo transcriptome. Lastly, we identified three large inversions compared to S. lycopersicum, containing several drought resistance related genes, such as beta-amylase 1 and YUCCA7.
S. sitiens (LA1974) raw sequencing, transcriptome and genome assembly have been deposited at the NCBI's Sequence Read Archive, under the BioProject number "PRJNA633104".All the commands and scripts necessary to generate the assembly are available at the following github repository: https://github.com/MCorentin/Solanum_sitiens_assembly.
Supplementary data are available at Bioinformatics online.
Solanum sitiens 是番茄的一种自交不亲和野生近缘种,具有耐盐和耐旱特性,通过育种计划有可能为栽培番茄的作物改良做出贡献。与其他耐盐野生番茄近缘种,如 S. pennellii 和 S. chilense 相比,该物种具有独特的形态、分类和生态型。因此,S. sitiens 参考基因组的可用性将促进对耐盐和耐旱性的遗传和分子理解。
我们为 S. sitiens(LA1974)开发了一个高质量的从头基因组和转录组组装。采用 Illumina 短读(约 159X 覆盖率)和 PacBio 长读(约 44X 覆盖率)的混合组装策略,共产生约 262 Gbp 的 DNA 序列。生成了一个 1245 Mbp 的参考基因组,由 1483 个支架排列而成,N50 为 1.826 Mbp。使用基准通用单拷贝同源物(BUSCO)和 K-mer 分析工具(KAT)估计基因组完整性为 95%。此外,还生成了约 63 Gbp 的 RNA-Seq,以支持从组装中预测 31164 个基因,并进行从头转录组分析。最后,与 S. lycopersicum 相比,我们鉴定了三个大的倒位,其中包含几个耐旱相关基因,如β-淀粉酶 1 和 YUCCA7。
S. sitiens(LA1974)原始测序、转录组和基因组组装已在 NCBI 的序列读取档案中存档,项目编号为“PRJNA633104”。生成组装所需的所有命令和脚本都可在以下 github 存储库中获得:https://github.com/MCorentin/Solanum_sitiens_assembly。
补充数据可在《生物信息学》在线获取。