Suppr超能文献

长读测序和荒漠灌木短柄垂头菊的从头基因组组装。

Long-read sequencing and de novo genome assembly of Ammopiptanthus nanus, a desert shrub.

机构信息

College of Life and Evironmental Sciences, Minzu University of China, 27 Zhongguancun South Street, Beijing, 100081, China.

Biomarker Technologies Corporation, Floor 8, Shunjie Building, 12 Fuqian Road, Nanfaxin Town, Shunyi District, Beijing, 101300, China.

出版信息

Gigascience. 2018 Jul 1;7(7). doi: 10.1093/gigascience/giy074.

Abstract

BACKGROUND

Ammopiptanthus nanus is a rare broad-leaved shrub that is found in the desert and arid regions of Central Asia. This plant species exhibits extremely high tolerance to drought and freezing and has been used in abiotic tolerance research in plants. As a relic of the tertiary period, A. nanus is of great significance to plant biogeographic research in the ancient Mediterranean region. Here, we report a draft genome assembly using the Pacific Biosciences (PacBio) platform and gene annotation for A. nanus.

FINDINGS

A total of 64.72 Gb of raw PacBio sequel reads were generated from four 20-kb libraries. After filtering, 64.53 Gb of clean reads were obtained, giving 72.59× coverage depth. Assembly using Canu gave an assembly length of 823.74 Mb, with a contig N50 of 2.76 Mb. The final size of the assembled A. nanus genome was close to the 889 Mb estimated by k-mer analysis. The gene annotation completeness was evaluated using Benchmarking Universal Single-Copy Orthologs; 1,327 of the 1,440 conserved genes (92.15%) could be found in the A. nanus assembly. Genome annotation revealed that 74.08% of the A. nanus genome is composed of repetitive elements and 53.44% is composed of long terminal repeat elements. We predicted  37,188 protein-coding genes, of which 96.53% were functionally annotated.

CONCLUSIONS

The genomic sequences of A. nanus could be a valuable source for comparative genomic analysis in the legume family and will be useful for understanding the phylogenetic relationships of the Thermopsideae and the evolutionary response of plant species to the Qinghai Tibetan Plateau uplift.

摘要

背景

沙冬青是一种罕见的阔叶灌木,分布于中亚荒漠和干旱地区。该植物物种对干旱和冷冻具有极高的耐受性,已被用于植物非生物耐受性研究。作为第三纪的遗留物,沙冬青对古地中海地区的植物生物地理研究具有重要意义。在这里,我们报告了使用 Pacific Biosciences (PacBio) 平台对沙冬青进行的基因组草图组装和基因注释。

发现

从四个 20 kb 文库中生成了总计 64.72 Gb 的原始 PacBio 测序序列。过滤后,获得了 64.53 Gb 的清洁读取序列,覆盖率深度为 72.59×。使用 Canu 进行组装得到的组装长度为 823.74 Mb,contig N50 为 2.76 Mb。最终组装的沙冬青基因组大小接近通过 k-mer 分析估计的 889 Mb。使用 Benchmarking Universal Single-Copy Orthologs 评估基因注释的完整性;1440 个保守基因中有 1327 个(92.15%)可以在沙冬青组装中找到。基因组注释表明,沙冬青基因组的 74.08%由重复元件组成,53.44%由长末端重复元件组成。我们预测了 37188 个蛋白质编码基因,其中 96.53%具有功能注释。

结论

沙冬青的基因组序列可以成为豆科比较基因组分析的宝贵资源,对于理解 Thermopsideae 的系统发育关系以及植物物种对青藏高原隆起的进化响应将非常有用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b511/6048559/c3fec8039e6d/giy074fig1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验