Suppr超能文献

通过高通量测序和高性能计算探索非模式产油微藻杜氏盐藻的转录组。

Exploring the transcriptome of non-model oleaginous microalga Dunaliella tertiolecta through high-throughput sequencing and high performance computing.

作者信息

Yao Lina, Tan Kenneth Wei Min, Tan Tin Wee, Lee Yuan Kun

机构信息

Department of Microbiology and Immunology, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, 117545, Singapore.

Department of Biochemistry, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, 117596, Singapore.

出版信息

BMC Bioinformatics. 2017 Feb 22;18(1):122. doi: 10.1186/s12859-017-1551-x.

Abstract

BACKGROUND

RNA-Seq technology has received a lot of attention in recent years for microalgal global transcriptomic profiling. It is widely used in transcriptome-wide analysis of gene expression., particularly for microalgal strains with potential as biofuel sources. However, insufficient genomic or transcriptomic information of non-model microalgae has limited the understanding of their regulatory mechanisms and hampered genetic manipulation to enhance biofuel production. As such, an optimal microalgal transcriptomic database construction is a subject of urgent investigation.

RESULTS

Dunaliella tertiolecta, a non-model oleaginous microalgal species, was sequenced via Illumina MISEQ and HISEQ 4000 in RNA-Seq studies. The high quality high-throughout sequencing data were explored using high performance computing (HPC) in a petascale data center and subjected to de novo assembly and parallelized mpiBLASTX search with multiple species. As a result, a transcriptome database of 17,845 was constructed (~95% completeness). This enlarged database constructed fueled the RNA-Seq data analysis, which was validated by a nitrogen deprivation (ND) study that induces triacylglycerol (TAG) production.

CONCLUSIONS

The new paralleled assembly and annotation method under HPC presented here allows the solution of large-scale data processing problems in acceptable computation time. There is significant increase in the number of transcriptomic data achieved and observable heterogeneity in the performance to identify differentially expressed genes in the ND treatment paradigm. The results provide new insights as to how response to ND treatment in microalgae is regulated. ND analyses highlight the advantages of this database generated in this study that could also serve as a useful resource for future gene manipulation and transcriptome-wide analysis. We thus demonstrate the usefulness of exploring the transcriptome as an informative platform for functional studies and genetic manipulations in similar species.

摘要

背景

近年来,RNA测序技术在微藻全基因组转录组分析中备受关注。它被广泛应用于全转录组基因表达分析,特别是对于具有生物燃料来源潜力的微藻菌株。然而,非模式微藻的基因组或转录组信息不足,限制了我们对其调控机制的理解,并阻碍了为提高生物燃料产量而进行的基因操作。因此,构建一个优化的微藻转录组数据库是亟待研究的课题。

结果

在RNA测序研究中,通过Illumina MISEQ和HISEQ 4000对非模式产油微藻物种盐生杜氏藻进行了测序。利用高性能计算(HPC)在千万亿字节级数据中心对高质量的高通量测序数据进行了探索,并进行了从头组装和与多个物种的并行mpiBLASTX搜索。结果,构建了一个包含17,845个转录本的数据库(完整性约为95%)。这个扩充的数据库为RNA测序数据分析提供了支持,这一点通过诱导三酰甘油(TAG)产生的氮剥夺(ND)研究得到了验证。

结论

本文提出的在HPC下的新并行组装和注释方法能够在可接受的计算时间内解决大规模数据处理问题。获得的转录组数据数量显著增加,并且在ND处理模式下识别差异表达基因方面的性能存在明显的异质性。这些结果为微藻对ND处理的反应调控方式提供了新的见解。ND分析突出了本研究中生成的这个数据库的优势,该数据库也可作为未来基因操作和全转录组分析的有用资源。因此,我们证明了将转录组作为类似物种功能研究和基因操作的信息平台进行探索的有用性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a761/5322580/bd9208b84131/12859_2017_1551_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验