Suppr超能文献

用于生物燃料研究的耐胁迫野生型酿酒酵母菌株的基因组序列与分析

Genome Sequence and Analysis of a Stress-Tolerant, Wild-Derived Strain of Saccharomyces cerevisiae Used in Biofuels Research.

作者信息

McIlwain Sean J, Peris David, Sardi Maria, Moskvin Oleg V, Zhan Fujie, Myers Kevin S, Riley Nicholas M, Buzzell Alyssa, Parreiras Lucas S, Ong Irene M, Landick Robert, Coon Joshua J, Gasch Audrey P, Sato Trey K, Hittinger Chris Todd

机构信息

Department of Energy (DOE) Great Lakes Bioenergy Research Center, University of Wisconsin-Madison, Wisconsin 53706.

Department of Energy (DOE) Great Lakes Bioenergy Research Center, University of Wisconsin-Madison, Wisconsin 53706 Laboratory of Genetics, University of Wisconsin-Madison, Wisconsin 53706 Genome Center of Wisconsin, University of Wisconsin-Madison, Wisconsin 53706 Wisconsin Energy Institute, J. F. Crow Institute for the Study of Evolution, University of Wisconsin-Madison, Wisconsin 53706.

出版信息

G3 (Bethesda). 2016 Jun 1;6(6):1757-66. doi: 10.1534/g3.116.029389.

Abstract

The genome sequences of more than 100 strains of the yeast Saccharomyces cerevisiae have been published. Unfortunately, most of these genome assemblies contain dozens to hundreds of gaps at repetitive sequences, including transposable elements, tRNAs, and subtelomeric regions, which is where novel genes generally reside. Relatively few strains have been chosen for genome sequencing based on their biofuel production potential, leaving an additional knowledge gap. Here, we describe the nearly complete genome sequence of GLBRCY22-3 (Y22-3), a strain of S. cerevisiae derived from the stress-tolerant wild strain NRRL YB-210 and subsequently engineered for xylose metabolism. After benchmarking several genome assembly approaches, we developed a pipeline to integrate Pacific Biosciences (PacBio) and Illumina sequencing data and achieved one of the highest quality genome assemblies for any S. cerevisiae strain. Specifically, the contig N50 is 693 kbp, and the sequences of most chromosomes, the mitochondrial genome, and the 2-micron plasmid are complete. Our annotation predicts 92 genes that are not present in the reference genome of the laboratory strain S288c, over 70% of which were expressed. We predicted functions for 43 of these genes, 28 of which were previously uncharacterized and unnamed. Remarkably, many of these genes are predicted to be involved in stress tolerance and carbon metabolism and are shared with a Brazilian bioethanol production strain, even though the strains differ dramatically at most genetic loci. The Y22-3 genome sequence provides an exceptionally high-quality resource for basic and applied research in bioenergy and genetics.

摘要

已公布了100多种酿酒酵母菌株的基因组序列。遗憾的是,这些基因组组装大多在重复序列处存在数十个到数百个缺口,这些重复序列包括转座元件、tRNA和亚端粒区域,而新基因通常就位于这些区域。基于生物燃料生产潜力而被选作基因组测序的菌株相对较少,这又造成了知识缺口。在此,我们描述了GLBRCY22-3(Y22-3)近乎完整的基因组序列,该菌株源自耐胁迫野生菌株NRRL YB-210,随后经过工程改造用于木糖代谢。在对几种基因组组装方法进行基准测试后,我们开发了一种整合太平洋生物科学公司(PacBio)和Illumina测序数据的流程,并获得了所有酿酒酵母菌株中质量最高的基因组组装之一。具体而言,重叠群N50为693 kbp,大多数染色体、线粒体基因组和2微米质粒的序列都是完整的。我们的注释预测有92个基因不存在于实验室菌株S288c的参考基因组中,其中70%以上都有表达。我们预测了其中43个基因的功能,其中28个基因以前未被表征和命名。值得注意的是,尽管这些菌株在大多数基因位点上差异巨大,但预计其中许多基因都参与胁迫耐受性和碳代谢,并且与一种巴西生物乙醇生产菌株共有。Y22-3基因组序列为生物能源和遗传学的基础研究和应用研究提供了一个质量极高的资源。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dfde/4889671/7e9febbdb83b/1757f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验