Suppr超能文献

豆科模式植物蒺藜苜蓿的一个改进的基因组版本(Mt4.0)。

An improved genome release (version Mt4.0) for the model legume Medicago truncatula.

作者信息

Tang Haibao, Krishnakumar Vivek, Bidwell Shelby, Rosen Benjamin, Chan Agnes, Zhou Shiguo, Gentzbittel Laurent, Childs Kevin L, Yandell Mark, Gundlach Heidrun, Mayer Klaus F X, Schwartz David C, Town Christopher D

机构信息

J, Craig Venter Institute, 9704 Medical Center Drive, Rockville, MD, USA.

出版信息

BMC Genomics. 2014 Apr 27;15:312. doi: 10.1186/1471-2164-15-312.

Abstract

BACKGROUND

Medicago truncatula, a close relative of alfalfa, is a preeminent model for studying nitrogen fixation, symbiosis, and legume genomics. The Medicago sequencing project began in 2003 with the goal to decipher sequences originated from the euchromatic portion of the genome. The initial sequencing approach was based on a BAC tiling path, culminating in a BAC-based assembly (Mt3.5) as well as an in-depth analysis of the genome published in 2011.

RESULTS

Here we describe a further improved and refined version of the M. truncatula genome (Mt4.0) based on de novo whole genome shotgun assembly of a majority of Illumina and 454 reads using ALLPATHS-LG. The ALLPATHS-LG scaffolds were anchored onto the pseudomolecules on the basis of alignments to both the optical map and the genotyping-by-sequencing (GBS) map. The Mt4.0 pseudomolecules encompass 360 Mb of actual sequences spanning 390 Mb of which ~330 Mb align perfectly with the optical map, presenting a drastic improvement over the BAC-based Mt3.5 which only contained 70% sequences (250 Mb) of the current version. Most of the sequences and genes that previously resided on the unanchored portion of Mt3.5 have now been incorporated into the Mt4.0 pseudomolecules, with the exception of ~28 Mb of unplaced sequences. With regard to gene annotation, the genome has been re-annotated through our gene prediction pipeline, which integrates EST, RNA-seq, protein and gene prediction evidences. A total of 50,894 genes (31,661 high confidence and 19,233 low confidence) are included in Mt4.0 which overlapped with ~82% of the gene loci annotated in Mt3.5. Of the remaining genes, 14% of the Mt3.5 genes have been deprecated to an "unsupported" status and 4% are absent from the Mt4.0 predictions.

CONCLUSIONS

Mt4.0 and its associated resources, such as genome browsers, BLAST-able datasets and gene information pages, can be found on the JCVI Medicago web site (http://www.jcvi.org/medicago). The assembly and annotation has been deposited in GenBank (BioProject: PRJNA10791). The heavily curated chromosomal sequences and associated gene models of Medicago will serve as a better reference for legume biology and comparative genomics.

摘要

背景

蒺藜苜蓿是紫花苜蓿的近亲,是研究固氮、共生和豆科植物基因组学的卓越模式植物。蒺藜苜蓿测序项目始于2003年,目标是破解源自基因组常染色质部分的序列。最初的测序方法基于BAC重叠群路径,最终形成了基于BAC的组装版本(Mt3.5)以及于2011年发表的对该基因组的深入分析。

结果

在此,我们描述了蒺藜苜蓿基因组的一个进一步改进和优化的版本(Mt4.0),它基于使用ALLPATHS-LG对大部分Illumina和454测序读段进行的从头全基因组鸟枪法组装。ALLPATHS-LG支架基于与光学图谱和测序基因分型(GBS)图谱的比对,被锚定到假分子上。Mt4.0假分子包含约360 Mb的实际序列,跨度为390 Mb,其中约330 Mb与光学图谱完美比对,相较于仅包含当前版本70%序列(约250 Mb)的基于BAC的Mt3.5有了显著改进。之前位于Mt3.5未锚定部分的大多数序列和基因现在已被纳入Mt4.0假分子中,除了约28 Mb未定位的序列。关于基因注释,通过我们整合了EST、RNA-seq、蛋白质和基因预测证据的基因预测流程,对基因组进行了重新注释。Mt4.0总共包含50,894个基因(31,661个高可信度和19,233个低可信度),与Mt3.5中注释的基因位点约82%重叠。在其余基因中,Mt3.5中14%的基因已被归为“无支持”状态,4%在Mt4.0预测中缺失。

结论

Mt4.0及其相关资源,如基因组浏览器、可用于BLAST的数据集和基因信息页面,可在JCVI蒺藜苜蓿网站(http://www.jcvi.org/medicago)上找到。该组装和注释已存入GenBank(生物项目:PRJNA10791)。经过严格整理的蒺藜苜蓿染色体序列和相关基因模型将为豆科植物生物学和比较基因组学提供更好的参考。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8353/4234490/8b95fada7a9d/1471-2164-15-312-1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验