Suppr超能文献

大麦基因组的低通量鸟枪法测序有助于快速鉴定基因、保守非编码序列和新型重复序列。

Low-pass shotgun sequencing of the barley genome facilitates rapid identification of genes, conserved non-coding sequences and novel repeats.

作者信息

Wicker Thomas, Narechania Apurva, Sabot Francois, Stein Joshua, Vu Giang T H, Graner Andreas, Ware Doreen, Stein Nils

机构信息

Institute of Plant Biology, University Zurich, Zollikerstrasse 107, 8008 Zurich, Switzerland.

出版信息

BMC Genomics. 2008 Oct 31;9:518. doi: 10.1186/1471-2164-9-518.

Abstract

BACKGROUND

Barley has one of the largest and most complex genomes of all economically important food crops. The rise of new short read sequencing technologies such as Illumina/Solexa permits such large genomes to be effectively sampled at relatively low cost. Based on the corresponding sequence reads a Mathematically Defined Repeat (MDR) index can be generated to map repetitive regions in genomic sequences.

RESULTS

We have generated 574 Mbp of Illumina/Solexa sequences from barley total genomic DNA, representing about 10% of a genome equivalent. From these sequences we generated an MDR index which was then used to identify and mark repetitive regions in the barley genome. Comparison of the MDR plots with expert repeat annotation drawing on the information already available for known repetitive elements revealed a significant correspondence between the two methods. MDR-based annotation allowed for the identification of dozens of novel repeat sequences, though, which were not recognised by hand-annotation. The MDR data was also used to identify gene-containing regions by masking of repetitive sequences in eight de-novo sequenced bacterial artificial chromosome (BAC) clones. For half of the identified candidate gene islands indeed gene sequences could be identified. MDR data were only of limited use, when mapped on genomic sequences from the closely related species Triticum monococcum as only a fraction of the repetitive sequences was recognised.

CONCLUSION

An MDR index for barley, which was obtained by whole-genome Illumina/Solexa sequencing, proved as efficient in repeat identification as manual expert annotation. Circumventing the labour-intensive step of producing a specific repeat library for expert annotation, an MDR index provides an elegant and efficient resource for the identification of repetitive and low-copy (i.e. potentially gene-containing sequences) regions in uncharacterised genomic sequences. The restriction that a particular MDR index can not be used across species is outweighed by the low costs of Illumina/Solexa sequencing which makes any chosen genome accessible for whole-genome sequence sampling.

摘要

背景

大麦拥有所有重要经济作物中最大且最复杂的基因组之一。诸如Illumina/Solexa等新型短读长测序技术的兴起,使得能够以相对较低的成本对如此庞大的基因组进行有效采样。基于相应的序列读数,可以生成一个数学定义重复(MDR)指数,以绘制基因组序列中的重复区域。

结果

我们从大麦总基因组DNA中生成了574 Mbp的Illumina/Solexa序列,约占一个基因组当量的10%。从这些序列中我们生成了一个MDR指数,然后用于识别和标记大麦基因组中的重复区域。将MDR图谱与基于已知重复元件已有信息的专家重复注释图进行比较,发现两种方法之间存在显著对应关系。不过,基于MDR的注释能够识别出数十个新的重复序列,而这些序列是手工注释无法识别的。MDR数据还通过掩盖八个从头测序的细菌人工染色体(BAC)克隆中的重复序列,用于识别含基因区域。对于一半已识别的候选基因岛,确实能够鉴定出基因序列。当将MDR数据映射到近缘物种一粒小麦的基因组序列上时,其用途有限,因为只能识别一小部分重复序列。

结论

通过全基因组Illumina/Solexa测序获得的大麦MDR指数,在重复序列识别方面与专家手工注释一样有效。MDR指数无需为专家注释制作特定重复文库这一劳动密集型步骤,为识别未表征基因组序列中的重复和低拷贝(即潜在含基因序列)区域提供了一种简便高效的资源。Illumina/Solexa测序成本低廉,使得任何选定的基因组都可用于全基因组序列采样,这弥补了特定MDR指数不能跨物种使用的限制。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验