Suppr超能文献

利用光学图谱数据改进鸵鸟基因组组装。

Improving the ostrich genome assembly using optical mapping data.

作者信息

Zhang Jilin, Li Cai, Zhou Qi, Zhang Guojie

机构信息

China National GeneBank, BGI-Shenzhen, Shenzhen,, 518083 China.

China National GeneBank, BGI-Shenzhen, Shenzhen,, 518083 China ; Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Øster Voldgade 5-7, 1350 Copenhagen, Denmark.

出版信息

Gigascience. 2015 May 12;4:24. doi: 10.1186/s13742-015-0062-9. eCollection 2015.

Abstract

BACKGROUND

The ostrich (Struthio camelus) is the tallest and heaviest living bird. Ostrich meat is considered a healthy red meat, with an annual worldwide production ranging from 12,000 to 15,000 tons. As part of the avian phylogenomics project, we sequenced the ostrich genome for phylogenetic and comparative genomics analyses. The initial Illumina-based assembly of this genome had a scaffold N50 of 3.59 Mb and a total size of 1.23 Gb. Since longer scaffolds are critical for many genomic analyses, particularly for chromosome-level comparative analysis, we generated optical mapping (OM) data to obtain an improved assembly. The OM technique is a non-PCR-based method to generate genome-wide restriction enzyme maps, which improves the quality of de novo genome assembly.

FINDINGS

In order to generate OM data, we digested the ostrich genome with KpnI, which yielded 1.99 million DNA molecules (>250 kb) and covered the genome at least 500×. The pattern of molecules was subsequently assembled to align with the Illumina-based assembly to achieve sequence extension. This resulted in an OM assembly with a scaffold N50 of 17.71 Mb, which is 5 times as large as that of the initial assembly. The number of scaffolds covering 90% of the genome was reduced from 414 to 75, which means an average of ~3 super-scaffolds for each chromosome. Upon integrating the OM data with previously published FISH (fluorescence in situ hybridization) markers, we recovered the full PAR (pseudoatosomal region) on the ostrich Z chromosome with 4 super-scaffolds, as well as most of the degenerated regions.

CONCLUSIONS

The OM data significantly improved the assembled scaffolds of the ostrich genome and facilitated chromosome evolution studies in birds. Similar strategies can be applied to other genome sequencing projects to obtain better assemblies.

摘要

背景

鸵鸟(Struthio camelus)是现存最高且最重的鸟类。鸵鸟肉被视为一种健康的红肉,全球年产量在12000至15000吨之间。作为鸟类系统发育基因组学项目的一部分,我们对鸵鸟基因组进行了测序,以进行系统发育和比较基因组学分析。该基因组最初基于Illumina的组装,其支架N50为3.59 Mb,总大小为1.23 Gb。由于更长的支架对于许多基因组分析至关重要,特别是对于染色体水平的比较分析,我们生成了光学图谱(OM)数据以获得改进的组装。OM技术是一种基于非PCR的方法,用于生成全基因组限制性酶切图谱,可提高从头基因组组装的质量。

研究结果

为了生成OM数据,我们用KpnI酶消化鸵鸟基因组,产生了199万个DNA分子(>250 kb),覆盖基因组至少500倍。随后将分子模式进行组装,使其与基于Illumina的组装对齐以实现序列延伸。这产生了一个支架N50为17.71 Mb的OM组装,是初始组装的5倍。覆盖基因组90%的支架数量从414个减少到75个,这意味着每个染色体平均约有3个超级支架。将OM数据与先前发表的荧光原位杂交(FISH)标记整合后,我们用4个超级支架在鸵鸟Z染色体上恢复了完整的假常染色体区域(PAR)以及大部分退化区域。

结论

OM数据显著改善了鸵鸟基因组的组装支架,并促进了鸟类染色体进化研究。类似的策略可应用于其他基因组测序项目以获得更好的组装结果。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/df43/4427950/2bdfc7f5b653/13742_2015_62_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验