• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

优化来自粪肠球菌的下一代序列数据的混合组装:一种基因组高度分化的微生物。

Optimizing hybrid assembly of next-generation sequence data from Enterococcus faecium: a microbe with highly divergent genome.

作者信息

Wang Yajun, Yu Yao, Pan Bohu, Hao Pei, Li Yixue, Shao Zhifeng, Xu Xiaogang, Li Xuan

机构信息

Shanghai Center for Systems Biomedicine, Shanghai Jiaotong University, Shanghai 200240, China.

出版信息

BMC Syst Biol. 2012;6 Suppl 3(Suppl 3):S21. doi: 10.1186/1752-0509-6-S3-S21. Epub 2012 Dec 17.

DOI:10.1186/1752-0509-6-S3-S21
PMID:23282199
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3524012/
Abstract

BACKGROUND

Sequencing of bacterial genomes became an essential approach to study pathogen virulence and the phylogenetic relationship among close related strains. Bacterium Enterococcus faecium emerged as an important nosocomial pathogen that were often associated with resistance to common antibiotics in hospitals. With highly divergent gene contents, it presented a challenge to the next generation sequencing (NGS) technologies featuring high-throughput and shorter read-length. This study was designed to investigate the properties and systematic biases of NGS technologies and evaluate critical parameters influencing the outcomes of hybrid assemblies using combinations of NGS data.

RESULTS

A hospital strain of E. faecium was sequenced using three different NGS platforms: 454 GS-FLX, Illumina GAIIx, and ABI SOLiD4.0, to approximately 28-, 500-, and 400-fold coverage depth. We built a pipeline that merged contigs from each NGS data into hybrid assemblies. The results revealed that each single NGS assembly had a ceiling in continuity that could not be overcome by simply increasing data coverage depth. Each NGS technology displayed some intrinsic properties, i.e. base calling error, systematic bias, etc. The gaps and low coverage regions of each NGS assembly were associated with lower GC contents. In order to optimize the hybrid assembly approach, we tested with varying amount and different combination of NGS data, and obtained optimal conditions for assembly continuity. We also, for the first time, showed that SOLiD data could help make much improved assemblies of E. faecium genome using the hybrid approach when combined with other type of NGS data.

CONCLUSIONS

The current study addressed the difficult issue of how to most effectively construct a complete microbial genome using today's state of the art sequencing technologies. We characterized the sequence data and genome assembly from each NGS technologies, tested conditions for hybrid assembly with combinations of NGS data, and obtained optimized parameters for achieving most cost-efficiency assembly. Our study helped form some guidelines to direct genomic work on other microorganisms, thus have important practical implications.

摘要

背景

细菌基因组测序已成为研究病原体毒力及亲缘关系密切菌株间系统发育关系的重要方法。粪肠球菌已成为一种重要的医院病原体,常在医院中与对常用抗生素的耐药性相关联。由于其基因含量高度不同,这对具有高通量和较短读长的新一代测序(NGS)技术提出了挑战。本研究旨在调查NGS技术的特性和系统偏差,并评估使用NGS数据组合影响混合组装结果的关键参数。

结果

使用三种不同的NGS平台对一株医院粪肠球菌菌株进行测序:454 GS-FLX、Illumina GAIIx和ABI SOLiD4.0,覆盖深度分别约为28倍、500倍和400倍。我们构建了一个流程,将每个NGS数据中的重叠群合并到混合组装中。结果显示,每个单一的NGS组装在连续性方面都有一个上限,无法通过简单增加数据覆盖深度来克服。每种NGS技术都表现出一些内在特性,即碱基识别错误、系统偏差等。每个NGS组装的间隙和低覆盖区域与较低的GC含量相关。为了优化混合组装方法,我们用不同数量和不同组合的NGS数据进行测试,获得了组装连续性的最佳条件。我们还首次表明,当与其他类型的NGS数据结合时,SOLiD数据有助于使用混合方法对粪肠球菌基因组进行大幅改进的组装。

结论

本研究解决了如何利用当今最先进的测序技术最有效地构建完整微生物基因组这一难题。我们对每种NGS技术的序列数据和基因组组装进行了表征,测试了使用NGS数据组合进行混合组装的条件,并获得了实现最高成本效益组装的优化参数。我们的研究有助于形成一些指导方针,以指导对其他微生物的基因组研究,因此具有重要的实际意义。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9e95/3524012/c2b73f224d39/1752-0509-6-S3-S21-7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9e95/3524012/7bc3857c8b20/1752-0509-6-S3-S21-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9e95/3524012/c8390412878f/1752-0509-6-S3-S21-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9e95/3524012/ecfd81823063/1752-0509-6-S3-S21-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9e95/3524012/f3c17a945515/1752-0509-6-S3-S21-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9e95/3524012/ade4ee2ae524/1752-0509-6-S3-S21-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9e95/3524012/9717a3b442af/1752-0509-6-S3-S21-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9e95/3524012/c2b73f224d39/1752-0509-6-S3-S21-7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9e95/3524012/7bc3857c8b20/1752-0509-6-S3-S21-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9e95/3524012/c8390412878f/1752-0509-6-S3-S21-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9e95/3524012/ecfd81823063/1752-0509-6-S3-S21-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9e95/3524012/f3c17a945515/1752-0509-6-S3-S21-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9e95/3524012/ade4ee2ae524/1752-0509-6-S3-S21-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9e95/3524012/9717a3b442af/1752-0509-6-S3-S21-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9e95/3524012/c2b73f224d39/1752-0509-6-S3-S21-7.jpg

相似文献

1
Optimizing hybrid assembly of next-generation sequence data from Enterococcus faecium: a microbe with highly divergent genome.优化来自粪肠球菌的下一代序列数据的混合组装:一种基因组高度分化的微生物。
BMC Syst Biol. 2012;6 Suppl 3(Suppl 3):S21. doi: 10.1186/1752-0509-6-S3-S21. Epub 2012 Dec 17.
2
Plasmids Shaped the Recent Emergence of the Major Nosocomial Pathogen Enterococcus faecium.质粒塑造了主要医院病原体粪肠球菌的近期新兴。
mBio. 2020 Feb 11;11(1):e03284-19. doi: 10.1128/mBio.03284-19.
3
Completion of draft bacterial genomes by long-read sequencing of synthetic genomic pools.通过合成基因组文库的长读长测序完成细菌基因组草图
BMC Genomics. 2020 Jul 29;21(1):519. doi: 10.1186/s12864-020-06910-6.
4
Genome sequence of Enterococcus faecium clinical isolate LCT-EF128.屎肠球菌临床分离株 LCT-EF128 基因组序列
J Bacteriol. 2012 Sep;194(17):4765. doi: 10.1128/JB.00996-12.
5
Comparisons of genome assembly tools for characterization of genomes using hybrid sequencing technologies.利用混合测序技术对基因组进行特征分析的基因组组装工具比较。
PeerJ. 2024 Aug 29;12:e17964. doi: 10.7717/peerj.17964. eCollection 2024.
6
Evaluation of strategies for the assembly of diverse bacterial genomes using MinION long-read sequencing.利用 MinION 长读测序技术评估组装多种细菌基因组的策略。
BMC Genomics. 2019 Jan 9;20(1):23. doi: 10.1186/s12864-018-5381-7.
7
High-quality genome sequence assembly of R.A73 Enterococcus faecium isolated from freshwater fish mucus.从淡水鱼黏液中分离的 R.A73 屎肠球菌的高质量基因组序列组装。
BMC Microbiol. 2020 Oct 23;20(1):322. doi: 10.1186/s12866-020-01980-8.
8
Hybrid De Novo Genome Assembly for the Generation of Complete Genomes of Urinary Bacteria using Short- and Long-read Sequencing Technologies.利用短读长读测序技术进行尿液细菌的从头杂交基因组组装,生成完整基因组。
J Vis Exp. 2021 Aug 20(174). doi: 10.3791/62872.
9
High quality 3C de novo assembly and annotation of a multidrug resistant ST-111 Pseudomonas aeruginosa genome: Benchmark of hybrid and non-hybrid assemblers.高质量的 3C 从头组装和耐药 ST-111 铜绿假单胞菌基因组的注释:杂交和非杂交组装器的基准测试。
Sci Rep. 2020 Jan 29;10(1):1392. doi: 10.1038/s41598-020-58319-6.
10
Assessing the feasibility of GS FLX Pyrosequencing for sequencing the Atlantic salmon genome.评估GS FLX焦磷酸测序技术用于大西洋鲑鱼基因组测序的可行性。
BMC Genomics. 2008 Aug 28;9:404. doi: 10.1186/1471-2164-9-404.

引用本文的文献

1
In Vitro and In Silico Based Approaches to Identify Potential Novel Bacteriocins from the Athlete Gut Microbiome of an Elite Athlete Cohort.基于体外和计算机模拟方法从精英运动员队列的运动员肠道微生物群中鉴定潜在新型细菌素
Microorganisms. 2022 Mar 24;10(4):701. doi: 10.3390/microorganisms10040701.
2
Phage Annotation Guide: Guidelines for Assembly and High-Quality Annotation.噬菌体注释指南:组装与高质量注释指南
Phage (New Rochelle). 2021 Dec 1;2(4):170-182. doi: 10.1089/phage.2021.0013. Epub 2021 Dec 16.
3
Bidirectional promoters: an enigmatic genome architecture and their roles in cancers.

本文引用的文献

1
Comparative analysis of the first complete Enterococcus faecium genome.比较分析第一个完整的屎肠球菌基因组。
J Bacteriol. 2012 May;194(9):2334-41. doi: 10.1128/JB.00259-12. Epub 2012 Feb 24.
2
Genomic and SNP analyses demonstrate a distant separation of the hospital and community-associated clades of Enterococcus faecium.基因组和 SNP 分析表明,粪肠球菌医院和社区相关分支簇在遥远的地方分离。
PLoS One. 2012;7(1):e30187. doi: 10.1371/journal.pone.0030187. Epub 2012 Jan 26.
3
Comparative studies of de novo assembly tools for next-generation sequencing technologies.
双向启动子:一种神秘的基因组结构及其在癌症中的作用。
Mol Biol Rep. 2021 Sep;48(9):6637-6644. doi: 10.1007/s11033-021-06612-6. Epub 2021 Aug 10.
4
Refinement of Draft Genome Assemblies of Pigeonpea ().木豆基因组草图组装的优化()。 (原文括号部分内容缺失完整信息)
Front Genet. 2020 Dec 15;11:607432. doi: 10.3389/fgene.2020.607432. eCollection 2020.
5
Beyond the whole genome consensus: unravelling of PRRSV phylogenomics using next generation sequencing technologies.超越全基因组共识:利用下一代测序技术解析猪繁殖与呼吸综合征病毒系统基因组学
Virus Res. 2014 Dec 19;194:167-74. doi: 10.1016/j.virusres.2014.10.004. Epub 2014 Oct 12.
6
Enhanced whole genome sequence and annotation of Clostridium stercorarium DSM8532T using RNA-seq transcriptomics and high-throughput proteomics.利用RNA测序转录组学和高通量蛋白质组学对嗜尸梭菌DSM8532T进行全基因组序列增强及注释
BMC Genomics. 2014 Jul 7;15(1):567. doi: 10.1186/1471-2164-15-567.
7
Next-generation sequence assembly: four stages of data processing and computational challenges.下一代序列组装:数据处理的四个阶段和计算挑战。
PLoS Comput Biol. 2013;9(12):e1003345. doi: 10.1371/journal.pcbi.1003345. Epub 2013 Dec 12.
8
An evaluation of the PacBio RS platform for sequencing and de novo assembly of a chloroplast genome.评价 PacBio RS 平台在叶绿体基因组测序和从头组装方面的应用。
BMC Genomics. 2013 Oct 1;14:670. doi: 10.1186/1471-2164-14-670.
9
Advances in systems biology: computational algorithms and applications.系统生物学进展:计算算法与应用
BMC Syst Biol. 2012;6 Suppl 3(Suppl 3):S1. doi: 10.1186/1752-0509-6-S3-S1. Epub 2012 Dec 17.
新一代测序技术从头组装工具的比较研究。
Bioinformatics. 2011 Aug 1;27(15):2031-7. doi: 10.1093/bioinformatics/btr319. Epub 2011 Jun 2.
4
A practical comparison of de novo genome assembly software tools for next-generation sequencing technologies.新一代测序技术中从头基因组组装软件工具的实用比较。
PLoS One. 2011 Mar 14;6(3):e17915. doi: 10.1371/journal.pone.0017915.
5
Quality control and preprocessing of metagenomic datasets.宏基因组数据集的质量控制和预处理。
Bioinformatics. 2011 Mar 15;27(6):863-4. doi: 10.1093/bioinformatics/btr026. Epub 2011 Jan 28.
6
Pyrosequencing-based comparative genome analysis of the nosocomial pathogen Enterococcus faecium and identification of a large transferable pathogenicity island.基于焦磷酸测序的医院感染病原体屎肠球菌比较基因组分析及一个大型可转移致病性岛的鉴定。
BMC Genomics. 2010 Apr 14;11:239. doi: 10.1186/1471-2164-11-239.
7
Assembly algorithms for next-generation sequencing data.下一代测序数据的组装算法。
Genomics. 2010 Jun;95(6):315-27. doi: 10.1016/j.ygeno.2010.03.001. Epub 2010 Mar 6.
8
The sequence and de novo assembly of the giant panda genome.大熊猫基因组的序列与从头组装。
Nature. 2010 Jan 21;463(7279):311-7. doi: 10.1038/nature08696. Epub 2009 Dec 13.
9
Sequencing technologies - the next generation.测序技术——下一代。
Nat Rev Genet. 2010 Jan;11(1):31-46. doi: 10.1038/nrg2626. Epub 2009 Dec 8.
10
De novo genome sequence assembly of a filamentous fungus using Sanger, 454 and Illumina sequence data.利用 Sanger、454 和 Illumina 测序数据进行丝状真菌从头基因组序列组装。
Genome Biol. 2009;10(9):R94. doi: 10.1186/gb-2009-10-9-r94. Epub 2009 Sep 11.