• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

SIS:一个用于生成原核生物基因组序列草图支架的程序。

SIS: a program to generate draft genome sequence scaffolds for prokaryotes.

机构信息

Departamento de Bioquímica, Instituto de Química, Universidade de São Paulo, São Paulo, SP, Brazil.

出版信息

BMC Bioinformatics. 2012 May 14;13:96. doi: 10.1186/1471-2105-13-96.

DOI:10.1186/1471-2105-13-96
PMID:22583530
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3674793/
Abstract

BACKGROUND

Decreasing costs of DNA sequencing have made prokaryotic draft genome sequences increasingly common. A contig scaffold is an ordering of contigs in the correct orientation. A scaffold can help genome comparisons and guide gap closure efforts. One popular technique for obtaining contig scaffolds is to map contigs onto a reference genome. However, rearrangements that may exist between the query and reference genomes may result in incorrect scaffolds, if these rearrangements are not taken into account. Large-scale inversions are common rearrangement events in prokaryotic genomes. Even in draft genomes it is possible to detect the presence of inversions given sufficient sequencing coverage and a sufficiently close reference genome.

RESULTS

We present a linear-time algorithm that can generate a set of contig scaffolds for a draft genome sequence represented in contigs given a reference genome. The algorithm is aimed at prokaryotic genomes and relies on the presence of matching sequence patterns between the query and reference genomes that can be interpreted as the result of large-scale inversions; we call these patterns inversion signatures. Our algorithm is capable of correctly generating a scaffold if at least one member of every inversion signature pair is present in contigs and no inversion signatures have been overwritten in evolution. The algorithm is also capable of generating scaffolds in the presence of any kind of inversion, even though in this general case there is no guarantee that all scaffolds in the scaffold set will be correct. We compare the performance of sis, the program that implements the algorithm, to seven other scaffold-generating programs. The results of our tests show that sis has overall better performance.

CONCLUSIONS

sis is a new easy-to-use tool to generate contig scaffolds, available both as stand-alone and as a web server. The good performance of sis in our tests adds evidence that large-scale inversions are widespread in prokaryotic genomes.

摘要

背景

DNA 测序成本的降低使得原核生物草图基因组序列越来越常见。重叠群支架是重叠群的正确定向排序。支架可以帮助基因组比较并指导缺口闭合工作。获得重叠群支架的一种流行技术是将重叠群映射到参考基因组上。然而,如果不考虑查询和参考基因组之间可能存在的重排,这些重排可能会导致不正确的支架。大规模倒位是原核生物基因组中常见的重排事件。即使在草图基因组中,只要有足够的测序覆盖度和足够接近的参考基因组,也有可能检测到倒位的存在。

结果

我们提出了一种线性时间算法,可以为给定参考基因组的重叠群生成草图基因组序列的一组重叠群支架。该算法针对原核生物基因组,依赖于查询和参考基因组之间存在匹配的序列模式,这些模式可以解释为大规模倒位的结果;我们称这些模式为倒位特征。如果至少有一个倒位特征对的成员存在于重叠群中,并且没有倒位特征在进化中被覆盖,那么我们的算法能够正确地生成支架。该算法还能够在存在任何类型的倒位的情况下生成支架,尽管在这种一般情况下,不能保证支架集中的所有支架都是正确的。我们将 sis 程序(实现该算法的程序)的性能与其他七个生成支架的程序进行了比较。我们的测试结果表明,sis 总体上具有更好的性能。

结论

sis 是一种新的易于使用的生成重叠群支架的工具,既可以作为独立程序使用,也可以作为网络服务器使用。sis 在我们的测试中表现良好,这进一步证明了大规模倒位在原核生物基因组中广泛存在。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c017/3674793/29ff3533de7c/1471-2105-13-96-9.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c017/3674793/50ac06225965/1471-2105-13-96-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c017/3674793/2e213f12a9d6/1471-2105-13-96-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c017/3674793/12fd5aa02819/1471-2105-13-96-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c017/3674793/a4dca27f3cbc/1471-2105-13-96-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c017/3674793/765550b8f91b/1471-2105-13-96-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c017/3674793/868a522aaee1/1471-2105-13-96-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c017/3674793/5275f596f1c8/1471-2105-13-96-7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c017/3674793/10450a79baec/1471-2105-13-96-8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c017/3674793/29ff3533de7c/1471-2105-13-96-9.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c017/3674793/50ac06225965/1471-2105-13-96-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c017/3674793/2e213f12a9d6/1471-2105-13-96-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c017/3674793/12fd5aa02819/1471-2105-13-96-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c017/3674793/a4dca27f3cbc/1471-2105-13-96-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c017/3674793/765550b8f91b/1471-2105-13-96-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c017/3674793/868a522aaee1/1471-2105-13-96-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c017/3674793/5275f596f1c8/1471-2105-13-96-7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c017/3674793/10450a79baec/1471-2105-13-96-8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c017/3674793/29ff3533de7c/1471-2105-13-96-9.jpg

相似文献

1
SIS: a program to generate draft genome sequence scaffolds for prokaryotes.SIS:一个用于生成原核生物基因组序列草图支架的程序。
BMC Bioinformatics. 2012 May 14;13:96. doi: 10.1186/1471-2105-13-96.
2
CAR: contig assembly of prokaryotic draft genomes using rearrangements.CAR:利用重排对原核生物草图基因组进行重叠群组装。
BMC Bioinformatics. 2014 Nov 28;15(1):381. doi: 10.1186/s12859-014-0381-3.
3
Multi-CAR: a tool of contig scaffolding using multiple references.多连续片段比对组装工具(Multi-CAR):一种使用多个参考序列进行重叠群搭建的工具。
BMC Bioinformatics. 2016 Dec 23;17(Suppl 17):469. doi: 10.1186/s12859-016-1328-7.
4
Assembling contigs in draft genomes using reversals and block-interchanges.利用反转和块交换组装草图基因组中的重叠群。
BMC Bioinformatics. 2013;14 Suppl 5(Suppl 5):S9. doi: 10.1186/1471-2105-14-S5-S9. Epub 2013 Apr 10.
5
CSAR-web: a web server of contig scaffolding using algebraic rearrangements.CSAR-web:一个使用代数重排进行基因簇拼接的网络服务器。
Nucleic Acids Res. 2018 Jul 2;46(W1):W55-W59. doi: 10.1093/nar/gky337.
6
A post-assembly genome-improvement toolkit (PAGIT) to obtain annotated genomes from contigs.一种用于从连续片段中获得带注释基因组的后组装基因组改进工具包(PAGIT)。
Nat Protoc. 2012 Jun 7;7(7):1260-84. doi: 10.1038/nprot.2012.068.
7
Mapping contigs using CONTIGuator.使用CONTIGuator对重叠群进行定位。
Methods Mol Biol. 2015;1231:163-76. doi: 10.1007/978-1-4939-1720-4_11.
8
CSA: A high-throughput chromosome-scale assembly pipeline for vertebrate genomes.CSA:脊椎动物基因组的高通量染色体级别的组装流水线。
Gigascience. 2020 May 1;9(5). doi: 10.1093/gigascience/giaa034.
9
Multi-CSAR: a multiple reference-based contig scaffolder using algebraic rearrangements.多CSAR:一种使用代数重排的基于多参考的重叠群支架构建工具。
BMC Syst Biol. 2018 Dec 31;12(Suppl 9):139. doi: 10.1186/s12918-018-0654-y.
10
Construction of Whole Genomes from Scaffolds Using Single Cell Strand-Seq Data.使用单细胞链测序数据从支架构建全基因组。
Int J Mol Sci. 2021 Mar 31;22(7):3617. doi: 10.3390/ijms22073617.

引用本文的文献

1
Draft Genomic Sequences of Streptomyces misionensis ACT66 and Streptomyces albidoflavus ACT77, Bacteria with Potential Application for Phytopathogen Biocontrol.米申链霉菌ACT66和白黄链霉菌ACT77的基因组序列草图,具有用于植物病原体生物防治的潜在应用价值的细菌
Microbiol Resour Announc. 2019 Sep 5;8(36):e00949-19. doi: 10.1128/MRA.00949-19.
2
Rapid genetic and phenotypic changes in Pseudomonas aeruginosa clinical strains during ventilator-associated pneumonia.铜绿假单胞菌临床分离株在呼吸机相关性肺炎期间的快速遗传和表型变化。
Sci Rep. 2019 Mar 18;9(1):4720. doi: 10.1038/s41598-019-41201-5.
3
Multi-CSAR: a multiple reference-based contig scaffolder using algebraic rearrangements.

本文引用的文献

1
Opera: reconstructing optimal genomic scaffolds with high-throughput paired-end sequences.Opera:利用高通量双末端序列重建最优基因组支架
J Comput Biol. 2011 Nov;18(11):1681-91. doi: 10.1089/cmb.2011.0170. Epub 2011 Sep 19.
2
CONTIGuator: a bacterial genomes finishing tool for structural insights on draft genomes.CONTIGuator:一种用于对草图基因组进行结构洞察的细菌基因组完成工具。
Source Code Biol Med. 2011 Jun 21;6:11. doi: 10.1186/1751-0473-6-11.
3
Scaffold filling, contig fusion and comparative gene order inference.支架填充、重叠群融合和比较基因顺序推断。
多CSAR:一种使用代数重排的基于多参考的重叠群支架构建工具。
BMC Syst Biol. 2018 Dec 31;12(Suppl 9):139. doi: 10.1186/s12918-018-0654-y.
4
Draft Genome Sequences of the 1,2-Dichloropropane-Respiring Dehalococcoides mccartyi Strains RC and KS.能够呼吸1,2 - 二氯丙烷的脱卤球菌属菌株RC和KS的基因组序列草图
Microbiol Resour Announc. 2018 Sep 13;7(10). doi: 10.1128/MRA.01081-18. eCollection 2018 Sep.
5
The genome sequence of Dyella jiangningensis FCAV SCS01 from a lignocellulose-decomposing microbial consortium metagenome reveals potential for biotechnological applications.来自木质纤维素分解微生物群落宏基因组的江宁戴氏菌FCAV SCS01的基因组序列揭示了其在生物技术应用方面的潜力。
Genet Mol Biol. 2018;41(2):507-513. doi: 10.1590/1678-4685-GMB-2017-0155. Epub 2018 May 14.
6
Phylogenetic signal from rearrangements in 18 Anopheles species by joint scaffolding extant and ancestral genomes.18 种按蚊种系重排的系统发育信号,通过联合支架现存和祖先基因组。
BMC Genomics. 2018 May 9;19(Suppl 2):96. doi: 10.1186/s12864-018-4466-7.
7
CSAR-web: a web server of contig scaffolding using algebraic rearrangements.CSAR-web:一个使用代数重排进行基因簇拼接的网络服务器。
Nucleic Acids Res. 2018 Jul 2;46(W1):W55-W59. doi: 10.1093/nar/gky337.
8
Closed Genome Sequence of Phytopathogen Biocontrol Agent Strain AGVL-005, Isolated from Soybean.从大豆中分离得到的植物病原菌生物防治剂AGVL-005菌株的全基因组序列
Genome Announc. 2018 Feb 15;6(7):e00057-18. doi: 10.1128/genomeA.00057-18.
9
Genomic repeats, misassembly and reannotation: a case study with long-read resequencing of Porphyromonas gingivalis reference strains.基因组重复、错误组装和重新注释:以长读重测序牙龈卟啉单胞菌参考株为例的研究。
BMC Genomics. 2018 Jan 16;19(1):54. doi: 10.1186/s12864-017-4429-4.
10
Approaches for in silico finishing of microbial genome sequences.微生物基因组序列的计算机辅助完成方法。
Genet Mol Biol. 2017;40(3):553-576. doi: 10.1590/1678-4685-GMB-2016-0230.
BMC Bioinformatics. 2010 Jun 4;11:304. doi: 10.1186/1471-2105-11-304.
4
r2cat: synteny plots and comparative assembly.r2cat:基因同线性图谱和比较组装。
Bioinformatics. 2010 Feb 15;26(4):570-1. doi: 10.1093/bioinformatics/btp690. Epub 2009 Dec 16.
5
BLAST+: architecture and applications.BLAST+:体系结构与应用。
BMC Bioinformatics. 2009 Dec 15;10:421. doi: 10.1186/1471-2105-10-421.
6
PGA4genomics for comparative genome assembly based on genetic algorithm optimization.基于遗传算法优化的用于比较基因组组装的PGA4基因组学。
Genomics. 2009 Oct;94(4):284-6. doi: 10.1016/j.ygeno.2009.06.006. Epub 2009 Jun 30.
7
Reordering contigs of draft genomes using the Mauve aligner.使用 Mauve 比对工具重新排列草图基因组的顺序。
Bioinformatics. 2009 Aug 15;25(16):2071-3. doi: 10.1093/bioinformatics/btp356. Epub 2009 Jun 10.
8
ABACAS: algorithm-based automatic contiguation of assembled sequences.ABACAS:基于算法的组装序列自动拼接。
Bioinformatics. 2009 Aug 1;25(15):1968-9. doi: 10.1093/bioinformatics/btp347. Epub 2009 Jun 3.
9
Inversion-based genomic signatures.基于倒位的基因组特征。
BMC Bioinformatics. 2009 Jan 30;10 Suppl 1(Suppl 1):S7. doi: 10.1186/1471-2105-10-S1-S7.
10
A genomic distance based on MUM indicates discontinuity between most bacterial species and genera.基于最大唯一匹配(MUM)的基因组距离表明,大多数细菌物种和属之间存在间断性。
J Bacteriol. 2009 Jan;191(1):91-9. doi: 10.1128/JB.01202-08. Epub 2008 Oct 31.