• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

支架填充、重叠群融合和比较基因顺序推断。

Scaffold filling, contig fusion and comparative gene order inference.

机构信息

Department of Mathematics and Statistics, University of Ottawa, Ottawa, K1N 6N5, Canada.

出版信息

BMC Bioinformatics. 2010 Jun 4;11:304. doi: 10.1186/1471-2105-11-304.

DOI:10.1186/1471-2105-11-304
PMID:20525342
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2902449/
Abstract

BACKGROUND

There has been a trend in increasing the phylogenetic scope of genome sequencing without finishing the sequence of the genome. Increasing numbers of genomes are being published in scaffold or contig form. Rearrangement algorithms, however, including gene order-based phylogenetic tools, require whole genome data on gene order or syntenic block order. How then can we use rearrangement algorithms to compare genomes available in scaffold form only? Can the comparative evidence predict the location of unsequenced genes?

RESULTS

Our method involves optimally filling in genes missing from the scaffolds, while incorporating the augmented scaffolds directly into the rearrangement algorithms as if they were chromosomes. This is accomplished by an exact, polynomial-time algorithm. We then correct for the number of extra fusion/fission operations required to make scaffolds comparable to full assemblies. We model the relationship between the ratio of missing genes actually absent from the genome versus merely unsequenced ones, on one hand, and the increase of genomic distance after scaffold filling, on the other. We estimate the parameters of this model through simulations and by comparing the angiosperm genomes Ricinus communis and Vitis vinifera.

CONCLUSIONS

The algorithm solves the comparison of genomes with 18,300 genes, including 4500 missing from one genome, in less than a minute on a MacBook, putting virtually all genomes within range of the method.

摘要

背景

在尚未完成基因组测序的情况下,对基因组测序的系统发育范围进行扩展已成为一种趋势。越来越多的基因组以支架或连续体的形式发表。然而,重排算法,包括基于基因顺序的系统发育工具,需要整个基因组的基因顺序或同线性块顺序数据。那么,我们如何仅使用支架形式的基因组来使用重排算法呢?比较证据能否预测未测序基因的位置?

结果

我们的方法涉及从支架中最优地填补缺失的基因,同时将扩充的支架直接作为染色体纳入重排算法中。这是通过一个精确的、多项式时间算法实现的。然后,我们纠正了为使支架与完整组装具有可比性而需要额外的融合/裂变操作的数量。我们通过模拟和比较被子植物基因组 Ricinus communis 和 Vitis vinifera 来构建模型,研究缺失基因的比例与支架填充后基因组距离增加之间的关系。我们通过模拟和比较被子植物基因组 Ricinus communis 和 Vitis vinifera 来估计该模型的参数。

结论

该算法在 MacBook 上不到一分钟即可解决具有 18300 个基因的基因组的比较问题,包括一个基因组中缺失的 4500 个基因,几乎所有基因组都在该方法的范围内。

相似文献

1
Scaffold filling, contig fusion and comparative gene order inference.支架填充、重叠群融合和比较基因顺序推断。
BMC Bioinformatics. 2010 Jun 4;11:304. doi: 10.1186/1471-2105-11-304.
2
Rearrangement phylogeny of genomes in contig form.基因组重排系统发育分析。
IEEE/ACM Trans Comput Biol Bioinform. 2010 Oct-Dec;7(4):579-87. doi: 10.1109/TCBB.2010.66.
3
Towards improved reconstruction of ancestral gene order in angiosperm phylogeny.迈向被子植物系统发育中祖先基因顺序的改进重建。
J Comput Biol. 2009 Oct;16(10):1353-67. doi: 10.1089/cmb.2009.0103.
4
SIS: a program to generate draft genome sequence scaffolds for prokaryotes.SIS:一个用于生成原核生物基因组序列草图支架的程序。
BMC Bioinformatics. 2012 May 14;13:96. doi: 10.1186/1471-2105-13-96.
5
Comparison of three assembly strategies for a heterozygous seedless grapevine genome assembly.三种组装策略用于组装一个杂合无核葡萄基因组的比较。
BMC Genomics. 2018 Jan 17;19(1):57. doi: 10.1186/s12864-018-4434-2.
6
A 2-approximation algorithm for the contig-based genomic scaffold filling problem.
J Bioinform Comput Biol. 2018 Dec;16(6):1850022. doi: 10.1142/S0219720018500221.
7
CAMSA: a tool for comparative analysis and merging of scaffold assemblies.CAMSA:一种用于支架组件比较分析和合并的工具。
BMC Bioinformatics. 2017 Dec 6;18(Suppl 15):496. doi: 10.1186/s12859-017-1919-y.
8
Scaffold filling under the breakpoint and related distances.支架填充在断点下和相关距离。
IEEE/ACM Trans Comput Biol Bioinform. 2012 Jul-Aug;9(4):1220-9. doi: 10.1109/TCBB.2012.57.
9
Gene loss under neighborhood selection following whole genome duplication and the reconstruction of the ancestral Populus genome.全基因组复制后邻域选择下的基因丢失与杨树祖先基因组的重建
J Bioinform Comput Biol. 2009 Jun;7(3):499-520. doi: 10.1142/s0219720009004199.
10
Techniques for multi-genome synteny analysis to overcome assembly limitations.克服组装限制的多基因组共线性分析技术。
Genome Inform. 2006;17(2):152-61.

引用本文的文献

1
Modern technologies and algorithms for scaffolding assembled genomes.组装基因组的现代技术和算法。
PLoS Comput Biol. 2019 Jun 5;15(6):e1006994. doi: 10.1371/journal.pcbi.1006994. eCollection 2019 Jun.
2
Multi-CSAR: a multiple reference-based contig scaffolder using algebraic rearrangements.多CSAR:一种使用代数重排的基于多参考的重叠群支架构建工具。
BMC Syst Biol. 2018 Dec 31;12(Suppl 9):139. doi: 10.1186/s12918-018-0654-y.
3
CSAR-web: a web server of contig scaffolding using algebraic rearrangements.CSAR-web:一个使用代数重排进行基因簇拼接的网络服务器。

本文引用的文献

1
Polyploidy and angiosperm diversification.多倍体与被子植物多样化。
Am J Bot. 2009 Jan;96(1):336-48. doi: 10.3732/ajb.0800079.
2
Rearrangement phylogeny of genomes in contig form.基因组重排系统发育分析。
IEEE/ACM Trans Comput Biol Bioinform. 2010 Oct-Dec;7(4):579-87. doi: 10.1109/TCBB.2010.66.
3
Genomics. Genome project standards in a new era of sequencing.基因组学。测序新时代的基因组计划标准。
Nucleic Acids Res. 2018 Jul 2;46(W1):W55-W59. doi: 10.1093/nar/gky337.
4
Approaches for in silico finishing of microbial genome sequences.微生物基因组序列的计算机辅助完成方法。
Genet Mol Biol. 2017;40(3):553-576. doi: 10.1590/1678-4685-GMB-2016-0230.
5
Filling a Protein Scaffold With a Reference.用参考物填充蛋白质支架。
IEEE Trans Nanobioscience. 2017 Mar;16(2):123-130. doi: 10.1109/TNB.2017.2666780. Epub 2017 Feb 9.
6
Multi-CAR: a tool of contig scaffolding using multiple references.多连续片段比对组装工具(Multi-CAR):一种使用多个参考序列进行重叠群搭建的工具。
BMC Bioinformatics. 2016 Dec 23;17(Suppl 17):469. doi: 10.1186/s12859-016-1328-7.
7
Locating rearrangement events in a phylogeny based on highly fragmented assemblies.基于高度碎片化的组装结果在系统发育中定位重排事件。
BMC Genomics. 2016 Jan 11;17 Suppl 1(Suppl 1):1. doi: 10.1186/s12864-015-2294-6.
8
CAR: contig assembly of prokaryotic draft genomes using rearrangements.CAR:利用重排对原核生物草图基因组进行重叠群组装。
BMC Bioinformatics. 2014 Nov 28;15(1):381. doi: 10.1186/s12859-014-0381-3.
9
Assembling contigs in draft genomes using reversals and block-interchanges.利用反转和块交换组装草图基因组中的重叠群。
BMC Bioinformatics. 2013;14 Suppl 5(Suppl 5):S9. doi: 10.1186/1471-2105-14-S5-S9. Epub 2013 Apr 10.
10
Comparative mapping in the Fagaceae and beyond with EST-SSRs.利用 EST-SSR 进行壳斗科及其他科的比较作图。
BMC Plant Biol. 2012 Aug 29;12:153. doi: 10.1186/1471-2229-12-153.
Science. 2009 Oct 9;326(5950):236-7. doi: 10.1126/science.1180614.
4
Towards improved reconstruction of ancestral gene order in angiosperm phylogeny.迈向被子植物系统发育中祖先基因顺序的改进重建。
J Comput Biol. 2009 Oct;16(10):1353-67. doi: 10.1089/cmb.2009.0103.
5
Locating large-scale gene duplication events through reconciled trees: implications for identifying ancient polyploidy events in plants.通过比对树定位大规模基因复制事件:对识别植物古老多倍体事件的意义
J Comput Biol. 2009 Aug;16(8):1071-83. doi: 10.1089/cmb.2009.0139.
6
Gene loss under neighborhood selection following whole genome duplication and the reconstruction of the ancestral Populus genome.全基因组复制后邻域选择下的基因丢失与杨树祖先基因组的重建
J Bioinform Comput Biol. 2009 Jun;7(3):499-520. doi: 10.1142/s0219720009004199.
7
A high quality draft consensus sequence of the genome of a heterozygous grapevine variety.一个杂合葡萄品种基因组的高质量初步共识序列。
PLoS One. 2007 Dec 19;2(12):e1326. doi: 10.1371/journal.pone.0001326.
8
MSOAR: a high-throughput ortholog assignment system based on genome rearrangement.MSOAR:一种基于基因组重排的高通量直系同源物分配系统。
J Comput Biol. 2007 Nov;14(9):1160-75. doi: 10.1089/cmb.2007.0048.
9
The grapevine genome sequence suggests ancestral hexaploidization in major angiosperm phyla.葡萄基因组序列表明主要被子植物门中存在祖先六倍体化现象。
Nature. 2007 Sep 27;449(7161):463-7. doi: 10.1038/nature06148. Epub 2007 Aug 26.
10
Efficient sorting of genomic permutations by translocation, inversion and block interchange.通过易位、倒位和块交换对基因组排列进行高效排序。
Bioinformatics. 2005 Aug 15;21(16):3340-6. doi: 10.1093/bioinformatics/bti535. Epub 2005 Jun 9.