• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于大规模基因组实验的超快速基因组比较。

Ultra-fast genome comparison for large-scale genomic experiments.

机构信息

Computer Architecture Department, University of Málaga - Instituto de Investigación Biomédica de Málaga-IBIMA, Málaga, Spain.

出版信息

Sci Rep. 2019 Jul 16;9(1):10274. doi: 10.1038/s41598-019-46773-w.

DOI:10.1038/s41598-019-46773-w
PMID:31312019
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6635410/
Abstract

In the last decade, a technological shift in the bioinformatics field has occurred: larger genomes can now be sequenced quickly and cost effectively, resulting in the computational need to efficiently compare large and abundant sequences. Furthermore, detecting conserved similarities across large collections of genomes remains a problem. The size of chromosomes, along with the substantial amount of noise and number of repeats found in DNA sequences (particularly in mammals and plants), leads to a scenario where executing and waiting for complete outputs is both time and resource consuming. Filtering steps, manual examination and annotation, very long execution times and a high demand for computational resources represent a few of the many difficulties faced in large genome comparisons. In this work, we provide a method designed for comparisons of considerable amounts of very long sequences that employs a heuristic algorithm capable of separating noise and repeats from conserved fragments in pairwise genomic comparisons. We provide software implementation that computes in linear time using one core as a minimum and a small, constant memory footprint. The method produces both a previsualization of the comparison and a collection of indices to drastically reduce computational complexity when performing exhaustive comparisons. Last, the method scores the comparison to automate classification of sequences and produces a list of detected synteny blocks to enable new evolutionary studies.

摘要

在过去的十年中,生物信息学领域发生了一场技术变革:现在可以快速且经济高效地对更大的基因组进行测序,从而产生了高效比较大量丰富序列的计算需求。此外,在大量基因组集合中检测保守相似性仍然是一个问题。染色体的大小,以及 DNA 序列中发现的大量噪声和重复数量(尤其是在哺乳动物和植物中),导致执行和等待完整输出既耗时又耗资源。过滤步骤、手动检查和注释、非常长的执行时间以及对计算资源的高需求,这些只是在大型基因组比较中面临的众多困难中的一部分。在这项工作中,我们提供了一种针对大量非常长序列的比较方法,该方法采用启发式算法,能够在两两基因组比较中从保守片段中分离噪声和重复。我们提供了软件实现,该实现使用一个核心作为最小资源,并具有较小的、恒定的内存占用,在时间上呈线性计算。该方法生成比较的预可视化以及索引集,以在执行详尽比较时大大降低计算复杂度。最后,该方法对比较进行评分,以实现序列的自动分类,并生成检测到的同线性块列表,从而能够进行新的进化研究。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a05b/6635410/6a20b6afa384/41598_2019_46773_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a05b/6635410/6f55b8146217/41598_2019_46773_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a05b/6635410/7873bb6721c2/41598_2019_46773_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a05b/6635410/8b62337294ce/41598_2019_46773_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a05b/6635410/9617b8a42f13/41598_2019_46773_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a05b/6635410/6a20b6afa384/41598_2019_46773_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a05b/6635410/6f55b8146217/41598_2019_46773_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a05b/6635410/7873bb6721c2/41598_2019_46773_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a05b/6635410/8b62337294ce/41598_2019_46773_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a05b/6635410/9617b8a42f13/41598_2019_46773_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a05b/6635410/6a20b6afa384/41598_2019_46773_Fig5_HTML.jpg

相似文献

1
Ultra-fast genome comparison for large-scale genomic experiments.用于大规模基因组实验的超快速基因组比较。
Sci Rep. 2019 Jul 16;9(1):10274. doi: 10.1038/s41598-019-46773-w.
2
Breaking the computational barriers of pairwise genome comparison.突破成对基因组比较的计算障碍。
BMC Bioinformatics. 2015 Aug 11;16(1):250. doi: 10.1186/s12859-015-0679-9.
3
Screening synteny blocks in pairwise genome comparisons through integer programming.通过整数规划在成对基因组比较中筛选同线性块。
BMC Bioinformatics. 2011 Apr 18;12:102. doi: 10.1186/1471-2105-12-102.
4
DiagHunter and GenoPix2D: programs for genomic comparisons, large-scale homology discovery and visualization.DiagHunter和GenoPix2D:用于基因组比较、大规模同源性发现及可视化的程序。
Genome Biol. 2003;4(10):R68. doi: 10.1186/gb-2003-4-10-r68. Epub 2003 Sep 19.
5
DRIMM-Synteny: decomposing genomes into evolutionary conserved segments.DRIMM-Synteny:将基因组分解为进化保守的片段。
Bioinformatics. 2010 Oct 15;26(20):2509-16. doi: 10.1093/bioinformatics/btq465. Epub 2010 Aug 24.
6
SynChro: a fast and easy tool to reconstruct and visualize synteny blocks along eukaryotic chromosomes.SynChro:一种快速简便的工具,可用于重建和可视化真核生物染色体上的同线性块。
PLoS One. 2014 Mar 20;9(3):e92621. doi: 10.1371/journal.pone.0092621. eCollection 2014.
7
i-ADHoRe 3.0--fast and sensitive detection of genomic homology in extremely large data sets.i-ADHoRe 3.0——在超大数据集中快速且灵敏地检测基因组同源性。
Nucleic Acids Res. 2012 Jan;40(2):e11. doi: 10.1093/nar/gkr955. Epub 2011 Nov 18.
8
mySyntenyPortal: an application package to construct websites for synteny block analysis.mySyntenyPortal:一个用于构建同线性块分析网站的应用程序包。
BMC Bioinformatics. 2018 Jun 5;19(1):216. doi: 10.1186/s12859-018-2219-x.
9
Unraveling Genome Evolution Throughout Visual Analysis: The XCout Portal.通过视觉分析揭示基因组进化:XCout门户。
Bioinform Biol Insights. 2021 Jun 8;15:11779322211021422. doi: 10.1177/11779322211021422. eCollection 2021.
10
SyDiG: uncovering Synteny in Distant Genomes.SyDiG:揭示远缘基因组中的共线性
Int J Bioinform Res Appl. 2011;7(1):43-62. doi: 10.1504/IJBRA.2011.039169.

引用本文的文献

1
Chromosome-scale assemblies of three Ormosia species: repetitive sequences distribution and structural rearrangement.三种红豆属植物的染色体水平组装:重复序列分布与结构重排
Gigascience. 2025 Jan 6;14. doi: 10.1093/gigascience/giaf047.
2
A telomere-to-telomere genome assembly coupled with multi-omic data provides insights into the evolution of hexaploid bread wheat.端粒到端粒的基因组组装结合多组学数据为六倍体面包小麦的进化提供了见解。
Nat Genet. 2025 Apr;57(4):1008-1020. doi: 10.1038/s41588-025-02137-x. Epub 2025 Apr 7.
3
Dispensable genome and segmental duplications drive the genome plasticity in .

本文引用的文献

1
Ensembl 2018.Ensembl 2018.
Nucleic Acids Res. 2018 Jan 4;46(D1):D754-D761. doi: 10.1093/nar/gkx1098.
2
The first near-complete assembly of the hexaploid bread wheat genome, Triticum aestivum.首个六倍体普通小麦基因组的近完整组装。
Gigascience. 2017 Nov 1;6(11):1-7. doi: 10.1093/gigascience/gix097.
3
Genome Sequence of " Carsonella ruddii" Strain BC, a Nutritional Endosymbiont of .“鲁氏卡氏菌”菌株BC的基因组序列,一种……的营养内共生菌。 (原文此处不完整)
可 dispensable 基因组和片段重复驱动了……中的基因组可塑性 。 你这里原文似乎不完整,“in”后面缺少具体内容。
Front Fungal Biol. 2025 Feb 5;6:1432339. doi: 10.3389/ffunb.2025.1432339. eCollection 2025.
4
Genome assembly of a diversity panel of Chenopodium quinoa.藜麦多样性群体的基因组组装
Sci Data. 2024 Dec 18;11(1):1366. doi: 10.1038/s41597-024-04200-4.
5
Chromosome-scale pearl millet genomes reveal CLAMT1b as key determinant of strigolactone pattern and Striga susceptibility.珠型高粱染色体水平基因组揭示 CLAMT1b 是独脚金内酯模式和 Striga 易感性的关键决定因素。
Nat Commun. 2024 Aug 12;15(1):6906. doi: 10.1038/s41467-024-51189-w.
6
Chromosome-scale assembly of the wild wheat relative Aegilops umbellulata.野生小麦近缘种节节麦的染色体级别组装。
Sci Data. 2023 Oct 25;10(1):739. doi: 10.1038/s41597-023-02658-2.
7
Einkorn genomics sheds light on history of the oldest domesticated wheat.一粒小麦基因组学揭示了最古老的栽培小麦的历史。
Nature. 2023 Aug;620(7975):830-838. doi: 10.1038/s41586-023-06389-7. Epub 2023 Aug 2.
8
Population genomics reveals mechanisms and dynamics of de novo expressed open reading frame emergence in .群体基因组学揭示了. 中从头表达的开放阅读框出现的机制和动态。
Genome Res. 2023 Jun;33(6):872-890. doi: 10.1101/gr.277482.122. Epub 2023 Jul 13.
9
Multi-Omics Pipeline and Omics-Integration Approach to Decipher Plant's Abiotic Stress Tolerance Responses.多组学分析管道和组学整合方法解析植物的非生物胁迫耐受反应。
Genes (Basel). 2023 Jun 16;14(6):1281. doi: 10.3390/genes14061281.
10
Insights from a chum salmon (Oncorhynchus keta) genome assembly regarding whole-genome duplication and nucleotide variation influencing gene function.从大麻哈鱼(Oncorhynchus keta)基因组组装中获得的关于全基因组复制和影响基因功能的核苷酸变异的见解。
G3 (Bethesda). 2023 Aug 9;13(8). doi: 10.1093/g3journal/jkad127.
Genome Announc. 2017 Apr 27;5(17):e00236-17. doi: 10.1128/genomeA.00236-17.
4
Coming of age: ten years of next-generation sequencing technologies.成年:下一代测序技术的十年
Nat Rev Genet. 2016 May 17;17(6):333-51. doi: 10.1038/nrg.2016.49.
5
SynFind: Compiling Syntenic Regions across Any Set of Genomes on Demand.SynFind:按需编译任意基因组集的共线性区域。
Genome Biol Evol. 2015 Nov 11;7(12):3286-98. doi: 10.1093/gbe/evv219.
6
Finding and Characterizing Repeats in Plant Genomes.寻找并鉴定植物基因组中的重复序列
Methods Mol Biol. 2016;1374:293-337. doi: 10.1007/978-1-4939-3167-5_17.
7
Fast Edge Detection Using Structured Forests.快速边缘检测使用结构化森林。
IEEE Trans Pattern Anal Mach Intell. 2015 Aug;37(8):1558-70. doi: 10.1109/TPAMI.2014.2377715.
8
Breaking the computational barriers of pairwise genome comparison.突破成对基因组比较的计算障碍。
BMC Bioinformatics. 2015 Aug 11;16(1):250. doi: 10.1186/s12859-015-0679-9.
9
Ancient hybridizations among the ancestral genomes of bread wheat.古代杂种杂交在面包小麦祖先基因组中。
Science. 2014 Jul 18;345(6194):1250092. doi: 10.1126/science.1250092.
10
Decoding the massive genome of loblolly pine using haploid DNA and novel assembly strategies.利用单倍体DNA和新型组装策略解码火炬松的庞大基因组。
Genome Biol. 2014 Mar 4;15(3):R59. doi: 10.1186/gb-2014-15-3-r59.