• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

Mapsembler,在台式计算机上对大型 NGS 数据集进行靶向和微组装。

Mapsembler, targeted and micro assembly of large NGS datasets on a desktop computer.

机构信息

INRIA Rennes - Bretagne Atlantique, EPI Symbiose, Rennes, France.

出版信息

BMC Bioinformatics. 2012 Mar 23;13:48. doi: 10.1186/1471-2105-13-48.

DOI:10.1186/1471-2105-13-48
PMID:22443449
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3514201/
Abstract

BACKGROUND

The analysis of next-generation sequencing data from large genomes is a timely research topic. Sequencers are producing billions of short sequence fragments from newly sequenced organisms. Computational methods for reconstructing whole genomes/transcriptomes (de novo assemblers) are typically employed to process such data. However, these methods require large memory resources and computation time. Many basic biological questions could be answered targeting specific information in the reads, thus avoiding complete assembly.

RESULTS

We present Mapsembler, an iterative micro and targeted assembler which processes large datasets of reads on commodity hardware. Mapsembler checks for the presence of given regions of interest that can be constructed from reads and builds a short assembly around it, either as a plain sequence or as a graph, showing contextual structure. We introduce new algorithms to retrieve approximate occurrences of a sequence from reads and construct an extension graph. Among other results presented in this paper, Mapsembler enabled to retrieve previously described human breast cancer candidate fusion genes, and to detect new ones not previously known.

CONCLUSIONS

Mapsembler is the first software that enables de novo discovery around a region of interest of repeats, SNPs, exon skipping, gene fusion, as well as other structural events, directly from raw sequencing reads. As indexing is localized, the memory footprint of Mapsembler is negligible. Mapsembler is released under the CeCILL license and can be freely downloaded from http://alcovna.genouest.org/mapsembler/.

摘要

背景

分析来自大型基因组的下一代测序数据是一个及时的研究课题。测序仪正在从新测序的生物体中产生数十亿个短序列片段。用于处理此类数据的计算方法通常是重建全基因组/转录组(从头组装程序)。然而,这些方法需要大量的内存资源和计算时间。许多基本的生物学问题可以通过针对读取中的特定信息来回答,从而避免完整的组装。

结果

我们提出了 Mapsembler,这是一种迭代的微观和靶向组装程序,可以在商品硬件上处理大型读取数据集。Mapsembler 检查给定感兴趣区域的存在,这些区域可以从读取中构建,并在其周围构建一个短的组装,无论是作为一个简单的序列还是作为一个显示上下文结构的图。我们引入了新的算法来从读取中检索序列的近似出现,并构建扩展图。本文介绍的其他结果中,Mapsembler 能够检索以前描述的人类乳腺癌候选融合基因,并检测到以前未知的新基因。

结论

Mapsembler 是第一个能够直接从原始测序读取中围绕重复、SNP、外显子跳跃、基因融合以及其他结构事件的感兴趣区域进行从头发现的软件。由于索引是本地化的,Mapsembler 的内存占用可以忽略不计。Mapsembler 是根据 CeCILL 许可证发布的,可以从 http://alcovna.genouest.org/mapsembler/ 免费下载。

相似文献

1
Mapsembler, targeted and micro assembly of large NGS datasets on a desktop computer.Mapsembler,在台式计算机上对大型 NGS 数据集进行靶向和微组装。
BMC Bioinformatics. 2012 Mar 23;13:48. doi: 10.1186/1471-2105-13-48.
2
FastEtch: A Fast Sketch-Based Assembler for Genomes.FastEtch:一种基于草图的快速基因组装配器。
IEEE/ACM Trans Comput Biol Bioinform. 2019 Jul-Aug;16(4):1091-1106. doi: 10.1109/TCBB.2017.2737999. Epub 2017 Sep 11.
3
GABenchToB: a genome assembly benchmark tuned on bacteria and benchtop sequencers.GABenchToB:一个针对细菌和台式测序仪进行优化的基因组组装基准测试。
PLoS One. 2014 Sep 8;9(9):e107014. doi: 10.1371/journal.pone.0107014. eCollection 2014.
4
LMAS: evaluating metagenomic short de novo assembly methods through defined communities.LMAS:通过定义的群落评估宏基因组短从头组装方法。
Gigascience. 2022 Dec 28;12. doi: 10.1093/gigascience/giac122.
5
De novo finished 2.8 Mbp Staphylococcus aureus genome assembly from 100 bp short and long range paired-end reads.从头组装完成了 2.8 Mbp 的金黄色葡萄球菌基因组,使用的是 100 bp 短读长和长读长配对末端 reads。
Bioinformatics. 2014 Jan 1;30(1):40-9. doi: 10.1093/bioinformatics/btt590. Epub 2013 Oct 15.
6
Benchmarking of de novo assembly algorithms for Nanopore data reveals optimal performance of OLC approaches.用于纳米孔数据的从头组装算法基准测试揭示了重叠布局一致(OLC)方法的最佳性能。
BMC Genomics. 2016 Aug 22;17 Suppl 7(Suppl 7):507. doi: 10.1186/s12864-016-2895-8.
7
NeatFreq: reference-free data reduction and coverage normalization for De Novo sequence assembly.NeatFreq:用于从头序列组装的无参考数据缩减和覆盖度归一化
BMC Bioinformatics. 2014 Nov 19;15(1):357. doi: 10.1186/s12859-014-0357-3.
8
BASE: a practical de novo assembler for large genomes using long NGS reads.BASE:一种使用长读长二代测序数据进行大型基因组从头组装的实用工具。
BMC Genomics. 2016 Aug 31;17 Suppl 5(Suppl 5):499. doi: 10.1186/s12864-016-2829-5.
9
LightAssembler: fast and memory-efficient assembly algorithm for high-throughput sequencing reads.LightAssembler:一种用于高通量测序reads 的快速且节省内存的组装算法。
Bioinformatics. 2016 Nov 1;32(21):3215-3223. doi: 10.1093/bioinformatics/btw470. Epub 2016 Jul 13.
10
FSG: Fast String Graph Construction for De Novo Assembly.FSG:用于从头组装的快速字符串图构建
J Comput Biol. 2017 Oct;24(10):953-968. doi: 10.1089/cmb.2017.0089. Epub 2017 Jul 17.

引用本文的文献

1
Distribution of Bacterial Endosymbionts of the Clade in Plant-Parasitic Nematodes.Clade 植物寄生线虫内生细菌的分布。
Int J Mol Sci. 2023 Feb 2;24(3):2905. doi: 10.3390/ijms24032905.
2
slag: A program for seeded local assembly of genes in complex genomes.Slag:一个用于复杂基因组中基因种子局部组装的程序。
Mol Ecol Resour. 2022 Jul;22(5):1999-2017. doi: 10.1111/1755-0998.13580. Epub 2022 Jan 27.
3
Music of metagenomics-a review of its applications, analysis pipeline, and associated tools.宏基因组学音乐——应用、分析流程及其相关工具的综述。

本文引用的文献

1
AlleleSeq: analysis of allele-specific expression and binding in a network framework.AlleleSeq:在网络框架中分析等位基因特异性表达和结合。
Mol Syst Biol. 2011 Aug 2;7:522. doi: 10.1038/msb.2011.54.
2
Comparative studies of de novo assembly tools for next-generation sequencing technologies.新一代测序技术从头组装工具的比较研究。
Bioinformatics. 2011 Aug 1;27(15):2031-7. doi: 10.1093/bioinformatics/btr319. Epub 2011 Jun 2.
3
Targeted assembly of short sequence reads.靶向组装短序列读段。
Funct Integr Genomics. 2022 Feb;22(1):3-26. doi: 10.1007/s10142-021-00810-y. Epub 2021 Oct 18.
4
Comparative Analysis of Plastid Genomes in the Non-photosynthetic Genus Reveals Ongoing Gene Set Reduction.非光合属质体基因组的比较分析揭示了基因集的持续减少。
Front Plant Sci. 2021 Mar 16;12:602598. doi: 10.3389/fpls.2021.602598. eCollection 2021.
5
EXFI: Exon and splice graph prediction without a reference genome.EXFI:无需参考基因组的外显子和剪接图预测。
Ecol Evol. 2020 Jul 28;10(16):8880-8893. doi: 10.1002/ece3.6587. eCollection 2020 Aug.
6
Assexon: Assembling Exon Using Gene Capture Data.Assexon:利用基因捕获数据组装外显子
Evol Bioinform Online. 2019 Sep 6;15:1176934319874792. doi: 10.1177/1176934319874792. eCollection 2019.
7
SRAssembler: Selective Recursive local Assembly of homologous genomic regions.SRAssembler:同源基因组区域的选择性递归局部组装。
BMC Bioinformatics. 2019 Jul 2;20(1):371. doi: 10.1186/s12859-019-2949-4.
8
Comparing fixed sampling with minimizer sampling when using k-mer indexes to find maximal exact matches.在使用k-mer索引查找最大精确匹配时,比较固定采样和最小化采样。
PLoS One. 2018 Feb 1;13(2):e0189960. doi: 10.1371/journal.pone.0189960. eCollection 2018.
9
Kollector: transcript-informed, targeted de novo assembly of gene loci.Kollector:基于转录本信息的基因座靶向从头组装。
Bioinformatics. 2017 Jun 15;33(12):1782-1788. doi: 10.1093/bioinformatics/btx078.
10
Colib'read on galaxy: a tools suite dedicated to biological information extraction from raw NGS reads.Colib'read on galaxy:一个致力于从原始NGS读段中提取生物信息的工具套件。
Gigascience. 2016 Feb 11;5:9. doi: 10.1186/s13742-015-0105-2. eCollection 2016.
PLoS One. 2011 May 11;6(5):e19816. doi: 10.1371/journal.pone.0019816.
4
Cobweb: a Java applet for network exploration and visualisation.蛛网图:用于网络探索和可视化的 Java 小程序。
Bioinformatics. 2011 Jun 15;27(12):1725-6. doi: 10.1093/bioinformatics/btr195. Epub 2011 Apr 12.
5
Identification of fusion genes in breast cancer by paired-end RNA-sequencing.通过 RNA 测序的配对末端技术鉴定乳腺癌中的融合基因。
Genome Biol. 2011;12(1):R6. doi: 10.1186/gb-2011-12-1-r6. Epub 2011 Jan 19.
6
Limitations of next-generation genome sequence assembly.下一代基因组序列组装的局限性。
Nat Methods. 2011 Jan;8(1):61-5. doi: 10.1038/nmeth.1527. Epub 2010 Nov 21.
7
GASSST: global alignment short sequence search tool.GASSST:全局比对短序列搜索工具。
Bioinformatics. 2010 Oct 15;26(20):2534-40. doi: 10.1093/bioinformatics/btq485. Epub 2010 Aug 24.
8
Improving draft assemblies by iterative mapping and assembly of short reads to eliminate gaps.通过将短读段迭代映射和组装来消除缺口,从而改进草案组装。
Genome Biol. 2010;11(4):R41. doi: 10.1186/gb-2010-11-4-r41. Epub 2010 Apr 13.
9
De novo assembly of human genomes with massively parallel short read sequencing.利用大规模平行短读测序进行人类基因组从头组装。
Genome Res. 2010 Feb;20(2):265-72. doi: 10.1101/gr.097261.109. Epub 2009 Dec 17.
10
Integration of biological networks and gene expression data using Cytoscape.使用Cytoscape整合生物网络与基因表达数据。
Nat Protoc. 2007;2(10):2366-82. doi: 10.1038/nprot.2007.324.