• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

SeQuiLa-cov:一个快速且可扩展的覆盖深度计算库。

SeQuiLa-cov: A fast and scalable library for depth of coverage calculations.

机构信息

Institute of Computer Science, Warsaw University of Technology, ul. Nowowiejska 15/19, 00-665 Warsaw, Poland.

出版信息

Gigascience. 2019 Aug 1;8(8). doi: 10.1093/gigascience/giz094.

DOI:10.1093/gigascience/giz094
PMID:31378808
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6680061/
Abstract

BACKGROUND

Depth of coverage calculation is an important and computationally intensive preprocessing step in a variety of next-generation sequencing pipelines, including the analysis of RNA-sequencing data, detection of copy number variants, or quality control procedures.

RESULTS

Building upon big data technologies, we have developed SeQuiLa-cov, an extension to the recently released SeQuiLa platform, which provides efficient depth of coverage calculations, reaching >100× speedup over the state-of-the-art tools. The performance and scalability of our solution allow for exome and genome-wide calculations running locally or on a cluster while hiding the complexity of the distributed computing with Structured Query Language Application Programming Interface.

CONCLUSIONS

SeQuiLa-cov provides significant performance gain in depth of coverage calculations streamlining the widely used bioinformatic processing pipelines.

摘要

背景

在各种下一代测序管道中,覆盖深度计算是一个重要且计算密集型的预处理步骤,包括 RNA-seq 数据分析、拷贝数变异检测或质量控制程序。

结果

基于大数据技术,我们开发了 SeQuiLa-cov,它是最近发布的 SeQuiLa 平台的扩展,提供了高效的覆盖深度计算,相对于最先进的工具实现了超过 100 倍的加速。我们的解决方案的性能和可扩展性允许在本地或集群上进行外显子组和全基因组计算,同时通过使用结构化查询语言应用程序编程接口隐藏分布式计算的复杂性。

结论

SeQuiLa-cov 在覆盖深度计算中提供了显著的性能提升,简化了广泛使用的生物信息处理管道。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/de01/6680061/d94d4ced21d6/giz094fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/de01/6680061/cda040febad9/giz094fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/de01/6680061/d94d4ced21d6/giz094fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/de01/6680061/cda040febad9/giz094fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/de01/6680061/d94d4ced21d6/giz094fig2.jpg

相似文献

1
SeQuiLa-cov: A fast and scalable library for depth of coverage calculations.SeQuiLa-cov:一个快速且可扩展的覆盖深度计算库。
Gigascience. 2019 Aug 1;8(8). doi: 10.1093/gigascience/giz094.
2
Cloud-native distributed genomic pileup operations.云原生分布式基因组堆积操作。
Bioinformatics. 2023 Jan 1;39(1). doi: 10.1093/bioinformatics/btac804.
3
SeQuiLa: an elastic, fast and scalable SQL-oriented solution for processing and querying genomic intervals.SeQuiLa:一种面向 SQL 的弹性、快速和可扩展的解决方案,用于处理和查询基因组区间。
Bioinformatics. 2019 Jun 1;35(12):2156-2158. doi: 10.1093/bioinformatics/bty940.
4
An evaluation of copy number variation detection tools from whole-exome sequencing data.基于全外显子组测序数据的拷贝数变异检测工具评估
Hum Mutat. 2014 Jul;35(7):899-907. doi: 10.1002/humu.22537. Epub 2014 May 1.
5
CNVind: an open source cloud-based pipeline for rare CNVs detection in whole exome sequencing data based on the depth of coverage.CNVind:一个基于覆盖深度的全外显子测序数据中罕见 CNVs 检测的开源云端分析流程。
BMC Bioinformatics. 2022 Mar 5;23(1):85. doi: 10.1186/s12859-022-04617-x.
6
Challenges in exome analysis by LifeScope and its alternative computational pipelines.LifeScope及其替代计算流程在全外显子组分析中的挑战。
BMC Res Notes. 2015 Sep 7;8:421. doi: 10.1186/s13104-015-1385-4.
7
SILO: A Computational Method for Detecting Copy Number Gain in Clinical Specimens Analyzed on a Next-Generation Sequencing Platform.SILO:一种用于检测下一代测序平台分析的临床标本中拷贝数增益的计算方法。
J Mol Diagn. 2021 Oct;23(10):1241-1248. doi: 10.1016/j.jmoldx.2021.07.016. Epub 2021 Aug 5.
8
Bamgineer: Introduction of simulated allele-specific copy number variants into exome and targeted sequence data sets.Bamgineer:外显子组和靶向序列数据集模拟等位基因特异性拷贝数变异的引入。
PLoS Comput Biol. 2018 Mar 28;14(3):e1006080. doi: 10.1371/journal.pcbi.1006080. eCollection 2018 Mar.
9
iCopyDAV: Integrated platform for copy number variations-Detection, annotation and visualization.iCopyDAV:用于拷贝数变异检测、注释和可视化的集成平台。
PLoS One. 2018 Apr 5;13(4):e0195334. doi: 10.1371/journal.pone.0195334. eCollection 2018.
10
An evaluation of copy number variation detection tools for cancer using whole exome sequencing data.使用全外显子组测序数据对癌症拷贝数变异检测工具的评估
BMC Bioinformatics. 2017 May 31;18(1):286. doi: 10.1186/s12859-017-1705-x.

引用本文的文献

1
Cloud-native distributed genomic pileup operations.云原生分布式基因组堆积操作。
Bioinformatics. 2023 Jan 1;39(1). doi: 10.1093/bioinformatics/btac804.
2
A Large-Scale and Serverless Computational Approach for Improving Quality of NGS Data Supporting Big Multi-Omics Data Analyses.一种用于提高支持大型多组学数据分析的NGS数据质量的大规模无服务器计算方法。
Front Genet. 2021 Jul 13;12:699280. doi: 10.3389/fgene.2021.699280. eCollection 2021.
3
Identification of SNPs and InDels associated with berry size in table grapes integrating genetic and transcriptomic approaches.

本文引用的文献

1
SeQuiLa: an elastic, fast and scalable SQL-oriented solution for processing and querying genomic intervals.SeQuiLa:一种面向 SQL 的弹性、快速和可扩展的解决方案,用于处理和查询基因组区间。
Bioinformatics. 2019 Jun 1;35(12):2156-2158. doi: 10.1093/bioinformatics/bty940.
2
Mosdepth: quick coverage calculation for genomes and exomes.Mosdepth:基因组和外显子组的快速覆盖度计算。
Bioinformatics. 2018 Mar 1;34(5):867-868. doi: 10.1093/bioinformatics/btx699.
3
Homozygous and hemizygous CNV detection from exome sequencing data in a Mendelian disease cohort.
利用遗传和转录组学方法鉴定与葡萄浆果大小相关的 SNPs 和 InDels。
BMC Plant Biol. 2020 Aug 3;20(1):365. doi: 10.1186/s12870-020-02564-4.
在孟德尔疾病队列中从外显子组测序数据检测纯合和半合子拷贝数变异
Nucleic Acids Res. 2017 Feb 28;45(4):1633-1648. doi: 10.1093/nar/gkw1237.
4
Flexible expressed region analysis for RNA-seq with derfinder.使用derfinder对RNA测序进行灵活的表达区域分析。
Nucleic Acids Res. 2017 Jan 25;45(2):e9. doi: 10.1093/nar/gkw852. Epub 2016 Sep 29.
5
Rail-RNA: scalable analysis of RNA-seq splicing and coverage.Rail-RNA:用于 RNA-seq 剪接和覆盖度分析的可扩展分析方法。
Bioinformatics. 2017 Dec 15;33(24):4033-4040. doi: 10.1093/bioinformatics/btw575.
6
Qualimap 2: advanced multi-sample quality control for high-throughput sequencing data.Qualimap 2:用于高通量测序数据的高级多样本质量控制
Bioinformatics. 2016 Jan 15;32(2):292-4. doi: 10.1093/bioinformatics/btv566. Epub 2015 Oct 1.
7
Sambamba: fast processing of NGS alignment formats.Sambamba:快速处理 NGS 比对格式。
Bioinformatics. 2015 Jun 15;31(12):2032-4. doi: 10.1093/bioinformatics/btv098. Epub 2015 Feb 19.
8
CODEX: a normalization and copy number variation detection method for whole exome sequencing.CODEX:一种用于全外显子组测序的标准化及拷贝数变异检测方法。
Nucleic Acids Res. 2015 Mar 31;43(6):e39. doi: 10.1093/nar/gku1363. Epub 2015 Jan 23.
9
Using XHMM Software to Detect Copy Number Variation in Whole-Exome Sequencing Data.使用XHMM软件检测全外显子组测序数据中的拷贝数变异。
Curr Protoc Hum Genet. 2014 Apr 24;81:7.23.1-7.23.21. doi: 10.1002/0471142905.hg0723s81.
10
Differential expression analysis of RNA-seq data at single-base resolution.单碱基分辨率下RNA测序数据的差异表达分析。
Biostatistics. 2014 Jul;15(3):413-26. doi: 10.1093/biostatistics/kxt053. Epub 2014 Jan 6.