文献检索文档翻译深度研究
Suppr Zotero 插件Zotero 插件
邀请有礼套餐&价格历史记录

新学期,新优惠

限时优惠:9月1日-9月22日

30天高级会员仅需29元

1天体验卡首发特惠仅需5.99元

了解详情
不再提醒
插件&应用
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
高级版
套餐订阅购买积分包
AI 工具
文献检索文档翻译深度研究
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2025

NGS-QCbox and Raspberry for Parallel, Automated and Rapid Quality Control Analysis of Large-Scale Next Generation Sequencing (Illumina) Data.

作者信息

Katta Mohan A V S K, Khan Aamir W, Doddamani Dadakhalandar, Thudi Mahendar, Varshney Rajeev K

机构信息

International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Hyderabad, India.

International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Hyderabad, India; School of Plant Biology and Institute of Agriculture, The University of Western Australia, Crawley, Australia.

出版信息

PLoS One. 2015 Oct 13;10(10):e0139868. doi: 10.1371/journal.pone.0139868. eCollection 2015.


DOI:10.1371/journal.pone.0139868
PMID:26460497
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4604202/
Abstract

Rapid popularity and adaptation of next generation sequencing (NGS) approaches have generated huge volumes of data. High throughput platforms like Illumina HiSeq produce terabytes of raw data that requires quick processing. Quality control of the data is an important component prior to the downstream analyses. To address these issues, we have developed a quality control pipeline, NGS-QCbox that scales up to process hundreds or thousands of samples. Raspberry is an in-house tool, developed in C language utilizing HTSlib (v1.2.1) (http://htslib.org), for computing read/base level statistics. It can be used as stand-alone application and can process both compressed and uncompressed FASTQ format files. NGS-QCbox integrates Raspberry with other open-source tools for alignment (Bowtie2), SNP calling (SAMtools) and other utilities (bedtools) towards analyzing raw NGS data at higher efficiency and in high-throughput manner. The pipeline implements batch processing of jobs using Bpipe (https://github.com/ssadedin/bpipe) in parallel and internally, a fine grained task parallelization utilizing OpenMP. It reports read and base statistics along with genome coverage and variants in a user friendly format. The pipeline developed presents a simple menu driven interface and can be used in either quick or complete mode. In addition, the pipeline in quick mode outperforms in speed against other similar existing QC pipeline/tools. The NGS-QCbox pipeline, Raspberry tool and associated scripts are made available at the URL https://github.com/CEG-ICRISAT/NGS-QCbox and https://github.com/CEG-ICRISAT/Raspberry for rapid quality control analysis of large-scale next generation sequencing (Illumina) data.

摘要
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dc22/4604202/a52d20d0a634/pone.0139868.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dc22/4604202/71e356b84ace/pone.0139868.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dc22/4604202/a52d20d0a634/pone.0139868.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dc22/4604202/71e356b84ace/pone.0139868.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dc22/4604202/a52d20d0a634/pone.0139868.g002.jpg

相似文献

[1]
NGS-QCbox and Raspberry for Parallel, Automated and Rapid Quality Control Analysis of Large-Scale Next Generation Sequencing (Illumina) Data.

PLoS One. 2015-10-13

[2]
An integrated SNP mining and utilization (ISMU) pipeline for next generation sequencing data.

PLoS One. 2014-7-8

[3]
ClinQC: a tool for quality control and cleaning of Sanger and NGS data in clinical research.

BMC Bioinformatics. 2016-2-2

[4]
AUSPP: A universal short-read pre-processing package.

J Bioinform Comput Biol. 2019-12

[5]
mInDel: a high-throughput and efficient pipeline for genome-wide InDel marker development.

BMC Genomics. 2016-4-14

[6]
MethylStar: A fast and robust pre-processing pipeline for bulk or single-cell whole-genome bisulfite sequencing data.

BMC Genomics. 2020-7-13

[7]
ViraPipe: scalable parallel pipeline for viral metagenome analysis from next generation sequencing reads.

Bioinformatics. 2018-3-15

[8]
CANEapp: a user-friendly application for automated next generation transcriptomic data analysis.

BMC Genomics. 2016-1-13

[9]
ngs_backbone: a pipeline for read cleaning, mapping and SNP calling using next generation sequence.

BMC Genomics. 2011-6-2

[10]
Gencore: an efficient tool to generate consensus reads for error suppressing and duplicate removing of NGS data.

BMC Bioinformatics. 2019-12-27

引用本文的文献

[1]
Propagation of goose primordial germ cells in vitro relies on FGF and BMP signalling pathways.

Commun Biol. 2025-2-25

[2]
The pivotal role of IncFIB(Mar) plasmid in the emergence and spread of hypervirulent carbapenem-resistant .

Sci Adv. 2025-1-31

[3]
Cicer super-pangenome provides insights into species evolution and agronomic trait loci for crop improvement in chickpea.

Nat Genet. 2024-6

[4]
Deploying QTL-seq rapid identification and separation of the major QTLs of tassel branch number for fine-mapping in advanced maize populations.

Mol Breed. 2023-11-29

[5]
The Transcriptome of Chicken Migratory Primordial Germ Cells Reveals Intrinsic Sex Differences and Expression of Hallmark Germ Cell Genes.

Cells. 2023-4-13

[6]
Genetic and biochemical characterization of BIM-1, a novel acquired subgroup B1 MBL found in a Pseudomonas sp. strain from the Brazilian Amazon region.

J Antimicrob Chemother. 2023-6-1

[7]
Comprehensive Transcriptome Profiling Uncovers Molecular Mechanisms and Potential Candidate Genes Associated with Heat Stress Response in Chickpea.

Int J Mol Sci. 2023-1-10

[8]
Identification of genes controlling compatible and incompatible reactions of pearl millet () against blast () pathogen through RNA-Seq.

Front Plant Sci. 2022-9-23

[9]
Combining QTL-seq and linkage mapping to uncover the genetic basis of single vs. paired spikelets in the advanced populations of two-ranked maize×teosinte.

BMC Plant Biol. 2021-12-4

[10]
HTSQualC is a flexible and one-step quality control software for high-throughput sequencing data analysis.

Sci Rep. 2021-9-21

本文引用的文献

[1]
HTSeq--a Python framework to work with high-throughput sequencing data.

Bioinformatics. 2015-1-15

[2]
Automated reconstruction of whole-genome phylogenies from short-sequence reads.

Mol Biol Evol. 2014-3-5

[3]
When whole-genome alignments just won't work: kSNP v2 software for alignment-free SNP discovery and phylogenetics of hundreds of microbial genomes.

PLoS One. 2013-12-9

[4]
Whole-genome sequencing reveals untapped genetic potential in Africa's indigenous cereal crop sorghum.

Nat Commun. 2013

[5]
Agriculture: Feeding the future.

Nature. 2013-7-4

[6]
Draft genome sequence of chickpea (Cicer arietinum) provides a resource for trait improvement.

Nat Biotechnol. 2013-1-27

[7]
Genome-wide genetic changes during modern breeding of maize.

Nat Genet. 2012-6-3

[8]
The iPlant Collaborative: Cyberinfrastructure for Plant Biology.

Front Plant Sci. 2011-7-25

[9]
Bpipe: a tool for running and managing bioinformatics pipelines.

Bioinformatics. 2012-4-12

[10]
Fast gapped-read alignment with Bowtie 2.

Nat Methods. 2012-3-4

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

推荐工具

医学文档翻译智能文献检索