FANSe2：一种用于定量下一代测序应用的强大且经济高效的比对工具。

FANSe2: a robust and cost-efficient alignment tool for quantitative next-generation sequencing applications.

作者信息

Xiao Chuan-Le, Mai Zhi-Biao, Lian Xin-Lei, Zhong Jia-Yong, Jin Jing-Jie, He Qing-Yu, Zhang Gong

机构信息

Key Laboratory of Functional Protein Research of Guangdong Higher Education Institutes, Institute of Life and Health Engineering, College of Life Science and Technology, Jinan University, Guangzhou, China.

出版信息

PLoS One. 2014 Apr 17;9(4):e94250. doi: 10.1371/journal.pone.0094250. eCollection 2014.

DOI:10.1371/journal.pone.0094250

PMID:24743329

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3990525/

Abstract

Correct and bias-free interpretation of the deep sequencing data is inevitably dependent on the complete mapping of all mappable reads to the reference sequence, especially for quantitative RNA-seq applications. Seed-based algorithms are generally slow but robust, while Burrows-Wheeler Transform (BWT) based algorithms are fast but less robust. To have both advantages, we developed an algorithm FANSe2 with iterative mapping strategy based on the statistics of real-world sequencing error distribution to substantially accelerate the mapping without compromising the accuracy. Its sensitivity and accuracy are higher than the BWT-based algorithms in the tests using both prokaryotic and eukaryotic sequencing datasets. The gene identification results of FANSe2 is experimentally validated, while the previous algorithms have false positives and false negatives. FANSe2 showed remarkably better consistency to the microarray than most other algorithms in terms of gene expression quantifications. We implemented a scalable and almost maintenance-free parallelization method that can utilize the computational power of multiple office computers, a novel feature not present in any other mainstream algorithm. With three normal office computers, we demonstrated that FANSe2 mapped an RNA-seq dataset generated from an entire Illunima HiSeq 2000 flowcell (8 lanes, 608 M reads) to masked human genome within 4.1 hours with higher sensitivity than Bowtie/Bowtie2. FANSe2 thus provides robust accuracy, full indel sensitivity, fast speed, versatile compatibility and economical computational utilization, making it a useful and practical tool for deep sequencing applications. FANSe2 is freely available at http://bioinformatics.jnu.edu.cn/software/fanse2/.

摘要

深度测序数据的正确且无偏差解读不可避免地依赖于所有可映射读段到参考序列的完整映射，特别是对于定量RNA测序应用。基于种子的算法通常速度慢但稳健，而基于Burrows-Wheeler变换（BWT）的算法速度快但稳健性较差。为了兼具两者的优点，我们基于对实际测序错误分布的统计，开发了一种具有迭代映射策略的算法FANSe2，以在不影响准确性的情况下大幅加速映射。在使用原核和真核测序数据集的测试中，其灵敏度和准确性高于基于BWT的算法。FANSe2的基因识别结果经过实验验证，而之前的算法存在假阳性和假阴性。在基因表达定量方面，FANSe2与微阵列的一致性比大多数其他算法显著更好。我们实现了一种可扩展且几乎无需维护的并行化方法，该方法可以利用多台办公计算机的计算能力，这是任何其他主流算法都没有的新特性。使用三台普通办公计算机，我们证明FANSe2能在4.1小时内将来自整个Illunima HiSeq 2000流动槽（8个泳道，6.08亿读段）生成的RNA测序数据集映射到掩码人类基因组，其灵敏度高于Bowtie/Bowtie2。因此，FANSe2提供了稳健的准确性、对插入缺失的完全灵敏度、快速速度、通用兼容性和经济的计算利用率，使其成为深度测序应用中一个有用且实用的工具。FANSe2可在http://bioinformatics.jnu.edu.cn/software/fanse2/免费获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a521/3990525/d2b4f31fa443/pone.0094250.g001.jpg

相似文献

FANSe2: a robust and cost-efficient alignment tool for quantitative next-generation sequencing applications.

PLoS One. 2014 Apr 17;9(4):e94250. doi: 10.1371/journal.pone.0094250. eCollection 2014.

Fast and memory efficient approach for mapping NGS reads to a reference genome.

J Bioinform Comput Biol. 2019 Apr;17(2):1950008. doi: 10.1142/S0219720019500082.

CLAST: CUDA implemented large-scale alignment search tool.

BMC Bioinformatics. 2014 Dec 11;15(1):406. doi: 10.1186/s12859-014-0406-y.

TotalReCaller: improved accuracy and performance via integrated alignment and base-calling.

Bioinformatics. 2011 Sep 1;27(17):2330-7. doi: 10.1093/bioinformatics/btr393. Epub 2011 Jun 30.

The Ultrafast and Accurate Mapping Algorithm FANSe3: Mapping a Human Whole-Genome Sequencing Dataset Within 30 Minutes.

Phenomics. 2021 Feb 22;1(1):22-30. doi: 10.1007/s43657-020-00008-5. eCollection 2021 Feb.

AlignerBoost: A Generalized Software Toolkit for Boosting Next-Gen Sequencing Mapping Accuracy Using a Bayesian-Based Mapping Quality Framework.

PLoS Comput Biol. 2016 Oct 5;12(10):e1005096. doi: 10.1371/journal.pcbi.1005096. eCollection 2016 Oct.

Fast inexact mapping using advanced tree exploration on backward search methods.

BMC Bioinformatics. 2015 Jan 28;16:18. doi: 10.1186/s12859-014-0438-3.

Ψ-RA: a parallel sparse index for genomic read alignment.

BMC Genomics. 2011;12 Suppl 2(Suppl 2):S7. doi: 10.1186/1471-2164-12-S2-S7. Epub 2011 Jul 27.

A Fast Approximate Algorithm for Mapping Long Reads to Large Reference Databases.

J Comput Biol. 2018 Jul;25(7):766-779. doi: 10.1089/cmb.2018.0036. Epub 2018 Apr 30.

Comparative analysis of algorithms for next-generation sequencing read alignment.

Bioinformatics. 2011 Oct 15;27(20):2790-6. doi: 10.1093/bioinformatics/btr477. Epub 2011 Aug 19.

引用本文的文献

Construction and characteristics of an adjustable biomechanical in vitro corneal stromal model simulating keratoconus pathological features.

Mater Today Bio. 2025 Jul 23;34:102130. doi: 10.1016/j.mtbio.2025.102130. eCollection 2025 Oct.

The influence of femtosecond laser intrastromal lenticules on the characteristics and maturity in tissue-engineered stem cell-derived retinal pigment epithelium sheets.

Stem Cell Res Ther. 2025 Jun 20;16(1):316. doi: 10.1186/s13287-025-04463-7.

Comparative analysis of translatomics and transcriptomics in the longissimus dorsi muscle of Luchuan and Duroc pigs.

PLoS One. 2025 Mar 18;20(3):e0319399. doi: 10.1371/journal.pone.0319399. eCollection 2025.

Mechanisms of Overexpression and Membrane Potential Reduction Leading to Ciprofloxacin Heteroresistance in a Isolate.

Int J Mol Sci. 2025 Mar 6;26(5):2372. doi: 10.3390/ijms26052372.

Circulating exosomal miR-16-5p and let-7e-5p are associated with bladder fibrosis of diabetic cystopathy.

Sci Rep. 2024 Jan 8;14(1):837. doi: 10.1038/s41598-024-51451-7.

Transcriptomic and Translatomic Analyses Reveal Insights into the Signaling Pathways of the Innate Immune Response in the Spleens of SPF Chickens Infected with Avian Reovirus.

Viruses. 2023 Nov 29;15(12):2346. doi: 10.3390/v15122346.

The Ultrafast and Accurate Mapping Algorithm FANSe3: Mapping a Human Whole-Genome Sequencing Dataset Within 30 Minutes.

Phenomics. 2021 Feb 22;1(1):22-30. doi: 10.1007/s43657-020-00008-5. eCollection 2021 Feb.

Translatomics Probes Into the Role of Lycopene on Improving Hepatic Steatosis Induced by High-Fat Diet.

Front Nutr. 2021 Nov 2;8:727785. doi: 10.3389/fnut.2021.727785. eCollection 2021.

Integrated Transcriptomic and Translatomic Inquiry of the Role of Betaine on Lipid Metabolic Dysregulation Induced by a High-Fat Diet.

Front Nutr. 2021 Oct 11;8:751436. doi: 10.3389/fnut.2021.751436. eCollection 2021.

Mechanotransduction Regulates Reprogramming Enhancement in Adherent 3D Keratocyte Cultures.

Front Bioeng Biotechnol. 2021 Sep 10;9:709488. doi: 10.3389/fbioe.2021.709488. eCollection 2021.

本文引用的文献

Expression Atlas update--a database of gene and transcript expression from microarray- and sequencing-based functional genomics experiments.

Nucleic Acids Res. 2014 Jan;42(Database issue):D926-32. doi: 10.1093/nar/gkt1270. Epub 2013 Dec 4.

Global regulatory architecture of human, mouse and rat tissue transcriptomes.

BMC Genomics. 2013 Oct 20;14:716. doi: 10.1186/1471-2164-14-716.

Genomic landscapes of Chinese hamster ovary cell lines as revealed by the Cricetulus griseus draft genome.

Nat Biotechnol. 2013 Aug;31(8):759-65. doi: 10.1038/nbt.2624. Epub 2013 Jul 21.

The development of next-generation sequencing assays for the mitochondrial genome and 108 nuclear genes associated with mitochondrial disorders.

J Mol Diagn. 2013 Jul;15(4):526-34. doi: 10.1016/j.jmoldx.2013.03.005. Epub 2013 May 9.

Next-generation sequencing and microarray-based interrogation of microRNAs from formalin-fixed, paraffin-embedded tissue: preliminary assessment of cross-platform concordance.

Genomics. 2013 Jul;102(1):8-14. doi: 10.1016/j.ygeno.2013.03.008. Epub 2013 Apr 3.

An accessible database for mouse and human whole transcriptome qPCR primers.

Bioinformatics. 2013 May 15;29(10):1355-6. doi: 10.1093/bioinformatics/btt145. Epub 2013 Mar 28.

Translating mRNAs strongly correlate to proteins in a multivariate manner and their translation ratios are phenotype specific.

Nucleic Acids Res. 2013 May;41(9):4743-54. doi: 10.1093/nar/gkt178. Epub 2013 Mar 21.

Altered error specificity of RNase H-deficient HIV-1 reverse transcriptases during DNA-dependent DNA synthesis.

Nucleic Acids Res. 2013 Apr;41(8):4601-12. doi: 10.1093/nar/gkt109. Epub 2013 Feb 26.

High-throughput RNA sequencing in B-cell lymphomas.

Methods Mol Biol. 2013;971:295-312. doi: 10.1007/978-1-62703-269-8_17.

Single-neuron sequencing analysis of L1 retrotransposition and somatic mutation in the human brain.

Cell. 2012 Oct 26;151(3):483-96. doi: 10.1016/j.cell.2012.09.035.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

FANSe2：一种用于定量下一代测序应用的强大且经济高效的比对工具。

FANSe2: a robust and cost-efficient alignment tool for quantitative next-generation sequencing applications.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献