• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一种用于预测映射质量的串联模拟框架。

A tandem simulation framework for predicting mapping quality.

作者信息

Langmead Ben

机构信息

Department of Computer Science, Whiting School of Engineering, Johns Hopkins University, 3400 North Charles St, Baltimore, 21218-2682, USA.

Department of Biostatistics, Bloomberg School of Public Health, Johns Hopkins University, 615 N Wolfe St, Baltimore, 21205, USA.

出版信息

Genome Biol. 2017 Aug 10;18(1):152. doi: 10.1186/s13059-017-1290-3.

DOI:10.1186/s13059-017-1290-3
PMID:28806977
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5557537/
Abstract

Read alignment is the first step in most sequencing data analyses. Because a read's point of origin can be ambiguous, aligners report a mapping quality, which is the probability that the reported alignment is incorrect. Despite its importance, there is no established and general method for calculating mapping quality. I describe a framework for predicting mapping qualities that works by simulating a set of tandem reads. These are like the input reads in important ways, but the true point of origin is known. I implement this method in an accurate and low-overhead tool called Qtip, which is compatible with popular aligners.

摘要

读取比对是大多数测序数据分析的第一步。由于读取的起源点可能不明确,比对工具会报告一个映射质量,即所报告的比对不正确的概率。尽管其很重要,但目前尚无既定的通用方法来计算映射质量。我描述了一个通过模拟一组串联读取来预测映射质量的框架。这些串联读取在重要方面与输入读取相似,但真实的起源点是已知的。我在一个名为Qtip的准确且低开销的工具中实现了此方法,该工具与流行的比对工具兼容。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f0f2/5557537/390cec6bce51/13059_2017_1290_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f0f2/5557537/45e814fd6228/13059_2017_1290_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f0f2/5557537/baf02280c769/13059_2017_1290_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f0f2/5557537/390cec6bce51/13059_2017_1290_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f0f2/5557537/45e814fd6228/13059_2017_1290_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f0f2/5557537/baf02280c769/13059_2017_1290_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f0f2/5557537/390cec6bce51/13059_2017_1290_Fig3_HTML.jpg

相似文献

1
A tandem simulation framework for predicting mapping quality.一种用于预测映射质量的串联模拟框架。
Genome Biol. 2017 Aug 10;18(1):152. doi: 10.1186/s13059-017-1290-3.
2
Evaluation and assessment of read-mapping by multiple next-generation sequencing aligners based on genome-wide characteristics.基于全基因组特征,对多种新一代测序比对器的读段比对进行评估。
Genomics. 2017 Jul;109(3-4):186-191. doi: 10.1016/j.ygeno.2017.03.001. Epub 2017 Mar 9.
3
Incorporating sequence quality data into alignment improves DNA read mapping.将序列质量数据纳入比对可提高 DNA 读取的映射质量。
Nucleic Acids Res. 2010 Apr;38(7):e100. doi: 10.1093/nar/gkq010. Epub 2010 Jan 27.
4
Optimal spliced alignments of short sequence reads.短序列 reads 的最优剪接比对。
Bioinformatics. 2008 Aug 15;24(16):i174-80. doi: 10.1093/bioinformatics/btn300.
5
TruSPAdes: barcode assembly of TruSeq synthetic long reads.TruSPAdes:TruSeq 合成长 reads 的条码组装。
Nat Methods. 2016 Mar;13(3):248-50. doi: 10.1038/nmeth.3737. Epub 2016 Feb 1.
6
Comparison of mapping algorithms used in high-throughput sequencing: application to Ion Torrent data.高通量测序中使用的映射算法比较:应用于Ion Torrent数据
BMC Genomics. 2014 Apr 5;15:264. doi: 10.1186/1471-2164-15-264.
7
Mapping short DNA sequencing reads and calling variants using mapping quality scores.使用比对质量分数比对短DNA测序读数并识别变异。
Genome Res. 2008 Nov;18(11):1851-8. doi: 10.1101/gr.078212.108. Epub 2008 Aug 19.
8
RF: a method for filtering short reads with tandem repeats for genome mapping.RF:一种用于基因组图谱构建的带有串联重复的短读过滤方法。
Genomics. 2013 Jul;102(1):35-7. doi: 10.1016/j.ygeno.2013.03.002. Epub 2013 Mar 29.
9
Review of alignment and SNP calling algorithms for next-generation sequencing data.下一代测序数据的比对和单核苷酸多态性(SNP)检测算法综述。
J Appl Genet. 2016 Feb;57(1):71-9. doi: 10.1007/s13353-015-0292-7. Epub 2015 Jun 9.
10
Subset selection of high-depth next generation sequencing reads for de novo genome assembly using MapReduce framework.使用MapReduce框架进行从头基因组组装时对高深度下一代测序读数的子集选择。
BMC Genomics. 2015;16 Suppl 12(Suppl 12):S9. doi: 10.1186/1471-2164-16-S12-S9. Epub 2015 Dec 9.

引用本文的文献

1
Oarfish: enhanced probabilistic modeling leads to improved accuracy in long read transcriptome quantification.皇带鱼:增强的概率模型可提高长读长转录组定量的准确性。
Bioinformatics. 2025 Jul 1;41(Supplement_1):i304-i313. doi: 10.1093/bioinformatics/btaf240.
2
SigAlign: an alignment algorithm guided by explicit similarity criteria.SigAlign:一种基于显式相似性标准的对齐算法。
Nucleic Acids Res. 2024 Aug 27;52(15):8717-8733. doi: 10.1093/nar/gkae607.
3
Short-read aligner performance in germline variant identification.短读比对工具在种系变异识别中的性能表现。

本文引用的文献

1
A reference data set of 5.4 million phased human variants validated by genetic inheritance from sequencing a three-generation 17-member pedigree.通过对一个包含17名成员的三代家系进行测序,经遗传继承验证的540万个定相人类变异的参考数据集。
Genome Res. 2017 Jan;27(1):157-164. doi: 10.1101/gr.210500.116. Epub 2016 Nov 30.
2
A haplotype-based normalization technique for the analysis and detection of allele specific expression.一种基于单倍型的归一化技术,用于等位基因特异性表达的分析和检测。
BMC Bioinformatics. 2016 Sep 13;17(1):364. doi: 10.1186/s12859-016-1238-8.
3
Assemblytics: a web analytics tool for the detection of variants from an assembly.
Bioinformatics. 2023 Aug 1;39(8). doi: 10.1093/bioinformatics/btad480.
4
Pathogenic strains of contain plasmids that are absent in the probiotic strain Pdp11.含有与益生菌 Pdp11 菌株中不存在的质粒相关的致病菌株。
PeerJ. 2022 Oct 24;10:e14248. doi: 10.7717/peerj.14248. eCollection 2022.
5
Functional gene categories differentiate maize leaf drought-related microbial epiphytic communities.功能基因类别区分玉米叶干旱相关微生物附生群落。
PLoS One. 2020 Sep 18;15(9):e0237493. doi: 10.1371/journal.pone.0237493. eCollection 2020.
6
Vargas: heuristic-free alignment for assessing linear and graph read aligners.瓦尔加斯:用于评估线性和图形读取对齐程序的无启发式对齐。
Bioinformatics. 2020 Jun 1;36(12):3712-3718. doi: 10.1093/bioinformatics/btaa265.
7
Multimapping confounds ribosome profiling analysis: A case-study of the Hsp90 molecular chaperone.多映射混淆核糖体分析:热休克蛋白 90 分子伴侣的案例研究。
Proteins. 2020 Jan;88(1):57-68. doi: 10.1002/prot.25766. Epub 2019 Jul 19.
8
NGSEP3: accurate variant calling across species and sequencing protocols.NGSEP3:跨物种和测序协议的准确变异调用。
Bioinformatics. 2019 Nov 1;35(22):4716-4723. doi: 10.1093/bioinformatics/btz275.
9
Joint Estimates of Heterozygosity and Runs of Homozygosity for Modern and Ancient Samples.现代和古代样本杂合度和纯合度的联合估计。
Genetics. 2019 Jul;212(3):587-614. doi: 10.1534/genetics.119.302057. Epub 2019 May 14.
10
FORGe: prioritizing variants for graph genomes.FORGe:对图基因组中的变体进行优先级排序。
Genome Biol. 2018 Dec 17;19(1):220. doi: 10.1186/s13059-018-1595-x.
Assemblytics:一种用于从组装中检测变异的网络分析工具。
Bioinformatics. 2016 Oct 1;32(19):3021-3. doi: 10.1093/bioinformatics/btw369. Epub 2016 Jun 17.
4
Alignment of Next-Generation Sequencing Reads.下一代测序读数的比对
Annu Rev Genomics Hum Genet. 2015;16:133-51. doi: 10.1146/annurev-genom-090413-025358. Epub 2015 May 4.
5
Toward better understanding of artifacts in variant calling from high-coverage samples.为了更好地理解高覆盖样本中变体调用中的伪影。
Bioinformatics. 2014 Oct 15;30(20):2843-51. doi: 10.1093/bioinformatics/btu356. Epub 2014 Jun 27.
6
MOSAIK: a hash-based algorithm for accurate next-generation sequencing short-read mapping.MOSAIK:一种基于哈希的算法,用于精确的下一代测序短读段比对。
PLoS One. 2014 Mar 5;9(3):e90581. doi: 10.1371/journal.pone.0090581. eCollection 2014.
7
Specificity control for read alignments using an artificial reference genome-guided false discovery rate.使用人工参考基因组指导的假发现率控制读对齐的特异性。
Bioinformatics. 2014 Jan 1;30(1):9-16. doi: 10.1093/bioinformatics/btt255. Epub 2013 May 17.
8
STAR: ultrafast universal RNA-seq aligner.STAR:超快通用 RNA-seq 对齐工具。
Bioinformatics. 2013 Jan 1;29(1):15-21. doi: 10.1093/bioinformatics/bts635. Epub 2012 Oct 25.
9
Accurate estimation of short read mapping quality for next-generation genome sequencing.准确估计下一代基因组测序中短读测序数据的映射质量。
Bioinformatics. 2012 Sep 15;28(18):i349-i355. doi: 10.1093/bioinformatics/bts408.
10
Comment on "Widespread RNA and DNA sequence differences in the human transcriptome".评论“人类转录组中广泛存在的 RNA 和 DNA 序列差异”。
Science. 2012 Mar 16;335(6074):1302; author reply 1302. doi: 10.1126/science.1210484.