• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

不同映射算法对Pool-Seq数据全基因组多态性扫描的适用性

Suitability of Different Mapping Algorithms for Genome-Wide Polymorphism Scans with Pool-Seq Data.

作者信息

Kofler Robert, Langmüller Anna Maria, Nouhaud Pierre, Otte Kathrin Anna, Schlötterer Christian

机构信息

Institut für Populationsgenetik, Vetmeduni Vienna, Veterinärplatz 1, 1210 Wien 1210, Austria.

Vienna Graduate School of Population Genetics, 1210, Austria.

出版信息

G3 (Bethesda). 2016 Nov 8;6(11):3507-3515. doi: 10.1534/g3.116.034488.

DOI:10.1534/g3.116.034488
PMID:27613752
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5100849/
Abstract

The cost-effectiveness of sequencing pools of individuals (Pool-Seq) provides the basis for the popularity and widespread use of this method for many research questions, ranging from unraveling the genetic basis of complex traits, to the clonal evolution of cancer cells. Because the accuracy of Pool-Seq could be affected by many potential sources of error, several studies have determined, for example, the influence of sequencing technology, the library preparation protocol, and mapping parameters. Nevertheless, the impact of the mapping tools has not yet been evaluated. Using simulated and real Pool-Seq data, we demonstrate a substantial impact of the mapping tools, leading to characteristic false positives in genome-wide scans. The problem of false positives was particularly pronounced when data with different read lengths and insert sizes were compared. Out of 14 evaluated algorithms novoalign, bwa mem and clc4 are most suitable for mapping Pool-Seq data. Nevertheless, no single algorithm is sufficient for avoiding all false positives. We show that the intersection of the results of two mapping algorithms provides a simple, yet effective, strategy to eliminate false positives. We propose that the implementation of a consistent Pool-Seq bioinformatics pipeline, building on the recommendations of this study, can substantially increase the reliability of Pool-Seq results, in particular when libraries generated with different protocols are being compared.

摘要

对个体样本池进行测序(Pool-Seq)的成本效益为该方法在许多研究问题中的广泛应用和普及提供了基础,这些研究问题涵盖从解析复杂性状的遗传基础到癌细胞的克隆进化等多个方面。由于Pool-Seq的准确性可能受到许多潜在误差来源的影响,例如,已有多项研究确定了测序技术、文库制备方案和比对参数的影响。然而,比对工具的影响尚未得到评估。通过使用模拟和真实的Pool-Seq数据,我们证明了比对工具具有重大影响,会在全基因组扫描中导致特征性的假阳性。当比较具有不同读长和插入片段大小的数据时,假阳性问题尤为突出。在评估的14种算法中,novoalign、bwa mem和clc4最适合用于比对Pool-Seq数据。然而,没有一种算法足以避免所有假阳性。我们表明,两种比对算法结果的交集提供了一种简单而有效的消除假阳性的策略。我们建议,基于本研究的建议实施一致的Pool-Seq生物信息学流程,可大幅提高Pool-Seq结果的可靠性,特别是在比较使用不同方案生成文库时。

相似文献

1
Suitability of Different Mapping Algorithms for Genome-Wide Polymorphism Scans with Pool-Seq Data.不同映射算法对Pool-Seq数据全基因组多态性扫描的适用性
G3 (Bethesda). 2016 Nov 8;6(11):3507-3515. doi: 10.1534/g3.116.034488.
2
New evaluation methods of read mapping by 17 aligners on simulated and empirical NGS data: an updated comparison of DNA- and RNA-Seq data from Illumina and Ion Torrent technologies.17种比对器对模拟和实际NGS数据进行读段比对的新评估方法:Illumina和Ion Torrent技术的DNA测序和RNA测序数据的更新比较
Neural Comput Appl. 2021;33(22):15669-15692. doi: 10.1007/s00521-021-06188-z. Epub 2021 Jun 16.
3
The impact of library preparation protocols on the consistency of allele frequency estimates in Pool-Seq data.文库制备方案对Pool-Seq数据中等位基因频率估计一致性的影响。
Mol Ecol Resour. 2016 Jan;16(1):118-22. doi: 10.1111/1755-0998.12432. Epub 2015 Jun 9.
4
Utility of pooled sequencing for association mapping in nonmodel organisms.组合测序在非模式生物关联作图中的应用。
Mol Ecol Resour. 2018 Jul;18(4):825-837. doi: 10.1111/1755-0998.12784. Epub 2018 Apr 25.
5
Read-Split-Run: an improved bioinformatics pipeline for identification of genome-wide non-canonical spliced regions using RNA-Seq data.读取-分割-运行:一种利用RNA测序数据识别全基因组非经典剪接区域的改进型生物信息学流程。
BMC Genomics. 2016 Aug 22;17 Suppl 7(Suppl 7):503. doi: 10.1186/s12864-016-2896-7.
6
Evaluation and assessment of read-mapping by multiple next-generation sequencing aligners based on genome-wide characteristics.基于全基因组特征,对多种新一代测序比对器的读段比对进行评估。
Genomics. 2017 Jul;109(3-4):186-191. doi: 10.1016/j.ygeno.2017.03.001. Epub 2017 Mar 9.
7
Benchmarking short sequence mapping tools.短序列比对工具的基准测试。
BMC Bioinformatics. 2013 Jun 7;14:184. doi: 10.1186/1471-2105-14-184.
8
RES-Scanner: a software package for genome-wide identification of RNA-editing sites.RES-Scanner:一个用于全基因组识别 RNA 编辑位点的软件包。
Gigascience. 2016 Aug 18;5(1):37. doi: 10.1186/s13742-016-0143-4.
9
Next-Generation Sequencing Data Analysis on Pool-Seq and Low-Coverage Retinoblastoma Data.基于 Pool-Seq 和低深度覆盖视网膜母细胞瘤数据的下一代测序数据分析。
Interdiscip Sci. 2020 Sep;12(3):302-310. doi: 10.1007/s12539-020-00374-8. Epub 2020 Jun 9.
10
Pool-hmm: a Python program for estimating the allele frequency spectrum and detecting selective sweeps from next generation sequencing of pooled samples.Pool-hmm:一个用于从混合样本的下一代测序中估计等位基因频率谱和检测选择清除的 Python 程序。
Mol Ecol Resour. 2013 Mar;13(2):337-40. doi: 10.1111/1755-0998.12063. Epub 2013 Jan 11.

引用本文的文献

1
The bone transcription factor Osterix controls extracellular matrix- and node of Ranvier-related gene expression in oligodendrocytes.骨转录因子 Osterix 控制少突胶质细胞中细胞外基质和Ranvier 结相关基因的表达。
Neuron. 2024 Jan 17;112(2):247-263.e6. doi: 10.1016/j.neuron.2023.10.008. Epub 2023 Nov 3.
2
Artificial selection reveals complex genetic architecture of shoot branching and its response to nitrate supply in Arabidopsis.人工选择揭示了拟南芥分枝性状的复杂遗传结构及其对硝酸盐供应的响应。
PLoS Genet. 2023 Aug 24;19(8):e1010863. doi: 10.1371/journal.pgen.1010863. eCollection 2023 Aug.
3
Comparative genomics of rainbow trout (): Is the genetic architecture of migratory behavior conserved among populations?

本文引用的文献

1
Reliably Detecting Clinically Important Variants Requires Both Combined Variant Calls and Optimized Filtering Strategies.可靠地检测具有临床重要性的变异既需要联合变异调用,也需要优化的过滤策略。
PLoS One. 2015 Nov 23;10(11):e0143199. doi: 10.1371/journal.pone.0143199. eCollection 2015.
2
BatAlign: an incremental method for accurate alignment of sequencing reads.BatAlign:一种用于测序读段精确比对的增量方法。
Nucleic Acids Res. 2015 Sep 18;43(16):e107. doi: 10.1093/nar/gkv533. Epub 2015 Jul 13.
3
Evolutionary genomics of Culex pipiens: global and local adaptations associated with climate, life-history traits and anthropogenic factors.
虹鳟鱼的比较基因组学():洄游行为的遗传结构在种群间是否保守?
Ecol Evol. 2023 Jun 26;13(6):e10241. doi: 10.1002/ece3.10241. eCollection 2023 Jun.
4
Selection on the Fly: Short-Term Adaptation to an Altered Sexual Selection Regime in Drosophila pseudoobscura.飞行中的选择:黑腹果蝇短期适应改变的性选择模式。
Genome Biol Evol. 2023 Jul 3;15(7). doi: 10.1093/gbe/evad113.
5
Genome-wide selection signatures reveal widespread synergistic effects of two different stressors in .全基因组选择信号揭示了两种不同胁迫因素在. 中广泛存在的协同效应。
Proc Biol Sci. 2022 Oct 26;289(1985):20221857. doi: 10.1098/rspb.2022.1857. Epub 2022 Oct 19.
6
Genome-enabled discovery of candidate virulence loci in Striga hermonthica, a devastating parasite of African cereal crops.基于基因组的研究发现 Striga hermonthica(一种严重危害非洲谷类作物的寄生植物)候选毒力基因座。
New Phytol. 2022 Oct;236(2):622-638. doi: 10.1111/nph.18305. Epub 2022 Jul 7.
7
Slow Recovery from Inbreeding Depression Generated by the Complex Genetic Architecture of Segregating Deleterious Mutations.由分离有害突变的复杂遗传结构引起的近交衰退的缓慢恢复。
Mol Biol Evol. 2022 Jan 7;39(1). doi: 10.1093/molbev/msab330.
8
The genetic architecture of temperature adaptation is shaped by population ancestry and not by selection regime.温度适应的遗传结构由种群起源决定,而不是由选择机制决定。
Genome Biol. 2021 Jul 16;22(1):211. doi: 10.1186/s13059-021-02425-9.
9
Fitness effects for Ace insecticide resistance mutations are determined by ambient temperature.环境温度决定了 Ace 杀虫剂抗性突变的适合度效应。
BMC Biol. 2020 Oct 30;18(1):157. doi: 10.1186/s12915-020-00882-5.
10
Multi-model seascape genomics identifies distinct environmental drivers of selection among sympatric marine species.多模型景观基因组学确定了共生海洋物种间选择的不同环境驱动因素。
BMC Evol Biol. 2020 Sep 16;20(1):121. doi: 10.1186/s12862-020-01679-4.
致倦库蚊的进化基因组学:与气候、生活史特征及人为因素相关的全球和局部适应性
Proc Biol Sci. 2015 Jul 7;282(1810). doi: 10.1098/rspb.2015.0728.
4
The impact of library preparation protocols on the consistency of allele frequency estimates in Pool-Seq data.文库制备方案对Pool-Seq数据中等位基因频率估计一致性的影响。
Mol Ecol Resour. 2016 Jan;16(1):118-22. doi: 10.1111/1755-0998.12432. Epub 2015 Jun 9.
5
The recent invasion of natural Drosophila simulans populations by the P-element.近期P因子对拟暗果蝇自然种群的入侵。
Proc Natl Acad Sci U S A. 2015 May 26;112(21):6659-63. doi: 10.1073/pnas.1500758112. Epub 2015 May 11.
6
Pooled sequencing and rare variant association tests for identifying the determinants of emerging drug resistance in malaria parasites.用于鉴定疟原虫新出现的耐药性决定因素的混合测序和罕见变异关联测试。
Mol Biol Evol. 2015 Apr;32(4):1080-90. doi: 10.1093/molbev/msu397. Epub 2014 Dec 21.
7
Genomic evidence of rapid and stable adaptive oscillations over seasonal time scales in Drosophila.果蝇在季节性时间尺度上快速且稳定的适应性振荡的基因组证据。
PLoS Genet. 2014 Nov 6;10(11):e1004775. doi: 10.1371/journal.pgen.1004775. eCollection 2014 Nov.
8
Review of current methods, applications, and data management for the bioinformatics analysis of whole exome sequencing.全外显子组测序生物信息学分析的当前方法、应用及数据管理综述
Cancer Inform. 2014 Sep 21;13(Suppl 2):67-82. doi: 10.4137/CIN.S13779. eCollection 2014.
9
Sequencing pools of individuals - mining genome-wide polymorphism data without big funding.对个体进行测序 - 在没有大量资金的情况下挖掘全基因组多态性数据。
Nat Rev Genet. 2014 Nov;15(11):749-63. doi: 10.1038/nrg3803. Epub 2014 Sep 23.
10
Genome assembly and annotation of a Drosophila simulans strain from Madagascar.来自马达加斯加的一种拟暗果蝇菌株的基因组组装与注释。
Mol Ecol Resour. 2015 Mar;15(2):372-81. doi: 10.1111/1755-0998.12297. Epub 2014 Jul 14.