• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

混合样本的下一代测序:变异筛选指南。

Next Generation Sequencing of Pooled Samples: Guideline for Variants' Filtering.

作者信息

Anand Santosh, Mangano Eleonora, Barizzone Nadia, Bordoni Roberta, Sorosina Melissa, Clarelli Ferdinando, Corrado Lucia, Martinelli Boneschi Filippo, D'Alfonso Sandra, De Bellis Gianluca

机构信息

Institute for Biomedical Technologies, National Research Council, Segrate (MI), Italy.

Department of Science and Technology, University of Sannio, Benevento, Italy.

出版信息

Sci Rep. 2016 Sep 27;6:33735. doi: 10.1038/srep33735.

DOI:10.1038/srep33735
PMID:27670852
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5037392/
Abstract

Sequencing large number of individuals, which is often needed for population genetics studies, is still economically challenging despite falling costs of Next Generation Sequencing (NGS). Pool-seq is an alternative cost- and time-effective option in which DNA from several individuals is pooled for sequencing. However, pooling of DNA creates new problems and challenges for accurate variant call and allele frequency (AF) estimation. In particular, sequencing errors confound with the alleles present at low frequency in the pools possibly giving rise to false positive variants. We sequenced 996 individuals in 83 pools (12 individuals/pool) in a targeted re-sequencing experiment. We show that Pool-seq AFs are robust and reliable by comparing them with public variant databases and in-house SNP-genotyping data of individual subjects of pools. Furthermore, we propose a simple filtering guideline for the removal of spurious variants based on the Kolmogorov-Smirnov statistical test. We experimentally validated our filters by comparing Pool-seq to individual sequencing data showing that the filters remove most of the false variants while retaining majority of true variants. The proposed guideline is fairly generic in nature and could be easily applied in other Pool-seq experiments.

摘要

尽管新一代测序(NGS)成本不断下降,但群体遗传学研究中经常需要对大量个体进行测序,这在经济上仍然具有挑战性。Pool-seq是一种具有成本效益且节省时间的替代方法,它将几个个体的DNA混合起来进行测序。然而,DNA混合为准确的变异检测和等位基因频率(AF)估计带来了新的问题和挑战。特别是,测序错误会与混合样本中低频出现的等位基因混淆,可能导致假阳性变异。在一项靶向重测序实验中,我们对83个样本池(每个样本池12个个体)中的996个个体进行了测序。通过将Pool-seq的AF与公共变异数据库以及样本池中个体受试者的内部SNP基因分型数据进行比较,我们表明Pool-seq的AF是稳健且可靠的。此外,我们基于柯尔莫哥洛夫-斯米尔诺夫统计检验提出了一个简单的筛选指南,用于去除虚假变异。通过将Pool-seq与个体测序数据进行比较,我们通过实验验证了我们的筛选方法,结果表明这些筛选方法能够去除大多数假变异,同时保留大多数真变异。所提出的指南本质上相当通用,可轻松应用于其他Pool-seq实验。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ea5c/5037392/b069e4a1510e/srep33735-f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ea5c/5037392/28d6048d187e/srep33735-f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ea5c/5037392/2fe5f945ed9f/srep33735-f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ea5c/5037392/af5dee9bbdea/srep33735-f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ea5c/5037392/dc1753cd6502/srep33735-f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ea5c/5037392/b069e4a1510e/srep33735-f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ea5c/5037392/28d6048d187e/srep33735-f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ea5c/5037392/2fe5f945ed9f/srep33735-f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ea5c/5037392/af5dee9bbdea/srep33735-f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ea5c/5037392/dc1753cd6502/srep33735-f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ea5c/5037392/b069e4a1510e/srep33735-f5.jpg

相似文献

1
Next Generation Sequencing of Pooled Samples: Guideline for Variants' Filtering.混合样本的下一代测序:变异筛选指南。
Sci Rep. 2016 Sep 27;6:33735. doi: 10.1038/srep33735.
2
Estimation of population allele frequencies from next-generation sequencing data: pool-versus individual-based genotyping.基于下一代测序数据的群体等位基因频率估计:基于池与个体的基因分型。
Mol Ecol. 2013 Jul;22(14):3766-79. doi: 10.1111/mec.12360. Epub 2013 Jun 4.
3
Pooled-DNA Sequencing for Elucidating New Genomic Risk Factors, Rare Variants Underlying Alzheimer's Disease.用于阐明新的基因组风险因素、阿尔茨海默病潜在罕见变异的混合DNA测序
Methods Mol Biol. 2016;1303:299-314. doi: 10.1007/978-1-4939-2627-5_18.
4
Estimating allele frequency from next-generation sequencing of pooled mitochondrial DNA samples.从混合线粒体DNA样本的下一代测序中估计等位基因频率。
Front Genet. 2011 Aug 17;2:51. doi: 10.3389/fgene.2011.00051. eCollection 2011.
5
How to optimize the precision of allele and haplotype frequency estimates using pooled-sequencing data.如何使用汇集测序数据优化等位基因和单倍型频率估计的精度。
Mol Ecol Resour. 2018 Mar;18(2):194-203. doi: 10.1111/1755-0998.12723. Epub 2017 Nov 4.
6
Validation of Pooled Whole-Genome Re-Sequencing in Arabidopsis lyrata.聚叶柳穿鱼全基因组重测序的验证
PLoS One. 2015 Oct 13;10(10):e0140462. doi: 10.1371/journal.pone.0140462. eCollection 2015.
7
Allele frequency calibration for SNP based genotyping of DNA pools: A regression based local-global error fusion method.基于SNP的DNA池基因分型的等位基因频率校准:一种基于回归的局部-全局误差融合方法。
Comput Biol Med. 2015 Jun;61:48-55. doi: 10.1016/j.compbiomed.2015.03.020. Epub 2015 Mar 26.
8
Effective discovery of rare variants by pooled target capture sequencing: A comparative analysis with individually indexed target capture sequencing.通过混合目标捕获测序有效发现罕见变异:与个体索引目标捕获测序的比较分析。
Mutat Res. 2018 May;809:24-31. doi: 10.1016/j.mrfmmm.2018.03.007. Epub 2018 Mar 30.
9
Next-Generation Sequencing Data Analysis on Pool-Seq and Low-Coverage Retinoblastoma Data.基于 Pool-Seq 和低深度覆盖视网膜母细胞瘤数据的下一代测序数据分析。
Interdiscip Sci. 2020 Sep;12(3):302-310. doi: 10.1007/s12539-020-00374-8. Epub 2020 Jun 9.
10
Benchmarking the performance of Pool-seq SNP callers using simulated and real sequencing data.使用模拟和真实测序数据对 Pool-seq SNP 调用程序的性能进行基准测试。
Mol Ecol Resour. 2021 May;21(4):1216-1229. doi: 10.1111/1755-0998.13343. Epub 2021 Mar 5.

引用本文的文献

1
Optimizing genomic diversity assessments for conservation of Bromus auleticus (Trinius ex Nees) using individual and pooled sequencing.利用个体测序和混合测序优化用于保护奥氏雀麦(Trinius ex Nees的Trinius)的基因组多样性评估。
PLoS One. 2025 Jun 25;20(6):e0325548. doi: 10.1371/journal.pone.0325548. eCollection 2025.
2
Variation for QTL alleles associated with total dissolved solids among crop types in a GWAS of a Beta vulgaris diversity panel.在甜菜多样性群体的全基因组关联研究中,不同作物类型间与总溶解固体相关的数量性状位点等位基因变异。
Plant Genome. 2025 Mar;18(1):e70014. doi: 10.1002/tpg2.70014.
3
Error-corrected ultradeep next-generation sequencing for detection of clonal haematopoiesis and haematological neoplasms - sensitivity, specificity and accuracy.

本文引用的文献

1
Sequencing pools of individuals - mining genome-wide polymorphism data without big funding.对个体进行测序 - 在没有大量资金的情况下挖掘全基因组多态性数据。
Nat Rev Genet. 2014 Nov;15(11):749-63. doi: 10.1038/nrg3803. Epub 2014 Sep 23.
2
Toward better understanding of artifacts in variant calling from high-coverage samples.为了更好地理解高覆盖样本中变体调用中的伪影。
Bioinformatics. 2014 Oct 15;30(20):2843-51. doi: 10.1093/bioinformatics/btu356. Epub 2014 Jun 27.
3
Effective filtering strategies to improve data quality from population-based whole exome sequencing studies.
用于检测克隆性造血和血液系统肿瘤的纠错超深度下一代测序——敏感性、特异性和准确性
PLoS One. 2025 Feb 26;20(2):e0318300. doi: 10.1371/journal.pone.0318300. eCollection 2025.
4
Integrating Metagenomic and Culture-Based Techniques to Detect Foodborne Pathogens and Antimicrobial Resistance Genes in Malaysian Produce.整合宏基因组学和基于培养的技术以检测马来西亚农产品中的食源性病原体和抗菌药物耐药基因。
Foods. 2025 Jan 22;14(3):352. doi: 10.3390/foods14030352.
5
Multiplexed amplicon sequencing reveals the heterogeneous spatial distribution of pyrethroid resistance mutations in Aedes albopictus mosquito populations in southern France.多重扩增子测序揭示了法国南部白纹伊蚊种群中拟除虫菊酯抗性突变的异质空间分布。
Parasit Vectors. 2024 Dec 27;17(1):539. doi: 10.1186/s13071-024-06632-8.
6
Genome resequencing and genome-wide polymorphisms in mosquito vectors Aedes aegypti and Aedes albopictus from south India.来自印度南部的埃及伊蚊和白纹伊蚊这两种蚊媒的基因组重测序及全基因组多态性
Sci Rep. 2024 Oct 2;14(1):22931. doi: 10.1038/s41598-024-71484-2.
7
Genome-wide copy number variation regions in indigenous (Bos indicus) cattle breeds of Tamil Nadu, India.印度泰米尔纳德邦本土(瘤牛)牛品种的全基因组拷贝数变异区域
Anim Biosci. 2025 Mar;38(3):395-407. doi: 10.5713/ab.23.0525. Epub 2024 Aug 26.
8
Sampling strategies for genotyping common bean ( L.) Genebank accessions with DArTseq: a comparison of single plants, multiple plants, and DNA pools.利用DArTseq技术对普通菜豆(Phaseolus vulgaris L.)基因库种质进行基因分型的取样策略:单株、多株和DNA池的比较。
Front Plant Sci. 2024 Jul 11;15:1338332. doi: 10.3389/fpls.2024.1338332. eCollection 2024.
9
First report on metagenomics and their predictive functional analysis of fermented bamboo shoot food of Tripura, North East India.印度东北部特里普拉发酵竹笋食品的宏基因组学及其预测功能分析的首次报告。
Front Microbiol. 2023 Apr 12;14:1158411. doi: 10.3389/fmicb.2023.1158411. eCollection 2023.
10
Deep Learning in Population Genetics.群体遗传学中的深度学习。
Genome Biol Evol. 2023 Feb 3;15(2). doi: 10.1093/gbe/evad008.
从基于人群的全外显子组测序研究中提高数据质量的有效筛选策略。
BMC Bioinformatics. 2014 May 2;15:125. doi: 10.1186/1471-2105-15-125.
4
Validation of SNP allele frequencies determined by pooled next-generation sequencing in natural populations of a non-model plant species.通过混合下一代测序确定的单核苷酸多态性(SNP)等位基因频率在非模式植物物种自然种群中的验证。
PLoS One. 2013 Nov 7;8(11):e80422. doi: 10.1371/journal.pone.0080422. eCollection 2013.
5
Analysis of immune-related loci identifies 48 new susceptibility variants for multiple sclerosis.免疫相关基因座分析鉴定出多发性硬化症的 48 个新易感变异。
Nat Genet. 2013 Nov;45(11):1353-60. doi: 10.1038/ng.2770. Epub 2013 Sep 29.
6
Genetic insights into common pathways and complex relationships among immune-mediated diseases.遗传对免疫介导性疾病的常见途径和复杂关系的深入了解。
Nat Rev Genet. 2013 Sep;14(9):661-73. doi: 10.1038/nrg3502. Epub 2013 Aug 6.
7
Estimation of population allele frequencies from next-generation sequencing data: pool-versus individual-based genotyping.基于下一代测序数据的群体等位基因频率估计:基于池与个体的基因分型。
Mol Ecol. 2013 Jul;22(14):3766-79. doi: 10.1111/mec.12360. Epub 2013 Jun 4.
8
Updating benchtop sequencing performance comparison.更新台式测序性能比较。
Nat Biotechnol. 2013 Apr;31(4):294-6. doi: 10.1038/nbt.2522.
9
Analysis of 6,515 exomes reveals the recent origin of most human protein-coding variants.对 6515 个外显子组的分析揭示了大多数人类蛋白质编码变异的近期起源。
Nature. 2013 Jan 10;493(7431):216-20. doi: 10.1038/nature11690. Epub 2012 Nov 28.
10
An integrated map of genetic variation from 1,092 human genomes.1092 个人类基因组遗传变异的综合图谱。
Nature. 2012 Nov 1;491(7422):56-65. doi: 10.1038/nature11632.