• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

从低深度下一代测序数据中计算 Tajima's D 和其他中性检验统计量。

Calculation of Tajima's D and other neutrality test statistics from low depth next-generation sequencing data.

机构信息

Centre for GeoGenetics, Natural History Museum of Denmark, University of Copenhagen, Oestervoldgade 5-7, DK-1350, Copenhagen, Denmark.

出版信息

BMC Bioinformatics. 2013 Oct 2;14:289. doi: 10.1186/1471-2105-14-289.

DOI:10.1186/1471-2105-14-289
PMID:24088262
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4015034/
Abstract

BACKGROUND

A number of different statistics are used for detecting natural selection using DNA sequencing data, including statistics that are summaries of the frequency spectrum, such as Tajima's D. These statistics are now often being applied in the analysis of Next Generation Sequencing (NGS) data. However, estimates of frequency spectra from NGS data are strongly affected by low sequencing coverage; the inherent technology dependent variation in sequencing depth causes systematic differences in the value of the statistic among genomic regions.

RESULTS

We have developed an approach that accommodates the uncertainty of the data when calculating site frequency based neutrality test statistics. A salient feature of this approach is that it implicitly solves the problems of varying sequencing depth, missing data and avoids the need to infer variable sites for the analysis and thereby avoids ascertainment problems introduced by a SNP discovery process.

CONCLUSION

Using an empirical Bayes approach for fast computations, we show that this method produces results for low-coverage NGS data comparable to those achieved when the genotypes are known without uncertainty. We also validate the method in an analysis of data from the 1000 genomes project. The method is implemented in a fast framework which enables researchers to perform these neutrality tests on a genome-wide scale.

摘要

背景

许多不同的统计数据被用于使用 DNA 测序数据检测自然选择,包括对频谱进行总结的统计数据,如 Tajima 的 D。这些统计数据现在经常被应用于下一代测序(NGS)数据的分析。然而,NGS 数据中频谱的估计受到低测序覆盖度的强烈影响;测序深度固有的依赖技术的变化导致统计数据在基因组区域之间的系统差异。

结果

我们开发了一种方法,在计算基于位点频率的中性测试统计数据时,考虑到数据的不确定性。该方法的一个显著特点是,它隐式地解决了测序深度变化、缺失数据的问题,避免了为分析推断可变位点的需要,从而避免了 SNP 发现过程中引入的确定问题。

结论

我们使用经验贝叶斯方法进行快速计算,结果表明,对于低覆盖 NGS 数据,该方法的结果与基因型不确定时的结果相当。我们还在对 1000 个基因组项目数据的分析中验证了该方法。该方法在一个快速的框架中实现,使研究人员能够在全基因组范围内进行这些中性测试。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0392/4015034/c3f73d6e3358/1471-2105-14-289-8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0392/4015034/c82bb2b6324a/1471-2105-14-289-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0392/4015034/90aaa5a57136/1471-2105-14-289-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0392/4015034/330858a634e0/1471-2105-14-289-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0392/4015034/0043077af6e7/1471-2105-14-289-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0392/4015034/9c65f5cfa831/1471-2105-14-289-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0392/4015034/b3accd72ebb7/1471-2105-14-289-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0392/4015034/bc54710f283b/1471-2105-14-289-7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0392/4015034/c3f73d6e3358/1471-2105-14-289-8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0392/4015034/c82bb2b6324a/1471-2105-14-289-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0392/4015034/90aaa5a57136/1471-2105-14-289-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0392/4015034/330858a634e0/1471-2105-14-289-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0392/4015034/0043077af6e7/1471-2105-14-289-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0392/4015034/9c65f5cfa831/1471-2105-14-289-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0392/4015034/b3accd72ebb7/1471-2105-14-289-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0392/4015034/bc54710f283b/1471-2105-14-289-7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0392/4015034/c3f73d6e3358/1471-2105-14-289-8.jpg

相似文献

1
Calculation of Tajima's D and other neutrality test statistics from low depth next-generation sequencing data.从低深度下一代测序数据中计算 Tajima's D 和其他中性检验统计量。
BMC Bioinformatics. 2013 Oct 2;14:289. doi: 10.1186/1471-2105-14-289.
2
Fast and accurate estimation of multidimensional site frequency spectra from low-coverage high-throughput sequencing data.从低覆盖高通量测序数据中快速准确地估计多维位点频率谱。
Gigascience. 2022 May 17;11. doi: 10.1093/gigascience/giac032.
3
A neutrality test for detecting selection on DNA methylation using single methylation polymorphism frequency spectrum.一种利用单甲基化多态性频谱检测DNA甲基化选择的中性检验。
Genome Biol Evol. 2014 Dec 23;7(1):154-71. doi: 10.1093/gbe/evu271.
4
Estimating individual admixture proportions from next generation sequencing data.从下一代测序数据估计个体混合比例。
Genetics. 2013 Nov;195(3):693-702. doi: 10.1534/genetics.113.154138. Epub 2013 Sep 11.
5
The next generation of molecular markers from massively parallel sequencing of pooled DNA samples.基于 DNA 样本池的高通量测序的下一代分子标记物。
Genetics. 2010 Sep;186(1):207-18. doi: 10.1534/genetics.110.114397. Epub 2010 May 10.
6
Correcting estimators of theta and Tajima's D for ascertainment biases caused by the single-nucleotide polymorphism discovery process.校正由单核苷酸多态性发现过程导致的确定偏差的θ和 Tajima's D估计量。
Genetics. 2009 Feb;181(2):701-10. doi: 10.1534/genetics.108.094060. Epub 2008 Dec 15.
7
An extended Tajima's D neutrality test incorporating SNP calling and imputation uncertainties.一种纳入单核苷酸多态性(SNP)分型及填充不确定性的扩展型 Tajima's D 中性检验。
Stat Interface. 2015 Oct 1;8(4):447-456. doi: 10.4310/SII.2015.v8.n4.a4.
8
Quantifying population genetic differentiation from next-generation sequencing data.从下一代测序数据中定量群体遗传分化。
Genetics. 2013 Nov;195(3):979-92. doi: 10.1534/genetics.113.154740. Epub 2013 Aug 26.
9
SNVHMM: predicting single nucleotide variants from next generation sequencing.SNVHMM:从下一代测序中预测单核苷酸变异。
BMC Bioinformatics. 2013 Jul 15;14:225. doi: 10.1186/1471-2105-14-225.
10
Neutrality tests for sequences with missing data.带有缺失数据的序列的中立性检验。
Genetics. 2012 Aug;191(4):1397-401. doi: 10.1534/genetics.112.139949. Epub 2012 Jun 1.

引用本文的文献

1
Landscape genomics analysis reveals the genetic basis underlying cashmere goats and dairy goats adaptation to frigid environments.景观基因组学分析揭示了绒山羊和奶山羊适应寒冷环境的遗传基础。
Stress Biol. 2025 Sep 9;5(1):56. doi: 10.1007/s44154-025-00254-5.
2
Gone With the Wind: Exploring a Vanished Rock Dove, , Hybrid Zone in the Sahara Desert.《随风而逝:探寻撒哈拉沙漠中一个消失的原鸽杂交区》
Ecol Evol. 2025 Aug 27;15(9):e72061. doi: 10.1002/ece3.72061. eCollection 2025 Sep.
3
Recent Adaptation in a Threatened Salmonid Revealed by Museum Genomics.

本文引用的文献

1
SNP calling, genotype calling, and sample allele frequency estimation from New-Generation Sequencing data.从新一代测序数据中进行 SNP 调用、基因型调用和样本等位基因频率估计。
PLoS One. 2012;7(7):e37558. doi: 10.1371/journal.pone.0037558. Epub 2012 Jul 24.
2
Neutrality tests for sequences with missing data.带有缺失数据的序列的中立性检验。
Genetics. 2012 Aug;191(4):1397-401. doi: 10.1534/genetics.112.139949. Epub 2012 Jun 1.
3
Association testing for next-generation sequencing data using score statistics.基于评分统计量的下一代测序数据关联分析。
博物馆基因组学揭示濒危鲑科鱼类的近期适应性变化
Mol Ecol. 2025 Sep;34(18):e70063. doi: 10.1111/mec.70063. Epub 2025 Sep 2.
4
Physiological plasticity in zebra finch color varieties mitigates DNA damage under oxidative stress.斑胸草雀不同羽色品种的生理可塑性可减轻氧化应激下的DNA损伤。
iScience. 2025 Jun 18;28(7):112937. doi: 10.1016/j.isci.2025.112937. eCollection 2025 Jul 18.
5
Translocations spur population growth but fail to prevent genetic erosion in imperiled Florida Scrub-Jays.易位促进了种群增长,但未能阻止濒危的佛罗里达灌丛鸦的基因侵蚀。
Curr Biol. 2025 Mar 24;35(6):1391-1399.e6. doi: 10.1016/j.cub.2025.01.058. Epub 2025 Feb 27.
6
Genetic diversity and selection signatures in sheep breeds.绵羊品种的遗传多样性与选择印记
J Appl Genet. 2025 Jan 30. doi: 10.1007/s13353-025-00941-z.
7
Ocean-Wide Conservation Genomics of Blue Whales Suggest New Northern Hemisphere Subspecies.蓝鲸的全海洋保护基因组学研究表明存在新的北半球亚种。
Mol Ecol. 2025 Jan;34(2):e17619. doi: 10.1111/mec.17619. Epub 2024 Dec 17.
8
Genome-wide scan for selection signatures in Mexican Sardo Negro Zebu cattle.墨西哥萨尔多内格罗黑牛的全基因组选择信号扫描。
PLoS One. 2024 Nov 11;19(11):e0312453. doi: 10.1371/journal.pone.0312453. eCollection 2024.
9
Linked Selection and Gene Density Shape Genome-Wide Patterns of Diversification in Peatmosses.连锁选择与基因密度塑造泥炭藓全基因组多样化模式。
Evol Appl. 2024 Aug 19;17(8):e13767. doi: 10.1111/eva.13767. eCollection 2024 Aug.
10
Persistent Gene Flow Suggests an Absence of Reproductive Isolation in an African Antelope Speciation Model.持续的基因流动表明非洲羚羊物种形成模型中不存在生殖隔离。
Syst Biol. 2024 Nov 29;73(6):979-994. doi: 10.1093/sysbio/syae037.
Genet Epidemiol. 2012 Jul;36(5):430-7. doi: 10.1002/gepi.21636. Epub 2012 May 8.
4
Assessing the accuracy and power of population genetic inference from low-pass next-generation sequencing data.评估来自低深度下一代测序数据的群体遗传推断的准确性和效能。
Front Genet. 2012 Apr 24;3:66. doi: 10.3389/fgene.2012.00066. eCollection 2012.
5
A statistical framework for SNP calling, mutation discovery, association mapping and population genetical parameter estimation from sequencing data.一种用于从测序数据中进行 SNP 调用、突变发现、关联映射和群体遗传参数估计的统计框架。
Bioinformatics. 2011 Nov 1;27(21):2987-93. doi: 10.1093/bioinformatics/btr509. Epub 2011 Sep 8.
6
Estimation of allele frequency and association mapping using next-generation sequencing data.利用下一代测序数据进行等位基因频率估计和关联作图。
BMC Bioinformatics. 2011 Jun 11;12:231. doi: 10.1186/1471-2105-12-231.
7
Inference of site frequency spectra from high-throughput sequence data: quantification of selection on nonsynonymous and synonymous sites in humans.从高通量测序数据推断位点频率谱:人类非同义与同义位点选择的定量分析。
Genetics. 2011 Aug;188(4):931-40. doi: 10.1534/genetics.111.128355. Epub 2011 May 19.
8
A framework for variation discovery and genotyping using next-generation DNA sequencing data.利用下一代 DNA 测序数据进行变异发现和基因分型的框架。
Nat Genet. 2011 May;43(5):491-8. doi: 10.1038/ng.806. Epub 2011 Apr 10.
9
A map of human genome variation from population-scale sequencing.人类基因组变异的图谱来自于基于人群的测序。
Nature. 2010 Oct 28;467(7319):1061-73. doi: 10.1038/nature09534.
10
The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data.基因组分析工具包:一种用于分析下一代 DNA 测序数据的 MapReduce 框架。
Genome Res. 2010 Sep;20(9):1297-303. doi: 10.1101/gr.107524.110. Epub 2010 Jul 19.