• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

在从包含缺失数据的序列集中估计π的真实值时,平均加权核苷酸多样性比pixy更精确。

Average weighted nucleotide diversity is more precise than pixy in estimating the true value of π from sequence sets containing missing data.

作者信息

Konopiński Maciej K

机构信息

Institute of Nature Conservation Polish Academy of Sciences, Kraków, Poland.

出版信息

Mol Ecol Resour. 2023 Feb;23(2):348-354. doi: 10.1111/1755-0998.13707. Epub 2022 Sep 6.

DOI:10.1111/1755-0998.13707
PMID:36031871
Abstract

Nucleotide diversity remains an important statistic in population genetic/genomic studies. Although recent advances in massive sequencing make generating sequence data sets cheaper and faster, currently used technologies often introduce substantial amounts of missing nucleotides in their output. A novel method of estimating π from data sets containing missing data - pixy - has also recently been proposed. In this study, the pixy estimator, π , was compared to average weighted nucleotide diversity, π . The estimators were tested both on sequences simulated in fastsimcoal and real sequence sets. Both sets were modified by random insertion of missing nucleotides. Weighted nucleotide diversity performed better in all pairwise comparisons. It was characterized by a smaller error and a narrower distribution of the results. π tends to overestimate the nucleotide diversity when both the proportion of missing data and the level of variation is low. Of the two estimators, only π estimated the true nucleotide diversity in a part of the simulations. A simple formula for estimating π allows for easy integration of the estimator in packages such as pixy, which would allow obtaining more precise estimates of nucleotide diversity either in a sliding window or for discrete genomic regions.

摘要

核苷酸多样性仍然是群体遗传学/基因组学研究中的一项重要统计量。尽管大规模测序的最新进展使得生成序列数据集的成本更低、速度更快,但目前使用的技术在其输出中往往会引入大量缺失的核苷酸。最近还提出了一种从包含缺失数据的数据集中估计π的新方法——pixy。在本研究中,将pixy估计值π与平均加权核苷酸多样性π进行了比较。这两种估计方法在fastsimcoal模拟的序列和真实序列集上都进行了测试。这两组序列都通过随机插入缺失核苷酸进行了修改。在所有成对比较中,加权核苷酸多样性表现更好。其特点是误差较小,结果分布较窄。当缺失数据的比例和变异水平都较低时,π往往会高估核苷酸多样性。在这两种估计方法中,只有π在部分模拟中估计出了真实的核苷酸多样性。一个估计π的简单公式便于将该估计方法集成到pixy等软件包中,这将能够在滑动窗口或离散基因组区域中获得更精确的核苷酸多样性估计值。

相似文献

1
Average weighted nucleotide diversity is more precise than pixy in estimating the true value of π from sequence sets containing missing data.在从包含缺失数据的序列集中估计π的真实值时,平均加权核苷酸多样性比pixy更精确。
Mol Ecol Resour. 2023 Feb;23(2):348-354. doi: 10.1111/1755-0998.13707. Epub 2022 Sep 6.
2
pixy: Unbiased estimation of nucleotide diversity and divergence in the presence of missing data.pixy:在存在缺失数据的情况下,对核苷酸多样性和分歧进行无偏估计。
Mol Ecol Resour. 2021 May;21(4):1359-1368. doi: 10.1111/1755-0998.13326. Epub 2021 Feb 5.
3
Genotyping-by-sequencing for estimating relatedness in nonmodel organisms: Avoiding the trap of precise bias.基于测序的基因分型在非模式生物亲缘关系估计中的应用:避免精确偏倚的陷阱。
Mol Ecol Resour. 2018 May;18(3):381-390. doi: 10.1111/1755-0998.12739. Epub 2018 Jan 29.
4
Viral Diversity Based on Next-Generation Sequencing of HIV-1 Provides Precise Estimates of Infection Recency and Time Since Infection.基于 HIV-1 下一代测序的病毒多样性可精确估计感染时间和感染后时间。
J Infect Dis. 2019 Jun 19;220(2):254-265. doi: 10.1093/infdis/jiz094.
5
Fragmentation and Coverage Variation in Viral Metagenome Assemblies, and Their Effect in Diversity Calculations.病毒宏基因组组装中的碎片化和覆盖度变化,及其对多样性计算的影响。
Front Bioeng Biotechnol. 2015 Sep 17;3:141. doi: 10.3389/fbioe.2015.00141. eCollection 2015.
6
Small- and large-scale heterogeneity in genetic variation across the collard flycatcher genome: implications for estimating genetic diversity in nonmodel organisms.小型和大型遗传变异在卷羽鹈鹕基因组中的异质性:对非模式生物遗传多样性估计的影响。
Mol Ecol Resour. 2017 Jul;17(4):583-585. doi: 10.1111/1755-0998.12632.
7
Genetic diversity analysis of highly incomplete SNP genotype data with imputations: an empirical assessment.基于插补的高度不完整SNP基因型数据的遗传多样性分析:实证评估
G3 (Bethesda). 2014 Mar 13;4(5):891-900. doi: 10.1534/g3.114.010942.
8
Understanding and utilizing crop genome diversity via high-resolution genotyping.通过高分辨率基因分型理解和利用作物基因组多样性。
Plant Biotechnol J. 2016 Apr;14(4):1086-94. doi: 10.1111/pbi.12456. Epub 2015 Aug 19.
9
Whole genome sequence analysis of equid gammaherpesvirus -2 field isolates reveals high levels of genomic diversity and recombination.马γ疱疹病毒 2 田间分离株的全基因组序列分析显示出高度的基因组多样性和重组。
BMC Genomics. 2022 Aug 30;23(1):622. doi: 10.1186/s12864-022-08789-x.
10
Genome-wide nucleotide diversity of hatchery-reared Atlantic and Mediterranean strains of brown trout Salmo trutta compared to wild Mediterranean populations.与野生地中海种群相比,养殖的大西洋和地中海褐鳟(Salmo trutta)品系的全基因组核苷酸多样性。
J Fish Biol. 2016 Dec;89(6):2717-2734. doi: 10.1111/jfb.13131. Epub 2016 Sep 25.

引用本文的文献

1
Ecological genetics of isolated loach populations indicate compromised adaptive potential.孤立泥鳅种群的生态遗传学表明其适应潜力受损。
Heredity (Edinb). 2024 Aug;133(2):88-98. doi: 10.1038/s41437-024-00695-0. Epub 2024 Jul 3.