香农多样性指数：呼吁在群体遗传学研究中用无偏估计量取代原始的香农公式。

Shannon diversity index: a call to replace the original Shannon's formula with unbiased estimator in the population genetics studies.

作者信息

Konopiński Maciej K

机构信息

Institute of Nature Conservation, Polish Academy of Sciences, Kraków, Poland.

出版信息

PeerJ. 2020 Jun 29;8:e9391. doi: 10.7717/peerj.9391. eCollection 2020.

DOI:10.7717/peerj.9391

PMID:32655992

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7331625/

Abstract

BACKGROUND

The Shannon diversity index has been widely used in population genetics studies. Recently, it was proposed as a unifying measure of diversity at different levels-from genes and populations to whole species and ecosystems. The index, however, was proven to be negatively biased at small sample sizes. Modifications to the original Shannon's formula have been proposed to obtain an unbiased estimator.

METHODS

In this study, the performance of four different estimators of Shannon index-the original Shannon's formula and those of Zahl, Chao and Shen and Chao et al.-was tested on simulated microsatellite data. Both the simulation and analysis of the results were performed in the R language environment. A new R function was created for the calculation of all four indices from the genind data format.

RESULTS

Sample size dependence was detected in all the estimators analysed; however, the deviation from parametric values was substantially smaller in the derived measures than in the original Shannon's formula. Error rate was negatively associated with population heterozygosity. Comparisons among loci showed that fast-mutating loci were less affected by the error, except for the original Shannon's estimator which, in the smallest sample, was more strongly affected by loci with a higher number of alleles. The Zahl and Chao et al. estimators performed notably better than the original Shannon's formula.

CONCLUSION

The results of this study show that the original Shannon index should no longer be used as a measure of genetic diversity and should be replaced by Zahl's unbiased estimator.

摘要

背景

香农多样性指数已在群体遗传学研究中广泛应用。最近，它被提议作为一种统一的多样性度量指标，可用于从基因、群体到整个物种和生态系统的不同层面。然而，该指数在小样本量时被证明存在负偏差。已有人提出对原始香农公式进行修正以获得无偏估计量。

方法

在本研究中，对香农指数的四种不同估计量——原始香农公式以及扎尔、赵和沈以及赵等人提出的公式——在模拟微卫星数据上进行了性能测试。结果的模拟和分析均在R语言环境中进行。创建了一个新的R函数，用于从genind数据格式计算所有这四个指数。

结果

在所分析的所有估计量中均检测到样本量依赖性；然而，与参数值的偏差在推导的度量中比在原始香农公式中要小得多。错误率与群体杂合度呈负相关。位点间的比较表明，快速突变的位点受误差影响较小，除了原始香农估计量，在最小样本中，它受等位基因数量较多的位点影响更大。扎尔和赵等人的估计量表现明显优于原始香农公式。

结论

本研究结果表明，原始香农指数不应再用作遗传多样性的度量指标，而应以扎尔的无偏估计量取而代之。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8086/7331625/45db00becff3/peerj-08-9391-g001.jpg

相似文献

Shannon diversity index: a call to replace the original Shannon's formula with unbiased estimator in the population genetics studies.香农多样性指数：呼吁在群体遗传学研究中用无偏估计量取代原始的香农公式。

PeerJ. 2020 Jun 29;8:e9391. doi: 10.7717/peerj.9391. eCollection 2020.

Asymptotic Normality for Plug-In Estimators of Generalized Shannon's Entropy.广义香农熵插件估计量的渐近正态性。

Entropy (Basel). 2022 May 12;24(5):683. doi: 10.3390/e24050683.

An Unbiased Estimator of Gene Diversity with Improved Variance for Samples Containing Related and Inbred Individuals of any Ploidy.针对包含任意倍性的相关个体和近交个体的样本，一种具有改进方差的基因多样性无偏估计量。

G3 (Bethesda). 2017 Feb 9;7(2):671-691. doi: 10.1534/g3.116.037168.

Molecular genetic diversity of Satureja bachtiarica.沙芥的分子遗传多样性。

Mol Biol Rep. 2013 Nov;40(11):6501-8. doi: 10.1007/s11033-013-2768-z. Epub 2013 Oct 6.

Genes are information, so information theory is coming to the aid of evolutionary biology.基因即信息，因此信息论正在助力进化生物学。

Mol Ecol Resour. 2015 Nov;15(6):1259-61. doi: 10.1111/1755-0998.12458.

Monitoring landscape metrics by point sampling: accuracy in estimating Shannon's diversity and edge density.点采样监测景观指标：估计香农多样性和边缘密度的准确性。

Environ Monit Assess. 2010 May;164(1-4):403-21. doi: 10.1007/s10661-009-0902-0. Epub 2009 May 5.

Analysis of Age and Gender Structures for ICD-10 Diagnoses in Outpatient Treatment Using Shannon's Entropy.使用香农熵对门诊治疗中ICD - 10诊断的年龄和性别结构进行分析。

Stud Health Technol Inform. 2017;243:52-56.

Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.在流行地区，服用抗叶酸抗疟药物的人群中，叶酸补充剂与疟疾易感性和严重程度的关系。

Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.

Parametric scaling from species relative abundances to absolute abundances in the computation of biological diversity: a first proposal using Shannon's entropy.在生物多样性计算中从物种相对丰度到绝对丰度的参数缩放：使用香农熵的首个提议

Acta Biotheor. 2003;51(3):181-8. doi: 10.1023/a:1025142106292.

Genetic diversity in wild Dipsacus chinensis populations from China based on ISSR markers.基于ISSR标记的中国野生川续断居群的遗传多样性

Genet Mol Res. 2013 Apr 12;12(2):1205-13. doi: 10.4238/2013.April.12.7.

引用本文的文献

Associations of digestive diseases exposure and lifestyle factors with Parkinson's disease.消化系统疾病暴露及生活方式因素与帕金森病的关联

NPJ Parkinsons Dis. 2025 Aug 18;11(1):245. doi: 10.1038/s41531-025-01098-6.

The tip of the iceberg: extraordinarily high diversity while examining two infralittoral nematode communities on Okinawa-jima Island, Japan, using morphology and DNA barcoding.冰山一角：在日本冲绳岛利用形态学和DNA条形码技术研究两个潮下带线虫群落时发现的极高多样性

PeerJ. 2025 Jul 30;13:e19757. doi: 10.7717/peerj.19757. eCollection 2025.

SSR-based molecular characterization of Verticillium wilt resistance in Iranian cotton cultivars.基于简单序列重复（SSR）的伊朗棉花品种黄萎病抗性分子特征分析

Biochem Biophys Rep. 2025 May 19;42:102059. doi: 10.1016/j.bbrep.2025.102059. eCollection 2025 Jun.

Uncovering potentials of an association panel subset for nitrogen fixation and sustainable chickpea productivity.挖掘关联群体子集在鹰嘴豆固氮和可持续生产力方面的潜力。

BMC Plant Biol. 2025 May 24;25(1):693. doi: 10.1186/s12870-025-06244-z.

Bacterial colonization contributes to pathological scar formation via the regulation of inflammatory response.细菌定植通过调节炎症反应促进病理性瘢痕形成。

J Transl Med. 2025 May 21;23(1):569. doi: 10.1186/s12967-025-06585-1.

disrupts -host dynamics in the domestic mite : evidence from manipulative experiments.破坏家螨中的宿主动态：来自操纵实验的证据

mSystems. 2025 May 20;10(5):e0176924. doi: 10.1128/msystems.01769-24. Epub 2025 Apr 18.

The impact of cystic fibrosis transmembrane conductance regulator (CFTR) modulators on the pulmonary microbiota.囊性纤维化跨膜传导调节因子（CFTR）调节剂对肺部微生物群的影响。

Microbiology (Reading). 2025 Apr;171(4). doi: 10.1099/mic.0.001553.

Biodiversity and Evaluation of Genetic Resources of Some Coffee Trees Grown in Al-Baha, Saudi Arabia.沙特阿拉伯巴哈地区种植的一些咖啡树的生物多样性及遗传资源评估

Curr Issues Mol Biol. 2025 Feb 20;47(3):136. doi: 10.3390/cimb47030136.

Unsupervised clustering for sepsis identification in large-scale patient data: a model development and validation study.用于大规模患者数据中脓毒症识别的无监督聚类：一项模型开发与验证研究。

Intensive Care Med Exp. 2025 Mar 20;13(1):37. doi: 10.1186/s40635-025-00744-w.

Chimpanzees () Indicate Mammalian Abundance Across Broad Spatial Scales.黑猩猩表明了广泛空间尺度上的哺乳动物丰富度。

Ecol Evol. 2025 Mar 15;15(3):e71000. doi: 10.1002/ece3.71000. eCollection 2025 Mar.

本文引用的文献

Entropy, or Information, Unifies Ecology and Evolution and Beyond.熵，即信息，统一了生态学与进化及其他领域。

Entropy (Basel). 2018 Sep 21;20(10):727. doi: 10.3390/e20100727.

Population genetic structure, migration, and polyploidy origin of a medicinal species (Cucurbitaceae).一种药用植物（葫芦科）的群体遗传结构、迁移及多倍体起源

Ecol Evol. 2019 Sep 12;9(19):11145-11170. doi: 10.1002/ece3.5618. eCollection 2019 Oct.

Diversity from genes to ecosystems: A unifying framework to study variation across biological metrics and scales.从基因到生态系统的多样性：一个用于研究生物指标和尺度上变异的统一框架。

Evol Appl. 2018 Feb 20;11(7):1176-1193. doi: 10.1111/eva.12593. eCollection 2018 Aug.

dartr: An r package to facilitate analysis of SNP data generated from reduced representation genome sequencing.dartr：一个 r 包，用于简化从简化代表性基因组测序生成的 SNP 数据的分析。

Mol Ecol Resour. 2018 May;18(3):691-699. doi: 10.1111/1755-0998.12745. Epub 2018 Jan 15.

Information Theory Broadens the Spectrum of Molecular Ecology and Evolution.信息论拓宽了分子生态学和进化的研究范围。

Trends Ecol Evol. 2017 Dec;32(12):948-963. doi: 10.1016/j.tree.2017.09.012. Epub 2017 Nov 7.

stratag: An r package for manipulating, summarizing and analysing population genetic data.Stratag：一个用于处理、汇总和分析群体遗传数据的R软件包。

Mol Ecol Resour. 2017 Jan;17(1):5-11. doi: 10.1111/1755-0998.12559. Epub 2016 Jul 20.

Poppr: an R package for genetic analysis of populations with clonal, partially clonal, and/or sexual reproduction.Poppr：用于具有克隆、部分克隆和/或有性繁殖的群体遗传分析的 R 包。

PeerJ. 2014 Mar 4;2:e281. doi: 10.7717/peerj.281. eCollection 2014.

Robust demographic inference from genomic and SNP data.基于基因组和单核苷酸多态性数据的可靠人口统计学推断。

PLoS Genet. 2013 Oct;9(10):e1003905. doi: 10.1371/journal.pgen.1003905. Epub 2013 Oct 24.

High level of genetic differentiation for allelic richness among populations of the argan tree [Argania spinosa (L.) Skeels] endemic to Morocco.摩洛哥特有种阿甘树（Argania spinosa (L.) Skeels）种群间等位基因丰富度的遗传分化水平较高。

Theor Appl Genet. 1996 May;92(7):832-9. doi: 10.1007/BF00221895.

GenAlEx 6.5: genetic analysis in Excel. Population genetic software for teaching and research--an update.GenAlEx 6.5：Excel 中的遗传分析。用于教学和研究的种群遗传软件--更新。

Bioinformatics. 2012 Oct 1;28(19):2537-9. doi: 10.1093/bioinformatics/bts460. Epub 2012 Jul 20.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

香农多样性指数：呼吁在群体遗传学研究中用无偏估计量取代原始的香农公式。

Shannon diversity index: a call to replace the original Shannon's formula with unbiased estimator in the population genetics studies.

作者信息

机构信息

出版信息

BACKGROUND

METHODS

RESULTS

CONCLUSION

背景

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献