使用RawGeno（一个用于自动进行AFLP评分的R软件包）评估评分参数对种内遗传变异结构的影响。

Evaluating the impact of scoring parameters on the structure of intra-specific genetic variation using RawGeno, an R package for automating AFLP scoring.

作者信息

Arrigo Nils, Tuszynski Jarek W, Ehrich Dorothee, Gerdes Tommy, Alvarez Nadir

机构信息

Laboratory of Evolutionary Botany, Institute of Biology, University of Neuchâtel, 11 rue Emile-Argand, CH-2000 Neuchâtel, Switzerland.

出版信息

BMC Bioinformatics. 2009 Jan 26;10:33. doi: 10.1186/1471-2105-10-33.

DOI:10.1186/1471-2105-10-33

PMID:19171029

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2656475/

Abstract

BACKGROUND

Since the transfer and application of modern sequencing technologies to the analysis of amplified fragment-length polymorphisms (AFLP), evolutionary biologists have included an increasing number of samples and markers in their studies. Although justified in this context, the use of automated scoring procedures may result in technical biases that weaken the power and reliability of further analyses.

RESULTS

Using a new scoring algorithm, RawGeno, we show that scoring errors--in particular "bin oversplitting" (i.e. when variant sizes of the same AFLP marker are not considered as homologous) and "technical homoplasy" (i.e. when two AFLP markers that differ slightly in size are mistakenly considered as being homologous)--induce a loss of discriminatory power, decrease the robustness of results and, in extreme cases, introduce erroneous information in genetic structure analyses. In the present study, we evaluate several descriptive statistics that can be used to optimize the scoring of the AFLP analysis, and we describe a new statistic, the information content per bin (Ibin) that represents a valuable estimator during the optimization process. This statistic can be computed at any stage of the AFLP analysis without requiring the inclusion of replicated samples. Finally, we show that downstream analyses are not equally sensitive to scoring errors. Indeed, although a reasonable amount of flexibility is allowed during the optimization of the scoring procedure without causing considerable changes in the detection of genetic structure patterns, notable discrepancies are observed when estimating genetic diversities from differently scored datasets.

CONCLUSION

Our algorithm appears to perform as well as a commercial program in automating AFLP scoring, at least in the context of population genetics or phylogeographic studies. To our knowledge, RawGeno is the only freely available public-domain software for fully automated AFLP scoring, from electropherogram files to user-defined working binary matrices. RawGeno was implemented in an R CRAN package (with an user-friendly GUI) and can be found at http://sourceforge.net/projects/rawgeno.

摘要

背景

自从现代测序技术被应用于扩增片段长度多态性（AFLP）分析以来，进化生物学家在其研究中纳入的样本和标记数量不断增加。尽管在这种情况下是合理的，但使用自动评分程序可能会导致技术偏差，从而削弱进一步分析的效力和可靠性。

结果

使用一种新的评分算法RawGeno，我们发现评分错误——特别是“bin过度拆分”（即当同一AFLP标记的不同变体大小不被视为同源时）和“技术平行进化”（即当大小略有不同的两个AFLP标记被错误地视为同源时）——会导致鉴别力丧失，降低结果的稳健性，在极端情况下，还会在遗传结构分析中引入错误信息。在本研究中，我们评估了几种可用于优化AFLP分析评分的描述性统计量，并描述了一种新的统计量，即每个bin的信息含量（Ibin），它在优化过程中是一个有价值的估计量。这个统计量可以在AFLP分析的任何阶段计算，无需包含重复样本。最后，我们表明下游分析对评分错误的敏感度并不相同。事实上，虽然在优化评分程序时允许一定程度的灵活性，而不会在遗传结构模式的检测中引起相当大的变化，但在从不同评分的数据集估计遗传多样性时，会观察到明显的差异。

结论

我们的算法在自动化AFLP评分方面似乎与商业程序表现相当，至少在群体遗传学或系统地理学研究的背景下是这样。据我们所知，RawGeno是唯一一款可免费获取的用于从电泳图文件到用户定义的工作二进制矩阵进行全自动AFLP评分的公共领域软件。RawGeno是在一个R CRAN包中实现的（带有用户友好的图形用户界面），可在http://sourceforge.net/projects/rawgeno上找到。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/971f/2656475/c003ad98e92b/1471-2105-10-33-1.jpg

相似文献

Evaluating the impact of scoring parameters on the structure of intra-specific genetic variation using RawGeno, an R package for automating AFLP scoring.

BMC Bioinformatics. 2009 Jan 26;10:33. doi: 10.1186/1471-2105-10-33.

Automated scoring of AFLPs using RawGeno v 2.0, a free R CRAN library.

Methods Mol Biol. 2012;888:155-75. doi: 10.1007/978-1-61779-870-2_10.

Influence of parameter settings in automated scoring of AFLPs on population genetic analysis.

Mol Ecol Resour. 2013 Jan;13(1):128-34. doi: 10.1111/1755-0998.12033. Epub 2012 Nov 26.

Optimizing automated AFLP scoring parameters to improve phylogenetic resolution.

Syst Biol. 2008 Jun;57(3):347-66. doi: 10.1080/10635150802044037.

optiFLP: software for automated optimization of amplified fragment length polymorphism scoring parameters.

Mol Ecol Resour. 2011 Nov;11(6):1113-8. doi: 10.1111/j.1755-0998.2011.03043.x. Epub 2011 Jun 28.

Selection criteria for scoring amplified fragment length polymorphisms (AFLPs) positively affect the reliability of population genetic parameter estimates.

Genome. 2010 Apr;53(4):302-10. doi: 10.1139/g10-006.

Automated masking of AFLP markers improves reliability of phylogenetic analyses.

PLoS One. 2012;7(11):e49119. doi: 10.1371/journal.pone.0049119. Epub 2012 Nov 9.

AFLPMax: a user-friendly application for computing the optimal number of amplified fragment length polymorphism markers needed in phylogenetic reconstruction.

Mol Ecol Resour. 2012 May;12(3):566-9. doi: 10.1111/j.1755-0998.2011.03113.x. Epub 2012 Jan 23.

Impact of amplified fragment length polymorphism size homoplasy on the estimation of population genetic diversity and the detection of selective loci.

Genetics. 2008 May;179(1):539-54. doi: 10.1534/genetics.107.083246.

AFLP-AFLP in silico-NGS approach reveals polymorphisms in repetitive elements in the malignant genome.

PLoS One. 2018 Nov 8;13(11):e0206620. doi: 10.1371/journal.pone.0206620. eCollection 2018.

引用本文的文献

Integrative taxonomy reveals cryptic diversity within the alliance (Euphorbiaceae) in the central Balkan Peninsula.

Front Plant Sci. 2025 Apr 14;16:1558466. doi: 10.3389/fpls.2025.1558466. eCollection 2025.

Patterns of Genetic and Morphological Variability of (Lamiaceae) on the Balkan Peninsula.

Plants (Basel). 2024 Dec 23;13(24):3596. doi: 10.3390/plants13243596.

Apomictic Mountain Whitebeam (, Rosaceae) Comprises Several Genetically and Morphologically Divergent Lineages.

Biology (Basel). 2023 Feb 27;12(3):380. doi: 10.3390/biology12030380.

Disentangling Relationships among the Alpine Species of Sect. (Juncaceae) in the Eastern Alps.

Plants (Basel). 2023 Feb 20;12(4):973. doi: 10.3390/plants12040973.

When ecological marginality is not geographically peripheral: exploring genetic predictions of the centre-periphery hypothesis in the endemic plant .

PeerJ. 2021 Mar 10;9:e11039. doi: 10.7717/peerj.11039. eCollection 2021.

Phenotypic Responses, Reproduction Mode and Epigenetic Patterns under Temperature Treatments in the Alpine Plant Species (Ranunculaceae).

Biology (Basel). 2020 Sep 29;9(10):315. doi: 10.3390/biology9100315.

DNA methylation patterns respond to thermal stress in the viviparous cockroach .

Epigenetics. 2021 Mar;16(3):313-326. doi: 10.1080/15592294.2020.1795603. Epub 2020 Aug 10.

Epigenetic Patterns and Geographical Parthenogenesis in the Alpine Plant Species (Ranunculaceae).

Int J Mol Sci. 2020 May 7;21(9):3318. doi: 10.3390/ijms21093318.

Effects of Temperature Treatments on Cytosine-Methylation Profiles of Diploid and Autotetraploid Plants of the Alpine Species (Ranunculaceae).

Front Plant Sci. 2020 Apr 8;11:435. doi: 10.3389/fpls.2020.00435. eCollection 2020.

Ancestral remnants or peripheral segregates? Phylogenetic relationships of two narrowly endemic species (Orobanchaceae) from the eastern European Alps.

AoB Plants. 2019 Feb 19;11(2):plz007. doi: 10.1093/aobpla/plz007. eCollection 2019 Apr.

本文引用的文献

An objective, rapid and reproducible method for scoring AFLP peak-height data that minimizes genotyping error.

Mol Ecol Resour. 2008 Jul;8(4):725-35. doi: 10.1111/j.1755-0998.2007.02073.x.

Optimizing automated AFLP scoring parameters to improve phylogenetic resolution.

Syst Biol. 2008 Jun;57(3):347-66. doi: 10.1080/10635150802044037.

Impact of amplified fragment length polymorphism size homoplasy on the estimation of population genetic diversity and the detection of selective loci.

Genetics. 2008 May;179(1):539-54. doi: 10.1534/genetics.107.083246.

Statistical analysis of amplified fragment length polymorphism data: a toolbox for molecular ecologists and evolutionists.

Mol Ecol. 2007 Sep;16(18):3737-58. doi: 10.1111/j.1365-294X.2007.03435.x.

Almost forgotten or latest practice? AFLP applications, analyses and advances.

Trends Plant Sci. 2007 Mar;12(3):106-17. doi: 10.1016/j.tplants.2007.02.001. Epub 2007 Feb 14.

Towards unbiased parentage assignment: combining genetic, behavioural and spatial data in a Bayesian framework.

Mol Ecol. 2006 Oct;15(12):3715-30. doi: 10.1111/j.1365-294X.2006.03050.x.

PSMIX: an R package for population structure inference via maximum likelihood method.

BMC Bioinformatics. 2006 Jun 22;7:317. doi: 10.1186/1471-2105-7-317.

Genotyping errors: causes, consequences and solutions.

Nat Rev Genet. 2005 Nov;6(11):847-59. doi: 10.1038/nrg1707.

How to track and assess genotyping errors in population genetics studies.

Mol Ecol. 2004 Nov;13(11):3261-73. doi: 10.1111/j.1365-294X.2004.02346.x.

Systematic differences in electropherogram peak heights reported by different versions of the GeneScan software.

J Forensic Sci. 2004 Jan;49(1):92-5.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

使用RawGeno（一个用于自动进行AFLP评分的R软件包）评估评分参数对种内遗传变异结构的影响。

Evaluating the impact of scoring parameters on the structure of intra-specific genetic variation using RawGeno, an R package for automating AFLP scoring.

作者信息

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSION

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献