Verbruggen Heroen, Uthanumallian Kavitha, Powrie Felix, Jalali Tara, Cremen Chiela, Preuss Maren, Duchene Sebastian, Diaz-Tapia Pilar
Melbourne Integrative Genomics, School of BioSciences, University of Melbourne, Melbourne, Australia.
CIBIO, Centro de Investigação em Biodiversidade e Recursos Genéticos, InBIO Laboratório Associado, Campus de Vairão, Universidade do Porto, Vairão, Portugal.
Mol Ecol Resour. 2025 Oct;25(7):e14132. doi: 10.1111/1755-0998.14132. Epub 2025 Jun 10.
Molecular sequence data have become a ubiquitous tool for delimiting species and are particularly important in organisms where morphological traits are not informative about species boundaries. A range of statistical methods have been developed to derive species limits from molecular data, for example, by quantifying changes in branching patterns in phylogenetic trees. We aim to investigate how such methods scale up from single genes to whole organelle genomes. We gathered chloroplast genome data from 38 samples of the red algal genus Dascyclonium and analysed them with the popular species delimitation methods Assemble Species by Automatic Partitioning (ASAP), General Mixed Yule Coalescent (GMYC), and Poisson Tree Processes (PTP). We show extensive variation in inferred species boundaries depending on the method and dataset used. Genome-scale analyses differed substantially between methods, with ASAP predicting the fewest species, PTP intermediate, and GMYC inferring many species. Based on a series of simulations, we identify a tendency of GMYC to overestimate species numbers as alignments increase in length, while the other two methods are not sensitive to this scaling. Gene-by-gene analyses show strong differences in predicted species limits, which is unexpected seeing that all genes are on a single uniparentally inherited chromosome, and highlight that choosing a particular gene as a DNA barcode has significant consequences for species diversity estimates. We show extensive cryptic diversity in the genus Dasyclonium and propose a consensus solution for species limits based on our combined results, enriched with biogeographic and morphological interpretations. Finally, we make recommendations for interpreting the results and improving the inferences drawn from species delimitation methods.
分子序列数据已成为界定物种的普遍工具,在形态特征无法提供物种界限信息的生物中尤为重要。已经开发了一系列统计方法来从分子数据中得出物种界限,例如,通过量化系统发育树中分支模式的变化。我们旨在研究这些方法如何从单个基因扩展到整个细胞器基因组。我们收集了红藻属达斯西克隆(Dascyclonium)38个样本的叶绿体基因组数据,并用流行的物种界定方法自动划分组装物种(ASAP)、广义混合尤尔合并(GMYC)和泊松树过程(PTP)对其进行分析。我们发现,根据所使用的方法和数据集,推断出的物种界限存在广泛差异。不同方法之间的基因组规模分析差异很大,ASAP预测的物种最少,PTP居中,GMYC推断出的物种最多。基于一系列模拟,我们发现随着比对长度增加,GMYC有高估物种数量的趋势,而其他两种方法对这种尺度变化不敏感。逐个基因分析显示预测的物种界限存在很大差异,鉴于所有基因都在一条单亲遗传的染色体上,这一结果出人意料,并且突出表明选择特定基因作为DNA条形码对物种多样性估计有重大影响。我们在达斯西克隆属中发现了广泛的隐存多样性,并根据我们的综合结果,结合生物地理学和形态学解释,提出了一个物种界限的共识解决方案。最后,我们对解释结果和改进从物种界定方法得出的推论提出了建议。