Systematic Botany and Mycology, University of Munich, Menzinger Str. 67, 80638 Munich, German.
Syst Biol. 2012 Oct;61(5):785-92. doi: 10.1093/sysbio/sys031. Epub 2012 Feb 14.
Chronograms from molecular dating are increasingly being used to infer rates of diversification and their change over time. A major limitation in such analyses is incomplete species sampling that moreover is usually nonrandom. While the widely used γ statistic with the Monte Carlo constant-rates test or the birth-death likelihood analysis with the δ AICrc test statistic are appropriate for comparing the fit of different diversification models in phylogenies with random species sampling, no objective automated method has been developed for fitting diversification models to nonrandomly sampled phylogenies. Here, we introduce a novel approach, CorSiM, which involves simulating missing splits under a constant rate birth-death model and allows the user to specify whether species sampling in the phylogeny being analyzed is random or nonrandom. The completed trees can be used in subsequent model-fitting analyses. This is fundamentally different from previous diversification rate estimation methods, which were based on null distributions derived from the incomplete trees. CorSiM is automated in an R package and can easily be applied to large data sets. We illustrate the approach in two Araceae clades, one with a random species sampling of 52% and one with a nonrandom sampling of 55%. In the latter clade, the CorSiM approach detects and quantifies an increase in diversification rate, whereas classic approaches prefer a constant rate model; in the former clade, results do not differ among methods (as indeed expected since the classic approaches are valid only for randomly sampled phylogenies). The CorSiM method greatly reduces the type I error in diversification analysis, but type II error remains a methodological problem.
年代测定的年代学越来越多地被用于推断多样化的速率及其随时间的变化。在这种分析中,一个主要的限制是不完全的物种采样,而且通常是非随机的。虽然广泛使用的γ统计量与蒙特卡罗常数速率检验或具有δ AICrc 检验统计量的出生-死亡似然分析适用于比较具有随机物种采样的系统发育中不同多样化模型的拟合度,但没有为非随机采样的系统发育拟合多样化模型开发客观的自动化方法。在这里,我们引入了一种新的方法 CorSiM,它涉及在恒定速率的出生-死亡模型下模拟缺失的分裂,并允许用户指定正在分析的系统发育中的物种采样是随机的还是非随机的。完成的树可以用于后续的模型拟合分析。这与以前基于不完全树的缺失分布得出的多样化率估计方法有根本的不同。CorSiM 在 R 包中自动化,并且可以轻松应用于大数据集。我们在两个天南星科类群中说明了该方法,一个是随机采样 52%的物种,另一个是非随机采样 55%的物种。在后一个类群中,CorSiM 方法检测到并量化了多样化速率的增加,而经典方法则倾向于恒定速率模型;在前一个类群中,方法之间的结果没有差异(正如经典方法仅适用于随机采样的系统发育的预期)。CorSiM 方法大大降低了多样化分析中的Ⅰ类错误,但Ⅱ类错误仍然是一个方法学问题。