Watanabe Akinobu
Division of Paleontology, American Museum of Natural History, Central Park West at 79th Street, New York, NY, 10024, USA.
Richard Gilder Graduate School, American Museum of Natural History, Central Park West at 79th Street, New York, NY, 10024, USA.
Cladistics. 2016 Jun;32(3):317-334. doi: 10.1111/cla.12130. Epub 2015 Jul 15.
Despite its ubiquity in the natural world, polymorphism is commonly disregarded or poorly sampled in phylogenetic analyses due to deliberate sampling strategy, inadequate sampling effort and limited specimen availability. Poor sampling of intraspecific variation engenders differential sampling of morphs within polymorphic species, which could generate conflicting tree topologies by altering the character-based affinity among taxa. To assess the potential magnitude of this impact, Polymorphic Entry Replacement Data Analysis (PERDA) was developed as a new script for the TNT phylogenetic program. This script simulates poor sampling of polymorphic taxa on a matrix of discrete characters by iteratively replacing each polymorphic state (e.g. [01]) with a randomly selected single state included in the original polymorphic coding (e.g. 0 or 1). The trees recovered from these subsampled data sets provide a distribution of tree distances, which indicates the level of incongruent trees resulting from different combinations of single states. Performing PERDA on empirical data sets shows alarming frequencies and magnitudes of conflicting tree topologies, demonstrating that poor sampling within polymorphic taxa could yield highly incompatible trees in many data sets. This troubling outcome undermines phylogenetic inferences based on data with poor intraspecific sampling, which is typical for palaeontological studies. With trees obtained from subsampled data sets, PERDA also generates a metaconsensus tree revealing interspecific relationships that become ambiguous due to documented levels of intraspecific variation. These collapsed clades point to taxa for which evidence should be sought to justify their taxonomic classification.
尽管多态性在自然界中无处不在,但由于刻意的抽样策略、抽样力度不足以及标本可得性有限,在系统发育分析中,多态性通常被忽视或抽样不足。种内变异抽样不足会导致多态物种内不同形态的抽样差异,这可能通过改变分类单元之间基于特征的亲缘关系而产生相互矛盾的树形拓扑结构。为了评估这种影响的潜在程度,多态性条目替换数据分析(PERDA)被开发为TNT系统发育程序的一个新脚本。该脚本通过将每个多态状态(例如[01])迭代替换为原始多态编码中包含的随机选择的单个状态(例如0或1),来模拟离散特征矩阵上多态分类单元的抽样不足。从这些二次抽样数据集中恢复的树形拓扑结构提供了树形距离的分布,这表明了由单个状态的不同组合导致的不一致树形拓扑结构的程度。对实证数据集进行PERDA分析显示,相互矛盾的树形拓扑结构的频率和程度令人担忧,这表明多态分类单元内抽样不足可能在许多数据集中产生高度不兼容的树形拓扑结构。这一令人不安的结果破坏了基于种内抽样不足的数据的系统发育推断,而这在古生物学研究中是很典型的。利用从二次抽样数据集中获得的树形拓扑结构,PERDA还生成了一个元共识树形拓扑结构,揭示了由于记录的种内变异水平而变得模糊的种间关系。这些合并的分支指向那些应该寻找证据来证明其分类地位的分类单元。