Department of Biology, Colorado State University, Fort Collins, Colorado, 80523-1878, USA.
Royal Botanic Gardens, Kew, Richmond, Surrey, TW9 3AE, UK.
Cladistics. 2022 Oct;38(5):595-611. doi: 10.1111/cla.12507. Epub 2022 May 15.
We examined the impact of successive alignment quality-control steps on downstream phylogenomic analyses. We applied a recently published phylogenomics pipeline that was developed for the Angiosperms353 target-sequence-capture probe set to the flowering plant order Celastrales. Our final dataset consists of 158 species, including at least one exemplar from all 109 currently recognized Celastrales genera. We performed nine quality-control steps and compared the inferred resolution, branch support, and topological congruence of the inferred gene and species trees with those generated after each of the first six steps. We describe and justify each of our quality-control steps, including manual masking, in detail so that they may be readily applied to other lineages. We found that highly supported clades could generally be relied upon even if stringent orthology and alignment quality-control measures had not been applied. But separate instances were identified, for both concatenation and coalescence, wherein a clade was highly supported before manual masking but then subsequently contradicted. These results are generally reassuring for broad-scale analyses that use phylogenomics pipelines, but also indicate that we cannot rely exclusively on these analyses to conclude how challenging phylogenetic problems are best resolved.
我们研究了连续的序列质量控制步骤对下游系统基因组分析的影响。我们应用了一个最近发表的系统基因组学管道,该管道是为被子植物 353 个目标序列捕获探针集开发的,应用于开花植物目 Celastrales。我们的最终数据集包含 158 个物种,包括目前公认的 Celastrales 109 个属中的至少一个代表。我们进行了九个质量控制步骤,并比较了推断的基因树和物种树的分辨率、分支支持和拓扑一致性,与前六个步骤中的每一个步骤生成的结果进行了比较。我们详细描述并证明了我们的每一个质量控制步骤的合理性,包括手动屏蔽,以便它们可以很容易地应用于其他谱系。我们发现,即使没有应用严格的同源性和对齐质量控制措施,高度支持的分支通常也可以依赖。但是,在单独的实例中,无论是连锁还是合并,在手动屏蔽之前,一个分支得到了高度支持,但随后又被反驳。这些结果对于使用系统基因组学管道进行的广泛分析来说是令人欣慰的,但也表明我们不能仅仅依靠这些分析来得出解决具有挑战性的系统发育问题的最佳方法。