Adams Richard, Lozano Jenniffer Roa, Duncan Mataya, Green Jack, Assis Raquel, DeGiorgio Michael
Department of Entomology and Plant Pathology, University of Arkansas, Fayetteville, AR, USA.
Center for Agricultural Data Analytics, University of Arkansas, Fayetteville, AR, USA.
Mol Biol Evol. 2025 Mar 5;42(3). doi: 10.1093/molbev/msaf032.
Just exactly which tree(s) should we assume when testing evolutionary hypotheses? This question has plagued comparative biologists for decades. Though all phylogenetic comparative methods require input trees, we seldom know with certainty whether even a perfectly estimated tree (if this is possible in practice) is appropriate for our studied traits. Yet, we also know that phylogenetic conflict is ubiquitous in modern comparative biology, and we are still learning about its dangers when testing evolutionary hypotheses. Here, we investigate the consequences of tree-trait mismatch for phylogenetic regression in the presence of gene tree-species tree conflict. Our simulation experiments reveal excessively high false positive rates for mismatched models with both small and large trees, simple and complex traits, and known and estimated phylogenies. In some cases, we find evidence of a directionality of error: assuming a species tree for traits that evolved according to a gene tree sometimes fares worse than the opposite. We also explored the impacts of tree choice using an expansive, cross-species gene expression dataset as an arguably "best-case" scenario in which one may have a better chance of matching tree with trait. Offering a potential path forward, we found promise in the application of a robust estimator as a potential, albeit imperfect, solution to some issues raised by tree mismatch. Collectively, our results emphasize the importance of careful study design for comparative methods, highlighting the need to fully appreciate the role of accurate and thoughtful phylogenetic modeling.
在检验进化假说时,我们究竟应该假定哪棵树(或哪些树)呢?这个问题已经困扰比较生物学家数十年了。尽管所有系统发育比较方法都需要输入树,但我们很少能确定,即便一棵估计得完美无缺的树(如果在实际中这是可能的话)对于我们所研究的性状是否合适。然而,我们也知道,系统发育冲突在现代比较生物学中无处不在,而且在检验进化假说时,我们仍在了解其风险。在此,我们研究了在存在基因树 - 物种树冲突的情况下,树与性状不匹配对系统发育回归的影响。我们的模拟实验表明,对于大小不同的树、简单和复杂的性状以及已知和估计的系统发育,不匹配模型的假阳性率过高。在某些情况下,我们发现了误差方向性的证据:对于根据基因树进化的性状假定物种树,有时比相反的情况表现更差。我们还使用一个广泛的跨物种基因表达数据集,作为一个可以说是“最佳情况”的场景,来探索树选择的影响,在这种场景下,将树与性状匹配可能有更好的机会。我们发现,应用一种稳健估计器作为解决树不匹配引发的一些问题的潜在(尽管并不完美)方案,有一定前景。总体而言,我们的结果强调了为比较方法进行仔细研究设计的重要性,突出了充分认识准确且周全的系统发育建模作用的必要性。