Laboratoire CEDRIC, Ecole Nationale Supérieure d'Informatique pour l'Industrie et l'Entreprise 1, square de la Résistance, 91025 Evry cedex, France.
Syst Biol. 2013 Jan 1;62(1):147-56. doi: 10.1093/sysbio/sys081. Epub 2012 Sep 20.
The phylogenetic diversity (PD) of a set of species is a measure of the evolutionary distance among the species in the collection, based on a phylogenetic tree. Such a tree is composed of a root, internal nodes, and leaves that correspond to the set of taxa under study. With each edge of the tree is associated a non-negative branch length (evolutionary distance). If a particular survival probability is associated with each taxon, the PD measure becomes the expected PD measure. In the Noah's Ark Problem (NAP) introduced by Weitzman (1998), these survival probabilities can be increased at some cost. The problem is to determine how best to allocate a limited amount of resources to maximize the expected PD of the considered species. It is easy to formulate the NAP as a (difficult) nonlinear 0-1 programming problem. The aim of this article is to show that a general version of the NAP (GNAP) can be solved simply and efficiently with any set of edge weights and any set of survival probabilities by using standard mixed-integer linear programming software. The crucial point to move from a nonlinear program in binary variables to a mixed-integer linear program, is to approximate the logarithmic function by the lower envelope of a set of tangents to the curve. Solving the obtained mixed-integer linear program provides not only a near-optimal solution but also an upper bound on the value of the optimal solution. We also applied this approach to a generalization of the nature reserve problem (GNRP) that consists of selecting a set of regions to be conserved so that the expected PD of the set of species present in these regions is maximized. In this case, the survival probabilities of different taxa are not independent of each other. Computational results are presented to illustrate potentialities of the approach. Near-optimal solutions with hypothetical phylogenetic trees comprising about 4000 taxa are obtained in a few seconds or minutes of computing time for the GNAP, and in about 30 min for the GNRP. In all the cases the average guarantee varies from 0% to 1.20%.
一组物种的系统发育多样性(PD)是基于系统发育树衡量集合中物种进化距离的一种度量。这样的树由根、内部节点和叶组成,对应于研究的一组分类单元。树的每一条边都与一个非负的分支长度(进化距离)相关联。如果与每个分类单元相关联一个特定的生存概率,则 PD 度量成为预期 PD 度量。在 Weitzman(1998)提出的诺亚方舟问题(NAP)中,可以以一定的成本增加这些生存概率。问题是确定如何最好地分配有限的资源,以最大化所考虑物种的预期 PD。很容易将 NAP 表述为(困难)非线性 0-1 规划问题。本文的目的是表明,通过使用标准的混合整数线性规划软件,对于任何边权重集和任何生存概率集,都可以简单有效地解决一般版本的 NAP(GNAP)。从二进制变量的非线性规划到混合整数线性规划的关键在于,通过曲线的一组切线的下包络来逼近对数函数。求解得到的混合整数线性规划不仅提供了一个接近最优的解,而且提供了最优解值的上界。我们还将这种方法应用于自然保护区问题(GNRP)的推广,该问题包括选择一组要保护的区域,以使这些区域中存在的物种的预期 PD 最大化。在这种情况下,不同分类单元的生存概率彼此不独立。计算结果表明了该方法的潜力。对于 GNAP,使用包含约 4000 个分类单元的假设系统发育树,在几秒钟或几分钟的计算时间内获得近最优解,对于 GNRP,大约需要 30 分钟。在所有情况下,平均保证从 0%到 1.20%不等。