Abdo Zaid, Golding G Brian
Department of Mathematics, University of Idaho, Moscow, Idaho 83844, USA.
Syst Biol. 2007 Feb;56(1):44-56. doi: 10.1080/10635150601167005.
A major part of the barcoding of life problem is assigning newly sequenced or sampled individuals to existing groups that are preidentified externally (by a taxonomist, for example). This problem involves evaluating the statistical evidence towards associating a sequence from a new individual with one group or another. The main concern of our current research is to perform this task in a fast and accurate manner. To accomplish this we have developed a model-based, decision-theoretic framework based on the coalescent theory. Under this framework, we utilized both distance and the posterior probability of a group, given the sequences from members of this group and the sequence from a newly sampled individual to assign this new individual. We believe that this approach makes efficient use of the available information in the data. Our preliminary results indicated that this approach is more accurate than using a simple measure of distance for assignment.
生命条形码问题的一个主要部分是将新测序或采样的个体归入外部预先确定的现有类群(例如由分类学家确定)。这个问题涉及评估将新个体的序列与一个或另一个类群相关联的统计证据。我们当前研究的主要关注点是以快速且准确的方式执行此任务。为实现这一点,我们基于合并理论开发了一个基于模型的决策理论框架。在此框架下,我们利用给定该类群成员的序列和新采样个体的序列时类群的距离和后验概率来对这个新个体进行归类。我们相信这种方法能有效利用数据中的可用信息。我们的初步结果表明,这种方法比使用简单的距离度量进行归类更准确。