Suppr超能文献

不是那种树:利用性状数据库评估基于决策树的植物识别潜力。

Not that kind of tree: Assessing the potential for decision tree-based plant identification using trait databases.

作者信息

Almeida Brianna K, Garg Manish, Kubat Miroslav, Afkhami Michelle E

机构信息

Department of Biology University of Miami 1301 Memorial Drive Coral Gables Florida 33143 USA.

Department of Electrical and Computer Engineering University of Miami 1251 Memorial Drive Coral Gables Florida 33143 USA.

出版信息

Appl Plant Sci. 2020 Jul 31;8(7):e11379. doi: 10.1002/aps3.11379. eCollection 2020 Jul.

Abstract

PREMISE

Advancements in machine learning and the rise of accessible "big data" provide an important opportunity to improve trait-based plant identification. Here, we applied decision-tree induction to a subset of data from the TRY plant trait database to (1) assess the potential of decision trees for plant identification and (2) determine informative traits for distinguishing taxa.

METHODS

Decision trees were induced using 16 vegetative and floral traits (689 species, 20 genera). We assessed how well the algorithm classified species from test data and pinpointed those traits that were important for identification across diverse taxa.

RESULTS

The unpruned tree correctly placed 98% of the species in our data set into genera, indicating its promise for distinguishing among the species used to construct them. Furthermore, in the pruned tree, an average of 89% of the species from the test data sets were properly classified into their genera, demonstrating the flexibility of decision trees to also classify new species into genera within the tree. Closer inspection revealed that seven of the 16 traits were sufficient for the classification, and these traits yielded approximately two times more initial information gain than those not included.

DISCUSSION

Our findings demonstrate the potential for tree-based machine learning and big data in distinguishing among taxa and determining which traits are important for plant identification.

摘要

前提

机器学习的进步和可获取的“大数据”的兴起为改进基于性状的植物识别提供了重要契机。在此,我们将决策树归纳法应用于TRY植物性状数据库的部分数据,以(1)评估决策树在植物识别方面的潜力,以及(2)确定区分分类群的信息性性状。

方法

使用16个营养和花部性状(689种,20属)构建决策树。我们评估了该算法对测试数据中物种的分类效果,并找出了对不同分类群识别重要的性状。

结果

未修剪的树将数据集中98%的物种正确归入属,表明其在区分用于构建它们的物种方面具有潜力。此外,在修剪后的树中,测试数据集中平均89%的物种被正确归入其属,证明了决策树将新物种也分类到树中属的灵活性。进一步检查发现,16个性状中的7个足以进行分类,并且这些性状产生的初始信息增益比未包含的性状多约两倍。

讨论

我们的研究结果证明了基于树的机器学习和大数据在区分分类群以及确定哪些性状对植物识别重要方面的潜力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/95a8/7394705/15c79d62b747/APS3-8-e11379-g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验