Suppr超能文献

使用数据挖掘进行自动化3D表型分析。

Automated 3D phenotype analysis using data mining.

作者信息

Plyusnin Ilya, Evans Alistair R, Karme Aleksis, Gionis Aristides, Jernvall Jukka

机构信息

Institute of Biotechnology, University of Helsinki, Helsinki, Finland.

出版信息

PLoS One. 2008 Mar 5;3(3):e1742. doi: 10.1371/journal.pone.0001742.

Abstract

The ability to analyze and classify three-dimensional (3D) biological morphology has lagged behind the analysis of other biological data types such as gene sequences. Here, we introduce the techniques of data mining to the study of 3D biological shapes to bring the analyses of phenomes closer to the efficiency of studying genomes. We compiled five training sets of highly variable morphologies of mammalian teeth from the MorphoBrowser database. Samples were labeled either by dietary class or by conventional dental types (e.g. carnassial, selenodont). We automatically extracted a multitude of topological attributes using Geographic Information Systems (GIS)-like procedures that were then used in several combinations of feature selection schemes and probabilistic classification models to build and optimize classifiers for predicting the labels of the training sets. In terms of classification accuracy, computational time and size of the feature sets used, non-repeated best-first search combined with 1-nearest neighbor classifier was the best approach. However, several other classification models combined with the same searching scheme proved practical. The current study represents a first step in the automatic analysis of 3D phenotypes, which will be increasingly valuable with the future increase in 3D morphology and phenomics databases.

摘要

对三维(3D)生物形态进行分析和分类的能力,落后于对其他生物数据类型(如基因序列)的分析。在此,我们将数据挖掘技术引入到3D生物形状的研究中,以使表型分析更接近基因组研究的效率。我们从MorphoBrowser数据库中汇编了五组具有高度可变形态的哺乳动物牙齿训练集。样本根据饮食类别或传统牙齿类型(如裂齿、月型齿)进行标记。我们使用类似地理信息系统(GIS)的程序自动提取了大量拓扑属性,然后将这些属性用于多种特征选择方案和概率分类模型的组合中,以构建和优化用于预测训练集标签的分类器。在分类准确率、计算时间和所用特征集的大小方面,非重复最佳优先搜索与1-最近邻分类器相结合是最佳方法。然而,其他几种与相同搜索方案相结合的分类模型也被证明是可行的。当前的研究代表了3D表型自动分析的第一步,随着未来3D形态学和表型组学数据库的增加,这将变得越来越有价值。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9876/2254194/f9beda28b7db/pone.0001742.g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验