Suppr超能文献

用于宏基因组分类的多层递归神经网络

Multi-Layer and Recursive Neural Networks for Metagenomic Classification.

作者信息

Ditzler Gregory, Polikar Robi, Rosen Gail

出版信息

IEEE Trans Nanobioscience. 2015 Sep;14(6):608-16. doi: 10.1109/TNB.2015.2461219. Epub 2015 Aug 24.

Abstract

Recent advances in machine learning, specifically in deep learning with neural networks, has made a profound impact on fields such as natural language processing, image classification, and language modeling; however, feasibility and potential benefits of the approaches to metagenomic data analysis has been largely under-explored. Deep learning exploits many layers of learning nonlinear feature representations, typically in an unsupervised fashion, and recent results have shown outstanding generalization performance on previously unseen data. Furthermore, some deep learning methods can also represent the structure in a data set. Consequently, deep learning and neural networks may prove to be an appropriate approach for metagenomic data. To determine whether such approaches are indeed appropriate for metagenomics, we experiment with two deep learning methods: i) a deep belief network, and ii) a recursive neural network, the latter of which provides a tree representing the structure of the data. We compare these approaches to the standard multi-layer perceptron, which has been well-established in the machine learning community as a powerful prediction algorithm, though its presence is largely missing in metagenomics literature. We find that traditional neural networks can be quite powerful classifiers on metagenomic data compared to baseline methods, such as random forests. On the other hand, while the deep learning approaches did not result in improvements to the classification accuracy, they do provide the ability to learn hierarchical representations of a data set that standard classification methods do not allow. Our goal in this effort is not to determine the best algorithm in terms accuracy-as that depends on the specific application-but rather to highlight the benefits and drawbacks of each of the approach we discuss and provide insight on how they can be improved for predictive metagenomic analysis.

摘要

机器学习领域的最新进展,特别是神经网络深度学习方面的进展,已对自然语言处理、图像分类和语言建模等领域产生了深远影响;然而,宏基因组数据分析方法的可行性和潜在益处很大程度上尚未得到充分探索。深度学习利用多层学习非线性特征表示,通常以无监督方式进行,最近的结果表明其在未见数据上具有出色的泛化性能。此外,一些深度学习方法还可以表示数据集中的结构。因此,深度学习和神经网络可能被证明是适用于宏基因组数据的方法。为了确定这些方法是否确实适用于宏基因组学,我们试验了两种深度学习方法:i)深度信念网络,以及ii)递归神经网络,后者提供表示数据结构的树状图。我们将这些方法与标准多层感知器进行比较,多层感知器在机器学习社区中已作为一种强大的预测算法得到广泛认可,尽管在宏基因组学文献中它的应用并不多见。我们发现,与随机森林等基线方法相比,传统神经网络在宏基因组数据上可以成为相当强大的分类器。另一方面,虽然深度学习方法并未提高分类准确率,但它们确实提供了学习数据集层次表示的能力,而标准分类方法则不具备这一点。我们这项工作的目标不是根据准确率来确定最佳算法——因为这取决于具体应用——而是强调我们所讨论的每种方法的优缺点,并深入了解如何改进它们以用于预测性宏基因组分析。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验