Suppr超能文献

在数据空间中寻找主路径。

Finding Principal Paths in Data Space.

作者信息

Ferrarotti Marco Jacopo, Rocchia Walter, Decherchi Sergio

出版信息

IEEE Trans Neural Netw Learn Syst. 2019 Aug;30(8):2449-2462. doi: 10.1109/TNNLS.2018.2884792. Epub 2018 Dec 25.

Abstract

In this paper, we introduce the concept of principal paths in data space; we show that this is a well-characterized problem from the point of view of cognition, and that it can lead to salient insights in the analyzed data enabling topological/holistic descriptions. These paths, interestingly, can be interpreted as local principal curves, and in this paper, we suggest that they are analogous to what, in the statistical mechanics realm, are called minimum free-energy paths. Here, we move that concept from physics to data space and compute them in both the original and the kernel space. The algorithm is a regularized version of the well-known k -means clustering algorithm. The regularization parameter is derived via an in-sample model selection process based on the Bayesian evidence maximization. Interestingly, we show that this choice for the regularization parameter consistently leads to the same manifold even when changing the number of clusters. We apply the method to common data sets, dynamical systems, and, in particular, to molecular dynamics trajectories showing the generality, the usefulness of the approach and its superiority with respect to other related approaches.

摘要

在本文中,我们引入了数据空间中主路径的概念;我们表明,从认知角度来看,这是一个特征明确的问题,并且它能够在分析的数据中带来显著的见解,从而实现拓扑/整体描述。有趣的是,这些路径可以被解释为局部主曲线,在本文中,我们认为它们类似于统计力学领域中所谓的最小自由能路径。在此,我们将该概念从物理领域迁移到数据空间,并在原始空间和核空间中进行计算。该算法是著名的k均值聚类算法的正则化版本。正则化参数是通过基于贝叶斯证据最大化的样本内模型选择过程推导得出的。有趣的是,我们表明,即使改变聚类数量,这种正则化参数的选择也始终会导致相同的流形。我们将该方法应用于常见数据集、动态系统,特别是分子动力学轨迹,展示了该方法的通用性、实用性及其相对于其他相关方法的优越性。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验