在数据空间中寻找主路径。

Finding Principal Paths in Data Space.

作者信息

Ferrarotti Marco Jacopo, Rocchia Walter, Decherchi Sergio

出版信息

IEEE Trans Neural Netw Learn Syst. 2019 Aug;30(8):2449-2462. doi: 10.1109/TNNLS.2018.2884792. Epub 2018 Dec 25.

DOI:10.1109/TNNLS.2018.2884792

Abstract

In this paper, we introduce the concept of principal paths in data space; we show that this is a well-characterized problem from the point of view of cognition, and that it can lead to salient insights in the analyzed data enabling topological/holistic descriptions. These paths, interestingly, can be interpreted as local principal curves, and in this paper, we suggest that they are analogous to what, in the statistical mechanics realm, are called minimum free-energy paths. Here, we move that concept from physics to data space and compute them in both the original and the kernel space. The algorithm is a regularized version of the well-known k -means clustering algorithm. The regularization parameter is derived via an in-sample model selection process based on the Bayesian evidence maximization. Interestingly, we show that this choice for the regularization parameter consistently leads to the same manifold even when changing the number of clusters. We apply the method to common data sets, dynamical systems, and, in particular, to molecular dynamics trajectories showing the generality, the usefulness of the approach and its superiority with respect to other related approaches.

摘要

在本文中，我们引入了数据空间中主路径的概念；我们表明，从认知角度来看，这是一个特征明确的问题，并且它能够在分析的数据中带来显著的见解，从而实现拓扑/整体描述。有趣的是，这些路径可以被解释为局部主曲线，在本文中，我们认为它们类似于统计力学领域中所谓的最小自由能路径。在此，我们将该概念从物理领域迁移到数据空间，并在原始空间和核空间中进行计算。该算法是著名的k均值聚类算法的正则化版本。正则化参数是通过基于贝叶斯证据最大化的样本内模型选择过程推导得出的。有趣的是，我们表明，即使改变聚类数量，这种正则化参数的选择也始终会导致相同的流形。我们将该方法应用于常见数据集、动态系统，特别是分子动力学轨迹，展示了该方法的通用性、实用性及其相对于其他相关方法的优越性。

相似文献

Finding Principal Paths in Data Space.在数据空间中寻找主路径。

IEEE Trans Neural Netw Learn Syst. 2019 Aug;30(8):2449-2462. doi: 10.1109/TNNLS.2018.2884792. Epub 2018 Dec 25.

Enhanced manifold regularization for semi-supervised classification.用于半监督分类的增强流形正则化

J Opt Soc Am A Opt Image Sci Vis. 2016 Jun 1;33(6):1207-13. doi: 10.1364/JOSAA.33.001207.

Computing Leapfrog Regularization Paths with Applications to Large-Scale K-mer Logistic Regression.计算蛙跳正则化路径及其在大规模 k-mer 逻辑回归中的应用。

A Nonparametric Deep Generative Model for Multimanifold Clustering.用于多流形聚类的非参数深度生成模型

IEEE Trans Cybern. 2019 Jul;49(7):2664-2677. doi: 10.1109/TCYB.2018.2832171. Epub 2018 May 16.

Semisupervised Support Vector Machines With Tangent Space Intrinsic Manifold Regularization.基于切空间内在流形正则化的半监督支持向量机。

IEEE Trans Neural Netw Learn Syst. 2016 Sep;27(9):1827-39. doi: 10.1109/TNNLS.2015.2461009. Epub 2015 Aug 10.

Feature Selection and Kernel Learning for Local Learning-Based Clustering.基于局部学习的聚类的特征选择和核学习。

IEEE Trans Pattern Anal Mach Intell. 2011 Aug;33(8):1532-47. doi: 10.1109/TPAMI.2010.215. Epub 2010 Dec 10.

Laplacian embedded regression for scalable manifold regularization.拉普拉斯嵌入回归的可扩展流形正则化。

IEEE Trans Neural Netw Learn Syst. 2012 Jun;23(6):902-15. doi: 10.1109/TNNLS.2012.2190420.

Regularized Gaussian Mixture Model for High-Dimensional Clustering.用于高维聚类的正则化高斯混合模型

IEEE Trans Cybern. 2019 Oct;49(10):3677-3688. doi: 10.1109/TCYB.2018.2846404. Epub 2018 Jun 27.

Regularization of Mixture Models for Robust Principal Graph Learning.用于稳健主图学习的混合模型正则化

IEEE Trans Pattern Anal Mach Intell. 2022 Dec;44(12):9119-9130. doi: 10.1109/TPAMI.2021.3124973. Epub 2022 Nov 7.

On Bayesian mechanics: a physics of and by beliefs.论贝叶斯力学：一种基于信念并由信念构成的物理学。

Interface Focus. 2023 Apr 14;13(3):20220029. doi: 10.1098/rsfs.2022.0029. eCollection 2023 Jun 6.

引用本文的文献

Binding Free Energy Calculations Based on the Path Collective Variable along a String Pathway.基于沿弦路径的路径集体变量的结合自由能计算。

J Phys Chem B. 2025 Jul 10;129(27):6805-6816. doi: 10.1021/acs.jpcb.5c02258. Epub 2025 Jun 30.

Path-Based Nonequilibrium Binding Free Energy Estimation, from Protein-Ligand to RNA-Ligand Binding.基于路径的非平衡结合自由能估计：从蛋白质-配体到RNA-配体结合

J Chem Inf Model. 2025 Jun 23;65(12):6057-6072. doi: 10.1021/acs.jcim.5c00452. Epub 2025 Jun 6.

Nonequilibrium Binding Free Energy Simulations: Minimizing Dissipation.非平衡结合自由能模拟：最小化耗散

J Chem Theory Comput. 2025 Feb 25;21(4):2079-2094. doi: 10.1021/acs.jctc.4c01453. Epub 2025 Feb 5.

Machine Learning and Enhanced Sampling Simulations for Computing the Potential of Mean Force and Standard Binding Free Energy.机器学习和增强采样模拟计算平均力势和标准结合自由能。

J Chem Theory Comput. 2021 Aug 10;17(8):5287-5300. doi: 10.1021/acs.jctc.1c00177. Epub 2021 Jul 14.

Editorial: Molecular Dynamics and Machine Learning in Drug Discovery.社论：药物发现中的分子动力学与机器学习

Front Mol Biosci. 2021 Apr 13;8:673773. doi: 10.3389/fmolb.2021.673773. eCollection 2021.

Thermodynamics and Kinetics of Drug-Target Binding by Molecular Simulation.分子模拟研究药物-靶标结合的热力学和动力学。

Chem Rev. 2020 Dec 9;120(23):12788-12833. doi: 10.1021/acs.chemrev.0c00534. Epub 2020 Oct 2.

Structural Transition States Explored With Minimalist Coarse Grained Models: Applications to Calmodulin.用极简粗粒度模型探索结构过渡态：应用于钙调蛋白

Front Mol Biosci. 2019 Oct 15;6:104. doi: 10.3389/fmolb.2019.00104. eCollection 2019.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

在数据空间中寻找主路径。

Finding Principal Paths in Data Space.

作者信息

出版信息

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献