International School for Advanced Studies (SISSA) 34014 Trieste, Italy.
Freie Universität Berlin, Department of Mathematics and Computer Science, 14195 Berlin, Germany.
Chem Rev. 2021 Aug 25;121(16):9722-9758. doi: 10.1021/acs.chemrev.0c01195. Epub 2021 May 4.
Unsupervised learning is becoming an essential tool to analyze the increasingly large amounts of data produced by atomistic and molecular simulations, in material science, solid state physics, biophysics, and biochemistry. In this Review, we provide a comprehensive overview of the methods of unsupervised learning that have been most commonly used to investigate simulation data and indicate likely directions for further developments in the field. In particular, we discuss of molecular systems and present state-of-the-art algorithms of , , and , and . We divide our discussion into self-contained sections, each discussing a specific method. In each section, we briefly touch upon the mathematical and algorithmic foundations of the method, highlight its strengths and limitations, and describe the specific ways in which it has been used-or can be used-to analyze molecular simulation data.
无监督学习正成为分析材料科学、固态物理、生物物理和生物化学领域中日益庞大的原子和分子模拟数据的重要工具。在这篇综述中,我们全面概述了最常用于研究模拟数据的无监督学习方法,并指出了该领域进一步发展的可能方向。特别是,我们讨论了分子系统,并介绍了目前最先进的算法,包括聚类、密度估计、降维和生成模型。我们将讨论分成独立的部分,每个部分讨论一种特定的方法。在每个部分中,我们简要地介绍了方法的数学和算法基础,强调了它的优点和局限性,并描述了它被用于分析分子模拟数据的具体方式,或者可以用于分析分子模拟数据的具体方式。