Bendory Tamir, Bartesaghi Alberto, Singer Amit
Tel Aviv University, Electrical Engineering, Tel Aviv, Israel.
Computer Science, Biochemistry, and Electrical and Computer Engineering, Durham, NC, USA, Duke University.
IEEE Signal Process Mag. 2020 Mar;37(2):58-76. doi: 10.1109/msp.2019.2957822. Epub 2020 Feb 27.
In recent years, an abundance of new molecular structures have been elucidated using cryo-electron microscopy (cryo-EM), largely due to advances in hardware technology and data processing techniques. Owing to these new exciting developments, cryo-EM was selected by Nature Methods as Method of the Year 2015, and the Nobel Prize in Chemistry 2017 was awarded to three pioneers in the field. The main goal of this article is to introduce the challenging and exciting computational tasks involved in reconstructing 3-D molecular structures by cryo-EM. Determining molecular structures requires a wide range of computational tools in a variety of fields, including signal processing, estimation and detection theory, high-dimensional statistics, convex and non-convex optimization, spectral algorithms, dimensionality reduction, and machine learning. The tools from these fields must be adapted to work under exceptionally challenging conditions, including extreme noise levels, the presence of missing data, and massively large datasets as large as several Terabytes. In addition, we present two statistical models: multi-reference alignment and multi-target detection, that abstract away much of the intricacies of cryo-EM, while retaining some of its essential features. Based on these abstractions, we discuss some recent intriguing results in the mathematical theory of cryo-EM, and delineate relations with group theory, invariant theory, and information theory.
近年来,大量新的分子结构已通过冷冻电子显微镜(cryo-EM)得以阐明,这很大程度上归功于硬件技术和数据处理技术的进步。由于这些令人振奋的新进展,冷冻电子显微镜被《自然方法》选为2015年度方法,2017年诺贝尔化学奖授予了该领域的三位先驱者。本文的主要目标是介绍通过冷冻电子显微镜重建三维分子结构所涉及的具有挑战性且令人兴奋的计算任务。确定分子结构需要在包括信号处理、估计与检测理论、高维统计、凸优化和非凸优化、谱算法、降维以及机器学习等多个领域的广泛计算工具。这些领域的工具必须经过调整以在极具挑战性的条件下工作,包括极端噪声水平、缺失数据的存在以及高达数太字节的海量数据集。此外,我们提出了两种统计模型:多参考对齐和多目标检测,它们去除了冷冻电子显微镜的许多复杂细节,同时保留了其一些基本特征。基于这些抽象,我们讨论了冷冻电子显微镜数学理论中一些近期有趣的结果,并阐述了与群论、不变量理论和信息理论的关系。