Suppr超能文献

使用机器学习技术对鼠类淀粉样蛋白 A 纤维的冷冻电镜图像中的交叉进行自动识别。

Automatic identification of crossovers in cryo-EM images of murine amyloid protein A fibrils with machine learning.

机构信息

Institute of Stochastics, Ulm University, Ulm, Germany.

Visual Computing Group, Institute of Media Informatics, Ulm University, Ulm, Germany.

出版信息

J Microsc. 2020 Jan;277(1):12-22. doi: 10.1111/jmi.12858. Epub 2019 Dec 29.

Abstract

Detecting crossovers in cryo-electron microscopy images of protein fibrils is an important step towards determining the morphological composition of a sample. Currently, the crossover locations are picked by hand, which introduces errors and is a time-consuming procedure. With the rise of deep learning in computer vision tasks, the automation of such problems has become more and more applicable. However, because of insufficient quality of raw data and missing labels, neural networks alone cannot be applied successfully to target the given problem. Thus, we propose an approach combining conventional computer vision techniques and deep learning to automatically detect fibril crossovers in two-dimensional cryo-electron microscopy image data and apply it to murine amyloid protein A fibrils, where we first use direct image processing methods to simplify the image data such that a convolutional neural network can be applied to the remaining segmentation problem. LAY DESCRIPTION: The ability of protein to form fibrillary structures underlies important cellular functions but can also give rise to disease, such as in a group of disorders, termed amyloid diseases. These diseases are characterised by the formation of abnormal protein filaments, so-called amyloid fibrils, that deposit inside the tissue. Many amyloid fibrils are helically twisted, which leads to periodic variations in the apparent width of the fibril, when observing amyloid fibrils using microscopy techniques like cryogenic electron microscopy (cryo-EM). Due to the two-dimensional projection, parts of the fibril orthogonal to the projection plane appear narrower than parts parallel to the plane. The parts of small width are called crossovers. The distance between two adjacent crossovers is an important characteristic for the analysis of amyloid fibrils, because it is informative about the fibril morphology and because it can be determined from raw data by eye. A given protein can typically form different fibril morphologies. The morphology can vary depending on the chemical and physical conditions of fibril formation, but even when fibrils are formed under identical solution conditions, different morphologies may be present in a sample. As the crossovers allow to define fibril morphologies in a heterogeneous sample, detecting crossovers is an important first step in the sample analysis. In the present paper, we introduce a method for the automated detection of fibril crossovers in cryo-EM image data. The data consists of greyscale images, each showing an unknown number of potentially overlapping fibrils. In a first step, techniques from image analysis and pattern detection are employed to detect single fibrils in the raw data. Then, a convolutional neural network is used to find the locations of crossovers on each single fibril. As these predictions may contain errors, further postprocessing steps assess the quality and may slightly alter or reject the predicted crossovers.

摘要

在蛋白质纤维的冷冻电子显微镜图像中检测交叉点是确定样品形态组成的重要步骤。目前,交叉点位置是手动挑选的,这会引入误差,并且是一个耗时的过程。随着计算机视觉任务中深度学习的兴起,此类问题的自动化越来越适用。然而,由于原始数据质量不足和缺少标签,神经网络本身无法成功应用于目标问题。因此,我们提出了一种结合传统计算机视觉技术和深度学习的方法,以自动检测二维冷冻电子显微镜图像数据中的纤维交叉点,并将其应用于鼠类淀粉样蛋白 A 纤维,我们首先使用直接图像处理方法简化图像数据,以便卷积神经网络可以应用于剩余的分割问题。

蛋白质形成纤维状结构的能力是重要的细胞功能的基础,但也会导致疾病,例如在一组称为淀粉样变性病的疾病中。这些疾病的特征是形成异常的蛋白质纤维,即所谓的淀粉样纤维,这些纤维在组织内沉积。许多淀粉样纤维是螺旋扭曲的,这导致当使用冷冻电子显微镜(cryo-EM)等显微镜技术观察淀粉样纤维时,纤维的表观宽度会出现周期性变化。由于二维投影,与投影平面正交的纤维部分看起来比与平面平行的部分更窄。宽度较小的部分称为交叉点。两个相邻交叉点之间的距离是分析淀粉样纤维的一个重要特征,因为它提供了关于纤维形态的信息,并且可以通过肉眼从原始数据中确定。给定的蛋白质通常可以形成不同的纤维形态。形态可以取决于纤维形成的化学和物理条件而变化,但是即使在纤维在相同的溶液条件下形成时,样品中也可能存在不同的形态。由于交叉点可以定义异质样品中的纤维形态,因此检测交叉点是样品分析的重要第一步。在本文中,我们介绍了一种用于在冷冻电子显微镜图像数据中自动检测纤维交叉点的方法。数据由灰度图像组成,每个图像显示数量未知的潜在重叠纤维。在第一步中,使用图像分析和模式检测技术从原始数据中检测单个纤维。然后,使用卷积神经网络在每个单根纤维上找到交叉点的位置。由于这些预测可能存在误差,因此进一步的后处理步骤会评估质量,并可能轻微修改或拒绝预测的交叉点。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验