利用机器学习从离散频率红外图像中定量分析蛋白质二级结构

Quantification of Protein Secondary Structures from Discrete Frequency Infrared Images Using Machine Learning.

作者信息

Edmonds Harrison, Mukherjee Sudipta S, Holcombe Brooke, Yeh Kevin, Bhargava Rohit, Ghosh Ayanjeet

机构信息

Department of Chemistry and Biochemistry, University of Alabama, Tuscaloosa, Alabama 354127, USA.

Beckman Institute for Advanced Science and Technology, University of Illinois Urbana-Champaign, Urbana, Illinois 61801, USA.

出版信息

Appl Spectrosc. 2025 Mar 31:37028251325553. doi: 10.1177/00037028251325553.

DOI:10.1177/00037028251325553

PMID:40165369

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12353105/

Abstract

Discrete frequency infrared (IR) imaging is an exciting experimental technique that has shown promise in various applications in biomedical science. This technique often involves acquiring IR absorptive images at specific frequencies of interest that enable pathologically relevant chemical contrast. However, certain applications, such as tracking the spatial variations in protein secondary structure of tissue specimens, necessary for the characterization of neurodegenerative diseases, require deeper analysis of spectral data. In such cases, the conventional analytical approach involves band fitting the hyperspectral data to extract the relative populations of different structures through their fitted areas under the curve (AUC). While Gaussian spectral fitting for one spectrum is viable, expanding that to an image with millions of pixels, as often applicable for tissue specimens, becomes a computationally expensive process. Alternatives like principal component analysis (PCA) are less structurally interpretable and incompatible with sparsely sampled data. Furthermore, this detracts from the key advantages of discrete frequency imaging by necessitating the acquisition of more finely sampled spectral data that is optimal for curve fitting, resulting in significantly longer data acquisition times, larger datasets, and additional computational overhead. In this work, we demonstrate that a simple two-step regressive neural network model can be utilized to mitigate these challenges and employ discrete frequency imaging for retrieving the results from band fitting without significant loss of fidelity. Our model reduces the data acquisition time nearly six-fold by requiring only seven wavenumbers to accurately interpolate spectral information at a higher resolution and subsequently using the upscaled spectra to accurately predict the component AUCs, which is more than 3000 times faster than spectral fitting. Our approach thus drastically cuts down the data acquisition and analysis time and predicts key differences in protein structure that can be vital towards broadening potential applications of discrete frequency imaging.

摘要

离散频率红外（IR）成像是一种令人兴奋的实验技术，已在生物医学科学的各种应用中展现出前景。该技术通常涉及在特定感兴趣频率下获取红外吸收图像，从而实现与病理相关的化学对比度。然而，某些应用，如追踪组织标本蛋白质二级结构的空间变化（这对于神经退行性疾病的表征至关重要），需要对光谱数据进行更深入的分析。在这种情况下，传统的分析方法是对高光谱数据进行波段拟合，通过曲线下拟合面积（AUC）来提取不同结构的相对含量。虽然对一个光谱进行高斯光谱拟合是可行的，但将其扩展到具有数百万像素的图像（这在组织标本中经常适用）会成为一个计算成本高昂的过程。像主成分分析（PCA）这样的替代方法在结构上较难解释，并且与稀疏采样数据不兼容。此外，这削弱了离散频率成像的关键优势，因为需要获取更精细采样的光谱数据以实现最佳曲线拟合，从而导致数据采集时间显著延长、数据集更大以及额外的计算开销。在这项工作中，我们证明了一个简单的两步回归神经网络模型可用于应对这些挑战，并利用离散频率成像在不显著损失保真度的情况下获取波段拟合结果。我们的模型通过仅需要七个波数就能以更高分辨率准确插值光谱信息，随后使用放大后的光谱准确预测成分AUC，将数据采集时间减少了近六倍，这比光谱拟合快3000多倍。因此，我们的方法大幅缩短了数据采集和分析时间，并预测了蛋白质结构的关键差异，这对于拓宽离散频率成像的潜在应用可能至关重要。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

利用机器学习从离散频率红外图像中定量分析蛋白质二级结构

Quantification of Protein Secondary Structures from Discrete Frequency Infrared Images Using Machine Learning.

作者信息

机构信息

出版信息

相似文献

本文引用的文献

利用机器学习从离散频率红外图像中定量分析蛋白质二级结构

Quantification of Protein Secondary Structures from Discrete Frequency Infrared Images Using Machine Learning.

作者信息

机构信息

出版信息

相似文献

本文引用的文献