College of Computer, Hubei University of Education, Wuhan, China.
Rapid Commun Mass Spectrom. 2024 Apr 30;38(8):e9717. doi: 10.1002/rcm.9717.
Mass spectrometry imaging (MSI) has been widely used in biomedical research fields. Each pixel in MSI consists of a mass spectrum that reflects the molecule feature of the tissue spot. Because MSI contains high-dimensional datasets, it is highly desired to develop computational methods for data mining and constructing tissue segmentation maps.
To visualize different tissue regions based on mass spectrum features and improve the efficiency in processing enormous data, we proposed a computational strategy that consists of four procedures including preprocessing, data reduction, clustering, and quantitative validation.
In this study, we examined the combination of t-distributed stochastic neighbor embedding (t-SNE) and hierarchical clustering (HC) for MSI data analysis. Using publicly available MSI datasets, one dataset of mouse urinary bladder, and one dataset of human colorectal cancer, we demonstrated that the generated tissue segmentation maps from this combination were superior to other data reduction and clustering algorithms. Using the staining image as a reference, we assessed the performance of clustering algorithms with external and internal clustering validation measures, including purity, adjusted Rand index (ARI), Davies-Bouldin index (DBI), and spatial aggregation index (SAI). The result indicated that SAI delivered excellent performance for automatic segmentation of tissue regions in MSI.
We used a clustering algorithm to construct tissue automatic segmentation in MSI datasets. The performance was evaluated by comparing it with the stained image and calculating clustering validation indexes. The results indicated that SAI is important for automatic tissue segmentation in MSI, different from traditional clustering validation measures. Compared to the reports that used internal clustering validation measures such as DBI, our method offers more effective evaluation of clustering results for MSI segmentation. We envision that the proposed automatic image segmentation strategy can facilitate deep learning in molecular feature extraction and biomarker discovery for the biomedical applications of MSI.
质谱成像(MSI)已广泛应用于生物医学研究领域。MSI 中的每个像素都由一个质谱组成,反映组织点的分子特征。由于 MSI 包含高维数据集,因此非常希望开发用于数据挖掘和构建组织分割图的计算方法。
为了根据质谱特征可视化不同的组织区域并提高处理大量数据的效率,我们提出了一种计算策略,该策略由包括预处理、数据减少、聚类和定量验证在内的四个过程组成。
在这项研究中,我们检查了 t 分布随机邻嵌入(t-SNE)和层次聚类(HC)在 MSI 数据分析中的组合。使用公开可用的 MSI 数据集、一个小鼠膀胱数据集和一个人结直肠癌数据集,我们证明了从这种组合生成的组织分割图优于其他数据减少和聚类算法。使用染色图像作为参考,我们使用外部和内部聚类验证措施(包括纯度、调整兰德指数(ARI)、戴维斯-布尔丁指数(DBI)和空间聚集指数(SAI))评估聚类算法的性能。结果表明,SAI 为 MSI 中组织区域的自动分割提供了出色的性能。
我们使用聚类算法构建 MSI 数据集的组织自动分割。通过将其与染色图像进行比较并计算聚类验证指标来评估性能。结果表明,SAI 对 MSI 中的自动组织分割很重要,与传统的聚类验证措施不同。与使用 DBI 等内部聚类验证措施的报告相比,我们的方法为 MSI 分割的聚类结果提供了更有效的评估。我们设想,所提出的自动图像分割策略可以促进深度学习在分子特征提取和生物标志物发现中的应用,从而推动 MSI 在生物医学中的应用。