Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany.
Neuromation OU, Tallinn, Estonia.
Bioinformatics. 2020 May 1;36(10):3215-3224. doi: 10.1093/bioinformatics/btaa085.
Imaging mass spectrometry (imaging MS) is a prominent technique for capturing distributions of molecules in tissue sections. Various computational methods for imaging MS rely on quantifying spatial correlations between ion images, referred to as co-localization. However, no comprehensive evaluation of co-localization measures has ever been performed; this leads to arbitrary choices and hinders method development.
We present ColocML, a machine learning approach addressing this gap. With the help of 42 imaging MS experts from nine laboratories, we created a gold standard of 2210 pairs of ion images ranked by their co-localization. We evaluated existing co-localization measures and developed novel measures using term frequency-inverse document frequency and deep neural networks. The semi-supervised deep learning Pi model and the cosine score applied after median thresholding performed the best (Spearman 0.797 and 0.794 with expert rankings, respectively). We illustrate these measures by inferring co-localization properties of 10 273 molecules from 3685 public METASPACE datasets.
https://github.com/metaspace2020/coloc.
Supplementary data are available at Bioinformatics online.
成像质谱(imaging MS)是一种用于捕获组织切片中分子分布的重要技术。各种用于成像 MS 的计算方法都依赖于定量离子图像之间的空间相关性,称为共定位。然而,尚未对共定位度量进行全面评估;这导致了任意的选择,并阻碍了方法的发展。
我们提出了 ColocML,这是一种解决这一差距的机器学习方法。在来自九个实验室的 42 名成像 MS 专家的帮助下,我们创建了一个由 2210 对离子图像组成的黄金标准,这些图像按共定位进行了排序。我们评估了现有的共定位度量,并使用术语频率-逆文档频率和深度神经网络开发了新的度量。半监督深度学习 Pi 模型和中值阈值处理后的余弦得分表现最好(与专家排名的 Spearman 分别为 0.797 和 0.794)。我们通过从 3685 个公共 METASPACE 数据集推断 10273 个分子的共定位特性来说明这些度量。
https://github.com/metaspace2020/coloc。
补充数据可在“生物信息学在线”上获得。