Zhvansky Evgeny, Sorokin Anatoly, Shurkhay Vsevolod, Zavorotnyuk Denis, Bormotov Denis, Pekov Stanislav, Potapov Alexander, Nikolaev Evgeny, Popov Igor
Moscow Institute of Physics and Technology, Dolgoprudny, Moscow Region, Russian Federation.
Institute of Cell Biophysics RAS, Pushchino, Russian Federation.
Mass Spectrom (Tokyo). 2021;10(1):A0094. doi: 10.5702/massspectrometry.A0094. Epub 2021 Mar 13.
Recently developed methods of ambient ionization allow the collection of mass spectrometric datasets for biological and medical applications at an unprecedented pace. One of the areas that could employ such analysis is neurosurgery. The fast identification of dissected tissues could assist the neurosurgery procedure. In this paper tumor tissues of astrocytoma and glioblastoma are compared. The vast majority of the data representation methods are hard to use, as the number of features is high and the amount of samples is limited. Furthermore, the ratio of features and samples number restricts the use of many machine learning methods. The number of features could be reduced through feature selection algorithms or dimensionality reduction methods. Different algorithms of dimensionality reduction are considered along with the traditional noise thresholding for the mass spectra. From our analysis, the Isomap algorithm appears to be the most effective dimensionality reduction algorithm for negative mode, whereas the positive mode could be processed with a simple noise reduction by a threshold. Also, negative and positive mode correspond to different sample properties: negative mode is responsible for the inner variability and the details of the sample, whereas positive mode describes measurement in general.
最近开发的常压电离方法使生物和医学应用的质谱数据集收集速度达到了前所未有的水平。神经外科手术是可以采用这种分析的领域之一。快速识别解剖组织有助于神经外科手术过程。本文对星形细胞瘤和胶质母细胞瘤的肿瘤组织进行了比较。由于特征数量众多且样本数量有限,绝大多数数据表示方法难以使用。此外,特征与样本数量的比例限制了许多机器学习方法的应用。可以通过特征选择算法或降维方法来减少特征数量。本文考虑了不同的降维算法以及质谱的传统噪声阈值处理方法。通过分析,对于负模式,等距映射(Isomap)算法似乎是最有效的降维算法,而正模式可以通过简单的阈值降噪来处理。此外,负模式和正模式对应于不同的样本属性:负模式反映样本的内部变异性和细节,而正模式总体上描述测量情况。