Graduate School of Chemical Sciences and Engineering, Materials Chemistry and Engineering Course, Hokkaido University, Kita 13, Nishi 8, Kita-ku, Sapporo, 060-8628, Hokkaido, Japan.
Department of Mathematics, Bangladesh University of Engineering and Technology, Dhaka-1000, Bangladesh.
Analyst. 2023 Jul 26;148(15):3574-3583. doi: 10.1039/d3an00516j.
A line illumination Raman microscope extracts the underlying spatial and spectral information of a sample, typically a few hundred times faster than raster scanning. This makes it possible to measure a wide range of biological samples such as cells and tissues - that only allow modest intensity illumination to prevent potential damage - within feasible time frame. However, a non-uniform intensity distribution of laser line illumination may induce some artifacts in the data and lower the accuracy of machine learning models trained to predict sample class membership. Here, using cancerous and normal human thyroid follicular epithelial cell lines, FTC-133 and Nthy-ori 3-1 lines, whose Raman spectral difference is not so large, we show that the standard pre-processing of spectral analyses widely used for raster scanning microscopes introduced some artifacts. To address this issue, we proposed a detrending scheme based on random forest regression, a nonparametric model-free machine learning algorithm, combined with a position-dependent wavenumber calibration scheme along the illumination line. It was shown that the detrending scheme minimizes the artifactual biases arising from non-uniform laser sources and significantly enhances the differentiability of the sample states, , cancerous or normal epithelial cells, compared to the standard pre-processing scheme.
线照明拉曼显微镜提取样品的潜在空间和光谱信息,通常比光栅扫描快几百倍。这使得能够在可行的时间范围内测量广泛的生物样品,如细胞和组织-仅允许适度的强度照明以防止潜在的损伤。然而,激光线照明的不均匀强度分布可能会在数据中引入一些伪影,并降低训练来预测样品类别成员的机器学习模型的准确性。在这里,使用癌性和正常人类甲状腺滤泡上皮细胞系 FTC-133 和 Nthy-ori 3-1 系,其拉曼光谱差异不是很大,我们表明广泛用于光栅扫描显微镜的光谱分析的标准预处理引入了一些伪影。为了解决这个问题,我们提出了一种基于随机森林回归的去趋势方案,这是一种无参数、无模型的机器学习算法,并结合了沿照明线的位置相关波数校准方案。结果表明,与标准预处理方案相比,去趋势方案最小化了非均匀激光源引起的人为偏差,并显著增强了样品状态的可区分性,即癌性或正常上皮细胞。