利用基础模型库进行跨设备肿瘤显微镜检查中的细胞相似性搜索。

Leveraging a foundation model zoo for cell similarity search in oncological microscopy across devices.

作者信息

Kalweit Gabriel, Klett Anusha, Silvestrini Paula, Rahnfeld Jens, Naouar Mehdi, Vogt Yannick, Infante Diana, Berger Rebecca, Duque-Afonso Jesús, Hartmann Tanja Nicole, Follo Marie, Bodurova-Spassova Elitsa, Lübbert Michael, Mertelsmann Roland, Boedecker Joschka, Ullrich Evelyn, Kalweit Maria

机构信息

Collaborative Research Institute Intelligent Oncology (CRIION), Freiburg, Germany.

Neurorobotics Lab, Department of Computer Science, University of Freiburg, Freiburg, Germany.

出版信息

Front Oncol. 2025 Jun 18;15:1480384. doi: 10.3389/fonc.2025.1480384. eCollection 2025.

BACKGROUND

Cellular imaging analysis using the traditional retrospective approach is extremely time-consuming and labor-intensive. Although AI-based solutions are available, these approaches rely heavily on supervised learning techniques that require high quality, large labeled datasets from the same microscope to be reliable. In addition, primary patient samples are often heterogeneous cell populations and need to be stained to distinguish the cellular subsets. The resulting imaging data is analyzed and labeled manually by experts. Therefore, a method to distinguish cell populations across imaging devices without the need for staining and extensive manual labeling would help immensely to gain real-time insights into cell population dynamics. This especially holds true for recognizing specific cell types and states in response to treatments.

OBJECTIVE

We aim to develop an unsupervised approach using general vision foundation models trained on diverse and extensive imaging datasets to extract rich visual features for cell-analysis across devices, including both stained and unstained live cells. Our method, Entropy-guided Weighted Combinational FAISS (EWC-FAISS), uses these models purely in an inference-only mode without task-specific retraining on the cellular data. Combining the generated embeddings in an efficient and adaptive k-nearest neighbor search allows for automated, cross device identification of cell types and states, providing a strong basis for AI-assisted cancer therapy.

METHODS

We utilized two publicly available datasets. The WBC dataset includes 14,424 images of stained white blood cell samples from patients with acute myeloid and lymphoid leukemia, as well as those without leukemic pathology. The LISC dataset comprises 257 images of white blood cell samples from healthy individuals. We generated four in-house datasets utilizing the JIMT-1 breast cancer cell line, as well as Jurkat and K562 (leukemic cell lines). These datasets were acquired using the Nanolive 3D Cell Explorer-fluo (CX-A) holotomographic microscope and the BioTek Lionheart FX automated brightfield microscope. The images from the in-house datasets were manually annotated using Roboflow software. To generate the embeddings, we used and optimized a concatenated combination of SAM, DINO, ConvNeXT, SWIN, CLIP and ViTMAE. The combined embeddings were used as input for the adaptive k-nearest neighbor search, building an approximate Hierarchical Navigable Small World FAISS index. We compared EWC-FAISS to fully fined-tuned ViT-Classifiers with DINO-, and SWIN-backbones, a ConvNeXT architecture, as well as to NMTune as a lightweight domain-adaptation method with frozen backbone.

RESULTS

EWC-FAISS performed competitively with the baselines on the original datasets in terms of macro accuracy. Macro accuracy is the average of class-specific accuracies, treating all classes equally by averaging their individual accuracies. EWC-FAISS ranked second for the WBC dataset (macro accuracy: 97.6 ± 0.2), first for cell state classification from Nanolive (macro accuracy: 90 ± 0), and performed comparably for cell type classification from Lionheart (macro accuracy: 87 ± 0). For the transfer to out-of-distribution (OOD) datasets, which the model had not seen during training, EWC-FAISS consistently outperformed the other baselines. For the LISC dataset, EWC-FAISS achieved a macro accuracy of 78.5 ± 0.3, compared to DINO FT's 17 ± 1, SWIN FT's 44 ± 14, ConvNeXT FT's 45 ± 9, and NMTune's 52 ± 10. For the cell state classification from Lionheart, EWC-FAISS had a macro accuracy of 86 ± 1, while DINO FT, SWIN FT, and ConvNeXT FT achieved 65 ± 11, 68 ± 16, and 81 ± 1, respectively, and NMTune 81 ± 7. For the transfer of cell type classification from Nanolive, EWC-FAISS attained a macro accuracy of 85 ± 0, compared to DINO FT's 24.5 ± 0.9, SWIN FT's 57 ± 6, ConvNeXT FT's 54 ± 4, and NMTune's 63 ± 4. Additionally, building EWC-FAISS after embedding generation was significantly faster than training DINO FT (∼ 6 minutes compared to 10 hours). Lastly, EWC-FAISS performed comparably in distinguishing cancerous cell lines from Peripheral Blood Mononuclear Cells with a mean accuracy of 80 ± 5, compared to CellMixer with a mean accuracy of 79.7.

CONCLUSION

We present a novel approach to identify various cell lines and primary cells based on their identity and state using images acquired across various imaging platforms which vary in resolution, magnification and image quality. Despite these differences, we could show that our efficient, adaptive k-nearest neighbor search pipeline can be applied on a large image dataset containing different cell types and effectively differentiate between the cells and their states such as live, apoptotic or necrotic. There are several applications, particularly in distinguishing various cell populations in patient samples or monitoring therapy.

背景

使用传统回顾性方法进行细胞成像分析极其耗时且费力。尽管有基于人工智能的解决方案，但这些方法严重依赖监督学习技术，而这需要来自同一显微镜的高质量、大量标记数据集才能可靠。此外，原发性患者样本通常是异质细胞群体，需要进行染色以区分细胞亚群。所得的成像数据由专家手动分析和标记。因此，一种无需染色和大量手动标记即可跨成像设备区分细胞群体的方法，将极大地有助于实时洞察细胞群体动态。这对于识别特定细胞类型以及对治疗作出反应的细胞状态尤其适用。

目的

我们旨在开发一种无监督方法，利用在多样且广泛的成像数据集上训练的通用视觉基础模型，为跨设备的细胞分析提取丰富的视觉特征，包括染色和未染色的活细胞。我们的方法，即熵引导加权组合FAISS（EWC - FAISS），仅在推理模式下使用这些模型，无需对细胞数据进行特定任务的再训练。在高效且自适应的k近邻搜索中组合生成的嵌入，能够自动跨设备识别细胞类型和状态，为人工智能辅助癌症治疗提供有力基础。

方法

我们利用了两个公开可用的数据集。白细胞数据集（WBC）包括来自急性髓系和淋巴细胞白血病患者以及无白血病病理患者的14424张染色白细胞样本图像。LISC数据集包含来自健康个体的257张白细胞样本图像。我们利用JIMT - 1乳腺癌细胞系以及Jurkat和K562（白血病细胞系）生成了四个内部数据集。这些数据集是使用Nanolive 3D细胞探索者 - 荧光（CX - A）全息显微镜和BioTek Lionheart FX自动明场显微镜获取的。来自内部数据集的图像使用Roboflow软件进行手动注释。为了生成嵌入，我们使用并优化了SAM、DINO、ConvNeXT、SWIN、CLIP和ViTMAE的串联组合。组合后的嵌入用作自适应k近邻搜索的输入，构建近似分层可导航小世界FAISS索引。我们将EWC - FAISS与具有DINO - 和SWIN - 主干的完全微调的ViT分类器、ConvNeXT架构以及作为具有冻结主干的轻量级域适应方法的NMTune进行了比较。

结果

在宏观准确率方面，EWC - FAISS在原始数据集上与基线方法表现相当。宏观准确率是特定类别准确率的平均值，通过平均各个类别的准确率来平等对待所有类别。在WBC数据集上，EWC - FAISS排名第二（宏观准确率：97.6 ± 0.2），在Nanolive的细胞状态分类中排名第一（宏观准确率：90 ± 0），在Lionheart的细胞类型分类中表现相当（宏观准确率：87 ± 0）。对于转移到训练期间模型未见过的分布外（OOD）数据集，EWC - FAISS始终优于其他基线方法。对于LISC数据集，EWC - FAISS的宏观准确率达到78.5 ± 0.3，而DINO FT为17 ± 1，SWIN FT为44 ± 14，ConvNeXT FT为45 ± 9，NMTune为52 ± 10。对于Lionheart的细胞状态分类，EWC - FAISS的宏观准确率为86 ± 1，而DINO FT、SWIN FT和ConvNeXT FT分别为65 ± 11、68 ± 16和81 ± 1，NMTune为81 ± 7。对于从Nanolive转移的细胞类型分类，EWC - FAISS的宏观准确率达到85 ± 0，而DINO FT为24.5 ± 0.9，SWIN FT为57 ± 6，ConvNeXT FT为54 ± 4，NMTune为63 ± 4。此外，在嵌入生成后构建EWC - FAISS明显比训练DINO FT更快（约6分钟，而训练DINO FT需要10小时）。最后，在区分癌细胞系与外周血单个核细胞方面，EWC - FAISS的平均准确率为80 ± 5，与CellMixer的平均准确率79.7相当。

结论

我们提出了一种新颖的方法，基于在分辨率、放大倍数和图像质量各不相同的各种成像平台上获取的图像，根据细胞系和原代细胞的身份及状态对其进行识别。尽管存在这些差异，但我们能够证明，我们高效、自适应的k近邻搜索管道可应用于包含不同细胞类型的大型图像数据集，并有效区分细胞及其状态，如活细胞、凋亡细胞或坏死细胞。该方法有多种应用，特别是在区分患者样本中的各种细胞群体或监测治疗方面。