Department of Chemical and Biological Engineering, University of Colorado Boulder, Boulder, CO 80303, United States.
Department of Pharmaceutics, Friedrich Alexander University Erlangen-Nürnberg, Erlangen 91058, Germany; Merck KGaA, Darmstadt 64293, Germany.
J Pharm Sci. 2024 May;113(5):1177-1189. doi: 10.1016/j.xphs.2024.03.003. Epub 2024 Mar 12.
Subvisible particles may be encountered throughout the processing of therapeutic protein formulations. Flow imaging microscopy (FIM) and backgrounded membrane imaging (BMI) are techniques commonly used to record digital images of these particles, which may be analyzed to provide particle size distributions, concentrations, and identities. Although both techniques record digital images of particles within a sample, FIM analyzes particles suspended in flowing liquids, whereas BMI records images of dry particles after collection by filtration onto a membrane. This study compared the performance of convolutional neural networks (CNNs) in classifying images of subvisible particles recorded by both imaging techniques. Initially, CNNs trained on BMI images appeared to provide higher classification accuracies than those trained on FIM images. However, attribution analyses showed that classification predictions from CNNs trained on BMI images relied on features contributed by the membrane background, whereas predictions from CNNs trained on FIM features were based largely on features of the particles. Segmenting images to minimize the contributions from image backgrounds reduced the apparent accuracy of CNNs trained on BMI images but caused minimal reduction in the accuracy of CNNs trained on FIM images. Thus, the seemingly superior classification accuracy of CNNs trained on BMI images compared to FIM images was an artifact caused by subtle features in the backgrounds of BMI images. Our findings emphasize the importance of examining machine learning algorithms for image analysis with attribution methods to ensure the robustness of trained models and to mitigate potential influence of artifacts within training data sets.
在治疗性蛋白制剂的处理过程中,可能会遇到亚可见颗粒。流动成像显微镜(FIM)和背景膜成像(BMI)是常用于记录这些颗粒数字图像的技术,这些颗粒可能经过分析提供粒径分布、浓度和特征。虽然这两种技术都记录了样品中颗粒的数字图像,但 FIM 分析的是悬浮在流动液体中的颗粒,而 BMI 则记录了通过过滤到膜上收集的干燥颗粒的图像。本研究比较了卷积神经网络(CNN)在分类两种成像技术记录的亚可见颗粒图像方面的性能。最初,基于 BMI 图像训练的 CNN 似乎提供了比基于 FIM 图像训练的 CNN 更高的分类准确率。然而,归因分析表明,基于 BMI 图像训练的 CNN 的分类预测依赖于膜背景贡献的特征,而基于 FIM 特征训练的 CNN 的预测则主要基于颗粒的特征。通过对图像进行分割以最小化图像背景的贡献,降低了基于 BMI 图像训练的 CNN 的表观准确率,但对基于 FIM 图像训练的 CNN 的准确率的降低影响很小。因此,与 FIM 图像相比,基于 BMI 图像训练的 CNN 似乎具有更高的分类准确率,这是 BMI 图像背景中的细微特征造成的假象。我们的研究结果强调了使用归因方法检查图像分析的机器学习算法的重要性,以确保训练模型的稳健性,并减轻训练数据集中潜在的假象影响。