Monge Nicolas, Amini Massih Reza, Deschamps Alexis
Xenocs, Grenoble, France.
LIG, University of Grenoble Alpes, CNRS, Grenoble, France.
Acta Crystallogr A Found Adv. 2024 Nov 1;80(Pt 6):405-413. doi: 10.1107/S2053273324007988. Epub 2024 Sep 23.
Small-angle X-ray scattering (SAXS) is a widely used method for nanoparticle characterization. A common approach to analysing nanoparticles in solution by SAXS involves fitting the curve using a parametric model that relates real-space parameters, such as nanoparticle size and electron density, to intensity values in reciprocal space. Selecting the optimal model is a crucial step in terms of analysis quality and can be time-consuming and complex. Several studies have proposed effective methods, based on machine learning, to automate the model selection step. Deploying these methods in software intended for both researchers and industry raises several issues. The diversity of SAXS instrumentation requires assessment of the robustness of these methods on data from various machine configurations, involving significant variations in the q-space ranges and highly variable signal-to-noise ratios (SNR) from one data set to another. In the case of laboratory instrumentation, data acquisition can be time-consuming and there is no universal criterion for defining an optimal acquisition time. This paper presents an approach that revisits the nanoparticle model selection method proposed by Monge et al. [Acta Cryst. (2024), A80, 202-212], evaluating and enhancing its robustness on data from device configurations not seen during training, by expanding the data set used for training. The influence of SNR on predictor robustness is then assessed, improved, and used to propose a stopping criterion for optimizing the trade-off between exposure time and data quality.
小角X射线散射(SAXS)是一种广泛用于纳米颗粒表征的方法。通过SAXS分析溶液中纳米颗粒的常用方法包括使用参数模型拟合曲线,该模型将实空间参数(如纳米颗粒大小和电子密度)与倒易空间中的强度值相关联。选择最佳模型是分析质量方面的关键步骤,可能既耗时又复杂。一些研究提出了基于机器学习的有效方法来自动执行模型选择步骤。在面向研究人员和行业的软件中部署这些方法会引发几个问题。SAXS仪器的多样性要求评估这些方法对来自各种机器配置的数据的稳健性,这涉及q空间范围的显著变化以及不同数据集之间高度可变的信噪比(SNR)。对于实验室仪器,数据采集可能很耗时,并且没有定义最佳采集时间的通用标准。本文提出了一种方法,重新审视了Monge等人[《晶体学报》(2024年),A80,202-212]提出的纳米颗粒模型选择方法,通过扩展用于训练的数据集,评估并增强其对训练期间未见过的设备配置数据的稳健性。然后评估、改进信噪比对预测器稳健性的影响,并用于提出一个停止标准,以优化曝光时间和数据质量之间的权衡。