Suppr超能文献

使用稳健峰查找器进行串行晶体学的数据约简。

Data reduction for serial crystallography using a robust peak finder.

作者信息

Hadian-Jazi Marjan, Sadri Alireza, Barty Anton, Yefanov Oleksandr, Galchenkova Marina, Oberthuer Dominik, Komadina Dana, Brehm Wolfgang, Kirkwood Henry, Mills Grant, de Wijn Raphael, Letrun Romain, Kloos Marco, Vakili Mohammad, Gelisio Luca, Darmanin Connie, Mancuso Adrian P, Chapman Henry N, Abbey Brian

机构信息

ARC Centre of Excellence in Advanced Molecular Imaging, La Trobe Institute for Molecular Sciences, La Trobe University, Melbourne, Australia.

Australian Nuclear Science and Technology Organisation (ANSTO), Australia.

出版信息

J Appl Crystallogr. 2021 Sep 13;54(Pt 5):1360-1378. doi: 10.1107/S1600576721007317. eCollection 2021 Oct 1.

Abstract

A peak-finding algorithm for serial crystallography (SX) data analysis based on the principle of 'robust statistics' has been developed. Methods which are statistically robust are generally more insensitive to any departures from model assumptions and are particularly effective when analysing mixtures of probability distributions. For example, these methods enable the discretization of data into a group comprising inliers ( the background noise) and another group comprising outliers ( Bragg peaks). Our robust statistics algorithm has two key advantages, which are demonstrated through testing using multiple SX data sets. First, it is relatively insensitive to the exact value of the input parameters and hence requires minimal optimization. This is critical for the algorithm to be able to run unsupervised, allowing for automated selection or 'vetoing' of SX diffraction data. Secondly, the processing of individual diffraction patterns can be easily parallelized. This means that it can analyse data from multiple detector modules simultaneously, making it ideally suited to real-time data processing. These characteristics mean that the robust peak finder (RPF) algorithm will be particularly beneficial for the new class of MHz X-ray free-electron laser sources, which generate large amounts of data in a short period of time.

摘要

基于“稳健统计”原理,开发了一种用于串行晶体学(SX)数据分析的峰值查找算法。统计稳健的方法通常对任何偏离模型假设的情况不太敏感,并且在分析概率分布的混合时特别有效。例如,这些方法能够将数据离散化为一个由内点(背景噪声)组成的组和另一个由异常值(布拉格峰)组成的组。我们的稳健统计算法有两个关键优势,通过使用多个SX数据集进行测试得以证明。首先,它对输入参数的精确值相对不敏感,因此所需的优化最少。这对于算法能够在无监督的情况下运行至关重要,从而允许对SX衍射数据进行自动选择或“否决”。其次,单个衍射图案的处理可以很容易地并行化。这意味着它可以同时分析来自多个探测器模块的数据,使其非常适合实时数据处理。这些特性意味着稳健峰值查找器(RPF)算法对于新型的兆赫兹X射线自由电子激光源将特别有益,这类光源在短时间内会产生大量数据。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c268/8493619/a707b0503830/j-54-01360-fig1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验