Suppr超能文献

用于串行晶体学实验的实时数据处理。

Real-time data processing for serial crystallography experiments.

作者信息

White Thomas, Schoof Tim, Yakubov Sergey, Tolstikova Aleksandra, Middendorf Philipp, Karnevskiy Mikhail, Mariani Valerio, Henkel Alessandra, Klopprogge Bjarne, Hannappel Juergen, Oberthuer Dominik, De Gennaro Aquino Ivan, Egorov Dmitry, Munke Anna, Sprenger Janina, Pompidor Guillaume, Taberman Helena, Gruzinov Andrey, Meyer Jan, Hakanpää Johanna, Gasthuber Martin

机构信息

Center for Data and Computing in Natural Science CDCS, Deutsches Elektronen-Synchrotron DESY, Hamburg, Germany.

Deutsches Elektronen-Synchrotron DESY, Hamburg, Germany.

出版信息

IUCrJ. 2025 Jan 1;12(Pt 1):97-108. doi: 10.1107/S2052252524011837.

Abstract

We report the use of streaming data interfaces to perform fully online data processing for serial crystallography experiments, without storing intermediate data on disk. The system produces Bragg reflection intensity measurements suitable for scaling and merging, with a latency of less than 1 s per frame. Our system uses the CrystFEL software in combination with the ASAP::O data framework. In a series of user experiments at PETRA III, frames from a 16 megapixel Dectris EIGER2 X detector were searched for peaks, indexed and integrated at the maximum full-frame readout speed of 133 frames per second. The computational resources required depend on various factors, most significantly the fraction of non-blank frames (hits'). The average single-thread processing time per frame was 242 ms for blank frames and 455 ms for hits, meaning that a single 96-core computing node was sufficient to keep up with the data, with ample headroom for unexpected throughput reductions. Further significant improvements are expected, for example by binning pixel intensities together to reduce the pixel count. We discuss the implications of real-time data processing on the data deluge' problem from recent and future photon-science experiments, in particular on calibration requirements, computing access patterns and the need for the preservation of raw data.

摘要

我们报告了使用流数据接口对串行晶体学实验进行完全在线数据处理,而无需将中间数据存储在磁盘上。该系统生成适用于缩放和合并的布拉格反射强度测量值,每帧延迟小于1秒。我们的系统将CrystFEL软件与ASAP::O数据框架结合使用。在PETRA III进行的一系列用户实验中,以每秒133帧的最大全帧读出速度,对来自1600万像素的Dectris EIGER2 X探测器的帧进行了峰值搜索、索引和积分。所需的计算资源取决于各种因素,最主要的是非空白帧(“命中”)的比例。空白帧每帧的平均单线程处理时间为242毫秒,命中帧为455毫秒,这意味着单个96核计算节点足以跟上数据,并有足够的余量应对意外的吞吐量降低。预计会有进一步显著的改进,例如通过合并像素强度以减少像素数量。我们讨论了实时数据处理对近期和未来光子科学实验中“数据洪流”问题的影响,特别是在校准要求、计算访问模式以及原始数据保存需求方面。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6fde/11707691/bd12fce15835/m-12-00097-fig1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验