• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

蛋白质连续晶体学中的数据缩减。

Data reduction in protein serial crystallography.

机构信息

Center for Free-Electron Laser Science CFEL, Deutsche Elektronen-Synchrotron DESY, Notkestr. 85, 22607 Hamburg, Germany.

Deutsches Elektronen-Synchrotron DESY, Notkestr. 85, 22607 Hamburg, Germany.

出版信息

IUCrJ. 2024 Mar 1;11(Pt 2):190-201. doi: 10.1107/S205225252400054X.

DOI:10.1107/S205225252400054X
PMID:38327201
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10916297/
Abstract

Serial crystallography (SX) has become an established technique for protein structure determination, especially when dealing with small or radiation-sensitive crystals and investigating fast or irreversible protein dynamics. The advent of newly developed multi-megapixel X-ray area detectors, capable of capturing over 1000 images per second, has brought about substantial benefits. However, this advancement also entails a notable increase in the volume of collected data. Today, up to 2 PB of data per experiment could be easily obtained under efficient operating conditions. The combined costs associated with storing data from multiple experiments provide a compelling incentive to develop strategies that effectively reduce the amount of data stored on disk while maintaining the quality of scientific outcomes. Lossless data-compression methods are designed to preserve the information content of the data but often struggle to achieve a high compression ratio when applied to experimental data that contain noise. Conversely, lossy compression methods offer the potential to greatly reduce the data volume. Nonetheless, it is vital to thoroughly assess the impact of data quality and scientific outcomes when employing lossy compression, as it inherently involves discarding information. The evaluation of lossy compression effects on data requires proper data quality metrics. In our research, we assess various approaches for both lossless and lossy compression techniques applied to SX data, and equally importantly, we describe metrics suitable for evaluating SX data quality.

摘要

连续结晶学(SX)已成为蛋白质结构测定的一种成熟技术,特别是在处理小或辐射敏感的晶体以及研究快速或不可逆的蛋白质动力学时。新开发的多百万像素 X 射线面探测器的出现带来了实质性的好处,这些探测器每秒能够捕获超过 1000 张图像。然而,这一进步也导致了所收集数据量的显著增加。如今,在高效的操作条件下,每个实验可以轻松获得高达 2 PB 的数据。考虑到来自多个实验的数据存储成本的综合因素,开发有效的策略来减少存储在磁盘上的数据量,同时保持科学结果的质量,这具有很强的吸引力。无损数据压缩方法旨在保留数据的信息内容,但在应用于包含噪声的实验数据时,通常难以实现高压缩比。相反,有损压缩方法具有大大减少数据量的潜力。然而,在使用有损压缩时,必须彻底评估数据质量和科学结果的影响,因为它本质上涉及信息的丢弃。对数据有损压缩效果的评估需要适当的数据质量指标。在我们的研究中,我们评估了应用于 SX 数据的无损和有损压缩技术的各种方法,同样重要的是,我们描述了适用于评估 SX 数据质量的指标。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c44b/10916297/fc2d54ba7d42/m-11-00190-fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c44b/10916297/a5e442fd2a05/m-11-00190-fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c44b/10916297/cb46c8224b61/m-11-00190-fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c44b/10916297/fc2d54ba7d42/m-11-00190-fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c44b/10916297/a5e442fd2a05/m-11-00190-fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c44b/10916297/cb46c8224b61/m-11-00190-fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c44b/10916297/fc2d54ba7d42/m-11-00190-fig3.jpg

相似文献

1
Data reduction in protein serial crystallography.蛋白质连续晶体学中的数据缩减。
IUCrJ. 2024 Mar 1;11(Pt 2):190-201. doi: 10.1107/S205225252400054X.
2
The effect of JPEG2000 compression on detection of skull fractures.JPEG2000 压缩对颅骨骨折检测的影响。
Acad Radiol. 2013 Jun;20(6):712-20. doi: 10.1016/j.acra.2013.01.021.
3
Performance evaluation of lossy quality compression algorithms for RNA-seq data.RNA-seq 数据有损质量压缩算法的性能评估。
BMC Bioinformatics. 2020 Jul 20;21(1):321. doi: 10.1186/s12859-020-03658-4.
4
A Two-Level Scheme for Quality Score Compression.一种用于质量分数压缩的两级方案。
J Comput Biol. 2018 Oct;25(10):1141-1151. doi: 10.1089/cmb.2018.0065. Epub 2018 Jul 30.
5
Understanding and controlling the effect of lossy raw data compression on CT images.理解并控制有损原始数据压缩对CT图像的影响。
Med Phys. 2009 Aug;36(8):3643-53. doi: 10.1118/1.3158738.
6
mspack: efficient lossless and lossy mass spectrometry data compression.mspack:高效的无损和有损质谱数据压缩。
Bioinformatics. 2021 Nov 5;37(21):3923-3925. doi: 10.1093/bioinformatics/btab636.
7
Swelling archives warrant closer look at compression.肿胀档案值得更仔细地研究压迫情况。
Radiol Manage. 2003 Sep-Oct;25(5):36-9.
8
NGC: lossless and lossy compression of aligned high-throughput sequencing data.NGC:对齐高通量测序数据的无损和有损压缩。
Nucleic Acids Res. 2013 Jan 7;41(1):e27. doi: 10.1093/nar/gks939. Epub 2012 Oct 12.
9
Assessment of commercial compression algorithms, of the lossy DCT and lossless types, applied to diagnostic digital image files.对应用于诊断数字图像文件的有损离散余弦变换(DCT)和无损类型的商业压缩算法进行评估。
Comput Med Imaging Graph. 1998 Jan-Feb;22(1):25-30. doi: 10.1016/s0895-6111(98)00009-3.
10
Lossless and lossy compression of quantitative phase images of red blood cells obtained by digital holographic imaging.通过数字全息成像获得的红细胞定量相位图像的无损和有损压缩。
Appl Opt. 2016 Dec 20;55(36):10409-10416. doi: 10.1364/AO.55.010409.

引用本文的文献

1
Application of signal separation to diffraction image compression and serial crystallography.信号分离在衍射图像压缩和串行晶体学中的应用。
J Appl Crystallogr. 2025 Feb 1;58(Pt 1):138-153. doi: 10.1107/S1600576724011038.
2
Massive compression for high data rate macromolecular crystallography (HDRMX): impact on diffraction data and subsequent structural analysis.用于高数据率大分子晶体学(HDRMX)的大规模压缩:对衍射数据及后续结构分析的影响
J Synchrotron Radiat. 2025 Mar 1;32(Pt 2):385-398. doi: 10.1107/S1600577525000396. Epub 2025 Feb 6.

本文引用的文献

1
JINXED: just in time crystallization for easy structure determination of biological macromolecules.JINXED:用于生物大分子易于结构测定的即时结晶。
IUCrJ. 2023 May 1;10(Pt 3):253-260. doi: 10.1107/S2052252523001653.
2
Rapid and efficient room-temperature serial synchrotron crystallography using the CFEL TapeDrive.使用CFEL磁带驱动系统的快速高效室温串行同步辐射晶体学。
IUCrJ. 2022 Oct 31;9(Pt 6):778-791. doi: 10.1107/S2052252522010193. eCollection 2022 Nov 1.
3
Advances in long-wavelength native phasing at X-ray free-electron lasers.
X射线自由电子激光长波长原生相位技术的进展。
IUCrJ. 2020 Sep 9;7(Pt 6):965-975. doi: 10.1107/S2052252520011379. eCollection 2020 Nov 1.
4
Gold Standard for macromolecular crystallography diffraction data.大分子晶体学衍射数据的金标准。
IUCrJ. 2020 Jul 10;7(Pt 5):784-792. doi: 10.1107/S2052252520008672. eCollection 2020 Sep 1.
5
Impact of lossy compression of X-ray projections onto reconstructed tomographic slices.X射线投影的有损压缩对重建断层切片的影响。
J Synchrotron Radiat. 2020 Sep 1;27(Pt 5):1326-1338. doi: 10.1107/S1600577520007353. Epub 2020 Jul 28.
6
JUNGFRAU detector for brighter x-ray sources: Solutions for IT and data science challenges in macromolecular crystallography.用于更亮X射线源的JUNGFRAU探测器:解决大分子晶体学中信息技术和数据科学挑战的方案
Struct Dyn. 2020 Feb 26;7(1):014305. doi: 10.1063/1.5143480. eCollection 2020 Jan.
7
Evaluation of serial crystallographic structure determination within megahertz pulse trains.兆赫兹脉冲序列中连续晶体结构测定的评估
Struct Dyn. 2019 Dec 4;6(6):064702. doi: 10.1063/1.5124387. eCollection 2019 Nov.
8
1 kHz fixed-target serial crystallography using a multilayer monochromator and an integrating pixel detector.使用多层单色仪和积分像素探测器的1千赫兹固定靶串行晶体学。
IUCrJ. 2019 Aug 17;6(Pt 5):927-937. doi: 10.1107/S205225251900914X. eCollection 2019 Sep 1.
9
X-Ray Free-Electron Lasers for the Structure and Dynamics of Macromolecules.X 射线自由电子激光在生物大分子结构与动力学研究中的应用。
Annu Rev Biochem. 2019 Jun 20;88:35-58. doi: 10.1146/annurev-biochem-013118-110744. Epub 2019 Jan 2.
10
Real-time diffraction computed tomography data reduction.实时衍射计算机断层扫描数据缩减
J Synchrotron Radiat. 2018 Mar 1;25(Pt 2):612-617. doi: 10.1107/S1600577518000607. Epub 2018 Feb 20.