Khavnekar Sagar, Erdmann Philipp S, Wan William
Max Planck Institute of Biochemistry.
Human Technopole.
bioRxiv. 2024 Aug 7:2024.05.02.589639. doi: 10.1101/2024.05.02.589639.
Cryo-electron tomography (cryo-ET) and subtomogram averaging (STA) are becoming the preferred methodologies for investigating subcellular and macromolecular structures in native or near-native environments. While cryo-ET is amenable to a wide range of biological problems, these problems often have data processing requirements that need to be individually optimized, precluding the notion of a one-size-fits-all processing pipeline. Cryo-ET data processing is also becoming progressively more complex due to an increasing number of packages for each processing step. Though each package has its own strengths and weaknesses, independent development and different data formats makes them difficult to interface with one another. TOMOMAN (TOMOgram MANager) is an extensible package for streamlining the interoperability of packages, enabling users to develop project-specific processing workflows. TOMOMAN does this by maintaining an internal metadata format and wrapping external packages to manage and perform preprocessing, from raw tilt-series data to reconstructed tomograms. TOMOMAN can also export this metadata between various STA packages. TOMOMAN also includes tools for archiving projects to data repositories; allowing subsequent users to download TOMOMAN projects and directly resume processing where it was previously left off. By tracking essential metadata, TOMOMAN streamlines data sharing, which improves reproducibility of published results, reduces computational costs by minimizing reprocessing, and enables distributed cryo-ET projects between multiple groups and institutions. TOMOMAN provides a way for users to test different software packages to develop processing workflows that meet the specific needs of their biological questions and to distribute their results with the broader scientific community.
冷冻电子断层扫描(cryo-ET)和亚断层平均(STA)正成为在天然或接近天然环境中研究亚细胞和大分子结构的首选方法。虽然cryo-ET适用于广泛的生物学问题,但这些问题往往有需要单独优化的数据处理要求,排除了一刀切的处理流程概念。由于每个处理步骤的软件包数量不断增加,cryo-ET数据处理也变得越来越复杂。尽管每个软件包都有其优缺点,但独立开发和不同的数据格式使它们难以相互接口。TOMOMAN(断层扫描管理器)是一个可扩展的软件包,用于简化软件包之间的互操作性,使用户能够开发特定于项目的处理工作流程。TOMOMAN通过维护内部元数据格式并包装外部软件包来管理和执行预处理,从原始倾斜系列数据到重建的断层扫描。TOMOMAN还可以在各种STA软件包之间导出此元数据。TOMOMAN还包括将项目存档到数据存储库的工具;允许后续用户下载TOMOMAN项目并直接从之前中断的地方恢复处理。通过跟踪基本元数据,TOMOMAN简化了数据共享,提高了已发表结果的可重复性,通过最小化重新处理降低了计算成本,并实现了多个团队和机构之间的分布式cryo-ET项目。TOMOMAN为用户提供了一种方法来测试不同的软件包,以开发满足其生物学问题特定需求的处理工作流程,并与更广泛的科学界分享其结果。