National Center for Biological Sciences, Tata Institute of Fundamental Research, Bangalore 560065, India.
Bioinformatics. 2024 Mar 4;40(3). doi: 10.1093/bioinformatics/btae106.
Integrative structural modeling combines data from experiments, physical principles, statistics of previous structures, and prior models to obtain structures of macromolecular assemblies that are challenging to characterize experimentally. The choice of model representation is a key decision in integrative modeling, as it dictates the accuracy of scoring, efficiency of sampling, and resolution of analysis. But currently, the choice is usually made ad hoc, manually.
Here, we report NestOR (Nested Sampling for Optimizing Representation), a fully automated, statistically rigorous method based on Bayesian model selection to identify the optimal coarse-grained representation for a given integrative modeling setup. Given an integrative modeling setup, it determines the optimal representations from given candidate representations based on their model evidence and sampling efficiency. The performance of NestOR was evaluated on a benchmark of four macromolecular assemblies.
NestOR is implemented in the Integrative Modeling Platform (https://integrativemodeling.org) and is available at https://github.com/isblab/nestor. Data for the benchmark is at https://www.doi.org/10.5281/zenodo.10360718.
综合结构建模将实验数据、物理原理、先前结构的统计数据和先前的模型结合起来,以获得难以通过实验进行表征的大分子组装体的结构。模型表示的选择是综合建模中的一个关键决策,因为它决定了评分的准确性、采样的效率和分析的分辨率。但目前,这种选择通常是凭经验、手动进行的。
在这里,我们报告了 NestOR(用于优化表示的嵌套采样),这是一种完全自动化的、基于贝叶斯模型选择的、严格的统计学方法,用于确定给定综合建模设置的最佳粗粒度表示。给定一个综合建模设置,它会根据模型证据和采样效率从给定的候选表示中确定最佳表示。Nestor 的性能在四个大分子组装体的基准上进行了评估。
Nestor 是在综合建模平台(https://integrativemodeling.org)中实现的,并可在 https://github.com/isblab/nestor 上获得。基准数据位于 https://www.doi.org/10.5281/zenodo.10360718。