Figueroa-Navedo Amanda M, Kapre Rohan, Gupta Tushita, Xu Yingrong, Phaneuf Clifford G, Jean Beltran Pierre M, Xue Liang, Ivanov Alexander R, Vitek Olga
Barnett Institute of Chemical and Biological Analysis, Department of Chemistry and Chemical Biology, Northeastern University, Boston, Massachusetts, USA.
Khoury College of Computer Science, Northeastern University, Boston, Massachusetts, USA.
Mol Cell Proteomics. 2025 May 27;24(8):100999. doi: 10.1016/j.mcpro.2025.100999.
Thermal proteome profiling investigates protein-protein, protein-nucleic acid, or protein-drug interactions, and the impact of metabolite binding and post-translational modifications on these interactions. The experiments quantitatively characterize biological samples treated with small molecules versus controls and subjected to timed exposures to multiple temperatures. Typically, each enzymatically digested sample is labeled with a tandem mass tag (TMT), where each TMT channel corresponds to a specific temperature treatment, and profiled using liquid chromatography coupled with mass spectrometry in data-dependent data acquisition mode. The resulting mass spectra are processed with computational tools to identify and quantify proteins and filter out noise. Protein-drug interactions are detected by fitting curves to the protein-level reporter ion abundances across the temperatures. Interacting proteins are identified by shifts in the fitted curves between treated samples and controls. In this article, we focus on data processing and curve fitting in thermal proteome profiling. We review the statistical methods currently used for thermal proteome profiling and demonstrate that such methods can yield substantially different results. We advocate for the statistical analysis strategy implemented in the open-source R package MSstatsTMT, as it does not require subjective pre-filtering of the data or curve fitting and appropriately represents all the sources of variation. It supports experimental designs that trade-off temperatures for a larger number of biological replicates and handles multiple drug concentrations or pools of samples treated with multiple temperatures, thus increasing the sensitivity of the results. We demonstrate these advantages of MSstatsTMT as compared to the currently used alternatives in a series of simulated and experimental datasets, which include conventional thermal proteome profiling and its OnePot counterpart that pools the samples treated at multiple temperatures into one sample and incorporates multiple doses of a drug. The suggested MSstatsTMT-based workflow is documented in publicly available and fully reproducible R vignettes.
热蛋白质组分析可研究蛋白质-蛋白质、蛋白质-核酸或蛋白质-药物相互作用,以及代谢物结合和翻译后修饰对这些相互作用的影响。该实验对用小分子处理的生物样品与对照进行定量表征,并使其在多个温度下进行定时暴露。通常,每个酶解样品都用串联质量标签(TMT)进行标记,其中每个TMT通道对应一种特定的温度处理,并在数据依赖的数据采集模式下使用液相色谱与质谱联用进行分析。所得质谱用计算工具进行处理,以识别和定量蛋白质并滤除噪声。通过对不同温度下蛋白质水平的报告离子丰度拟合曲线来检测蛋白质-药物相互作用。通过处理样品与对照之间拟合曲线的变化来识别相互作用的蛋白质。在本文中,我们重点关注热蛋白质组分析中的数据处理和曲线拟合。我们回顾了目前用于热蛋白质组分析的统计方法,并证明这些方法可能会产生截然不同的结果。我们提倡使用开源R包MSstatsTMT中实施的统计分析策略,因为它不需要对数据进行主观预过滤或曲线拟合,并且能恰当地体现所有变异来源。它支持以温度换取更多生物重复的实验设计,并能处理多种药物浓度或多个温度处理的样品池,从而提高结果的灵敏度。在一系列模拟和实验数据集中,我们证明了MSstatsTMT相对于目前使用的其他方法的这些优势,这些数据集包括传统的热蛋白质组分析及其OnePot对应方法,后者将多个温度处理的样品合并到一个样品中,并纳入多剂量的药物。基于MSstatsTMT的建议工作流程记录在公开可用且完全可重现的R vignettes中。