Ille Alexander M, Markosian Christopher, Burley Stephen K, Pasqualini Renata, Arap Wadih
Rutgers Cancer Institute, Newark, NJ, USA.
Division of Cancer Biology, Department of Radiation Oncology, Rutgers New Jersey Medical School, Newark, NJ, USA.
bioRxiv. 2025 Jul 3:2025.07.03.663068. doi: 10.1101/2025.07.03.663068.
In humans, protein-protein interactions mediate numerous biological processes and are central to both normal physiology and disease. Extensive research efforts have aimed to elucidate the human protein interactome, and comprehensive databases now catalog these interactions at scale. However, structural coverage of the human protein interactome is limited and remains challenging to resolve through experimental methodology alone. Recent advances in artificial intelligence/machine learning (AI/ML)-based approaches for protein interaction structure prediction present opportunities for large-scale structural characterization of the human interactome. One such model, Boltz-2, which is capable of predicting the structures of protein complexes, may serve this objective. Here, we present computed models of 1,394 binary human protein interaction structures predicted using Boltz-2. These structural predictions were based on biochemically determined interaction data sourced from the IntAct database, together with corresponding protein sequences obtained from UniProt. We then assessed the predicted interaction structures through various confidence metrics, which consider both overall structure and the interaction interface. These analyses indicated that prediction confidence tended to be greater for smaller complexes, while increased multiple sequence alignment (MSA) depth tended to improve prediction confidence. Additionally, we performed a limited evaluation against experimentally determined human protein complex structures not included in the Boltz-2 training regimen, which indicated prediction accuracy consistent with AlphaFold3. This work demonstrates the utility of Boltz-2 for structural modeling of the human protein interactome, while highlighting both strengths and limitations. Ultimately, such modeling is expected to yield broad structural insights with relevance across multiple domains of biomedical research.
在人类中,蛋白质-蛋白质相互作用介导了众多生物过程,对正常生理和疾病都至关重要。广泛的研究致力于阐明人类蛋白质相互作用组,现在综合数据库已大规模编目这些相互作用。然而,人类蛋白质相互作用组的结构覆盖范围有限,仅通过实验方法来解析仍然具有挑战性。基于人工智能/机器学习(AI/ML)的蛋白质相互作用结构预测方法的最新进展为人类相互作用组的大规模结构表征提供了机会。一种这样的模型Boltz-2能够预测蛋白质复合物的结构,可能有助于实现这一目标。在这里,我们展示了使用Boltz-2预测的1394种二元人类蛋白质相互作用结构的计算模型。这些结构预测基于从IntAct数据库获取的生化确定的相互作用数据,以及从UniProt获得的相应蛋白质序列。然后,我们通过各种置信度指标评估预测的相互作用结构,这些指标同时考虑整体结构和相互作用界面。这些分析表明,较小的复合物预测置信度往往更高,而增加多序列比对(MSA)深度往往会提高预测置信度。此外,我们对未包含在Boltz-2训练方案中的实验确定的人类蛋白质复合物结构进行了有限评估,结果表明预测准确性与AlphaFold3一致。这项工作证明了Boltz-2在人类蛋白质相互作用组结构建模中的实用性,同时突出了其优势和局限性。最终,这种建模有望产生具有广泛结构见解的结果,与生物医学研究的多个领域相关。