• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

系统发育相关性足以从序列中推断蛋白质伴侣。

Phylogenetic correlations can suffice to infer protein partners from sequences.

机构信息

Sorbonne Université, CNRS, Laboratoire Jean Perrin (UMR 8237), F-75005 Paris, France.

Sorbonne Université, CNRS, Institut de Biologie Paris-Seine, Laboratoire de Biologie Computationnelle et Quantitative (LCQB, UMR 7238), F-75005 Paris, France.

出版信息

PLoS Comput Biol. 2019 Oct 14;15(10):e1007179. doi: 10.1371/journal.pcbi.1007179. eCollection 2019 Oct.

DOI:10.1371/journal.pcbi.1007179
PMID:31609984
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6812855/
Abstract

Determining which proteins interact together is crucial to a systems-level understanding of the cell. Recently, algorithms based on Direct Coupling Analysis (DCA) pairwise maximum-entropy models have allowed to identify interaction partners among paralogous proteins from sequence data. This success of DCA at predicting protein-protein interactions could be mainly based on its known ability to identify pairs of residues that are in contact in the three-dimensional structure of protein complexes and that coevolve to remain physicochemically complementary. However, interacting proteins possess similar evolutionary histories. What is the role of purely phylogenetic correlations in the performance of DCA-based methods to infer interaction partners? To address this question, we employ controlled synthetic data that only involve phylogeny and no interactions or contacts. We find that DCA accurately identifies the pairs of synthetic sequences that share evolutionary history. While phylogenetic correlations confound the identification of contacting residues by DCA, they are thus useful to predict interacting partners among paralogs. We find that DCA performs as well as phylogenetic methods to this end, and slightly better than them with large and accurate training sets. Employing DCA or phylogenetic methods within an Iterative Pairing Algorithm (IPA) allows to predict pairs of evolutionary partners without a training set. We further demonstrate the ability of these various methods to correctly predict pairings among real paralogous proteins with genome proximity but no known direct physical interaction, illustrating the importance of phylogenetic correlations in natural data. However, for physically interacting and strongly coevolving proteins, DCA and mutual information outperform phylogenetic methods. We finally discuss how to distinguish physically interacting proteins from proteins that only share a common evolutionary history.

摘要

确定哪些蛋白质相互作用对于系统水平理解细胞至关重要。最近,基于直接耦合分析(DCA)成对最大熵模型的算法已经能够从序列数据中识别出同源蛋白的相互作用伙伴。DCA 在预测蛋白质-蛋白质相互作用方面的成功可能主要基于其已知的能力,即识别在蛋白质复合物三维结构中相互接触并共同进化以保持物理化学互补的残基对。然而,相互作用的蛋白质具有相似的进化历史。在基于 DCA 的方法推断相互作用伙伴的性能中,纯粹的系统发育相关性的作用是什么?为了解决这个问题,我们使用仅涉及系统发育且没有相互作用或接触的受控合成数据。我们发现 DCA 准确地识别出共享进化历史的合成序列对。虽然系统发育相关性会混淆 DCA 识别接触残基的能力,但它们对于预测同源蛋白的相互作用伙伴是有用的。我们发现 DCA 在这方面的表现与系统发育方法一样好,并且在使用大型和准确的训练集时略好于它们。在迭代配对算法(IPA)中使用 DCA 或系统发育方法可以在没有训练集的情况下预测进化伙伴对。我们进一步证明了这些各种方法在正确预测具有基因组邻近性但没有已知直接物理相互作用的真实同源蛋白配对方面的能力,说明了系统发育相关性在自然数据中的重要性。然而,对于具有物理相互作用和强烈共同进化的蛋白质,DCA 和互信息优于系统发育方法。最后,我们讨论了如何区分具有物理相互作用的蛋白质和仅共享共同进化历史的蛋白质。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eab5/6812855/0ac541ae311d/pcbi.1007179.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eab5/6812855/28f9f0da64f4/pcbi.1007179.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eab5/6812855/b829d21c776f/pcbi.1007179.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eab5/6812855/ecfde4edac6d/pcbi.1007179.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eab5/6812855/2000e2f59a77/pcbi.1007179.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eab5/6812855/7237dd97ccde/pcbi.1007179.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eab5/6812855/7b4dd6a2434c/pcbi.1007179.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eab5/6812855/0ac541ae311d/pcbi.1007179.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eab5/6812855/28f9f0da64f4/pcbi.1007179.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eab5/6812855/b829d21c776f/pcbi.1007179.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eab5/6812855/ecfde4edac6d/pcbi.1007179.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eab5/6812855/2000e2f59a77/pcbi.1007179.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eab5/6812855/7237dd97ccde/pcbi.1007179.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eab5/6812855/7b4dd6a2434c/pcbi.1007179.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eab5/6812855/0ac541ae311d/pcbi.1007179.g007.jpg

相似文献

1
Phylogenetic correlations can suffice to infer protein partners from sequences.系统发育相关性足以从序列中推断蛋白质伴侣。
PLoS Comput Biol. 2019 Oct 14;15(10):e1007179. doi: 10.1371/journal.pcbi.1007179. eCollection 2019 Oct.
2
Inferring interaction partners from protein sequences using mutual information.利用互信息从蛋白质序列推断相互作用的伙伴。
PLoS Comput Biol. 2018 Nov 13;14(11):e1006401. doi: 10.1371/journal.pcbi.1006401. eCollection 2018 Nov.
3
Inferring interaction partners from protein sequences.从蛋白质序列推断相互作用伙伴。
Proc Natl Acad Sci U S A. 2016 Oct 25;113(43):12180-12185. doi: 10.1073/pnas.1606762113. Epub 2016 Sep 23.
4
Direct-coupling analysis of residue coevolution captures native contacts across many protein families.残基共进化的直接耦联分析捕获了许多蛋白质家族中的天然接触。
Proc Natl Acad Sci U S A. 2011 Dec 6;108(49):E1293-301. doi: 10.1073/pnas.1111471108. Epub 2011 Nov 21.
5
Statistical physics of interacting proteins: Impact of dataset size and quality assessed in synthetic sequences.相互作用蛋白质的统计物理学:在合成序列中评估数据集大小和质量的影响。
Phys Rev E. 2020 Mar;101(3-1):032413. doi: 10.1103/PhysRevE.101.032413.
6
Correlations from structure and phylogeny combine constructively in the inference of protein partners from sequences.结构和系统发生的相关性在从序列推断蛋白质伴侣时具有建设性的结合。
PLoS Comput Biol. 2022 May 16;18(5):e1010147. doi: 10.1371/journal.pcbi.1010147. eCollection 2022 May.
7
Exploiting the co-evolution of interacting proteins to discover interaction specificity.利用相互作用蛋白的共同进化来发现相互作用特异性。
J Mol Biol. 2003 Mar 14;327(1):273-84. doi: 10.1016/s0022-2836(03)00114-1.
8
Combining phylogeny and coevolution improves the inference of interaction partners among paralogous proteins.系统发生和共进化相结合提高了对同源蛋白相互作用伙伴的推断。
PLoS Comput Biol. 2023 Mar 30;19(3):e1011010. doi: 10.1371/journal.pcbi.1011010. eCollection 2023 Mar.
9
Simultaneous identification of specifically interacting paralogs and interprotein contacts by direct coupling analysis.通过直接耦合分析同时鉴定特异性相互作用的旁系同源物和蛋白质间相互作用位点
Proc Natl Acad Sci U S A. 2016 Oct 25;113(43):12186-12191. doi: 10.1073/pnas.1607570113. Epub 2016 Oct 11.
10
Mutual information without the influence of phylogeny or entropy dramatically improves residue contact prediction.不受系统发育或熵影响的互信息显著改善了残基接触预测。
Bioinformatics. 2008 Feb 1;24(3):333-40. doi: 10.1093/bioinformatics/btm604. Epub 2007 Dec 5.

引用本文的文献

1
Investigating Statistical Conditions of Coevolutionary Signals that Enable Algorithmic Predictions of Protein Partners.研究能够实现蛋白质伴侣算法预测的协同进化信号的统计条件。
J Chem Inf Model. 2025 Apr 28;65(8):4107-4115. doi: 10.1021/acs.jcim.5c00052. Epub 2025 Apr 15.
2
DiffPaSS-high-performance differentiable pairing of protein sequences using soft scores.DiffPaSS——使用软评分对蛋白质序列进行高性能可微配对
Bioinformatics. 2024 Dec 26;41(1). doi: 10.1093/bioinformatics/btae738.
3
Impact of phylogeny on the inference of functional sectors from protein sequence data.

本文引用的文献

1
STRING v11: protein-protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets.STRING v11:具有增强覆盖范围的蛋白质-蛋白质相互作用网络,支持在全基因组实验数据集的功能发现。
Nucleic Acids Res. 2019 Jan 8;47(D1):D607-D613. doi: 10.1093/nar/gky1131.
2
Inferring interaction partners from protein sequences using mutual information.利用互信息从蛋白质序列推断相互作用的伙伴。
PLoS Comput Biol. 2018 Nov 13;14(11):e1006401. doi: 10.1371/journal.pcbi.1006401. eCollection 2018 Nov.
3
Power law tails in phylogenetic systems.
系统发育对从蛋白质序列数据推断功能区的影响。
PLoS Comput Biol. 2024 Sep 23;20(9):e1012091. doi: 10.1371/journal.pcbi.1012091. eCollection 2024 Sep.
4
Pairing interacting protein sequences using masked language modeling.使用掩蔽语言模型对相互作用的蛋白质序列进行配对。
Proc Natl Acad Sci U S A. 2024 Jul 2;121(27):e2311887121. doi: 10.1073/pnas.2311887121. Epub 2024 Jun 24.
5
Enhancing coevolutionary signals in protein-protein interaction prediction through clade-wise alignment integration.通过分支对齐整合增强蛋白质-蛋白质相互作用预测中的协同进化信号
Sci Rep. 2024 Mar 12;14(1):6009. doi: 10.1038/s41598-024-55655-9.
6
Pitfalls of machine learning models for protein-protein interaction networks.机器学习模型在蛋白质-蛋白质相互作用网络中的陷阱。
Bioinformatics. 2024 Feb 1;40(2). doi: 10.1093/bioinformatics/btae012.
7
Combining phylogeny and coevolution improves the inference of interaction partners among paralogous proteins.系统发生和共进化相结合提高了对同源蛋白相互作用伙伴的推断。
PLoS Comput Biol. 2023 Mar 30;19(3):e1011010. doi: 10.1371/journal.pcbi.1011010. eCollection 2023 Mar.
8
Impact of phylogeny on structural contact inference from protein sequence data.系统发育对从蛋白质序列数据推断结构接触的影响。
J R Soc Interface. 2023 Feb;20(199):20220707. doi: 10.1098/rsif.2022.0707. Epub 2023 Feb 8.
9
Generative power of a protein language model trained on multiple sequence alignments.基于多序列比对训练的蛋白质语言模型的生成能力。
Elife. 2023 Feb 3;12:e79854. doi: 10.7554/eLife.79854.
10
Funneling modulatory peptide design with generative models: Discovery and characterization of disruptors of calcineurin protein-protein interactions.使用生成模型进行调节肽设计:钙调神经磷酸酶蛋白-蛋白相互作用抑制剂的发现和表征。
PLoS Comput Biol. 2023 Feb 2;19(2):e1010874. doi: 10.1371/journal.pcbi.1010874. eCollection 2023 Feb.
系统发育学中的幂律尾部。
Proc Natl Acad Sci U S A. 2018 Jan 23;115(4):690-695. doi: 10.1073/pnas.1711913115. Epub 2018 Jan 8.
4
Inverse statistical physics of protein sequences: a key issues review.蛋白质序列的反统计物理学:关键问题综述。
Rep Prog Phys. 2018 Mar;81(3):032601. doi: 10.1088/1361-6633/aa9965.
5
Revealing protein networks and gene-drug connectivity in cancer from direct information.从直接信息中揭示癌症中的蛋白质网络和基因-药物关联。
Sci Rep. 2017 Jun 16;7(1):3739. doi: 10.1038/s41598-017-04001-3.
6
Simultaneous identification of specifically interacting paralogs and interprotein contacts by direct coupling analysis.通过直接耦合分析同时鉴定特异性相互作用的旁系同源物和蛋白质间相互作用位点
Proc Natl Acad Sci U S A. 2016 Oct 25;113(43):12186-12191. doi: 10.1073/pnas.1607570113. Epub 2016 Oct 11.
7
Inferring interaction partners from protein sequences.从蛋白质序列推断相互作用伙伴。
Proc Natl Acad Sci U S A. 2016 Oct 25;113(43):12180-12185. doi: 10.1073/pnas.1606762113. Epub 2016 Sep 23.
8
Connecting the Sequence-Space of Bacterial Signaling Proteins to Phenotypes Using Coevolutionary Landscapes.利用协同进化景观将细菌信号蛋白的序列空间与表型联系起来。
Mol Biol Evol. 2016 Dec;33(12):3054-3064. doi: 10.1093/molbev/msw188. Epub 2016 Sep 7.
9
A Comprehensive, CRISPR-based Functional Analysis of Essential Genes in Bacteria.基于CRISPR的细菌必需基因综合功能分析
Cell. 2016 Jun 2;165(6):1493-1506. doi: 10.1016/j.cell.2016.05.003. Epub 2016 May 26.
10
Inter-Protein Sequence Co-Evolution Predicts Known Physical Interactions in Bacterial Ribosomes and the Trp Operon.蛋白质间序列协同进化预测细菌核糖体和色氨酸操纵子中的已知物理相互作用。
PLoS One. 2016 Feb 16;11(2):e0149166. doi: 10.1371/journal.pone.0149166. eCollection 2016.