利用马尔可夫随机场直接从高通量串联亲和纯化（TAP）数据中识别蛋白质复合物。

Identifying protein complexes directly from high-throughput TAP data with Markov random fields.

作者信息

Rungsarityotin Wasinee, Krause Roland, Schödl Arno, Schliep Alexander

机构信息

Max Planck Institute for Molecular Genetics, Department of Computational Molecular Biology, Ihnestr, 73, D-14195 Berlin, Germany.

出版信息

BMC Bioinformatics. 2007 Dec 19;8:482. doi: 10.1186/1471-2105-8-482.

DOI:10.1186/1471-2105-8-482

PMID:18093306

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2222659/

Abstract

BACKGROUND

Predicting protein complexes from experimental data remains a challenge due to limited resolution and stochastic errors of high-throughput methods. Current algorithms to reconstruct the complexes typically rely on a two-step process. First, they construct an interaction graph from the data, predominantly using heuristics, and subsequently cluster its vertices to identify protein complexes.

RESULTS

We propose a model-based identification of protein complexes directly from the experimental observations. Our model of protein complexes based on Markov random fields explicitly incorporates false negative and false positive errors and exhibits a high robustness to noise. A model-based quality score for the resulting clusters allows us to identify reliable predictions in the complete data set. Comparisons with prior work on reference data sets shows favorable results, particularly for larger unfiltered data sets. Additional information on predictions, including the source code under the GNU Public License can be found at http://algorithmics.molgen.mpg.de/Static/Supplements/ProteinComplexes.

CONCLUSION

We can identify complexes in the data obtained from high-throughput experiments without prior elimination of proteins or weak interactions. The few parameters of our model, which does not rely on heuristics, can be estimated using maximum likelihood without a reference data set. This is particularly important for protein complex studies in organisms that do not have an established reference frame of known protein complexes.

摘要

背景

由于高通量方法的分辨率有限和随机误差，从实验数据预测蛋白质复合物仍然是一项挑战。当前用于重建复合物的算法通常依赖于两步过程。首先，它们主要使用启发式方法从数据构建相互作用图，随后对其顶点进行聚类以识别蛋白质复合物。

结果

我们提出了一种直接从实验观察中基于模型识别蛋白质复合物的方法。我们基于马尔可夫随机场的蛋白质复合物模型明确纳入了假阴性和假阳性误差，并且对噪声具有高度鲁棒性。为所得聚类基于模型的质量评分使我们能够在完整数据集中识别可靠的预测。与参考数据集上先前工作的比较显示了良好的结果，特别是对于较大的未过滤数据集。有关预测的其他信息，包括遵循GNU公共许可证的源代码，可在http://algorithmics.molgen.mpg.de/Static/Supplements/ProteinComplexes上找到。

结论

我们可以在无需事先去除蛋白质或弱相互作用的情况下，从高通量实验获得的数据中识别复合物。我们的模型不依赖启发式方法，其少数参数可使用最大似然估计，无需参考数据集。这对于没有已知蛋白质复合物既定参考框架的生物体中的蛋白质复合物研究尤为重要。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cba4/2222659/2de20e3f1cd6/1471-2105-8-482-1.jpg

相似文献

Identifying protein complexes directly from high-throughput TAP data with Markov random fields.利用马尔可夫随机场直接从高通量串联亲和纯化（TAP）数据中识别蛋白质复合物。

BMC Bioinformatics. 2007 Dec 19;8:482. doi: 10.1186/1471-2105-8-482.

Evaluation of clustering algorithms for protein-protein interaction networks.蛋白质-蛋白质相互作用网络聚类算法的评估

BMC Bioinformatics. 2006 Nov 6;7:488. doi: 10.1186/1471-2105-7-488.

Structural interpretation of protein-protein interaction network.蛋白质-蛋白质相互作用网络的结构解释

BMC Struct Biol. 2010 May 17;10 Suppl 1(Suppl 1):S4. doi: 10.1186/1472-6807-10-S1-S4.

Statistical analysis of domains in interacting protein pairs.相互作用蛋白对中结构域的统计分析。

Bioinformatics. 2005 Apr 1;21(7):993-1001. doi: 10.1093/bioinformatics/bti086. Epub 2004 Oct 27.

A statistical framework for combining and interpreting proteomic datasets.用于整合和解读蛋白质组学数据集的统计框架。

Bioinformatics. 2004 Mar 22;20(5):689-700. doi: 10.1093/bioinformatics/btg469. Epub 2004 Jan 22.

Discovery of protein complexes with core-attachment structures from Tandem Affinity Purification (TAP) data.从串联亲和纯化（TAP）数据中发现具有核心-附着结构的蛋白质复合物。

J Comput Biol. 2012 Sep;19(9):1027-42. doi: 10.1089/cmb.2010.0293. Epub 2011 Jul 21.

Bayesian Markov Random Field analysis for protein function prediction based on network data.基于网络数据的蛋白质功能预测的贝叶斯马尔可夫随机场分析。

PLoS One. 2010 Feb 24;5(2):e9293. doi: 10.1371/journal.pone.0009293.

Markov clustering versus affinity propagation for the partitioning of protein interaction graphs.用于蛋白质相互作用图划分的马尔可夫聚类与亲和传播算法

BMC Bioinformatics. 2009 Mar 30;10:99. doi: 10.1186/1471-2105-10-99.

Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.在流行地区，服用抗叶酸抗疟药物的人群中，叶酸补充剂与疟疾易感性和严重程度的关系。

Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.

ProClust: improved clustering of protein sequences with an extended graph-based approach.ProClust：基于扩展的图形方法改进蛋白质序列聚类

Bioinformatics. 2002;18 Suppl 2:S182-91. doi: 10.1093/bioinformatics/18.suppl_2.s182.

引用本文的文献

A comparative analysis of computational approaches and algorithms for protein subcomplex identification.蛋白质亚复合物鉴定的计算方法和算法的比较分析。

Sci Rep. 2014 Mar 3;4:4262. doi: 10.1038/srep04262.

An effective method for refining predicted protein complexes based on protein activity and the mechanism of protein complex formation.一种基于蛋白质活性和蛋白质复合物形成机制来优化预测的蛋白质复合物的有效方法。

BMC Syst Biol. 2013 Mar 28;7:28. doi: 10.1186/1752-0509-7-28.

Recent advances in clustering methods for protein interaction networks.蛋白质相互作用网络聚类方法的最新进展。

BMC Genomics. 2010 Dec 1;11 Suppl 3(Suppl 3):S10. doi: 10.1186/1471-2164-11-S3-S10.

Identification of protein complexes from co-immunoprecipitation data.从共免疫沉淀数据中鉴定蛋白质复合物。

Bioinformatics. 2011 Jan 1;27(1):111-7. doi: 10.1093/bioinformatics/btq652. Epub 2010 Nov 25.

Computational approaches for detecting protein complexes from protein interaction networks: a survey.从蛋白质相互作用网络中检测蛋白质复合物的计算方法：综述。

BMC Genomics. 2010 Feb 10;11 Suppl 1(Suppl 1):S3. doi: 10.1186/1471-2164-11-S1-S3.

本文引用的文献

The mean field theory in EM procedures for blind Markov random field image restoration.EM 算法中盲 Markov 随机场图像恢复的平均场理论。

IEEE Trans Image Process. 1993;2(1):27-40. doi: 10.1109/83.210863.

Evaluation of clustering algorithms for protein-protein interaction networks.蛋白质-蛋白质相互作用网络聚类算法的评估

BMC Bioinformatics. 2006 Nov 6;7:488. doi: 10.1186/1471-2105-7-488.

Comprehensive curation and analysis of global interaction networks in Saccharomyces cerevisiae.酿酒酵母中全球相互作用网络的全面整理与分析。

J Biol. 2006;5(4):11. doi: 10.1186/jbiol36. Epub 2006 Jun 8.

Global landscape of protein complexes in the yeast Saccharomyces cerevisiae.酿酒酵母中蛋白质复合物的全球格局。

Nature. 2006 Mar 30;440(7084):637-43. doi: 10.1038/nature04670. Epub 2006 Mar 22.

Proteome survey reveals modularity of the yeast cell machinery.蛋白质组研究揭示酵母细胞机制的模块化特性。

Nature. 2006 Mar 30;440(7084):631-6. doi: 10.1038/nature04532. Epub 2006 Jan 22.

A probabilistic functional network of yeast genes.酵母基因的概率功能网络。

Science. 2004 Nov 26;306(5701):1555-8. doi: 10.1126/science.1099511.

Shared components of protein complexes--versatile building blocks or biochemical artefacts?蛋白质复合物的共享组件——通用构建模块还是生化假象？

Bioessays. 2004 Dec;26(12):1333-43. doi: 10.1002/bies.20141.

Protein complex prediction via cost-based clustering.基于成本聚类的蛋白质复合物预测

Bioinformatics. 2004 Nov 22;20(17):3013-20. doi: 10.1093/bioinformatics/bth351. Epub 2004 Jun 4.

A statistical framework for combining and interpreting proteomic datasets.用于整合和解读蛋白质组学数据集的统计框架。

Bioinformatics. 2004 Mar 22;20(5):689-700. doi: 10.1093/bioinformatics/btg469. Epub 2004 Jan 22.

Prediction of protein function using protein-protein interaction data.利用蛋白质-蛋白质相互作用数据预测蛋白质功能。

J Comput Biol. 2003;10(6):947-60. doi: 10.1089/106652703322756168.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

利用马尔可夫随机场直接从高通量串联亲和纯化（TAP）数据中识别蛋白质复合物。

Identifying protein complexes directly from high-throughput TAP data with Markov random fields.

作者信息

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSION

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献