拟南芥中潜在分子相互作用的计算识别

Computational identification of potential molecular interactions in Arabidopsis.

作者信息

Lin Mingzhi, Hu Bin, Chen Lijuan, Sun Peng, Fan Yi, Wu Ping, Chen Xin

机构信息

Department of Bioinformatics, Zhejiang University, Hangzhou, People's Republic of China, 310058.

出版信息

Plant Physiol. 2009 Sep;151(1):34-46. doi: 10.1104/pp.109.141317. Epub 2009 Jul 10.

DOI:10.1104/pp.109.141317

PMID:19592425

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2735983/

Abstract

Knowledge of the protein interaction network is useful to assist molecular mechanism studies. Several major repositories have been established to collect and organize reported protein interactions. Many interactions have been reported in several model organisms, yet a very limited number of plant interactions can thus far be found in these major databases. Computational identification of potential plant interactions, therefore, is desired to facilitate relevant research. In this work, we constructed a support vector machine model to predict potential Arabidopsis (Arabidopsis thaliana) protein interactions based on a variety of indirect evidence. In a 100-iteration bootstrap evaluation, the confidence of our predicted interactions was estimated to be 48.67%, and these interactions were expected to cover 29.02% of the entire interactome. The sensitivity of our model was validated with an independent evaluation data set consisting of newly reported interactions that did not overlap with the examples used in model training and testing. Results showed that our model successfully recognized 28.91% of the new interactions, similar to its expected sensitivity (29.02%). Applying this model to all possible Arabidopsis protein pairs resulted in 224,206 potential interactions, which is the largest and most accurate set of predicted Arabidopsis interactions at present. In order to facilitate the use of our results, we present the Predicted Arabidopsis Interactome Resource, with detailed annotations and more specific per interaction confidence measurements. This database and related documents are freely accessible at http://www.cls.zju.edu.cn/pair/.

摘要

蛋白质相互作用网络的知识有助于辅助分子机制研究。已经建立了几个主要的数据库来收集和整理已报道的蛋白质相互作用。在几种模式生物中已经报道了许多相互作用，但目前在这些主要数据库中发现的植物相互作用数量非常有限。因此，需要通过计算来识别潜在的植物相互作用，以促进相关研究。在这项工作中，我们基于各种间接证据构建了一个支持向量机模型，用于预测拟南芥（Arabidopsis thaliana）潜在的蛋白质相互作用。在100次迭代的自助评估中，我们预测的相互作用的置信度估计为48.67%，并且这些相互作用预计覆盖整个相互作用组的29.02%。我们的模型的敏感性通过一个独立的评估数据集进行了验证，该数据集由新报道的相互作用组成，这些相互作用与模型训练和测试中使用的示例不重叠。结果表明，我们的模型成功识别了28.91%的新相互作用，与其预期的敏感性（29.02%）相似。将该模型应用于所有可能的拟南芥蛋白质对，得到了224,206个潜在的相互作用，这是目前最大且最准确的一组预测的拟南芥相互作用。为了便于使用我们的结果，我们提供了预测的拟南芥相互作用组资源，带有详细的注释和更具体的每个相互作用的置信度测量。该数据库和相关文档可在http://www.cls.zju.edu.cn/pair/免费获取。

相似文献

Computational identification of potential molecular interactions in Arabidopsis.拟南芥中潜在分子相互作用的计算识别

Plant Physiol. 2009 Sep;151(1):34-46. doi: 10.1104/pp.109.141317. Epub 2009 Jul 10.

PAIR: the predicted Arabidopsis interactome resource.PAIR：预测的拟南芥相互作用组资源。

Nucleic Acids Res. 2011 Jan;39(Database issue):D1134-40. doi: 10.1093/nar/gkq938. Epub 2010 Oct 15.

The predicted Arabidopsis interactome resource and network topology-based systems biology analyses.预测的拟南芥互作组资源和基于网络拓扑的系统生物学分析。

Plant Cell. 2011 Mar;23(3):911-22. doi: 10.1105/tpc.110.082529. Epub 2011 Mar 25.

A predicted interactome for Arabidopsis.拟南芥的预测相互作用组。

Plant Physiol. 2007 Oct;145(2):317-29. doi: 10.1104/pp.107.103465. Epub 2007 Aug 3.

Global protein interactome exploration through mining genome-scale data in Arabidopsis thaliana.通过挖掘拟南芥基因组规模数据进行全球蛋白质相互作用组探索。

BMC Genomics. 2010 Nov 2;11 Suppl 2(Suppl 2):S2. doi: 10.1186/1471-2164-11-S2-S2.

Prediction of regulatory interactions in Arabidopsis using gene-expression data and support vector machines.使用基因表达数据和支持向量机预测拟南芥中的调控相互作用。

Plant Physiol Biochem. 2011 Mar;49(3):280-3. doi: 10.1016/j.plaphy.2011.01.002. Epub 2011 Jan 12.

The TAIR database.TAIR数据库。

Methods Mol Biol. 2007;406:179-212. doi: 10.1007/978-1-59745-535-0_8.

Multi-omics network-based functional annotation of unknown Arabidopsis genes.基于多组学网络的拟南芥未知基因的功能注释。

Plant J. 2021 Nov;108(4):1193-1212. doi: 10.1111/tpj.15507. Epub 2021 Oct 10.

AtPID: Arabidopsis thaliana protein interactome database--an integrative platform for plant systems biology.AtPID：拟南芥蛋白质相互作用组数据库——植物系统生物学的综合平台。

Nucleic Acids Res. 2008 Jan;36(Database issue):D999-1008. doi: 10.1093/nar/gkm844. Epub 2007 Oct 25.

Prediction of microRNA-regulated protein interaction pathways in Arabidopsis using machine learning algorithms.利用机器学习算法预测拟南芥中 miRNA 调控的蛋白质互作通路。

Comput Biol Med. 2013 Nov;43(11):1645-52. doi: 10.1016/j.compbiomed.2013.08.010. Epub 2013 Aug 22.

引用本文的文献

Genome-wide functional association networks: background, data & state-of-the-art resources.全基因组功能关联网络：背景、数据和最新资源。

Brief Bioinform. 2020 Jul 15;21(4):1224-1237. doi: 10.1093/bib/bbz064.

Predicted Arabidopsis Interactome Resource and Gene Set Linkage Analysis: A Transcriptomic Analysis Resource.拟南芥互作组资源预测和基因集关联分析：转录组分析资源。

Plant Physiol. 2018 May;177(1):422-433. doi: 10.1104/pp.18.00144. Epub 2018 Mar 12.

PTIR: Predicted Tomato Interactome Resource.PTIR：预测的番茄相互作用组资源。

Sci Rep. 2016 Apr 28;6:25047. doi: 10.1038/srep25047.

IIS--Integrated Interactome System: a web-based platform for the annotation, analysis and visualization of protein-metabolite-gene-drug interactions by integrating a variety of data sources and tools.IIS——综合相互作用组系统：一个基于网络的平台，通过整合各种数据源和工具，用于蛋白质-代谢物-基因-药物相互作用的注释、分析和可视化。

PLoS One. 2014 Jun 20;9(6):e100385. doi: 10.1371/journal.pone.0100385. eCollection 2014.

FunCoup 3.0: database of genome-wide functional coupling networks.FunCoup 3.0：全基因组功能耦合网络数据库。

Nucleic Acids Res. 2014 Jan;42(Database issue):D380-8. doi: 10.1093/nar/gkt984. Epub 2013 Oct 31.

Inferring the Brassica rapa Interactome Using Protein-Protein Interaction Data from Arabidopsis thaliana.利用拟南芥的蛋白-蛋白互作数据推断芸薹属互作组。

Front Plant Sci. 2013 Jan 4;3:297. doi: 10.3389/fpls.2012.00297. eCollection 2012.

Systems analysis of plant functional, transcriptional, physical interaction, and metabolic networks.植物功能、转录、物理相互作用和代谢网络的系统分析。

Plant Cell. 2012 Oct;24(10):3859-75. doi: 10.1105/tpc.112.100776. Epub 2012 Oct 30.

The Arabidopsis thaliana SET-domain-containing protein ASHH1/SDG26 interacts with itself and with distinct histone lysine methyltransferases.拟南芥 SET 结构域蛋白 ASHH1/SDG26 可与自身以及不同的组蛋白赖氨酸甲基转移酶相互作用。

J Plant Res. 2012 Sep;125(5):679-92. doi: 10.1007/s10265-012-0485-7. Epub 2012 Mar 22.

Os11Gsk gene from a wild rice, Oryza rufipogon improves yield in rice.野生稻 Os11Gsk 基因可提高水稻产量。

Funct Integr Genomics. 2012 Jun;12(2):277-89. doi: 10.1007/s10142-012-0265-4. Epub 2012 Feb 25.

Computational identification of protein-protein interactions in rice based on the predicted rice interactome network.基于预测的水稻互作网络的水稻蛋白-蛋白相互作用的计算鉴定。

Genomics Proteomics Bioinformatics. 2011 Oct;9(4-5):128-37. doi: 10.1016/S1672-0229(11)60016-8.

本文引用的文献

Histone H3 lysine 4 trimethylation marks meiotic recombination initiation sites.组蛋白H3赖氨酸4三甲基化标记减数分裂重组起始位点。

EMBO J. 2009 Jan 21;28(2):99-111. doi: 10.1038/emboj.2008.257. Epub 2008 Dec 11.

A combination of modified particle swarm optimization algorithm and support vector machine for gene selection and tumor classification.一种用于基因选择和肿瘤分类的改进粒子群优化算法与支持向量机的组合

Talanta. 2007 Mar 15;71(4):1679-83. doi: 10.1016/j.talanta.2006.07.047. Epub 2006 Sep 1.

Identification of microRNA precursors with support vector machine and string kernel.利用支持向量机和字符串核识别微小RNA前体

Genomics Proteomics Bioinformatics. 2008 Jun;6(2):121-8. doi: 10.1016/S1672-0229(08)60027-3.

ATTED-II provides coexpressed gene networks for Arabidopsis.ATTED-II为拟南芥提供共表达基因网络。

Nucleic Acids Res. 2009 Jan;37(Database issue):D987-91. doi: 10.1093/nar/gkn807. Epub 2008 Oct 25.

Identifying protein domains with the Pfam database.使用Pfam数据库鉴定蛋白质结构域。

Curr Protoc Bioinformatics. 2008 Sep;Chapter 2:2.5.1-2.5.17. doi: 10.1002/0471250953.bi0205s23.

High-quality binary protein interaction map of the yeast interactome network.酵母相互作用组网络的高质量二元蛋白质相互作用图谱。

Science. 2008 Oct 3;322(5898):104-10. doi: 10.1126/science.1158684. Epub 2008 Aug 21.

Arabidopsis reactome: a foundation knowledgebase for plant systems biology.拟南芥反应组：植物系统生物学的基础知识库。

Plant Cell. 2008 Jun;20(6):1426-36. doi: 10.1105/tpc.108.057976. Epub 2008 Jun 30.

Predicting co-complexed protein pairs from heterogeneous data.从异构数据中预测共复合蛋白质对。

PLoS Comput Biol. 2008 Apr 18;4(4):e1000054. doi: 10.1371/journal.pcbi.1000054.

AtRECQ2, a RecQ helicase homologue from Arabidopsis thaliana, is able to disrupt various recombinogenic DNA structures in vitro.AtRECQ2是一种来自拟南芥的RecQ解旋酶同源物，能够在体外破坏各种重组性DNA结构。

Plant J. 2008 Aug;55(3):397-405. doi: 10.1111/j.0960-7412.2008.03511.x.

Fusion of face and speech data for person identity verification.用于身份验证的面部与语音数据融合

IEEE Trans Neural Netw. 1999;10(5):1065-74. doi: 10.1109/72.788647.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验