• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于细胞死亡相关蛋白分类的马尔可夫均值特性

Markov mean properties for cell death-related protein classification.

作者信息

Fernandez-Lozano Carlos, Gestal Marcos, González-Díaz Humberto, Dorado Julián, Pazos Alejandro, Munteanu Cristian R

机构信息

Information and Communication Technologies Department, Faculty of Computer Science, University of A Coruña, 15071A Coruña, Spain.

Information and Communication Technologies Department, Faculty of Computer Science, University of A Coruña, 15071A Coruña, Spain.

出版信息

J Theor Biol. 2014 May 21;349:12-21. doi: 10.1016/j.jtbi.2014.01.033. Epub 2014 Jan 31.

DOI:10.1016/j.jtbi.2014.01.033
PMID:24491256
Abstract

The cell death (CD) is a dynamic biological function involved in physiological and pathological processes. Due to the complexity of CD, there is a demand for fast theoretical methods that can help to find new CD molecular targets. The current work presents the first classification model to predict CD-related proteins based on Markov Mean Properties. These protein descriptors have been calculated with the MInD-Prot tool using the topological information of the amino acid contact networks of the 2423 protein chains, five atom physicochemical properties and the protein 3D regions. The Machine Learning algorithms from Weka were used to find the best classification model for CD-related protein chains using all 20 attributes. The most accurate algorithm to solve this problem was K*. After several feature subset methods, the best model found is based on only 11 variables and is characterized by the Area Under the Receiver Operating Characteristic Curve (AUROC) of 0.992 and the true positive rate (TP Rate) of 88.2% (validation set). 7409 protein chains labeled with "unknown function" in the PDB Databank were analyzed with the best model in order to predict the CD-related biological activity. Thus, several proteins have been predicted to have CD-related function in Homo sapiens: 3DRX-involved in virus-host interaction biological process, protein homooligomerization; 4DWF-involved in cell differentiation, chromatin modification, DNA damage response, protein stabilization; 1IUR-involved in ATP binding, chaperone binding; 1J7D-involved in DNA double-strand break processing, histone ubiquitination, nucleotide-binding oligomerization; 1UTU-linked with DNA repair, regulation of transcription; 3EEC-participating to the cellular membrane organization, egress of virus within host cell, class mediator resulting in cell cycle arrest, negative regulation of ubiquitin-protein ligase activity involved in mitotic cell cycle and apoptotic process. Other proteins from bacteria predicted as CD-related are 2G3V - a CAG pathogenicity island protein 13 from Helicobacter pylori, 4G5A - a hypothetical protein in Bacteroides thetaiotaomicron, 1YLK-involved in the nitrogen metabolism of Mycobacterium tuberculosis, and 1XSV - with possible DNA/RNA binding domains. The results demonstrated the possibility to predict CD-related proteins using molecular information encoded into the protein 3D structure. Thus, the current work demonstrated the possibility to predict new molecular targets involved in cell-death processes.

摘要

细胞死亡(CD)是一种涉及生理和病理过程的动态生物学功能。由于细胞死亡的复杂性,需要快速的理论方法来帮助寻找新的细胞死亡分子靶点。当前的工作提出了第一个基于马尔可夫均值特性预测细胞死亡相关蛋白质的分类模型。这些蛋白质描述符是使用MInD-Prot工具根据2423条蛋白质链的氨基酸接触网络的拓扑信息、五种原子物理化学性质和蛋白质三维区域计算得出的。使用来自Weka的机器学习算法,利用所有20个属性为细胞死亡相关蛋白质链找到最佳分类模型。解决此问题最准确的算法是K*。经过几种特征子集方法后,找到的最佳模型仅基于11个变量,其特征是受试者操作特征曲线下面积(AUROC)为0.992,真阳性率(TP率)为88.2%(验证集)。使用最佳模型对蛋白质数据银行(PDB)中标记为“功能未知”的7409条蛋白质链进行了分析,以预测细胞死亡相关的生物学活性。因此,已预测几种人类蛋白质具有细胞死亡相关功能:3DRX参与病毒-宿主相互作用生物学过程、蛋白质同寡聚化;4DWF参与细胞分化、染色质修饰、DNA损伤反应、蛋白质稳定化;1IUR参与ATP结合、伴侣蛋白结合;1J7D参与DNA双链断裂处理、组蛋白泛素化、核苷酸结合寡聚化;1UTU与DNA修复、转录调控有关;3EEC参与细胞膜组织、病毒在宿主细胞内的释放、导致细胞周期停滞的类介质、参与有丝分裂细胞周期和凋亡过程的泛素-蛋白连接酶活性的负调控。预测为与细胞死亡相关的其他细菌蛋白质有:2G3V——幽门螺杆菌的一种CAG致病岛蛋白13;4G5A——嗜热栖热放线菌中的一种假设蛋白;1YLK参与结核分枝杆菌的氮代谢;1XSV——可能具有DNA/RNA结合结构域。结果表明,利用编码在蛋白质三维结构中的分子信息预测细胞死亡相关蛋白质是有可能的。因此,当前的工作证明了预测参与细胞死亡过程的新分子靶点的可能性。

相似文献

1
Markov mean properties for cell death-related protein classification.用于细胞死亡相关蛋白分类的马尔可夫均值特性
J Theor Biol. 2014 May 21;349:12-21. doi: 10.1016/j.jtbi.2014.01.033. Epub 2014 Jan 31.
2
Improving enzyme regulatory protein classification by means of SVM-RFE feature selection.通过支持向量机递归特征消除(SVM-RFE)特征选择改进酶调节蛋白分类。
Mol Biosyst. 2014 May;10(5):1063-71. doi: 10.1039/c3mb70489k.
3
Enzymes/non-enzymes classification model complexity based on composition, sequence, 3D and topological indices.基于组成、序列、三维结构和拓扑指数的酶/非酶分类模型复杂性
J Theor Biol. 2008 Sep 21;254(2):476-82. doi: 10.1016/j.jtbi.2008.06.003. Epub 2008 Jun 14.
4
Prediction of small molecule binding property of protein domains with Bayesian classifiers based on Markov chains.基于马尔可夫链的贝叶斯分类器预测蛋白质结构域的小分子结合特性。
Comput Biol Chem. 2009 Dec;33(6):457-60. doi: 10.1016/j.compbiolchem.2009.09.005. Epub 2009 Oct 9.
5
Identification of transcription factor binding sites with variable-order Bayesian networks.利用可变阶贝叶斯网络识别转录因子结合位点。
Bioinformatics. 2005 Jun 1;21(11):2657-66. doi: 10.1093/bioinformatics/bti410. Epub 2005 Mar 29.
6
Classification of signaling proteins based on molecular star graph descriptors using Machine Learning models.基于机器学习模型的分子星图描述符对信号蛋白进行分类。
J Theor Biol. 2015 Nov 7;384:50-8. doi: 10.1016/j.jtbi.2015.07.038. Epub 2015 Aug 20.
7
MISS-Prot: web server for self/non-self discrimination of protein residue networks in parasites; theory and experiments in Fasciola peptides and Anisakis allergens.MISS-Prot:用于区分寄生虫中蛋白质残基网络的自身/非自身的网络服务器;在片形吸虫肽和异尖线虫过敏原方面的理论与实验
Mol Biosyst. 2011 Jun;7(6):1938-55. doi: 10.1039/c1mb05069a. Epub 2011 Apr 6.
8
Prediction of ubiquitin proteins using artificial neural networks, hidden markov model and support vector machines.使用人工神经网络、隐马尔可夫模型和支持向量机对泛素蛋白进行预测。
In Silico Biol. 2007;7(6):559-68.
9
2D MI-DRAGON: a new predictor for protein-ligands interactions and theoretic-experimental studies of US FDA drug-target network, oxoisoaporphine inhibitors for MAO-A and human parasite proteins.2D MI-DRAGON:一种新的蛋白配体相互作用预测因子,以及美国 FDA 药物靶点网络、MAO-A 抑制剂和人体寄生虫蛋白的理论-实验研究。
Eur J Med Chem. 2011 Dec;46(12):5838-51. doi: 10.1016/j.ejmech.2011.09.045. Epub 2011 Oct 1.
10
Statistical geometry based prediction of nonsynonymous SNP functional effects using random forest and neuro-fuzzy classifiers.基于统计几何学,使用随机森林和神经模糊分类器预测非同义单核苷酸多态性的功能效应
Proteins. 2008 Jun;71(4):1930-9. doi: 10.1002/prot.21838.

引用本文的文献

1
Predicting Calcein Release from Ultrasound-Targeted Liposomes: A Comparative Analysis of Random Forest and Support Vector Machine.超声靶向脂质体中 calcein 释放的预测:随机森林和支持向量机的比较分析。
Technol Cancer Res Treat. 2024 Jan-Dec;23:15330338241296725. doi: 10.1177/15330338241296725.
2
Comparative analysis of weka-based classification algorithms on medical diagnosis datasets.基于 WEKA 的分类算法在医学诊断数据集上的比较分析。
Technol Health Care. 2023;31(S1):397-408. doi: 10.3233/THC-236034.
3
A review on machine learning approaches and trends in drug discovery.
关于药物发现中机器学习方法与趋势的综述。
Comput Struct Biotechnol J. 2021 Aug 12;19:4538-4558. doi: 10.1016/j.csbj.2021.08.011. eCollection 2021.
4
Prediction of high anti-angiogenic activity peptides in silico using a generalized linear model and feature selection.使用广义线性模型和特征选择对高抗血管生成活性肽进行计算机预测。
Sci Rep. 2018 Oct 24;8(1):15688. doi: 10.1038/s41598-018-33911-z.
5
A methodology for the design of experiments in computational intelligence with multiple regression models.一种用于结合多元回归模型进行计算智能实验设计的方法。
PeerJ. 2016 Dec 1;4:e2721. doi: 10.7717/peerj.2721. eCollection 2016.
6
Texture analysis in gel electrophoresis images using an integrative kernel-based approach.基于集成核方法的凝胶电泳图像纹理分析
Sci Rep. 2016 Jan 13;6:19256. doi: 10.1038/srep19256.