• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

重折射:一种机器学习方法,用于在大规模基于 MS 的蛋白质组学中确定蛋白质同源物和剪接变体。

Re-fraction: a machine learning approach for deterministic identification of protein homologues and splice variants in large-scale MS-based proteomics.

机构信息

School of Information Technologies, University of Sydney, NSW 2006, Australia.

出版信息

J Proteome Res. 2012 May 4;11(5):3035-45. doi: 10.1021/pr300072j. Epub 2012 Mar 30.

DOI:10.1021/pr300072j
PMID:22428558
Abstract

A key step in the analysis of mass spectrometry (MS)-based proteomics data is the inference of proteins from identified peptide sequences. Here we describe Re-Fraction, a novel machine learning algorithm that enhances deterministic protein identification. Re-Fraction utilizes several protein physical properties to assign proteins to expected protein fractions that comprise large-scale MS-based proteomics data. This information is then used to appropriately assign peptides to specific proteins. This approach is sensitive, highly specific, and computationally efficient. We provide algorithms and source code for the current version of Re-Fraction, which accepts output tables from the MaxQuant environment. Nevertheless, the principles behind Re-Fraction can be applied to other protein identification pipelines where data are generated from samples fractionated at the protein level. We demonstrate the utility of this approach through reanalysis of data from a previously published study and generate lists of proteins deterministically identified by Re-Fraction that were previously only identified as members of a protein group. We find that this approach is particularly useful in resolving protein groups composed of splice variants and homologues, which are frequently expressed in a cell- or tissue-specific manner and may have important biological consequences.

摘要

质谱(MS)为基础的蛋白质组学数据分析的一个关键步骤是从鉴定的肽序列推断蛋白质。在这里,我们描述了 Re-Fraction,这是一种新的机器学习算法,可以增强确定性蛋白质鉴定。Re-Fraction 利用几种蛋白质物理性质将蛋白质分配到预期的蛋白质分数中,这些分数包含大规模的 MS 为基础的蛋白质组学数据。然后,该信息用于将肽适当地分配到特定的蛋白质上。该方法具有较高的灵敏度、特异性和计算效率。我们提供了当前版本 Re-Fraction 的算法和源代码,该版本接受来自 MaxQuant 环境的输出表。然而,Re-Fraction 的原理可以应用于其他蛋白质鉴定管道,其中数据是从蛋白质水平分馏的样本中生成的。我们通过重新分析先前发表的研究中的数据来证明该方法的实用性,并生成了由 Re-Fraction 确定性鉴定的蛋白质列表,这些蛋白质之前仅被鉴定为蛋白质组的成员。我们发现,这种方法在解决由剪接变体和同源物组成的蛋白质组特别有用,这些变体和同源物通常以细胞或组织特异性的方式表达,并且可能具有重要的生物学后果。

相似文献

1
Re-fraction: a machine learning approach for deterministic identification of protein homologues and splice variants in large-scale MS-based proteomics.重折射:一种机器学习方法,用于在大规模基于 MS 的蛋白质组学中确定蛋白质同源物和剪接变体。
J Proteome Res. 2012 May 4;11(5):3035-45. doi: 10.1021/pr300072j. Epub 2012 Mar 30.
2
Proteomics-grade de novo sequencing approach.蛋白质组学级别的从头测序方法。
J Proteome Res. 2005 Nov-Dec;4(6):2348-54. doi: 10.1021/pr050288x.
3
STEM: a software tool for large-scale proteomic data analyses.STEM:一种用于大规模蛋白质组学数据分析的软件工具。
J Proteome Res. 2005 Sep-Oct;4(5):1826-31. doi: 10.1021/pr050167x.
4
VEMS 3.0: algorithms and computational tools for tandem mass spectrometry based identification of post-translational modifications in proteins.VEMS 3.0:用于基于串联质谱法鉴定蛋白质翻译后修饰的算法和计算工具
J Proteome Res. 2005 Nov-Dec;4(6):2338-47. doi: 10.1021/pr050264q.
5
How do shotgun proteomics algorithms identify proteins?鸟枪法蛋白质组学算法如何识别蛋白质?
Nat Biotechnol. 2007 Jul;25(7):755-7. doi: 10.1038/nbt0707-755.
6
Analysis of mass spectrometry data in proteomics.蛋白质组学中质谱数据的分析。
Methods Mol Biol. 2008;453:105-22. doi: 10.1007/978-1-60327-429-6_4.
7
Detection of alternative splice variants at the proteome level in Aspergillus flavus.在黄曲霉中进行蛋白质组水平的可变剪接变体检测。
J Proteome Res. 2010 Mar 5;9(3):1209-17. doi: 10.1021/pr900602d.
8
Semi-supervised learning for peptide identification from shotgun proteomics datasets.基于鸟枪法蛋白质组学数据集的肽段鉴定的半监督学习
Nat Methods. 2007 Nov;4(11):923-5. doi: 10.1038/nmeth1113. Epub 2007 Oct 21.
9
A support vector machine model for the prediction of proteotypic peptides for accurate mass and time proteomics.一种用于预测精确质量和时间蛋白质组学中蛋白型肽段的支持向量机模型。
Bioinformatics. 2008 Jul 1;24(13):1503-9. doi: 10.1093/bioinformatics/btn218. Epub 2008 May 3.
10
Added value for tandem mass spectrometry shotgun proteomics data validation through isoelectric focusing of peptides.通过肽段等电聚焦对串联质谱鸟枪法蛋白质组学数据进行验证的附加价值。
J Proteome Res. 2005 Nov-Dec;4(6):2273-82. doi: 10.1021/pr050193v.

引用本文的文献

1
Mitochondrial CoQ deficiency is a common driver of mitochondrial oxidants and insulin resistance.线粒体 CoQ 缺乏是线粒体氧化剂和胰岛素抵抗的常见驱动因素。
Elife. 2018 Feb 6;7:e32111. doi: 10.7554/eLife.32111.
2
Novel protein isoforms of carcinoembryonic antigen are secreted from pancreatic, gastric and colorectal cancer cells.癌胚抗原的新型蛋白质异构体由胰腺、胃和结肠癌细胞分泌。
BMC Res Notes. 2013 Sep 26;6:381. doi: 10.1186/1756-0500-6-381.