Applied Computational Biology and Bioinformatics Group, Cancer Research UK, Paterson Institute for Cancer Research, The University of Manchester, Manchester, United Kingdom.
PLoS One. 2010 Jan 28;5(1):e8949. doi: 10.1371/journal.pone.0008949.
Most protein mass spectrometry (MS) experiments rely on searches against a database of known or predicted proteins, limiting their ability as a gene discovery tool.
Using a search against an in silico translation of the entire human genome, combined with a series of annotation filters, we identified 346 putative novel peptides [False Discovery Rate (FDR)<5%] in a MS dataset derived from two human breast epithelial cell lines. A subset of these were then successfully validated by a different MS technique. Two of these correspond to novel isoforms of Heterogeneous Ribonuclear Proteins, while the rest correspond to novel loci.
MS technology can be used for ab initio gene discovery in human data, which, since it is based on different underlying assumptions, identifies protein-coding genes not found by other techniques. As MS technology continues to evolve, such approaches will become increasingly powerful.
大多数蛋白质质谱(MS)实验依赖于针对已知或预测蛋白质数据库的搜索,这限制了它们作为基因发现工具的能力。
我们使用针对整个人类基因组的计算机翻译进行搜索,并结合一系列注释筛选,在源自两种人类乳腺上皮细胞系的 MS 数据集中共鉴定出 346 种假定的新型肽[假发现率(FDR)<5%]。然后,通过另一种 MS 技术成功验证了其中的一部分。其中两个对应于异质核糖核蛋白的新型同工型,而其余的则对应于新的基因座。
MS 技术可用于从头开始在人类数据中进行基因发现,由于它基于不同的基本假设,因此可以识别其他技术未发现的编码蛋白质的基因。随着 MS 技术的不断发展,这种方法将变得越来越强大。