Fundamental and Computational Sciences Directorate, Pacific Northwest National Laboratory, 999 Battelle Boulevard, Richland, WA 99352, USA.
Bioinformatics. 2010 Jul 1;26(13):1601-7. doi: 10.1093/bioinformatics/btq245. Epub 2010 May 21.
Ion mobility spectrometry (IMS) has gained significant traction over the past few years for rapid, high-resolution separations of analytes based upon gas-phase ion structure, with significant potential impacts in the field of proteomic analysis. IMS coupled with mass spectrometry (MS) affords multiple improvements over traditional proteomics techniques, such as in the elucidation of secondary structure information, identification of post-translational modifications, as well as higher identification rates with reduced experiment times. The high throughput nature of this technique benefits from accurate calculation of cross sections, mobilities and associated drift times of peptides, thereby enhancing downstream data analysis. Here, we present a model that uses physicochemical properties of peptides to accurately predict a peptide's drift time directly from its amino acid sequence. This model is used in conjunction with two mathematical techniques, a partial least squares regression and a support vector regression setting.
When tested on an experimentally created high confidence database of 8675 peptide sequences with measured drift times, both techniques statistically significantly outperform the intrinsic size parameters-based calculations, the currently held practice in the field, on all charge states (+2, +3 and +4).
The software executable, imPredict, is available for download from http:/omics.pnl.gov/software/imPredict.php
Supplementary data are available at Bioinformatics online.
在过去几年中,基于气相离子结构对分析物进行快速、高分辨率分离的离子迁移谱(IMS)技术得到了很大的关注,在蛋白质组学分析领域具有重要的潜在影响。IMS 与质谱(MS)相结合,在阐明二级结构信息、鉴定翻译后修饰以及在减少实验时间的同时提高鉴定率等方面,比传统的蛋白质组学技术有了多个改进。该技术的高通量性质得益于对肽的截面、迁移率和相关漂移时间的准确计算,从而增强了下游数据分析。在这里,我们提出了一种模型,该模型使用肽的物理化学性质来直接从其氨基酸序列准确预测肽的漂移时间。该模型与两种数学技术(偏最小二乘回归和支持向量回归设置)结合使用。
当在具有测量漂移时间的 8675 个肽序列的实验创建的高置信度数据库上进行测试时,两种技术在所有电荷状态(+2、+3 和+4)上都显著优于基于固有尺寸参数的计算,这是目前该领域的实践。
可从 http://omics.pnl.gov/software/imPredict.php 下载 imPredict 软件可执行文件。
补充数据可在生物信息学在线获得。