Chang Chih-Hsiang, Yeung Darien, Spicer Victor, Ogata Kosuke, Krokhin Oleg, Ishihama Yasushi
Graduate School of Pharmaceutical Sciences, Kyoto University, Kyoto 606-8501, Japan.
Department of Biochemistry and Medical Genetics, University of Manitoba, Winnipeg, Manitoba R3E 0J9, Canada.
J Proteome Res. 2021 Jun 16. doi: 10.1021/acs.jproteome.1c00185.
The contribution of peptide amino acid sequence to collision cross section values (CCS) has been investigated using a dataset of ∼134 000 peptides of four different charge states (1+ to 4+). The migration data were acquired using a two-dimensional liquid chromatography (LC)/trapped ion mobility spectrometry/quadrupole/time-of-flight mass spectrometry (MS) analysis of HeLa cell digests created using seven different proteases and was converted to CCS values. Following the previously reported modeling approaches using intrinsic size parameters (ISP), we extended this methodology to encode the position of individual residues within a peptide sequence. A generalized prediction model was built by dividing the dataset into eight groups (four charges for both tryptic/nontryptic peptides). Position-dependent ISPs were independently optimized for the eight subsets of peptides, resulting in prediction accuracy of ∼0.981 for the entire population of peptides. We find that ion mobility is strongly affected by the peptide's ability to solvate the positively charged sites. Internal positioning of polar residues and proline leads to decreased CCS values as they improve charge solvation; conversely, this ability decreases with increasing peptide charge due to electrostatic repulsion. Furthermore, higher helical propensity and peptide hydrophobicity result in a preferential formation of extended structures with higher than predicted CCS values. Finally, acidic/basic residues exhibit position-dependent ISP behavior consistent with electrostatic interaction with the peptide macrodipole, which affects the peptide helicity. The MS raw data files have been deposited with the ProteomeXchange Consortium via the jPOST partner repository (http://jpostdb.org) with the dataset identifiers PXD021440/JPST000959, PXD022800/JPST001017, and PXD026087/ JPST001176.
利用一个包含四种不同电荷状态(1+至4+)的约134,000个肽段的数据集,研究了肽段氨基酸序列对碰撞截面值(CCS)的贡献。迁移数据是通过对使用七种不同蛋白酶产生的HeLa细胞消化产物进行二维液相色谱(LC)/捕集离子淌度质谱/四极杆/飞行时间质谱(MS)分析获得的,并转换为CCS值。遵循先前报道的使用内在尺寸参数(ISP)的建模方法,我们扩展了该方法以编码肽序列中各个残基的位置。通过将数据集分为八组(胰蛋白酶/非胰蛋白酶肽段各四种电荷)建立了一个广义预测模型。对肽段的八个子集独立优化了位置相关的ISP,从而使整个肽段群体的预测准确率达到约0.981。我们发现离子淌度受肽段溶剂化带正电位点能力的强烈影响。极性残基和脯氨酸的内部定位会导致CCS值降低,因为它们改善了电荷溶剂化;相反,由于静电排斥,这种能力会随着肽段电荷的增加而降低。此外,更高的螺旋倾向和肽段疏水性导致优先形成具有高于预测CCS值的伸展结构。最后,酸性/碱性残基表现出与肽段大偶极静电相互作用一致的位置相关ISP行为,这会影响肽段螺旋度。质谱原始数据文件已通过jPOST合作伙伴存储库(http://jpostdb.org)存入蛋白质组交换联盟,数据集标识符为PXD021440/JPST000959、PXD022800/JPST001017和PXD026087/JPST001176。