Yu Fengchao, Li Ning, Yu Weichuan
Division of Biomedical Engineering, The Hong Kong University of Science and Technology, Hong Kong, China.
Division of Life Science, The Hong Kong University of Science and Technology, Hong Kong, China.
BMC Bioinformatics. 2016 May 20;17(1):217. doi: 10.1186/s12859-016-1073-y.
Chemical cross-linking combined with mass spectrometry (CX-MS) is a high-throughput approach to studying protein-protein interactions. The number of peptide-peptide combinations grows quadratically with respect to the number of proteins, resulting in a high computational complexity. Widely used methods including xQuest (Rinner et al., Nat Methods 5(4):315-8, 2008; Walzthoeni et al., Nat Methods 9(9):901-3, 2012), pLink (Yang et al., Nat Methods 9(9):904-6, 2012), ProteinProspector (Chu et al., Mol Cell Proteomics 9:25-31, 2010; Trnka et al., 13(2):420-34, 2014) and Kojak (Hoopmann et al., J Proteome Res 14(5):2190-198, 2015) avoid searching all peptide-peptide combinations by pre-selecting peptides with heuristic approaches. However, pre-selection procedures may cause missing findings. The most intuitive approach is searching all possible candidates. A tool that can exhaustively search a whole database without any heuristic pre-selection procedure is therefore desirable.
We have developed a cross-linked peptides identification tool named ECL. It can exhaustively search a whole database in a reasonable period of time without any heuristic pre-selection procedure. Tests showed that searching a database containing 5200 proteins took 7 h. ECL identified more non-redundant cross-linked peptides than xQuest, pLink, and ProteinProspector. Experiments showed that about 30 % of these additional identified peptides were not pre-selected by Kojak. We used protein crystal structures from the protein data bank to check the intra-protein cross-linked peptides. Most of the distances between cross-linking sites were smaller than 30 Å.
To the best of our knowledge, ECL is the first tool that can exhaustively search all candidates in cross-linked peptides identification. The experiments showed that ECL could identify more peptides than xQuest, pLink, and ProteinProspector. A further analysis indicated that some of the additional identified results were thanks to the exhaustive search.
化学交联结合质谱分析(CX-MS)是一种研究蛋白质-蛋白质相互作用的高通量方法。肽-肽组合的数量相对于蛋白质数量呈二次方增长,导致计算复杂度很高。广泛使用的方法包括xQuest(Rinner等人,《自然方法》5(4):315 - 8,2008年;Walzthoeni等人,《自然方法》9(9):901 - 3,2012年)、pLink(Yang等人,《自然方法》9(9):904 - 6,2012年)、ProteinProspector(Chu等人,《分子细胞蛋白质组学》9:25 - 31,2010年;Trnka等人,13(2):420 - 34,2014年)和Kojak(Hoopmann等人,《蛋白质组研究杂志》14(5):2190 - 198,2015年),它们通过启发式方法预先选择肽段来避免搜索所有肽-肽组合。然而,预选择程序可能会导致遗漏发现。最直观的方法是搜索所有可能的候选者。因此,需要一种能够在不进行任何启发式预选择程序的情况下详尽搜索整个数据库的工具。
我们开发了一种名为ECL的交联肽鉴定工具。它能够在合理的时间内详尽搜索整个数据库,而无需任何启发式预选择程序。测试表明,搜索一个包含5200种蛋白质的数据库需要7小时。ECL鉴定出的非冗余交联肽比xQuest、pLink和ProteinProspector更多。实验表明,这些额外鉴定出的肽中约30%未被Kojak预先选择。我们使用蛋白质数据库中的蛋白质晶体结构来检查蛋白质内的交联肽。交联位点之间的大多数距离小于30埃。
据我们所知,ECL是第一种能够在交联肽鉴定中详尽搜索所有候选者的工具。实验表明,ECL能够比xQuest、pLink和ProteinProspector鉴定出更多的肽段。进一步分析表明,一些额外鉴定出的结果得益于详尽搜索。