Clinical and Experimental Pharmacology Group, CRUK Manchester Institute, University of Manchester, Manchester M20 4BX, UK, Stem Cell and Leukaemia Proteomics Laboratory, Institute of Cancer Sciences, Manchester Academic Health Science Centre, Wolfson Molecular Imaging Centre, University of Manchester, Manchester M20 3LJ, UK and Centre for Biostatistics, Institute of Population Health, University of Manchester, Oxford Road, Manchester M13 9PL, UK.
Bioinformatics. 2014 Feb 15;30(4):549-58. doi: 10.1093/bioinformatics/btt722. Epub 2013 Dec 15.
Isobaric tag for relative and absolute quantitation (iTRAQ) is a widely used method in quantitative proteomics. A robust data analysis strategy is required to determine protein quantification reliability, i.e. changes due to biological regulation rather than technical variation, so that proteins that are differentially expressed can be identified.
Samples were created by mixing 5, 10, 15 and 20 μg Escherichia coli cell lysate with 100 μg of cell lysate from mouse, corresponding to expected relative fold changes of one for mouse proteins and from 0.25 to 4 for E.coli proteins. Relative quantification was carried out using eight channel isobaric tagging with iTRAQ reagent, and proteins were identified using a TripleTOF 5600 mass spectrometer. Technical variation inherent in this iTRAQ dataset was systematically investigated.
A hierarchical statistical model was developed to use quantitative information at peptide level and protein level simultaneously to estimate variation present in each individual peptide and protein. A novel data analysis strategy for iTRAQ, denoted in short as WHATraq, was subsequently proposed with its performance evaluated by the proportion of E.coli proteins that are successfully identified as differentially expressed. Compared with two benchmark data analysis strategies WHATraq was able to identify at least 62.8% more true positive proteins that are differentially expressed. Further validated using a biological iTRAQ dataset including multiple biological replicates from varied murine cell lines, WHATraq performed consistently and identified 375% more proteins as being differentially expressed among different cell lines than the other data analysis strategies.
相对和绝对定量同位素标记技术(iTRAQ)是定量蛋白质组学中广泛使用的方法。需要一种强大的数据分析策略来确定蛋白质定量的可靠性,即由于生物学调节而不是技术变化引起的变化,以便可以识别差异表达的蛋白质。
通过将 5、10、15 和 20 μg 大肠杆菌细胞裂解物与 100 μg 来自小鼠的细胞裂解物混合来创建样品,这对应于小鼠蛋白的预期相对折叠变化为 1,而大肠杆菌蛋白的相对折叠变化为 0.25 到 4。使用 iTRAQ 试剂的 8 通道等压标记进行相对定量,并使用 TripleTOF 5600 质谱仪鉴定蛋白质。系统地研究了这种 iTRAQ 数据集固有的技术变化。
开发了一个层次统计模型,利用肽水平和蛋白质水平的定量信息来同时估计每个单个肽和蛋白质中的存在的变化。随后提出了一种新的 iTRAQ 数据分析策略,简称 WHATraq,并通过成功鉴定为差异表达的大肠杆菌蛋白的比例来评估其性能。与两种基准数据分析策略相比,WHATraq 能够识别至少 62.8%更多的差异表达的真实阳性蛋白。使用包括来自不同小鼠细胞系的多个生物学重复的生物学 iTRAQ 数据集进一步验证,WHATraq 表现一致,并在不同细胞系之间鉴定出 375%更多的蛋白作为差异表达蛋白,比其他数据分析策略更多。