Villanueva Josep, Philip John, Chaparro Carlos A, Li Yongbiao, Toledo-Crow Ricardo, DeNoyer Lin, Fleisher Martin, Robbins Richard J, Tempst Paul
Protein Center, Molecular Biology Program, Engineering Resource Laboratory, Department of Clinical Laboratories, Memorial Sloan-Kettering Cancer Center, 1275 York Avenue, New York, NY 10021, USA.
J Proteome Res. 2005 Jul-Aug;4(4):1060-72. doi: 10.1021/pr050034b.
"Molecular signatures" are the qualitative and quantitative patterns of groups of biomolecules (e.g., mRNA, proteins, peptides, or metabolites) in a cell, tissue, biological fluid, or an entire organism. To apply this concept to biomarker discovery, the measurements should ideally be noninvasive and performed in a single read-out. We have therefore developed a peptidomics platform that couples magnetics-based, automated solid-phase extraction of small peptides with a high-resolution MALDI-TOF mass spectrometric readout (Villanueva, J.; Philip, J.; Entenberg, D.; Chaparro, C. A.; Tanwar, M. K.; Holland, E. C.; Tempst, P. Anal. Chem. 2004, 76, 1560-1570). Since hundreds of peptides can be detected in microliter volumes of serum, it allows to search for disease signatures, for instance in the presence of cancer. We have now evaluated, optimized, and standardized a number of clinical and analytical chemistry variables that are major sources of bias; ranging from blood collection and clotting, to serum storage and handling, automated peptide extraction, crystallization, spectral acquisition, and signal processing. In addition, proper alignment of spectra and user-friendly visualization tools are essential for meaningful, certifiable data mining. We introduce a minimal entropy algorithm, "Entropycal", that simplifies alignment and subsequent statistical analysis and increases the percentage of the highly distinguishing spectral information being retained after feature selection of the datasets. Using the improved analytical platform and tools, and a commercial statistics program, we found that sera from thyroid cancer patients can be distinguished from healthy controls based on an array of 98 discriminant peptides. With adequate technological and computational methods in place, and using rigorously standardized conditions, potential sources of patient related bias (e.g., gender, age, genetics, environmental, dietary, and other factors) may now be addressed.
“分子特征”是指细胞、组织、生物流体或整个生物体中生物分子(如mRNA、蛋白质、肽或代谢物)群体的定性和定量模式。为了将这一概念应用于生物标志物的发现,理想情况下,测量应该是非侵入性的,并且在一次读数中完成。因此,我们开发了一种肽组学平台,该平台将基于磁性的小肽自动固相萃取与高分辨率基质辅助激光解吸电离飞行时间质谱读数相结合(Villanueva,J.;Philip,J.;Entenberg,D.;Chaparro,C.A.;Tanwar,M.K.;Holland,E.C.;Tempst,P.分析化学,2004年,76卷,第1560 - 1570页)。由于在微升体积的血清中可以检测到数百种肽,因此它能够寻找疾病特征,例如在癌症存在的情况下。我们现在已经评估、优化并标准化了许多作为偏差主要来源的临床和分析化学变量;范围从血液采集和凝血,到血清储存和处理、自动肽提取、结晶、光谱采集和信号处理。此外,光谱的正确比对和用户友好的可视化工具对于有意义的、可认证的数据挖掘至关重要。我们引入了一种最小熵算法“Entropycal”,它简化了比对和后续的统计分析,并增加了在数据集特征选择后保留的高度区分性光谱信息的百分比。使用改进的分析平台和工具以及一个商业统计程序,我们发现甲状腺癌患者的血清可以通过一组98种判别肽与健康对照区分开来。有了适当的技术和计算方法,并使用严格标准化的条件,现在可以解决与患者相关的偏差的潜在来源(例如性别、年龄、遗传学、环境、饮食和其他因素)。