Office of Chief Medical Examiner, 421 East 26th Street, New York, NY, 10016, USA.
Institute for Systems Genetics, Department of Biochemistry and Molecular Pharmacology, NYU Grossman School of Medicine, New York, NY, 10016, USA.
Sci Rep. 2021 May 25;11(1):10900. doi: 10.1038/s41598-021-90231-5.
Proteogenomics is an increasingly common method for species identification as it allows for rapid and inexpensive interrogation of an unknown organism's proteome-even when the proteome is partially degraded. The proteomic method typically uses tandem mass spectrometry to survey all peptides detectable in a sample that frequently contains hundreds or thousands of proteins. Species identification is based on detection of a small numbers of species-specific peptides. Genetic analysis of proteins by mass spectrometry, however, is a developing field, and the bone proteome, typically consisting of only two proteins, pushes the limits of this technology. Nearly 20% of highly confident spectra from modern human bone samples identify non-human species when searched against a vertebrate database-as would be necessary with a fragment of unknown bone. These non-human peptides are often the result of current limitations in mass spectrometry or algorithm interpretation errors. Consequently, it is difficult to know if a "species-specific" peptide used to identify a sample is actually present in that sample. Here we evaluate the causes of peptide sequence errors and propose an unbiased, probabilistic approach to determine the likelihood that a species is correctly identified from bone without relying on species-specific peptides.
蛋白质组学是一种越来越常见的物种鉴定方法,因为它允许快速、廉价地检测未知生物体的蛋白质组——即使蛋白质组部分降解。蛋白质组学方法通常使用串联质谱法来检测样本中所有可检测到的肽,而这些样本通常含有数百种甚至数千种蛋白质。物种鉴定是基于检测少量的物种特异性肽。然而,通过质谱对蛋白质进行遗传分析是一个正在发展的领域,而通常只由两种蛋白质组成的骨骼蛋白质组则推动了这项技术的极限。当对脊椎动物数据库进行搜索时,来自现代人骨样本的近 20%的高度置信谱会识别出非人类物种——对于未知骨骼的片段来说,这是必要的。这些非人类肽通常是当前质谱技术的限制或算法解释错误的结果。因此,很难知道用于鉴定样本的“物种特异性”肽是否实际存在于该样本中。在这里,我们评估了肽序列错误的原因,并提出了一种公正的、概率的方法,无需依赖物种特异性肽即可确定从骨骼中正确鉴定物种的可能性。