School of Pharmacy, University of Wisconsin-Madison, 777 Highland Avenue, Madison, Wisconsin 53705, United States.
Department of Chemistry, University of Wisconsin-Madison, 1101 University Avenue, Madison, Wisconsin 53706, United States.
J Am Soc Mass Spectrom. 2024 Aug 7;35(8):1902-1912. doi: 10.1021/jasms.4c00192. Epub 2024 Jul 26.
Endogenous peptides are an abundant and versatile class of biomolecules with vital roles pertinent to the functionality of the nervous, endocrine, and immune systems and others. Mass spectrometry stands as a premier technique for identifying endogenous peptides, yet the field still faces challenges due to the lack of optimized computational resources for reliable raw mass spectra analysis and interpretation. Current database searching programs can exhibit discrepancies due to the unique properties of endogenous peptides, which typically require specialized search considerations. Herein, we present a high throughput, novel scoring algorithm for the extraction and ranking of conserved amino acid sequence motifs within any endogenous peptide database. Motifs are conserved patterns across organisms, representing sequence moieties crucial for biological functions, including maintenance of homeostasis. MotifQuest, our novel motif database generation algorithm, is designed to work in partnership with EndoGenius, a program optimized for database searching of endogenous peptides and that is powered by a motif database to capitalize on biological context to produce identifications. MotifQuest aims to quickly develop motif databases without any prior knowledge, a laborious task not possible with traditional sequence alignment resources. In this work we illustrate the utility of MotifQuest to expand EndoGenius' identification utility to other endogenous peptides by showcasing its ability to identify antimicrobial peptides. Additionally, we discuss the potential utility of MotifQuest to parse out motifs from a FASTA database file that can be further validated as new peptide drug candidates.
内源性肽是一类丰富多样的生物分子,具有与神经系统、内分泌系统和免疫系统以及其他系统的功能相关的重要作用。质谱分析是鉴定内源性肽的主要技术,但由于缺乏优化的计算资源来进行可靠的原始质谱分析和解释,该领域仍然面临挑战。目前的数据库搜索程序可能存在差异,这是由于内源性肽的独特性质所致,通常需要特殊的搜索考虑。在此,我们提出了一种高通量、新颖的评分算法,用于提取和排列任何内源性肽数据库中的保守氨基酸序列基序。基序是生物体之间的保守模式,代表了对生物功能至关重要的序列部分,包括维持内稳态。我们的新基序数据库生成算法 MotifQuest 旨在与 EndoGenius 合作,EndoGenius 是一个专门用于内源性肽数据库搜索的程序,它由一个基序数据库提供支持,以利用生物背景来生成鉴定结果。MotifQuest 的目标是在没有任何先验知识的情况下快速开发基序数据库,这是一项不可能通过传统序列对齐资源完成的艰巨任务。在这项工作中,我们通过展示其识别抗菌肽的能力来说明 MotifQuest 将 EndoGenius 的鉴定功能扩展到其他内源性肽的效用。此外,我们还讨论了 MotifQuest 从 FASTA 数据库文件中解析基序的潜在效用,这些基序可以进一步验证为新的肽类药物候选物。