Bochalis Eleftherios, Patsakis Michail, Chantzi Nikol, Mouratidis Ioannis, Chartoumpekis Dionysios, Georgakopoulos-Soares Ilias
Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA.
Department of Internal Medicine, Division of Endocrinology, Medical School, University of Patras, Patras, Greece.
bioRxiv. 2025 Feb 10:2025.02.05.636664. doi: 10.1101/2025.02.05.636664.
The identification of succinct, universal fingerprints that enable the characterization of individual taxonomies can reveal insights into trait development and can have widespread applications in pathogen diagnostics, human healthcare, ecology and the characterization of biomes. Here, we investigated the existence of peptide k-mer sequences that are exclusively present in a specific taxonomy and absent in every other taxonomic level, termed taxonomic quasi-primes. By analyzing proteomes across 24,073 species, we identified quasi-prime peptides specific to superkingdoms, kingdoms, and phyla, uncovering their taxonomic distributions and functional relevance. These peptides exhibit remarkable sequence uniqueness at six- and seven-amino-acid lengths, offering insights into evolutionary divergence and lineage-specific adaptations. Moreover, we show that human quasi-prime loci are more prone to harboring pathogenic variants, underscoring their functional significance. This study introduces taxonomic quasi-primes and offers insights into their contributions to proteomic diversity, evolutionary pathways, and functional adaptations across the tree of life, while emphasizing their potential impact on human health and disease.
识别简洁、通用的指纹图谱以实现对各个分类法的表征,能够揭示性状发育的见解,并在病原体诊断、人类医疗保健、生态学以及生物群落特征描述等方面具有广泛应用。在此,我们研究了肽k-mer序列的存在情况,这些序列仅存在于特定分类法中,而在其他每个分类级别中均不存在,我们将其称为分类学准素数。通过分析24073个物种的蛋白质组,我们鉴定出了特定于超界、界和门的准素数肽,揭示了它们的分类分布和功能相关性。这些肽在六肽和七肽长度时表现出显著的序列独特性,为进化分歧和谱系特异性适应提供了见解。此外,我们表明人类准素数基因座更容易携带致病变异,强调了它们的功能重要性。本研究引入了分类学准素数,并深入探讨了它们对生命之树中蛋白质组多样性、进化途径和功能适应的贡献,同时强调了它们对人类健康和疾病的潜在影响。