Escudeiro Pedro, Henry Christopher S, Dias Ricardo P M
BioISI - Instituto de Biosistemas e Ciências Integrativas, Faculdade de Ciências, Universidade de Lisboa, Lisboa 1749-016, Portugal.
Argonne National Laboratory, Lemont, Illinois, USA.
Curr Res Microb Sci. 2022 Aug 7;3:100159. doi: 10.1016/j.crmicr.2022.100159. eCollection 2022.
Eight-hundred thousand to one trillion prokaryotic species may inhabit our planet. Yet, fewer than two-hundred thousand prokaryotic species have been described. This uncharted fraction of microbial diversity, and its undisclosed coding potential, is known as the "microbial dark matter" (MDM). Next-generation sequencing has allowed to collect a massive amount of genome sequence data, leading to unprecedented advances in the field of genomics. Still, harnessing new functional information from the genomes of uncultured prokaryotes is often limited by standard classification methods. These methods often rely on sequence similarity searches against reference genomes from cultured species. This hinders the discovery of unique genetic elements that are missing from the cultivated realm. It also contributes to the accumulation of prokaryotic gene products of unknown function among public sequence data repositories, highlighting the need for new approaches for sequencing data analysis and classification. Increasing evidence indicates that these proteins of unknown function might be a treasure trove of biotechnological potential. Here, we outline the challenges, opportunities, and the potential hidden within the functional dark matter (FDM) of prokaryotes. We also discuss the pitfalls surrounding molecular and computational approaches currently used to probe these uncharted waters, and discuss future opportunities for research and applications.
地球上可能栖息着80万至1万亿种原核生物。然而,已被描述的原核生物物种不到20万种。微生物多样性的这一未知部分及其未被揭示的编码潜力,被称为“微生物暗物质”(MDM)。新一代测序技术使得能够收集大量的基因组序列数据,推动了基因组学领域取得前所未有的进展。尽管如此,从未培养原核生物的基因组中获取新的功能信息往往受到标准分类方法的限制。这些方法通常依赖于针对已培养物种的参考基因组进行序列相似性搜索。这阻碍了对培养领域中缺失的独特遗传元件的发现。它还导致公共序列数据存储库中功能未知的原核基因产物不断积累,凸显了对测序数据分析和分类新方法的需求。越来越多的证据表明,这些功能未知的蛋白质可能蕴藏着巨大的生物技术潜力。在这里,我们概述了原核生物功能暗物质(FDM)所带来的挑战、机遇和潜在价值。我们还讨论了目前用于探索这片未知领域的分子和计算方法所存在的缺陷,并探讨了未来的研究和应用机会。