Environmental Microbiomics Research Center, School of Environmental Science and Engineering, Southern Marine Science and Engineering Guangdong Laboratory (Zhuhai), State Key Laboratory of Biocontrol, Sun Yat-sen University, Guangzhou, 510006, China.
Institute of Marine Science and Technology, Shandong University, Qingdao, 266237, China.
Microbiome. 2022 Jul 4;10(1):101. doi: 10.1186/s40168-022-01292-1.
Phosphorus (P) is one of the most essential macronutrients on the planet, and microorganisms (including bacteria and archaea) play a key role in P cycling in all living things and ecosystems. However, our comprehensive understanding of key P cycling genes (PCGs) and microorganisms (PCMs) as well as their ecological functions remains elusive even with the rapid advancement of metagenome sequencing technologies. One of major challenges is a lack of a comprehensive and accurately annotated P cycling functional gene database.
In this study, we constructed a well-curated P cycling database (PCycDB) covering 139 gene families and 10 P metabolic processes, including several previously ignored PCGs such as pafA encoding phosphate-insensitive phosphatase, ptxABCD (phosphite-related genes), and novel aepXVWPS genes for 2-aminoethylphosphonate transporters. We achieved an annotation accuracy, positive predictive value (PPV), sensitivity, specificity, and negative predictive value (NPV) of 99.8%, 96.1%, 99.9%, 99.8%, and 99.9%, respectively, for simulated gene datasets. Compared to other orthology databases, PCycDB is more accurate, more comprehensive, and faster to profile the PCGs. We used PCycDB to analyze P cycling microbial communities from representative natural and engineered environments and showed that PCycDB could apply to different environments.
We demonstrate that PCycDB is a powerful tool for advancing our understanding of microbially driven P cycling in the environment with high coverage, high accuracy, and rapid analysis of metagenome sequencing data. The PCycDB is available at https://github.com/ZengJiaxiong/Phosphorus-cycling-database . Video Abstract.
磷(P)是地球上最重要的大量营养素之一,微生物(包括细菌和古菌)在所有生物和生态系统的 P 循环中发挥着关键作用。然而,即使在宏基因组测序技术迅速发展的情况下,我们对关键 P 循环基因(PCGs)和微生物(PCMs)及其生态功能的综合理解仍然难以捉摸。主要挑战之一是缺乏全面准确注释的 P 循环功能基因数据库。
在这项研究中,我们构建了一个精心编纂的 P 循环数据库(PCycDB),涵盖了 139 个基因家族和 10 个 P 代谢过程,包括几个以前被忽视的 PCGs,如编码磷酸盐不敏感磷酸酶的 pafA、与亚膦酸盐相关的 ptxABCD 基因和用于 2-氨基乙基膦酸盐转运体的新型 aepXVWPS 基因。我们实现了模拟基因数据集的注释准确性、阳性预测值(PPV)、敏感性、特异性和阴性预测值(NPV)分别为 99.8%、96.1%、99.9%、99.8%和 99.9%。与其他同源数据库相比,PCycDB 更准确、更全面、更快地对 PCGs 进行分析。我们使用 PCycDB 分析了来自代表性自然和工程环境的 P 循环微生物群落,并表明 PCycDB 可以应用于不同的环境。
我们证明 PCycDB 是一种强大的工具,可用于提高我们对环境中微生物驱动的 P 循环的理解,具有高覆盖率、高精度和快速分析宏基因组测序数据的能力。PCycDB 可在 https://github.com/ZengJiaxiong/Phosphorus-cycling-database 上获得。视频摘要。