Novikova Polina V, Bhanu Busi Susheel, Probst Alexander J, May Patrick, Wilmes Paul
Systems Ecology, Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Esch-sur-Alzette L-4362, Luxembourg.
UK Centre for Ecology and Hydrology, Wallingford, OX10 8 BB, United Kingdom.
ISME Commun. 2024 Jan 10;4(1):ycad014. doi: 10.1093/ismeco/ycad014. eCollection 2024 Jan.
The human gastrointestinal tract contains diverse microbial communities, including archaea. Among them, represents a highly active and clinically relevant methanogenic archaeon, being involved in gastrointestinal disorders, such as inflammatory bowel disease and obesity. Herein, we present an integrated approach using sequence and structure information to improve the annotation of proteins using advanced protein structure prediction and annotation tools, such as AlphaFold2, trRosetta, ProFunc, and DeepFri. Of an initial set of 873 481 archaeal proteins, we found 707 754 proteins exclusively present in the human gut. Having analysed archaeal proteins together with 87 282 994 bacterial proteins, we identified unique archaeal proteins and archaeal-bacterial homologs. We then predicted and characterized functional domains and structures of 73 unique and homologous archaeal protein clusters linked the human gut and . We refined annotations based on the predicted structures, extending existing sequence similarity-based annotations. We identified gut-specific archaeal proteins that may be involved in defense mechanisms, virulence, adhesion, and the degradation of toxic substances. Interestingly, we identified potential glycosyltransferases that could be associated with -linked and -glycosylation. Additionally, we found preliminary evidence for interdomain horizontal gene transfer between species and , which includes . Our study broadens the understanding of archaeal biology, particularly , and highlights the importance of considering both sequence and structure for the prediction of protein function.
人类胃肠道包含多种微生物群落,其中包括古菌。其中, 代表一种高度活跃且与临床相关的产甲烷古菌,参与诸如炎症性肠病和肥胖症等胃肠道疾病。在此,我们提出一种综合方法,利用序列和结构信息,借助先进的蛋白质结构预测和注释工具(如AlphaFold2、trRosetta、ProFunc和DeepFri)来改进 蛋白质的注释。在最初的873481个古菌蛋白质集合中,我们发现707754个蛋白质仅存在于人类肠道中。在将古菌蛋白质与87282994个细菌蛋白质一起分析后,我们鉴定出了独特的古菌蛋白质和古菌 - 细菌同源物。然后,我们预测并表征了与人类肠道和 相关的73个独特且同源的古菌蛋白质簇的功能结构域和结构。我们基于预测的结构完善了注释,扩展了现有的基于序列相似性的注释。我们鉴定出可能参与防御机制、毒力、黏附以及有毒物质降解的肠道特异性古菌蛋白质。有趣的是,我们鉴定出了可能与 - 连接和 - 糖基化相关的潜在糖基转移酶。此外,我们发现了 物种与 物种(包括 )之间跨域水平基因转移的初步证据。我们的研究拓宽了对古菌生物学,尤其是对 的理解,并强调了在预测蛋白质功能时考虑序列和结构的重要性。