Boer Meine D, Melkonian Chrats, Zafeiropoulos Haris, Haas Andreas F, Garza Daniel R, Dutilh Bas E
Theoretical Biology and Bioinformatics, Utrecht University, 3584 CH Utrecht, the Netherlands.
Department Marine Microbiology and Biogeochemistry, NIOZ Royal Netherlands Institute for Sea Research, PO Box 59, Den Burg 1790 AB, Texel, The Netherlands.
iScience. 2024 Nov 7;27(12):111349. doi: 10.1016/j.isci.2024.111349. eCollection 2024 Dec 20.
Deciphering microbial metabolism is essential for understanding ecosystem functions. Genome-scale metabolic models (GSMMs) predict metabolic traits from genomic data, but constructing GSMMs for uncultured bacteria is challenging due to incomplete metagenome-assembled genomes, resulting in many gaps. We introduce the deep neural network guided imputation of reactomes (DNNGIOR), which uses AI to improve gap-filling by learning from the presence and absence of metabolic reactions across diverse bacterial genomes. Key factors for prediction accuracy are: (1) reaction frequency across all bacteria and (2) phylogenetic distance of the query to the training genomes. DNNGIOR predictions achieve an average F1 score of 0.85 for reactions present in over 30% of training genomes. DNNGIOR guided gap-filling was 14 times more accurate for draft reconstructions and 2-9 times for curated models than unweighted gap-filling.
解读微生物代谢对于理解生态系统功能至关重要。基因组规模代谢模型(GSMMs)可根据基因组数据预测代谢特征,但由于宏基因组组装基因组不完整,为未培养细菌构建GSMMs具有挑战性,会导致许多缺口。我们引入了反应组的深度神经网络引导插补(DNNGIOR),它通过从不同细菌基因组中代谢反应的存在与否进行学习,利用人工智能改进缺口填充。预测准确性的关键因素是:(1)所有细菌中的反应频率,以及(2)查询序列与训练基因组的系统发育距离。对于存在于超过30%训练基因组中的反应,DNNGIOR预测的平均F1分数达到0.85。与未加权缺口填充相比,DNNGIOR引导的缺口填充对于草图重建的准确性高14倍,对于经过整理的模型高2至9倍。