Liu Anqi, Tian Bo, Qiu Chuan, Su Kuan-Jui, Jiang Lindong, Zhao Chen, Song Meng, Liu Yong, Qu Gang, Zhou Ziyu, Zhang Xiao, Gnanesh Shashank Sajjan Mungasavalli, Thumbigere-Math Vivek, Luo Zhe, Tian Qing, Zhang Li-Shu, Wu Chong, Ding Zhengming, Shen Hui, Deng Hong-Wen
Tulane Center for Biomedical Informatics and Genomics, Deming Department of Medicine, School of Medicine, Tulane University, New Orleans, LA, USA.
Center for System Biology, Data Sciences, and Reproductive Health, School of Basic Medical Science, Central South University, Yuelu, Changsha, P.R. China.
bioRxiv. 2024 Sep 27:2024.09.25.614767. doi: 10.1101/2024.09.25.614767.
Short-chain fatty acids (SCFAs) are the main metabolites produced by bacterial fermentation of dietary fiber within gastrointestinal tract. SCFAs produced by gut microbiotas (GMs) are absorbed by host, reach bloodstream, and are distributed to different organs, thus influencing host physiology. However, due to the limited budget or the poor sensitivity of instruments, most studies on GMs have incomplete blood SCFA data, limiting our understanding of the metabolic processes within the host. To address this gap, we developed an innovative multi-task multi-view integrative approach (MAE, Multi-task Multi-View Attentive Encoders), to impute blood SCFA levels using gut metagenomic sequencing (MGS) data, while taking into account the intricate interplay among the gut microbiome, dietary features, and host characteristics, as well as the nuanced nature of SCFA dynamics within the body. Here, each view represents a distinct type of data input (i.e., gut microbiome compositions, dietary features, or host characteristics). Our method jointly explores both view-specific representations and cross-view correlations for effective predictions of SCFAs. We applied MAE to two in-house datasets, which both include MGS and blood SCFAs profiles, host characteristics, and dietary features from 964 subjects and 171 subjects, respectively. Results from both of two datasets demonstrated that MAE outperforms traditional regression-based and neural-network based approaches in imputing blood SCFAs. Furthermore, a series of gut bacterial species (e.g., and ), host characteristics (e.g., race, gender), as well as dietary features (e.g., intake of fruits, pickles) were shown to contribute greatly to imputation of blood SCFAs. These findings demonstrated that GMs, dietary features and host characteristics might contribute to the complex biological processes involved in blood SCFA productions. These might pave the way for a deeper and more nuanced comprehension of how these factors impact human health.
短链脂肪酸(SCFAs)是膳食纤维在胃肠道内被细菌发酵产生的主要代谢产物。肠道微生物群(GMs)产生的SCFAs被宿主吸收,进入血液循环,并分布到不同器官,从而影响宿主生理功能。然而,由于预算有限或仪器灵敏度较低,大多数关于GMs的研究血液SCFA数据不完整,限制了我们对宿主内代谢过程的理解。为了填补这一空白,我们开发了一种创新的多任务多视图整合方法(MAE,多任务多视图注意力编码器),利用肠道宏基因组测序(MGS)数据估算血液SCFA水平,同时考虑肠道微生物组、饮食特征和宿主特征之间的复杂相互作用,以及体内SCFA动态变化的细微性质。在这里,每个视图代表一种不同类型的数据输入(即肠道微生物组组成、饮食特征或宿主特征)。我们的方法联合探索视图特定表示和跨视图相关性,以有效预测SCFAs。我们将MAE应用于两个内部数据集,这两个数据集分别包含来自964名受试者和171名受试者的MGS和血液SCFAs谱、宿主特征和饮食特征。两个数据集的结果都表明,MAE在估算血液SCFAs方面优于传统的基于回归和基于神经网络的方法。此外,一系列肠道细菌种类(如 和 )、宿主特征(如种族、性别)以及饮食特征(如水果、泡菜摄入量)被证明对血液SCFAs的估算有很大贡献。这些发现表明,GMs、饮食特征和宿主特征可能有助于血液SCFA产生所涉及的复杂生物学过程。这些可能为更深入、更细致地理解这些因素如何影响人类健康铺平道路。