School of Mathematics and Statistics, Henan University of Science and Technology, Luoyang, China.
Big Data Institute, Central South University, Changsha, China.
Brief Bioinform. 2023 Nov 22;25(1). doi: 10.1093/bib/bbad465.
There generally exists a critical state or tipping point from a stable state to another in the development of colorectal cancer (CRC) beyond which a significant qualitative transition occurs. Gut microbiome sequencing data can be collected non-invasively from fecal samples, making it more convenient to obtain. Furthermore, intestinal microbiome sequencing data contain phylogenetic information at various levels, which can be used to reliably identify critical states, thereby providing early warning signals more accurately and effectively. Yet, pinpointing the critical states using gut microbiome data presents a formidable challenge due to the high dimension and strong noise of gut microbiome data. To address this challenge, we introduce a novel approach termed the specific network information gain (SNIG) method to detect CRC's critical states at various taxonomic levels via gut microbiome data. The numerical simulation indicates that the SNIG method is robust under different noise levels and that it is also superior to the existing methods on detecting the critical states. Moreover, utilizing SNIG on two real CRC datasets enabled us to discern the critical states preceding deterioration and to successfully identify their associated dynamic network biomarkers at different taxonomic levels. Notably, we discovered certain 'dark species' and pathways intimately linked to CRC progression. In addition, we accurately detected the tipping points on an individual dataset of type I diabetes.
一般来说,结直肠癌(CRC)的发展存在一个从稳定状态到另一个状态的关键状态或转折点,超过这个点会发生显著的定性转变。肠道微生物组测序数据可以从粪便样本中无创地收集,因此更方便获得。此外,肠道微生物组测序数据包含各个层次的系统发育信息,可以用来可靠地识别关键状态,从而更准确有效地提供预警信号。然而,由于肠道微生物组数据的高维性和强噪声,利用肠道微生物组数据来确定关键状态仍然是一个巨大的挑战。为了解决这个挑战,我们引入了一种新的方法,称为特定网络信息增益(SNIG)方法,通过肠道微生物组数据在不同的分类学水平上检测 CRC 的关键状态。数值模拟表明,SNIG 方法在不同的噪声水平下具有鲁棒性,并且在检测关键状态方面也优于现有方法。此外,利用 SNIG 在两个真实的 CRC 数据集上,我们能够辨别恶化前的关键状态,并成功识别出不同分类学水平上与 CRC 进展相关的动态网络生物标志物。值得注意的是,我们发现了某些与 CRC 进展密切相关的“暗物种”和途径。此外,我们还在一个 I 型糖尿病的个体数据集上准确地检测到了转折点。