Lu Youtao, Zhou Xiaoyuan, Nardini Christine
CAS-MPG Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200031, P. R. China.
Mol Biosyst. 2017 Sep 26;13(10):2083-2091. doi: 10.1039/c7mb00248c.
Under the current deluge of omics, module networks distinctively emerge as methods capable of not only identifying inherently coherent groups (modules), thus reducing dimensionality, but also hypothesizing cause-effect relationships between modules and their regulators. Module networks were first designed in the transcriptomic era and further exploited in the multi-omic context to assess (for example) miRNA regulation of gene expression. Despite a number of available implementations, expansion of module networks to other omics is constrained by a limited characterization of the solutions' (modules plus regulators) accuracy and stability - an immediate need for the better characterization of molecular biology complexity in silico. We hence carefully assessed for LemonTree - a popular and open source module network implementation - the dependency of the software performances (sensitivity, specificity, false discovery rate, solutions' stability) on the input parameters and on the data quality (sample size, expression noise) based on synthetic and real data. In the process, we uncovered and fixed an issue in the code for the regulator assignment procedure. We concluded this evaluation with a table of recommended parameter settings. Finally, we applied these recommended settings to gut-intestinal metagenomic data from rheumatoid arthritis patients, to characterize the evolution of the gut-intestinal microbiome under different pharmaceutical regimens (methotrexate and prednisone) and we inferred innovative clinical recommendations with therapeutic potential, based on the computed module network.
在当前组学数据泛滥的情况下,模块网络作为一种独特的方法脱颖而出,它不仅能够识别内在连贯的组(模块),从而降低维度,还能够推测模块与其调节因子之间的因果关系。模块网络最初是在转录组学时代设计的,并在多组学背景下得到进一步应用,以评估(例如)miRNA对基因表达的调控。尽管有许多可用的实现方法,但模块网络向其他组学的扩展受到解决方案(模块加调节因子)准确性和稳定性的有限表征的限制——这是在计算机上更好地表征分子生物学复杂性的迫切需求。因此,我们基于合成数据和真实数据,仔细评估了流行的开源模块网络实现LemonTree软件性能(敏感性、特异性、错误发现率、解决方案的稳定性)对输入参数和数据质量(样本大小、表达噪声)的依赖性。在此过程中,我们发现并修复了调节因子分配程序代码中的一个问题。我们以推荐参数设置表结束了这项评估。最后,我们将这些推荐设置应用于类风湿性关节炎患者的肠道宏基因组数据,以表征不同药物治疗方案(甲氨蝶呤和泼尼松)下肠道微生物群的演变,并基于计算得到的模块网络推断出具有治疗潜力的创新临床建议。