Institut des Hautes Etudes Scientifiques, Le Bois-Marie 35 rte de Chartres, 91440, Bures-sur-Yvette, France.
Université Paris Diderot, Paris, France.
BMC Genomics. 2020 Sep 14;21(1):632. doi: 10.1186/s12864-020-07047-2.
Functional genomics employs several experimental approaches to investigate gene functions. High-throughput techniques, such as loss-of-function screening and transcriptome profiling, allow to identify lists of genes potentially involved in biological processes of interest (so called hit list). Several computational methods exist to analyze and interpret such lists, the most widespread of which aim either at investigating of significantly enriched biological processes, or at extracting significantly represented subnetworks.
Here we propose a novel network analysis method and corresponding computational software that employs the shortest path approach and centrality measure to discover members of molecular pathways leading to the studied phenotype, based on functional genomics screening data. The method works on integrated interactomes that consist of both directed and undirected networks - HIPPIE, SIGNOR, SignaLink, TFactS, KEGG, TransmiR, miRTarBase. The method finds nodes and short simple paths with significant high centrality in subnetworks induced by the hit genes and by so-called final implementers - the genes that are involved in molecular events responsible for final phenotypic realization of the biological processes of interest. We present the application of the method to the data from miRNA loss-of-function screen and transcriptome profiling of terminal human muscle differentiation process and to the gene loss-of-function screen exploring the genes that regulates human oxidative DNA damage recognition. The analysis highlighted the possible role of several known myogenesis regulatory miRNAs (miR-1, miR-125b, miR-216a) and their targets (AR, NR3C1, ARRB1, ITSN1, VAV3, TDGF1), as well as linked two major regulatory molecules of skeletal myogenesis, MYOD and SMAD3, to their previously known muscle-related targets (TGFB1, CDC42, CTCF) and also to a number of proteins such as C-KIT that have not been previously studied in the context of muscle differentiation. The analysis also showed the role of the interaction between H3 and SETDB1 proteins for oxidative DNA damage recognition.
The current work provides a systematic methodology to discover members of molecular pathways in integrated networks using functional genomics screening data. It also offers a valuable instrument to explain the appearance of a set of genes, previously not associated with the process of interest, in the hit list of each particular functional genomics screening.
功能基因组学采用多种实验方法来研究基因功能。高通量技术,如功能丧失筛选和转录组谱分析,可以识别出可能参与感兴趣的生物过程的基因列表(所谓的命中列表)。存在几种用于分析和解释此类列表的计算方法,其中最广泛使用的方法旨在研究显著富集的生物过程,或提取具有代表性的子网络。
在这里,我们提出了一种新的网络分析方法和相应的计算软件,该方法基于功能基因组学筛选数据,采用最短路径方法和中心性度量来发现导致研究表型的分子途径成员。该方法适用于由有向和无向网络组成的综合相互作用组-HIPPIE、SIGNOR、SignaLink、TFactS、KEGG、TransmiR、miRTarBase。该方法在由命中基因和所谓的最终执行者(即参与负责最终实现感兴趣的生物过程表型的分子事件的基因)诱导的子网中找到具有显著高中心性的节点和简单短路径。我们展示了该方法在 miRNA 功能丧失筛选和终末人肌肉分化过程转录组谱分析数据以及探索调节人类氧化 DNA 损伤识别的基因的基因功能丧失筛选数据中的应用。该分析强调了几种已知的肌发生调节 miRNA(miR-1、miR-125b、miR-216a)及其靶标(AR、NR3C1、ARRB1、ITSN1、VAV3、TDGF1)的可能作用,以及将两个主要的骨骼肌发生调节分子 MYOD 和 SMAD3 与其先前已知的肌肉相关靶标(TGFB1、CDC42、CTCF)以及先前未在肌肉分化背景下研究过的一些蛋白质(如 C-KIT)联系起来。该分析还表明了 H3 和 SETDB1 蛋白之间的相互作用在氧化 DNA 损伤识别中的作用。
目前的工作提供了一种系统的方法,用于使用功能基因组学筛选数据在综合网络中发现分子途径的成员。它还为解释在每个特定功能基因组学筛选的命中列表中出现一组先前与感兴趣的过程不相关的基因提供了有价值的工具。