Research Programs Unit, Genome-Scale Biology, Faculty of Medicine, University of Helsinki, Helsinki, POB 63, 00014, Finland.
Research Programs Unit, Genome-Scale Biology, Faculty of Medicine, University of Helsinki, Helsinki, POB 63, 00014, Finland.
Comput Methods Programs Biomed. 2018 Jan;153:129-136. doi: 10.1016/j.cmpb.2017.10.013. Epub 2017 Oct 12.
BACKGROUND AND OBJECTIVE: High-throughput measurement technologies have triggered a rise in large-scale cancer studies containing multiple levels of molecular data. While there are a number of efficient methods to analyze individual data types, there are far less that enhance data interpretation after analysis. We present the R package Director, a dynamic visualization approach to linking and interrogating multiple levels of molecular data after analysis for clinically meaningful, actionable insights. METHODS: Sankey diagrams are traditionally used to represent quantitative flows through multiple, distinct events. Regulation can be interpreted as a flow of biological information through a series of molecular interactions. Functions in Director introduce novel drawing capabilities to make Sankey diagrams robust to a wide range of quantitative measures and to depict molecular interactions as regulatory cascades. The package streamlines creation of diagrams using as input quantitative measurements identifying nodes as molecules of interest and paths as the interaction strength between two molecules. RESULTS: Director's utility is demonstrated with quantitative measurements of candidate microRNA-gene networks identified in an ovarian cancer dataset. A recent study reported eight miRNAs as master regulators of signature genes in epithelial-mesenchymal transition (EMT). The Sankey diagrams generated with data from this study furthers interpretation of the miRNAs' roles by revealing potential co-regulatory behavior in the extracellular matrix (ECM). An additional analysis identified 32 genes differentially expressed between good and poor prognosis patients in four significant pathways (FDR ≤ 0.1), three of which support a complementary role of the ECM in ovarian cancer. The resulting diagram created with Director suggest elevated levels of COL11A1, INHBA, and THBS2 - a signature feature of metastasis [1] - and decreased levels of their targeting miRNAs define poor prognosis. CONCLUSION: We have demonstrated a visualization approach suitable for implementation in an analysis workflow, linking multiple levels of molecular data to gain novel perspective on candidate biomarkers in a complex disease. The diagrams are dynamic, easily replicable, and rendered locally as HTML files to facilitate sharing. The R package Director is simple to use and widely available on all operating systems through Bioconductor (http://bioconductor.org/packages/Director) and GitHub (http://kzouchka.github.io/Director).
背景与目的:高通量测量技术引发了大量包含多个分子数据层次的癌症研究。虽然有许多有效的方法可以分析单个数据类型,但在分析后增强数据解释的方法却很少。我们提出了 R 包 Director,这是一种动态可视化方法,可以在分析后链接和查询多个分子数据层次,以获得有临床意义的可操作见解。
方法:Sankey 图传统上用于表示通过多个不同事件的定量流。调控可以被解释为生物信息通过一系列分子相互作用的流动。Director 中的功能引入了新的绘图功能,使 Sankey 图能够抵抗广泛的定量测量,并将分子相互作用描绘为调控级联。该包通过使用输入定量测量值简化了图表的创建,这些测量值将节点标识为感兴趣的分子,路径标识为两个分子之间的相互作用强度。
结果:Director 的效用通过卵巢癌数据集的候选 microRNA-基因网络的定量测量来证明。最近的一项研究报告了八个 microRNA 作为上皮-间充质转化(EMT)中标志性基因的主要调控因子。使用该研究数据生成的 Sankey 图通过揭示细胞外基质(ECM)中的潜在共调控行为,进一步解释了 microRNA 的作用。另外一项分析确定了四个显著通路(FDR ≤ 0.1)中 4 个预后良好和预后不良患者之间差异表达的 32 个基因,其中 3 个支持 ECM 在卵巢癌中的互补作用。使用 Director 创建的结果图表明,COL11A1、INHBA 和 THBS2 的水平升高 - 这是转移的特征[1] - 以及其靶向 microRNA 的水平降低定义了不良预后。
结论:我们已经展示了一种适合在分析工作流程中实施的可视化方法,通过链接多个分子数据层次,对复杂疾病中的候选生物标志物获得新的视角。这些图是动态的,易于复制,并以本地 HTML 文件呈现,以方便共享。R 包 Director 使用简单,在所有操作系统上都可通过 Bioconductor(http://bioconductor.org/packages/Director)和 GitHub(http://kzouchka.github.io/Director)获得。
Comput Methods Programs Biomed. 2017-10-12
BMC Bioinformatics. 2017-1-14
BMC Bioinformatics. 2019-5-14
Bioinformatics. 2022-2-7
BMC Genomics. 2015-11-16
BMC Genomics. 2017-5-15
BMC Genomics. 2024-10-16
Nat Commun. 2020-10-16