Bioinformatics Interdepartmental Program, University of California, Los Angeles (UCLA), Los Angeles, CA, USA.
Bioinformatics Interdepartmental Program, University of California, Los Angeles (UCLA), Los Angeles, CA, USA; Department of Bioengineering, UCLA, Los Angeles, CA, USA; Jonsson Comprehensive Cancer Center, UCLA, Los Angeles, CA, USA; Eli and Edythe Broad Center of Regenerative Medicine and Stem Cell Research, UCLA, Los Angeles, CA, USA.
Cell Syst. 2024 Aug 21;15(8):679-693. doi: 10.1016/j.cels.2024.07.004.
Recent biological studies have been revolutionized in scale and granularity by multiplex and high-throughput assays. Profiling cell responses across several experimental parameters, such as perturbations, time, and genetic contexts, leads to richer and more generalizable findings. However, these multidimensional datasets necessitate a reevaluation of the conventional methods for their representation and analysis. Traditionally, experimental parameters are merged to flatten the data into a two-dimensional matrix, sacrificing crucial experiment context reflected by the structure. As Marshall McLuhan famously stated, "the medium is the message." In this work, we propose that the experiment structure is the medium in which subsequent analysis is performed, and the optimal choice of data representation must reflect the experiment structure. We review how tensor-structured analyses and decompositions can preserve this information. We contend that tensor methods are poised to become integral to the biomedical data sciences toolkit.
近年来,多重和高通量检测手段的出现,使得生物学研究在规模和粒度上发生了革命性的变化。对多个实验参数(如干扰、时间和遗传背景)下的细胞反应进行分析,可以得出更丰富、更具普遍性的发现。然而,这些多维数据集需要重新评估传统的表示和分析方法。传统上,实验参数被合并,将数据简化为二维矩阵,从而牺牲了结构所反映的关键实验背景。正如马歇尔·麦克卢汉(Marshall McLuhan)著名的说法,“媒介即信息”。在这项工作中,我们提出实验结构是后续分析所执行的媒介,而数据表示的最佳选择必须反映实验结构。我们回顾了张量结构分析和分解如何保留这些信息。我们认为张量方法很可能成为生物医学数据科学工具包不可或缺的一部分。