Zhu Yan, Koleilat Mohamad Karim I, Roszik Jason, Kwong Man Kam, Wang Zhonglin, Maru Dipen M, Kopetz Scott, Kwong Lawrence N
Department of Translational Molecular Pathology, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA.
Department of Melanoma Medical Oncology, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA.
Cancers (Basel). 2024 May 15;16(10):1886. doi: 10.3390/cancers16101886.
A challenge with studying cancer transcriptomes is in distilling the wealth of information down into manageable portions of information. In this resource, we develop an approach that creates and assembles cancer type-specific gene expression modules into flexible barcodes, allowing for adaptation to a wide variety of uses. Specifically, we propose that modules derived organically from high-quality gold standards such as The Cancer Genome Atlas (TCGA) can accurately capture and describe functionally related genes that are relevant to specific cancer types. We show that such modules can: (1) uncover novel gene relationships and nominate new functional memberships, (2) improve and speed up analysis of smaller or lower-resolution datasets, (3) re-create and expand known cancer subtyping schemes, (4) act as a "decoder" to bridge seemingly disparate established gene signatures, and (5) efficiently apply single-cell RNA sequencing information to other datasets. Moreover, such modules can be used in conjunction with native spreadsheet program commands to create a powerful and rapid approach to hypothesis generation and testing that is readily accessible to non-bioinformaticians. Finally, we provide tools for users to create and interpret their own modules. Overall, the flexible modular nature of the proposed barcoding provides a user-friendly approach to rapidly decoding transcriptome-wide data for research or, potentially, clinical uses.
研究癌症转录组面临的一个挑战是如何将丰富的信息提炼成易于管理的信息部分。在本资源中,我们开发了一种方法,将特定癌症类型的基因表达模块创建并组装成灵活的条形码,以适应各种用途。具体而言,我们提出从诸如癌症基因组图谱(TCGA)等高质量金标准中有机衍生的模块能够准确捕获和描述与特定癌症类型相关的功能相关基因。我们表明,这样的模块可以:(1)揭示新的基因关系并确定新的功能成员,(2)改进并加速对较小或分辨率较低的数据集的分析,(3)重新创建并扩展已知的癌症亚型分类方案,(4)充当“解码器”以桥接看似不同的已建立基因特征,以及(5)有效地将单细胞RNA测序信息应用于其他数据集。此外,此类模块可与原生电子表格程序命令结合使用,以创建一种强大且快速的假设生成和测试方法,非生物信息学家也易于使用。最后,我们为用户提供了创建和解释自己的模块的工具。总体而言,所提出的条形码的灵活模块化性质提供了一种用户友好的方法,可快速解码全转录组数据以用于研究或潜在的临床用途。